Commit Graph

1345 Commits (4087d1941db7e4356af8e1165bfa3eef504e8d04)

Author SHA1 Message Date
Naisila Puka 76ff4ab188
Adds support for unlogged distributed sequences (#6292)
We can now do the following:
- Distribute sequence with logged/unlogged option
- ALTER TABLE my_sequence SET LOGGED/UNLOGGED
- ALTER SEQUENCE my_sequence SET LOGGED/UNLOGGED

Relevant PG commit
344d62fb9a
2022-09-13 10:53:39 +03:00
Nils Dijk cda3686d86
Feature: run rebalancer in the background (#6215)
DESCRIPTION: Add a rebalancer that uses background tasks for its
execution

Based on the baclground jobs and tasks introduced in #6296 we implement
a new rebalancer on top of the primitives of background execution. This
allows the user to initiate a rebalance and let Citus execute the long
running steps in the background until completion.

Users can invoke the new background rebalancer with `SELECT
citus_rebalance_start();`. It will output information on its job id and
how to track progress. Also it returns its job id for automation
purposes. If you simply want to wait till the rebalance is done you can
use `SELECT citus_rebalance_wait();`

A running rebalance can be canelled/stopped with `SELECT
citus_rebalance_stop();`.
2022-09-12 20:46:53 +03:00
naisila 47bea76c6c Revert "Support JSON_TABLE on PG 15 (#6241)"
This reverts commit 1f4fe35512.
2022-09-12 15:20:17 +03:00
Jelte Fennema a2d86214b2
Share more replication code between moves and splits (#6310)
The logical replication catchup part for shard splits and shard moves is
very similar. This abstracts most of that similarity away into a single
function. This also improves the logic for non blocking shard splits a
bit, by using faster foreign key creation. It also parallelizes index creation
which shard moves were already doing, but shard splits did not.
2022-09-09 16:45:38 +02:00
Marco Slot ba2fe3e3c4
Remove do_repair option from citus_copy_shard_placement (#6299)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-09 15:44:30 +02:00
Nils Dijk 00a94c7f13
Implement infrastructure to run sql jobs in the background (#6296)
DESCRIPTION: Add infrastructure to run long running management operations in background

This infrastructure introduces the primitives of jobs and tasks.
A task consists of a sql statement and an owner. Tasks belong to a
Job and can depend on other tasks from the same job.

When there are either runnable or running tasks we would like to
make sure a bacgrkound task queue monitor process is running. A Task
could be in running state while there is actually no monitor present
due to a database restart or failover. Once the monitor starts it
will reset any running task to its runnable state.

To make sure only one background task queue monitor is ever running
at once it will acquire an advisory lock that self conflicts.

Once a task is done it will find all tasks depending on this task.
After checking that the task doesn't have unmet dependencies it will
transition the task from blocked to runnable state for the task to
be picked up on a subsequent task start.

Currently only one task can be running at a time. This can be
improved upon in later releases without changes to the higher level
API.

The initial goal for this background tasks is to allow a rebalance
to run in the background. This will be implemented in a subsequent PR.
2022-09-09 16:11:19 +03:00
Jelte Fennema 76137e967f
Create all foreign keys quickly at the end of a shard move (#6148)
Previously we would create foreign keys to reference table in an extra
fast way at the end of a shard move. This uses that same logic to also
do it for foreign keys between distributed tables.

Fixes #6141
2022-09-09 09:58:33 +02:00
Ahmet Gedemenli eadc88a800
Introduce GUC citus.skip_constraint_validation (#6281)
Introduces a new GUC named citus.skip_constraint_validation, which basically skips constraint validation when set to on.
For some several places that we hack to skip the foreign key validation phase, now we use this GUC.
2022-09-08 18:13:18 +03:00
Nitish Upreti d7404a9446
'Deferred Drop' and robust 'Shard Cleanup' for Splits. (#6258)
DESCRIPTION:
This PR adds support for 'Deferred Drop' and robust 'Shard Cleanup' for Splits.

Common Infrastructure
This PR introduces new common infrastructure so as any operation that wants robust cleanup of resources can register with the cleaner and have the resources cleaned appropriately based on a specified policy. 'Shard Split' is the first consumer using this new infrastructure.
Note : We only support adding 'shards' as resources to be cleaned-up right now but the framework will be extended to support other resources in future.

Deferred Drop for Split
Deferred Drop Support ensures that shards undergoing split are not dropped inline as part of operation but dropped later when no active read queries are running on shard. This helps with :

Avoids any potential deadlock scenarios that can cause long running Split operation to rollback.
Avoids Split operation blocking writes and then getting blocked (due to running queries on the shard) when trying to drop shards.
Deferred drop is the new default behavior going forward.
Shard Cleaner Extension
Shard Cleaner is a background task responsible for deferred drops in case of 'Move' operations.
The cleaner has been extended to ensure robust cleanup of shards (dummy shards and split children) in case of a failure based on the new infrastructure mentioned above. The cleaner also handles deferred drop for 'Splits'.

TESTING:
New test ''citus_split_shard_by_split_points_deferred_drop' to test deferred drop support.
New test 'failure_split_cleanup' to test shard cleanup with failures in different stages.
Update 'isolation_blocking_shard_split and isolation_non_blocking_shard_split' for deferred drop.
Added non-deferred drop version of existing tests : 'citus_split_shard_no_deferred_drop' and 'citus_non_blocking_splits_no_deferred_drop'
2022-09-06 12:11:20 -07:00
aykut-bozkurt 69726648ab
verify shards if exists for insert, delete, update (#6280)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-06 15:29:14 +02:00
Naisila Puka d7f41cacbe
Prohibit renaming child trigger on distributed partition pre PG15 (#6290)
Pre PG15, renaming child triggers on partitions is allowed. When
creating a trigger in a distributed parent partitioned table, the
triggers on the shards of the partitions have the same name with
the triggers on the corresponding parent shards of the parent
table. Therefore, they don't have the same appended shard id as
the shard id of the partition. Hence, when trying to rename a
child trigger on a partition of a distributed table, we can't
correctly find the triggers on the shards of the partition in
order to rename them since we append a different shard id to the
name of the trigger. Since we can't find the trigger we get a
misleading error of inexistent trigger.

In this commit we prohibit renaming child triggers on distributed
partitions altogether.
2022-09-06 12:19:25 +03:00
Marco Slot e6b1845931
Change split logic to avoid EnsureReferenceTablesExistOnAllNodesExtended (#6208)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-05 22:02:18 +02:00
Önder Kalacı bd13836648
Add citus.skip_advisory_lock_permission_checks (#6293) 2022-09-05 17:47:41 +02:00
Jelte Fennema 1c5b8588fe
Address race condition in InitializeBackendData (#6285)
Sometimes in CI our isolation_citus_dist_activity test fails randomly
like this:
```diff
 step s2-view-dist:
  SELECT query, citus_nodename_for_nodeid(citus_nodeid_for_gpid(global_pid)), citus_nodeport_for_nodeid(citus_nodeid_for_gpid(global_pid)), state, wait_event_type, wait_event, usename, datname FROM citus_dist_stat_activity WHERE query NOT ILIKE ALL(VALUES('%pg_prepared_xacts%'), ('%COMMIT%'), ('%BEGIN%'), ('%pg_catalog.pg_isolation_test_session_is_blocked%'), ('%citus_add_node%')) AND backend_type = 'client backend' ORDER BY query DESC;

 query                                                                                                                                                                                                                                                                                                                                                                 |citus_nodename_for_nodeid|citus_nodeport_for_nodeid|state              |wait_event_type|wait_event|usename |datname
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+-------------------------+-------------------+---------------+----------+--------+----------

   INSERT INTO test_table VALUES (100, 100);
                                                                                                                                                                                                                                                                                                                          |localhost                |                    57636|idle in transaction|Client         |ClientRead|postgres|regression
-(1 row)
+
+                SELECT coalesce(to_jsonb(array_agg(csa_from_one_node.*)), '[{}]'::JSONB)
+                FROM (
+                    SELECT global_pid, worker_query AS is_worker_query, pg_stat_activity.* FROM
+                    pg_stat_activity LEFT JOIN get_all_active_transactions() ON process_id = pid
+                ) AS csa_from_one_node;
+            |localhost                |                    57636|active             |               |          |postgres|regression
+(2 rows)

 step s3-view-worker:
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26692/workflows/3406e4b4-b686-4667-bec6-8253ee0809b1/jobs/765119

I intended to fix this with #6263, but the fix turned out to be
insufficient. This PR tries to address the issue by setting
distributedCommandOriginator correctly in more situations. However, even
with this change it's still possible to reproduce the flaky test in CI.
In any case this should fix at least some instances of this issue.

In passing this changes the isolation_citus_dist_activity test to allow
running it multiple times in a row.
2022-09-02 14:23:47 +02:00
Jelte Fennema 8bb082e77d
Fix reporting of progress on waiting and moved shards (#6274)
In commit 31faa88a4e I removed some features of the rebalance progress
monitor. I did this because the plan was to remove the foreground shard
rebalancer later in the PR that would add the background shard
rebalancer. So, I didn't want to spend time fixing something that we
would throw away anyway.

As it turns out we're not removing the foreground shard rebalancer after
all, so it made sens to fix the stuff that I broke. This PR does that.
For the most part this commit reverts the changes in commit 31faa88a4e.
It's not a full revert though, because it keeps the improved tests and
the changes to `citus_move_shard_placement`.
2022-08-31 14:55:47 +03:00
Marco Slot 6bb31c5d75
Add non-blocking variant of create_distributed_table (#6087)
Added create_distributed_table_concurrently which is nonblocking variant of create_distributed_table.

It bases on the split API which takes advantage of logical replication to support nonblocking split operations.

Co-authored-by: Marco Slot <marco.slot@gmail.com>
Co-authored-by: aykutbozkurt <aykut.bozkurt1995@gmail.com>
2022-08-30 15:35:40 +03:00
Jelte Fennema d68654680b
Fix flakyness in isolation_citus_dist_activity (#6263)
Sometimes in CI our isolation_citus_dist_activity test fails randomly
like this:
```diff
 step s2-view-dist:
  SELECT query, citus_nodename_for_nodeid(citus_nodeid_for_gpid(global_pid)), citus_nodeport_for_nodeid(citus_nodeid_for_gpid(global_pid)), state, wait_event_type, wait_event, usename, datname FROM citus_dist_stat_activity WHERE query NOT ILIKE ALL(VALUES('%pg_prepared_xacts%'), ('%COMMIT%'), ('%BEGIN%'), ('%pg_catalog.pg_isolation_test_session_is_blocked%'), ('%citus_add_node%')) AND backend_type = 'client backend' ORDER BY query DESC;

 query                                                                                                                                                                                                                                                                                                                                                                 |citus_nodename_for_nodeid|citus_nodeport_for_nodeid|state              |wait_event_type|wait_event|usename |datname
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+-------------------------+-------------------+---------------+----------+--------+----------

   INSERT INTO test_table VALUES (100, 100);
                                                                                                                                                                                                                                                                                                                          |localhost                |                    57636|idle in transaction|Client         |ClientRead|postgres|regression
-(1 row)
+
+                SELECT coalesce(to_jsonb(array_agg(csa_from_one_node.*)), '[{}]'::JSONB)
+                FROM (
+                    SELECT global_pid, worker_query AS is_worker_query, pg_stat_activity.* FROM
+                    pg_stat_activity LEFT JOIN get_all_active_transactions() ON process_id = pid
+                ) AS csa_from_one_node;
+            |localhost                |                    57636|active             |               |          |postgres|regression
+(2 rows)

 step s3-view-worker:
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26605/workflows/56d284d2-5bb3-4e64-a0ea-7b9b1626e7cd/jobs/760633

The reason for this is that citus_dist_stat_activity sometimes shows the
query that it uses itself to get the data from pg_stat_activity. This is
actually a bug, because it's a worker query and thus shouldn't show up
there. To try and solve this bug, we remove two small opportunities for a
race condition. These race conditions could happen when the backenddata
was marked as active, but the distributedCommandOriginator was not set
correctly yet/anymore. There was an opportunity for this to happen both 
during connection start and shutdown.
2022-08-30 12:57:37 +03:00
Marco Slot 9bf3c3dd5c
Add an allow_unsafe_constraints flag for constraints without distribution column (#6237)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-08-25 11:37:50 +03:00
Marco Slot ac07d33a29
Remove unused reduceQuery from physical planning (#6221)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-08-24 17:24:27 +00:00
Naisila Puka 1f4fe35512
Support JSON_TABLE on PG 15 (#6241)
Postgres supports JSON_TABLE feature on PG 15.

We treat JSON_TABLE the same as correlated functions (e.g., recurring tuples).
In the end, for multi-shard JSON_TABLE commands, we apply the same
restrictions as reference tables (e.g., cannot be in the outer part of
an outer join etc.)

Co-authored-by: Onder Kalaci <onderkalaci@gmail.com>
2022-08-24 19:11:18 +03:00
Naisila Puka 35b4ddc355
Pg15 support (#6085)
* Adjust configure script to allow PG15

* Adds copy of ruleutils_14.c as ruleutils_15.c

* Uses get_namespace_name_or_temp in ruleutils_15.c

Relevant PG commit:
48c5c9068211e0a04fd9553c8714b2821ed3ad17

* Clean up code using "(expr) ? true : false" in ruleutils_15.c

Relevant PG commit:
fd0625c7a9c679c0c1e896014b8f49a489c3a245

* Change varno from Index (unsigned int) to int in ruleutils_15.c

Relevant PG commit:
e3ec3c00d85bd2844ffddee83df2bd67c4f8297f

* Adds find_recursive_union to ruleutils_15.c

Relevant PG commit:
3f50b82639637c9908afa2087de7588450aa866b

* Fix display of SQL-std func's args in INSERT/SELECT in ruleutils_15.c

Relevant PG commit:
a8d8445a7b2f80f6d0bfe97b19f90bd2cbef8759

* Fix ruleutils_15.c's dumping of whole-row Vars in more contexts

Relevant PG commit:
43c2175121c829c8591fc5117b725f1f22bfb670

* Fix assorted missing logic for GroupingFunc nodes in ruleutils_15.c

Relevant PG commit:
2591ee8ec44d8cbc8e1226550337a64c684746e4

* Adds grammar support for SQL/JSON clauses in ruleutils_15.c

Relevant PG commit:
f79b803dcc98d707450e158db3638dc67ff8380b

* Adds SQL/JSON constructors to ruleutils_15.c

Relevant PG commits:
f4fb45d15c59d7add2e1b81a9d477d0119a9691a
cc7401d5ca498a84d9b47fd2e01cebd8e830e558

* Adds support for MERGE in ruleutils_15.c

Relevant PG commit:
7103ebb7aae8ab8076b7e85f335ceb8fe799097c

* Add IS JSON predicate to ruleutils_15.c

Relevant PG commit:
33a377608fc29cdd1f6b63be561eab0aee5c81f0

* Add SQL/JSON query functions to ruleutils_15.c

Relevant PG commit:
1a36bc9dba8eae90963a586d37b6457b32b2fed4

* Adds three different SQL/JSON values to ruleutils_15.c

Relevant PG commits:
606948b058dc16bce494270eea577011a602810e
49082c2cc3d8167cca70cfe697afb064710828ca

* Adds JSON table functions in ruleutils_15.c

Relevant PG commit:
4e34747c88a03ede6e9d731727815e37273d4bc9

* Add PLAN function for JSON table in ruleutils_15.c

Relevant PG commit:
fadb48b00e02ccfd152baa80942de30205ab3c4f

* Remove extra blank lines before block-closing braces ruleutils_15.c

Relevant PG commit:
24d2b2680a8d0e01b30ce8a41c4eb3b47aca5031

* set_deparse_plan: Reuse variable to appease Coverity ruleutils_15.c

Relevant PG commit:
e70813fbc4aaca35ec012d5a426706bd54e4acab

* Mechanical code beautification ruleutils_15.c

Relevant PG commit:
23e7b38bfe396f919fdb66057174d29e17086418

* Rename value_type to item_type in ruleutils_15.c

Relevant PG commit:
3ab9a63cb638a1fd99475668e2da9c237495aeda

* Show 'AS "?column?"' explicitly when it's important in ruleutils_15.c

Relevant PG commit:
c7461fc25558832dd347a9c8150b0f1ed85e36e8

* Fix ruleutils_15.c issues with dropped cols in funcs-returning-composite

Relevant PG commit:
c1d1e8469c77ce6b8e5310955580b4a3eee7fe96

* Change comment regarding functions returning composite in ruleutils_15.c

Relevant PG commit:
c2fa113ddb1117b1f03e91960f65d5d7d8a90270

* Replace int nodes with bool nodes where needed

In PG15, Boolean nodes are added. Pre PG15, internal Boolean values
in Create Role commands were represented by Integer nodes. This
commit replaces int nodes logic with bool nodes logic where needed.
Mostly there are CREATE ROLE logic changes.

Relevant PG commit:
941460fcf731a32e6a90691508d5cfa3d1f8eeaf

* Handle new option colliculocale in CREATE COLLATION logic

In PG15, there is an added option to use ICU as global locale provider.
pg_collation has three locale-related fields: collcollate and collctype,
which are libc-related fields, and a new one colliculocale, which is the
ICU-related field. Only the libc-related fields or the ICU-related field
is set, never both.

Relevant PG commits:
f2553d43060edb210b36c63187d52a632448e1d2
54637508f87bd5f07fb9406bac6b08240283be3b

* Add PG15 tests to CI using test images that have 15beta2 (#6093)

* Change warning message in pg_signal_backend()

Relevant PG commit:
7fa945b857cc1b2964799411f1633468826861ff

* Revert "Add missing ifdef for PG 15"

This reverts commit c7b51025ab.

* Fixes tests for ALTER TRIGGER RENAME consistency for part. tables

Relevant PG commit:
80ba4bb383538a2ee846fece6a7b8da9518b6866

* Prevent creating child triggers on partitions when adding new node

Pre PG15, tgisinternal is true for a "child" trigger on a partition
cloned from the trigger on the parent.
In PG15, tgisinternal is false in that case. However, we don't want to
create this trigger on the partition since it will create a conflict
when we try to attach the partition to the parent table:
ERROR: trigger "..." for relation "{partition_name}" already exists

Relevant PG commit:
f4566345cf40b068368cb5617e61318da60676ec

* Fix tests for generated columns dependency changes

In PG15, For GENERATED columns, all dependencies of the generation
expression are recorded as NORMAL dependencies of the column itself.
This requires CASCADE to drop generated cols with the original col.
PRE PG15, dependencies were recorded as AUTO, with which
generated columns are silently dropped with the original column.

Relevant PG commit:
cb02fcb4c95bae08adaca1202c2081cfc81a28b5

* Explicitly cast catalog "char" column to text before concatenation

Relevant PG commit:
07eee5a0dc642d26f44d65c4e6263304208e8583

* Remove 'AS "?column?"' from test outputs

There were some instances in the following tst outputs
in planning debug outputs where AS "?column?" is added.
We add a normalization rule to remove it as it is not
important.

cte_inline.out
recursive_relation_planning_restriction_pushdown.out

Relevant PG commit:
c7461fc25558832dd347a9c8150b0f1ed85e36e8

* Use pg_backup_stop(PG15) instead of pg_stop_backup(PG<15)

Add an alternative test output because of the change in the
backup modes of Postgres. Specifically here, there is a renaming
issue: pg_stop_backup PRE PG15 vs pg_backup_stop PG15+
The alternative output can be deleted when we drop support for PG14

Relevant PG commit:
39969e2a1e4d7f5a37f3ef37d53bbfe171e7d77a

* Adds citus.mitmfifo GUC

Previously we setting this configuration parameter
in the fly for failure tests schedule.
However, PG15 doesn't allow that anymore: reserved prefixes
like "citus" cannot be used to set non-existing GUCs.

Relevant PG commit:
88103567cb8fa5be46dc9fac3e3b8774951a2be7

* Handles EXPLAIN output diffs in PG15 - Extra result lines

To handle extra "Result" lines in explain outputs, we add explain
method to multi_test_helpers.sql file
- plan_without_result_lines() is added for cases where we want the
whole explain output with only "Result" lines removed

* Handles EXPLAIN output diffs in PG15, Hash Agg/Join leverage

To handle differences in usage of GroupAggregate vs HashAggregate
or Merge Join vs Hash join in cases where this detail doesn't
seem to matter, we use coordinator_plan().
- coordinator_plan() is updated to remove "Result" lines

There are some cases where we have subplans so we add a new
function that prints all Task Count lines as well
- coordinator_plan_with_subplans()

Still not sure of the relevant PG commit
Could be db0d67db2401eb6238ccc04c6407a4fd4f985832
but disabling enable_group_by_reordering didn't help.

* Handles EXPLAIN output diffs in PG15: enable_group_by_reordering

Relevant PG commit
db0d67db2401eb6238ccc04c6407a4fd4f985832

* Normalizes Memory Usage, Buckets, Batches for PG15 explain diffs

We create a new function in multi_test_helpers, which is similar
to explain_merge function in PG15. This explain helper function
normalies Memory Usage, Buckets and Batches, and we use it in the
tests which give a different output for PG15.

* Bump test images to 15beta3 (#6172)

* Omit namespace in post-copy errmsg

Relevant PG commit:
069d33d0c5a021601245e44df77a0423ddd69359

* Handles EXPLAIN output diffs in PG15: extra arrows&result lines

To handle extra "->" arrows resulting from extra Result lines
in explain outputs, we add the following explain method to
multi_test_helpers.sql file

- plan_without_arrows() is added for cases where we want the
whole explain output without arrows and without Result lines

* Alters public schema's owner to pg_database_owner in PG15

In PG15, public schema is owned by pg_database_owner role.
In multi_extension, we drop and recreate the ppublic schema,
hence its owner become the default user in our tests, postgres.
Change that to pg_database_owner for PG15 consistency.

This results in alternative test output for public schema grants
in the following test:

grant_on_schema_propagation.sql

Relevant PG commit: b073c3ccd06e4cb845e121387a43faa8c68a7b62

* Add alternative test outputs for change in Insert Select display

citus_local_tables_queries.sql
coordinator_shouldhaveshards.sql
cte_inline.sql
insert_select_repartition.sql
intermediate_result_pruning.sql
local_shard_execution.sql
local_shard_execution_replicated.sql
multi_deparse_shard_query.sql
multi_insert_select.sql
multi_insert_select_conflict.sql
multi_mx_insert_select_repartition.sql
mx_coordinator_shouldhaveshards.sql
single_node.sql

Relevant PG commit:
a8d8445a7b2f80f6d0bfe97b19f90bd2cbef8759

* Fixes columnar tap tests for PG15

In PG15, Perl test modules have been moved to a new namespace.
Also, postgres node new() and get_new_node() methods have been
unified to one method: new()

We create separate tap tests for PG13/14 and PG15+
and update the Makefiles accordingly.

Relevant PG commits:
201a76183e2056c2217129e12d68c25ec9c559c8
b3b4d8e68ae83f432f43f035c7eb481ef93e1583

* Handles EXPLAIN output diffs in PG15: HashAgg Leverage,alt. output

Still not sure of the relevant PG commit
Could be db0d67db2401eb6238ccc04c6407a4fd4f985832
but disabling enable_group_by_reordering didn't help.
2022-08-24 17:59:17 +02:00
aykut-bozkurt 041f88d7bf
Revert "Revert "Creates new colocation for colocate_with:='none' too"" (#6227)
This reverts commit d171a736ab.
2022-08-24 10:54:04 +03:00
Marco Slot 639588bee0
Remove unused functions (#6220)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-08-22 11:53:25 +03:00
Jelte Fennema 31faa88a4e
Track rebalance progress at the shard move level (#6187)
We're in the processes of totally changing the shard rebalancer
experience and infrastructure. Soon the shard rebalancer will include
retries, crash recovery and support for running in the background.

These improvements come at a cost though, the way the
get_rebalance_progress UDF currently works is very hard to replicate
with this new structure. This is mostly because the old behaviour
doesn't really make sense anymore with this new infrastructure. A new
and better way to track the progress will be included as part of the new
infrastructure.

This PR is in preparation of the new code rebalancer experience.
It changes the get_rebalance_progress UDF to only display the moves that
are in progress at the moment, not the ones that happened in the past or
that are planned in the future. Another option would have been to
completely remove the current get_rebalance_progress functionality and
point people to the new way of tracking progress. But old blogposts
still reference the old UDF and users might have some automation on top
of it. Showing the progress of the current moves is fairly simple to
achieve, even with the new infrastructure.

So this PR is a kind of compromise: It doesn't have complete feature
parity with the old get_rebalance_progress, but the most common use
cases will still work.

There's also an advantage of the change: You can now see progress of
shard moves that were triggered by calling citus_move_shard_placement
manually. Instead of only being able to see progress of moves that were
initiated using get_rebalance_table_shards.
2022-08-18 18:57:04 +02:00
Onder Kalaci 9ec8e627c1 Support Sequences owned by columns before distributing tables
There are 3 different ways that a sequence can be interacting
with tables. (1) and (2) are already supported. This commit adds
support for (3).

     (1) column DEFAULT nextval('seq'):

	The dependency is roughly like below,
	and ExpandCitusSupportedTypes() is responsible
	for finding the depending sequences.

        schema <--- table <--- column <---- default value
         ^                                     |
         |------------------ sequence <--------|

    (2) serial columns: Bigserial/small serial etc:

	The dependency is roughly like below,
	and ExpandCitusSupportedTypes() is responsible
	for finding the depending sequences.

        schema <--- table <--- column <---- default value
                                 ^             |
				 |             |
          		     sequence <--------|

   (3) Sequence OWNED BY table.column: Added support for
       this type of resolution in this commit.

       The dependency is almost like the following, and
       ExpandCitusSupportedTypes() is NOT responsible for finding
       the dependency.

        schema <--- table <--- column
                                 ^
				 |
          		     sequence
2022-08-18 10:29:40 +02:00
Nils Dijk a9d47a96f6
Fix reference table lock contention (#6173)
DESCRIPTION: Fix reference table lock contention

Dropping and creating reference tables unintentionally blocked on each other due to the use of an ExclusiveLock for both the Drop and conditionally copying existing reference tables to (new) nodes.

The patch does the following:
 - Lower lock lever for dropping (reference) tables to `ShareLock` so they don't self conflict
 - Treat reference tables and distributed tables equally and acquire the colocation lock when dropping any table that is in a colocation group
 - Perform the precondition check for copying reference tables twice, first time with a lower lock that doesn't conflict with anything. Could have been a NoLock, however, in preparation for dropping a colocation group, it is an `AccessShareLock`

During normal operation the first check will always pass and we don't have to escalate that lock. Making it that we won't be blocked on adding and remove reference tables. Only after a node addition the first `create_reference_table` will still need to acquire an `ExclusiveLock` on the colocation group to perform the copy.
2022-08-17 18:19:28 +02:00
Jelte Fennema 3f6ce889eb
Use CreateSimpleHash (and variants) whenever possible (#6177)
This is a refactoring PR that starts using our new hash table creation
helper function. It adds a few more macros for ease of use, because C
doesn't have default arguments. It also adds a macro to check if a
struct contains automatic padding bytes. No struct that is hashed using
tag_hash should have automatic padding bytes, because those bytes are
undefined and thus using them to create a hash will result in undefined
behaviour (usually a random hash).
2022-08-17 13:01:59 +03:00
aykut-bozkurt be06d65721
Nonblocking tenant isolation is supported by using split api. (#6167) 2022-08-17 11:13:07 +03:00
Jelte Fennema 78a5013e24
Support changing CPU priorities for backends and shard moves (#6126)
**Intro**
This adds support to Citus to change the CPU priority values of
backends. This is created with two main usecases in mind:

1. Users might want to run the logical replication part of the shard moves
   or shard splits at a higher speed than they would do by themselves. 
   This might cause some small loss of DB performance for their regular 
   queries, but this is often worth it. During high load it's very possible
   that the logical replication WAL sender is not able to keep up with the
   WAL that is generated. This is especially a big problem when the
   machine is close to running out of disk when doing a rebalance.
2. Users might have certain long running queries that they don't impact
   their regular workload too much.

**Be very careful!!!**
Using CPU priorities to control scheduling can be helpful in some cases
to control which processes are getting more CPU time than others. 
However, due to an issue called "[priority inversion][1]" it's possible that
using CPU priorities together with the many locks that are used within
Postgres cause the exact opposite behavior of what you intended. This
is why this PR only allows the PG superuser to change the CPU priority 
of its own processes. Currently it's not recommended to set `citus.cpu_priority`
directly. Currently the only recommended interface for users is the setting 
called `citus.cpu_priority_for_logical_replication_senders`. This setting
controls CPU priority for a very limited set of processes (the logical 
replication senders). So, the dangers of priority inversion are also limited
with when using it for this usecase.

**Background**
Before reading the rest it's important to understand some basic
background regarding process CPU priorities, because they are a bit
counter intuitive. A lower priority value, means that the process will
be scheduled more and whatever it's doing will thus complete faster. The
default priority for processes is 0. Valid values are from -20 to 19
inclusive. On Linux a larger difference between values of two processes
will result in a bigger difference in percentage of scheduling.

**Handling the usecases**
Usecase 1 can be achieved by setting `citus.cpu_priority_for_logical_replication_senders`
to the priority value that you want it to have. It's necessary to set
this both on the workers and the coordinator. Example:
```
citus.cpu_priority_for_logical_replication_senders = -10
```

Usecase 2 can with this PR be achieved by running the following as
superuser. Note that this is only possible as superuser currently 
due to the dangers mentioned in the "Be very carefull!!!" section. 
And although this is possible it's **NOT** recommended:
```sql
ALTER USER background_job_user SET citus.cpu_priority = 5;
```

**OS configuration**
To actually make these settings work well it's important to run Postgres
with more a more permissive value for the 'nice' resource limit than
Linux will do by default. By default Linux will not allow a process to
set its priority lower than it currently is, even if it was lower when
the process originally started. This capability is necessary to reset
the CPU priority to its original value after a transaction finishes.
Depending on how you run Postgres this needs to be done in one of two
ways:

If you use systemd to start Postgres all you have to do is add  a line
like this to the systemd service file:
```conf
LimitNice=+0 # the + is important, otherwise its interpreted incorrectly as 20
```

If that's not the case you'll have to configure `/etc/security/limits.conf` 
like so, assuming that you are running Postgres as the `postgres` OS user:
```
postgres            soft    nice            0
postgres            hard    nice            0
```
Finally you'd have add the following line to `/etc/pam.d/common-session`
```
session required pam_limits.so
```

These settings would allow to change the priority back after setting it
to a higher value.

However, to actually allow you to set priorities even lower than the
default priority value you would need to change the values in the 
config to something lower than 0. So for example:
```conf
LimitNice=-10
```

or

```
postgres            soft    nice            -10
postgres            hard    nice            -10
```

If you use WSL2 you'll likely have to do another thing. You have to 
open a new shell, because when PAM is only used during login, and 
WSL2 doesn't actually log you in. You can force a login like this:
```
sudo su $USER --shell /bin/bash
```
Source: https://stackoverflow.com/a/68322992/2570866

[1]: https://en.wikipedia.org/wiki/Priority_inversion
2022-08-16 13:07:17 +03:00
Jelte Fennema 43c2a1e88b
Share more code between splits and moves (#6152)
When introducing non-blocking shard split functionality it was based
heavily on the non-blocking shard moves. However, differences between
usage was slightly to big to be able to reuse the existing functions
easily. So, most logical replication code was simply copied to dedicated
shard split functions and modified for that purpose.

This PR tries to create a more generic logical replication
infrastructure that can be used by both shard splits and shard moves.
There's probably more code sharing possible in the future, but I believe
this is at least a good start and addresses the lowest hanging fruit.

This also adds a CreateSimpleHash function that makes creating the
most common type of hashmap common.
2022-08-15 20:21:51 +03:00
aykut-bozkurt cc694b6bcf
we consider stat object as invalid if it is not owned by current user (#6130) 2022-08-09 20:59:30 +03:00
Jelte Fennema dd548ee3c7
Use faster custom copy logic for non-blocking shard moves (#6119)
DESCRIPTION: Use faster custom copy logic for non-blocking shard moves

Non-blocking shard moves consist of two main phases:
1. Initial data copy
2. Catchup phase

This changes the first of these phases significantly. Previously we used the
copy logic provided by postgres subscriptions. This meant we didn't have
to implement it ourselves, but it came with the downside of little control.
When implementing shard splits we needed more control to even make it
work, so we implemented our own logic for copying data between nodes.

This PR starts using that logic for non-blocking shard moves. Doing so
has four main advantages:
1. It uses COPY in binary format when possible, which is cheaper to encode 
    and decode. Furthermore it very often results in less data that needs to 
    be sent over the network.
2. It allows us to create the primary key (or other replica identity) after doing
    the initial data copy. This should give some speed up over the total run,
    because creating an index is bulk is much faster than incrementally building it.
3. It doesn't require a replication slot per parallel copy. Increasing the maximum
    number of replication slots uses resources in postgres, even if they are not used.
    So reducing the number of replication slots that shard moves need is nice.
4. Logical replication table_sync workers are slow to start up, so if lots of shards
    need to be copied that can make it quite slow. This can happen easily when
    combining Postgres partitioning with Citus.
2022-08-08 17:09:43 +02:00
Sameer Awasekar e236711eea Introduce Non-Blocking Shard Split Workflow 2022-08-04 16:32:38 +02:00
aykut-bozkurt 3ddc089651
stop distributing views with no distributed dependency if GUC DistributeLocalViews is set false. (#6083) 2022-08-04 12:34:40 +03:00
aykut-bozkurt 4ffe436bf9
we validate constraint as well if the statement is alter domain drop constraint (#6125) 2022-08-03 23:06:33 +03:00
aykutbozkurt 7387c7ed3d address method should take parameter isPostprocess 2022-08-02 21:00:23 +03:00
aykutbozkurt c98a68662a introduces operation type for dist ops 2022-08-02 20:42:32 +03:00
aykutbozkurt 57ce4cf8c4 use address method to decide if we should run preprocess and postprocess steps for a distributed object 2022-08-02 20:42:32 +03:00
aykut-bozkurt f372e93d22
we supress notice log during looking up function oid to not break pg vanilla tests. (#6082) 2022-08-01 10:14:35 +03:00
Onder Kalaci 149771792b Remove useless version compats
most likely leftover from earlier versions
2022-07-29 10:31:55 +02:00
Marco Slot cff013a057 Fix issues with insert..select casts and column ordering 2022-07-28 13:23:57 +02:00
Onder Kalaci d67cf907a2 Detach relation access tracking from connection management 2022-07-28 11:27:59 +02:00
aykut-bozkurt b08e5ec29d
added some missing object address callbacks (#6056) 2022-07-27 17:36:04 +03:00
Onder Kalaci 12fa3aaf6b Concurrent shard move/copy and colocated table creation fix
It turns out that create_distributed_table
and citus_move/copy_shard_placement does not
work well concurrently.

To fix that, we need to acquire a lock, which
sounds like a good use of colocation lock.

However, the current usage of colocation lock is
limited to higher level UDFs like rebalance_table_shards
etc. Those usage of lock is still useful, but
we cannot acquire the same lock on citus_move_shard_placement
etc. because the coordinator connects to itself to acquire
the lock. Hence, the high level UDF blocks itself.

To fix that, we use one more colocation lock, with the placements
are the main objects to consider.
2022-07-27 10:01:19 +02:00
Onder Kalaci 26fdcb68f0 Optimize StringJoin() for when prefix-postfix is needed
Before this commit, we required multiple copies of the
same stringInfo if we needed to append/prepend data to
the stringInfo. Now, we optionally get prefix/postfix.

For large string operations, this can save up to %10
memory.
2022-07-27 09:49:08 +02:00
aykut-bozkurt 5f27445b69
enable propagation warnings before postgres vanilla tests (#6081) 2022-07-27 10:34:41 +03:00
Onder Kalaci 6c65d29924 Check the PGPROC's validity properly
We used to only check whether the PID is valid
or not. However, Postgres does not necessarily
set the PID of the backend to 0 when it exists.

Instead, we need to be able to check it from procArray.
IsBackendPid() is what pg_stat_activity also relies
on for a similar purpose.
2022-07-26 17:44:44 +02:00
aykut-bozkurt 67ac3da2b0
added citus_depended_objects udf and HideCitusDependentObjects GUC to hide citus depended objects from pg meta queries (#6055)
use RecurseObjectDependencies api to find if an object is citus depended

make vanilla tests runnable to see if citus_depended function is working correctly
2022-07-25 16:43:34 +03:00
Naisila Puka 7d6410c838
Drop postgres 12 support (#6040)
* Remove if conditions with PG_VERSION_NUM < 13

* Remove server_above_twelve(&eleven) checks from tests

* Fix tests

* Remove pg12 and pg11 alternative test output files

* Remove pg12 specific normalization rules

* Some more if conditions in the code

* Change RemoteCollationIdExpression and some pg12/pg13 comments

* Remove some more normalization rules
2022-07-20 17:49:36 +03:00
aykutbozkurt ebb6d1c8c0 refactor code where GetObjectAddressFromParseTree is called because it returns list of addresses now 2022-07-19 18:13:12 +03:00
aykutbozkurt 9d232d7b00 change address method to return list of addresses 2022-07-19 18:13:11 +03:00
Önder Kalacı 90b1afe31e
Merge branch 'main' into baby_step_pg_15 2022-07-18 15:02:39 +02:00
Nitish Upreti 5b3537cdff
Shard Split for Citus (#6029)
* Blocking split setup

* Add missing type

* Missing API from Metadata Sync

* Shard Split e2e code

* Worker Split Copy DestReceiver skeleton

* Basic destreceiver code

* worker_split_copy UDF

* UDF calling

* Split points are text

* Isolate Tenant and Split Shard Unification

* Fixing executor and misc

* Reindent code

* Fixing UDF definitions

* Hello World Local Copy works

* Remote copy hello world works

* Local and Remote binary test

* Fixing text local copy and adding tests

* Hello World shard split works

* Negative tests

* Blocking Split workflow works

* Refactor

* Bug fix

* Reindent

* Cleaning up and adding comments

* Basic test for shard split workflow

* ReIndent

* Circle CI integration

* Removing include causing circle-ci build failure

* Remove SplitCopyDestReceiver and use PartitionedResultDestReceiver

* Add support for citus.enable_binary_protocol

* Reindent

* Fix build break

* Update Test

* Cleanup on catch

* Addressing open comments

* Update downgrade script and quote schema/table in COPY statement

* Fix metadata sync issue. Update regression test

* Isolation test and bug fix

* Add Isolation test, fix foreign constraint deadlock issue

* Misc code review comments

* Test name needing to be quoted

* Refactor code from review comments

* Explaining shardGroupSplitIntervalListList

* Fix upgrade & downgrade

* Fix broken test

* Test fix Round 2

* Fixing bug and modifying test appropriately

* Fully qualify copy udf name. Run Reindent

* Address PR comments

* Fix null handling when creating AuxiliaryStructures

* Ensure local copy is triggered in tests

* Limit max shards that can be created with split

* Test failure fix

* Remove split_mode and use shard_transfer_mode instead'

* Fix test failure

* Fix test failure

* Fixing permission issue when splitting non-superuser owned tables

* Fix test expected output

* Remove extra space

* Fix test

* attempt to fix test

* Addressing Marco's PR comment

* Only clean shards created by workflow

* Remove from merge

* Update test
2022-07-18 02:54:15 -07:00
Onder Kalaci 3eaef027e2 Remove unused code
Probably left over from removing old repartitioning code
2022-07-15 10:28:46 +02:00
Onder Kalaci 483a3a5875 PG 15 Compat: Resolve compile issues + shmem requests
Similar to #5897, one more step for running Citus with PG 15.

This PR at least make Citus run with PG 15. I have not tried running the tests with PG 15.

Shmem changes are based on 4f2400cb3f

Compile breaks are mostly due to #6008
2022-07-15 10:11:39 +02:00
ywj 1675519f93
Support citus_columnar as separate extension (#5911)
* Support upgrade and downgrade and separate columnar as citus_columnar extension

Co-authored-by: Yanwen Jin <yanwjin@microsoft.com>
Co-authored-by: Jeff Davis <jeff@j-davis.com>
2022-07-13 21:08:29 -07:00
Onder Kalaci b2e9a5baf1 Make sure citus_is_coordinator works on read replicas 2022-07-13 14:11:18 +02:00
Ahmet Gedemenli c8e1e243b8
Fix matviews for citus_add_local_table_to_metadata (#6023) 2022-07-04 17:00:07 +03:00
Aykut Bozkurt 6986f53835 propagate unqualified vacuum and analyze to all worker nodes 2022-06-23 15:33:14 +03:00
Jelte Fennema 184c7c0bce
Make enterprise features open source (#6008)
This PR makes all of the features open source that were previously only
available in Citus Enterprise.

Features that this adds:
1. Non blocking shard moves/shard rebalancer
   (`citus.logical_replication_timeout`)
2. Propagation of CREATE/DROP/ALTER ROLE statements
3. Propagation of GRANT statements
4. Propagation of CLUSTER statements
5. Propagation of ALTER DATABASE ... OWNER TO ...
6. Optimization for COPY when loading JSON to avoid double parsing of
   the JSON object (`citus.skip_jsonb_validation_in_copy`)
7. Support for row level security
8. Support for `pg_dist_authinfo`, which allows storing different
   authentication options for different users, e.g. you can store
   passwords or certificates here.
9. Support for `pg_dist_poolinfo`, which allows using connection poolers
   in between coordinator and workers
10. Tracking distributed query execution times using
   citus_stat_statements (`citus.stat_statements_max`,
   `citus.stat_statements_purge_interval`,
   `citus.stat_statements_track`). This is disabled by default.
11. Blocking tenant_isolation
12. Support for `sslkey` and `sslcert` in `citus.node_conninfo`
2022-06-16 00:23:46 -07:00
Ahmet Gedemenli 268d3fa3a6
Fix materialized view intermediate result filename (#5982) 2022-06-14 15:07:08 +03:00
Halil Ozan Akgul b255706189 Fixes the bug where undistribute can drop Citus extension 2022-05-31 16:23:28 +03:00
Ahmet Gedemenli 26d927178c
Propagate dependent views upon distribution (#5950) 2022-05-26 14:23:45 +03:00
Onder Kalaci dd02e1755f Parallelize metadata syncing on node activate
It is often useful to be able to sync the metadata in parallel
across nodes.

Also citus_finalize_upgrade_to_citus11() uses
start_metadata_sync_to_primary_nodes() after this commit.

Note that this commit does not parallelize all pieces of node
activation or metadata syncing. Instead, it tries to parallelize
potenially large parts of metadata, which is the objects and
distributed tables (in general Citus tables).

In the future, it would be nice to sync the reference tables
in parallel across nodes.

Create ~720 distributed tables / ~23450 shards
```SQL
-- declaratively partitioned table
CREATE TABLE github_events_looooooooooooooong_name (
  event_id bigint,
  event_type text,
  event_public boolean,
  repo_id bigint,
  payload jsonb,
  repo jsonb,
  actor jsonb,
  org jsonb,
  created_at timestamp
) PARTITION BY RANGE (created_at);

SELECT create_time_partitions(
  table_name         := 'github_events_looooooooooooooong_name',
  partition_interval := '1 day',
  end_at             := now() + '24 months'
);

CREATE INDEX ON github_events_looooooooooooooong_name USING btree (event_id, event_type, event_public, repo_id);
SELECT create_distributed_table('github_events_looooooooooooooong_name', 'repo_id');

SET client_min_messages TO ERROR;

```

across 1 node: almost same as expected
```SQL

SELECT start_metadata_sync_to_primary_nodes();
Time: 15664.418 ms (00:15.664)

select start_metadata_sync_to_node(nodename,nodeport) from pg_dist_node;
Time: 14284.069 ms (00:14.284)
```

across 7 nodes: ~3.5x improvement
```SQL

SELECT start_metadata_sync_to_primary_nodes();
┌──────────────────────────────────────┐
│ start_metadata_sync_to_primary_nodes │
├──────────────────────────────────────┤
│ t                                    │
└──────────────────────────────────────┘
(1 row)

Time: 25711.192 ms (00:25.711)

-- across 7 nodes
select start_metadata_sync_to_node(nodename,nodeport) from pg_dist_node;
Time: 82126.075 ms (01:22.126)
```
2022-05-23 09:15:48 +02:00
Ying Xu a1151c2395
Clear metadatacache during abort for create extension (#5907)
* Bug fix for bug #5876. Memset MetadataCacheSystem every time there is an abort

* Created an ObjectAccessHook that saves the transactionlevel of when citus was created and will clear metadatacache if that transaction level is rolled back. Added additional tests to make sure metadatacache is cleared
2022-05-20 13:47:58 -07:00
Marco Slot 7abcfac61f Add caching for functions that check the backend type 2022-05-20 19:02:37 +02:00
Marco Slot 09ec366ff5 Improve nested execution checks and add GUC to disable 2022-05-20 18:55:43 +02:00
jeff-davis a9f8a60007
Columnar: support relation options with ALTER TABLE. (#5935)
Columnar: support relation options with ALTER TABLE.

Use ALTER TABLE ... SET/RESET to specify relation options rather than
alter_columnar_table_set() and alter_columnar_table_reset().

Not only is this more ergonomic, but it also allows better integration
because it can be treated like DDL on a regular table. For instance,
citus can use its own ProcessUtility_hook to distribute the new
settings to the shards.

DESCRIPTION: Columnar: support relation options with ALTER TABLE.
2022-05-20 08:35:00 -07:00
Marco Slot ad5214b50c Allow distributed execution from run_command_on_* functions 2022-05-20 15:26:47 +02:00
gledis69 4731630741 Add distributing lock command support 2022-05-20 12:28:07 +03:00
Onder Kalaci db998b3d66 Adds "sync" option to citus_disable_node() UDF
Before this commit, we had:
```SQL
SELECT citus_disable_node(nodename, nodeport, force boolean DEFAULT false)
```

Where, we allow forcing to disable first worker node with
`force:=true`. However, it entails the risk for losing
data / diverging placement data etc.

With `force` flag, we control disabling the first worker node,
and with `async` flag we control whether the changes are done
via bg worker or immediately.

```SQL
SELECT citus_disable_node(nodename, nodeport, force boolean DEFAULT false, sync boolean DEFAULT false)
```

Where we can achieve all the following:

| Mode  | Data loss possibility | Can run in 2PC | Handle multiple node failures | Immediately effective |
| --- |--- |--- |--- |--- |
| force:false, sync: false  | false   | true  | true  | false |
| force:false, sync: true   | false  | false | false | true |
| force:true, sync: false   | true   | true  | true   | false |
| force:true, sync: true    | false  | false | false  | true |
2022-05-18 17:21:12 +02:00
Nils Dijk b71a08955a
Refactor: reduce complexity and code duplication for Object Propagation
Over time we have added significantly improved the support for objects to be propagated by Citus as to make scaling out the database more seamless. It became evident that there was a lot of code duplication that got into the codebase to implement the propagation.

This PR tries to reduce the amount of repeated code that is at most only slightly different. To make things worse, most of the differences were actually oversights instead of correct.

This Patch introduces 3 reusable sets of pre/post processing steps for respectively
 - create
 - alter
 - drop

With the use of the common functionality we should have more coherent behaviour between different supported object by Citus.

Some steps either omit the Pre or Post processing step if they would not make sense to include.

All tests pass, only 1 test needed changing, foreign servers, as the dropping of foreign servers didn't implement support for dropping multiple foreign servers at once. Given the common approach correctly supports dropping of multiple objects, either distributed or not, the test that assumed it wouldn't work was now obsolete.
2022-05-18 15:58:28 +02:00
Onder Kalaci ee45e7bfbf Mark existing views as distributed when upgrade to 11.0+
We have a mechanism which ensures that newly distributed
objects are recorded during `alter extension citus update`.

However, the logic was lacking "view"s. With this commit, we make
sure that existing views are also marked as distributed during
upgrade.
2022-05-18 15:43:17 +02:00
Halil Ozan Akgul d171a736ab Revert "Creates new colocation for colocate_with:='none' too"
This reverts commit f74447b3b7.
2022-05-17 15:32:22 +03:00
Ahmet Gedemenli aa8f46ead0
Fix schema name bug for sequences (#5937) 2022-05-16 18:11:57 +03:00
Halil Ozan Akgul f74447b3b7 Creates new colocation for colocate_with:='none' too 2022-05-16 13:39:05 +03:00
Teja Mupparti e56fc34404 Fixes: #5787 In prepared statements, map any unused parameters
to a generic type.
2022-05-13 19:31:05 -07:00
Burak Velioglu 1875516ae9 Add ALTER VIEW support
Adds support for propagation ALTER VIEW commands to
- Change owner of view
- SET/RESET option
- Rename view and view's column name
- Change schema of the view

Since PG also supports targeting views with ALTER TABLE
commands, related code also added to direct such ALTER TABLE
commands to ALTER VIEW commands while sending them to workers.
2022-05-13 13:21:53 +03:00
Gledis Zeneli 4c6f62efc6
Switch to using LOCK instead of lock_relation_if_exists in TRUNCATE (#5930)
Breaking down #5899 into smaller PR-s

This particular PR changes the way TRUNCATE acquires distributed locks on the relations it is truncating to use the LOCK command instead of lock_relation_if_exists. This has the benefit of using pg's recursive locking logic it implements for the LOCK command instead of us having to resolve relation dependencies and lock them explicitly. While this does not directly affect truncate, it will allow us to generalize this locking logic to then log different relations where the pg recursive locking will become useful (e.g. locking views).

This implementation is a bit more complex that it needs to be due to pg not supporting locking foreign tables. We can however, still lock foreign tables with lock_relation_if_exists. So for a command:

TRUNCATE dist_table_1, dist_table_2, foreign_table_1, foreign_table_2, dist_table_3;

We generate and send the following command to all the workers in metadata:
```sql
SEL citus.enable_ddl_propagation TO FALSE;
LOCK dist_table_1, dist_table_2 IN ACCESS EXCLUSIVE MODE;
SELECT lock_relation_if_exists('foreign_table_1', 'ACCESS EXCLUSIVE');
SELECT lock_relation_if_exists('foreign_table_2', 'ACCESS EXCLUSIVE');
LOCK dist_table_3 IN ACCESS EXCLUSIVE MODE;
SEL citus.enable_ddl_propagation TO TRUE;
```

Note that we need to alternate between the lock command and lock_table_if_exists in order to preserve the TRUNCATE order of relations.
When pg supports locking foreign tables, we will be able to massive simplify this logic and send a single LOCK command.
2022-05-11 18:38:48 +03:00
Burak Velioglu 1460452442 Introduce CREATE/DROP VIEW
Adds support for propagating create/drop view commands and views to
worker node while scaling out the cluster. Since views are dropped while
converting the table type, metadata connection will be used while
propagating view commands to not switch to sequential mode.
2022-05-10 13:07:14 +03:00
Burak Velioglu 06a94d167e Use object address instead of relation id on DDLJob to decide on syncing metadata 2022-05-05 17:59:44 +03:00
Marco Slot ceb593c9da Convert citus.hide_shards_from_app_name_prefixes to citus.show_shards_for_app_name_prefixes 2022-05-03 14:22:13 +02:00
Nils Dijk 8897361f95
Implement DOMAIN propagation for citus 2022-04-08 15:25:39 +02:00
Onder Kalaci b0b91bab04 Rename metadata sync to node metadata sync where applicable 2022-04-07 17:51:31 +02:00
Marco Slot 9476f377b5 Remove old re-partitioning functions 2022-04-04 18:11:52 +02:00
Marco Slot 8c8c3b665d Add TABLESAMPLE support 2022-04-01 15:51:40 +02:00
jeff-davis c485a04139
Separate build of citus.so and citus_columnar.so. (#5805)
* Separate build of citus.so and citus_columnar.so.

Because columnar code is statically-linked to both modules, it doesn't
make sense to load them both at once.

A subsequent commit will make the modules entirely separate and allow
loading them both simultaneously.

Author: Yanwen Jin

* Separate citus and citus_columnar modules.

Now the modules are independent. Columnar can be loaded by itself, or
along with citus.

Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2022-03-31 19:47:17 -07:00
Jelte Fennema 3a44fa827a
Add versions of forboth that don't need ListCell (#5856)
We've had custom versions of Postgres its `foreach` macro which with a
hidden ListCell for quite some time now. People like these custom
macros, because they are easier to use and require less boilerplate.
This adds similar custom versions of Postgres its `forboth` macro. Now
you don't need ListCells anymore when looping over two lists at the same
time.
2022-03-23 14:50:36 +03:00
Onder Kalaci af4ba3eb1f Remove citus.enable_cte_inlining GUC
In Postgres 12+, users can adjust whether to inline/not inline CTEs
by [NOT] MATERIALIZED keywords. So, this GUC is already useless.
2022-03-22 17:14:44 +01:00
Ahmet Gedemenli 46c6630328
Qualify CREATE AGGREGATE stmts in Preprocess (#5834) 2022-03-21 13:55:09 +03:00
Onder Kalaci 953951007c Move wait event error checks to connection manager 2022-03-14 14:35:56 +01:00
Onder Kalaci db529facab Only change the sequence types if the target column type is a supported sequence type
Before this commit, we erroneously converted the sequence
type to the column's type it is used. However, it is possible
that the sequence is used in an expression which then converted
to a type that cannot be a sequence, such as text.

With this commit, we only try this conversion if the column
type is a supported sequence type (e.g., smallint, int and bigint).

Note that we do this conversion because if the column type is a
bigint and the sequence is NOT a bigint, users would be in trouble
because sequences would generate values that are out of the range
of the column. (The other ways are already not supported such as
the column is int and the sequence is bigint would fail on the worker.)

In other words, with this commit, we scope this optimization only
when the target column type is a supported sequence type. Otherwise,
we let users to more freely use the sequences.
2022-03-11 16:06:00 +01:00
Hanefi Onaldi b0eb685101
Add support for TEXT SEARCH DICTIONARY objects
TEXT SEARCH DICTIONARY objects depend on TEXT SEARCH TEMPLATE objects.
Since we do not yet support distributed TS TEMPLATE objects, we skip
dependency checks for text search templates, similar to what we do for
roles.

The user is expected to manually create the TEXT SEARCH TEMPLATE objects
before a) adding new nodes, b) creating TEXT SEARCH DICTIONARY objects.
2022-03-11 03:40:20 +03:00
Gledis Zeneli 2cb02bfb56
Fix node adding itself with citus_add_node leading to deadlock (Fix #5720) (#5758)
If a worker node is being added, a command is sent to get the server_id of the worker from the pg_dist_node_metadata table. If the worker's id is the same as the node executing the code, we will know the node is trying to add itself. If the node tries to add itself without specifying `groupid:=0` the operation will result in an error.
2022-03-10 17:46:33 +03:00
Burak Velioglu 547f6b18ef
Ensure dependencies exists for all alter owner commands 2022-03-10 16:37:55 +03:00
Marco Slot 8e43c8094d Fix CREATE EXTENSION propagation with custom version 2022-03-09 17:40:50 +01:00
Burak Velioglu bbe1b16125
Check whether the object has unsupported or circular dependency 2022-03-09 16:37:53 +03:00
Onder Kalaci c32b2de1a7 Improve citus_lock_waits
1) Remove useless columns
2) Show backends that are blocked on a DDL even before
   gpid is assigned
3) One minor bugfix, where we clear distributedCommandOriginator
   properly.
2022-03-07 11:10:44 +01:00
Nils Dijk 3801576dfb
Move pg_dist_object to pg_catalog (#5765)
DESCRIPTION: Move pg_dist_object to pg_catalog

Historically `pg_dist_object` had been created in the `citus` schema as an experiment to understand if we could move our catalog tables to a branded schema. We quickly realised that this interfered with the UX on our managed services and other environments, where users connected via a user with the name of `citus`.

By default postgres put the username on the search_path. To be able to read the catalog in the `citus` schema we would need to grant access permissions to the schema. This caused newly created objects like tables etc, to default to this schema for creation. This failed due to the write permissions to that schema.

With this change we move the `pg_dist_object` catalog table to the `pg_catalog` schema, where our other schema's are also located. This makes the catalog table visible and readable by any user, like our other catalog tables, for debugging purposes.

Note: due to the change of schema, we had to disable 1 test that was running into a discrepancy between the schema and binary. Secondly, we needed to make the lookup functions for the `pg_dist_object` relation and their indexes less strict on the fallback of the naming due to an other test that, due to an unfortunate cache invalidation, needed to lookup the relation again. This makes that we won't default to _only_ resolving from `pg_catalog` outside of upgrades.
2022-03-04 17:40:38 +00:00
Burak Velioglu cb6d67a9a9
Make sure that all dependencies of citus tables can be distributed 2022-03-03 20:08:09 +03:00
Marco Slot 3ba61244b8 Synchronize pg_dist_colocation metadata 2022-03-03 11:01:59 +01:00
Marco Slot 43e4dd3808 Add a citus.internal_reserved_connections setting 2022-03-02 19:13:53 +01:00
Onder Kalaci 35ec9721b4 Add a new API for enabling Citus MX for clusters upgrading from earlier versions
Clusters created pre-Citus 11 mostly didn't have metadata sync enabled.
For those clusters, we add a utility UDF which fixes some minor issues
and sync the necessary objects to the workers.
2022-03-02 17:02:55 +01:00
Ahmet Gedemenli e1809af376 Propagate CREATE AGGREGATE commands 2022-03-02 10:52:43 +03:00
Nils Dijk 65bd540943
Feature: configure object propagation behaviour in transactions (#5724)
DESCRIPTION: Add GUC to control ddl creation behaviour in transactions

Historically we would _not_ propagate objects when we are in a transaction block. Creation of distributed tables would not always work in sequential mode, hence objects created in the same transaction as distributing a table that would use the just created object wouldn't work. The benefit was that the user could still benefit from parallelism.

Now that the creation of distributed tables is supported in sequential mode it would make sense for users to force transactional consistency of ddl commands for distributed tables. A transaction could switch more aggressively to sequential mode when creating new objects in a transaction.

We don't change the default behaviour just yet.

Also, many objects would not even propagate their creation when the transaction was already set to sequential, leaving the probability of a self deadlock. The new policy checks solve this discrepancy between objects as well.
2022-03-01 17:29:31 +03:00
Burak Velioglu f17872aed4
Expand functions while resolving dependencies 2022-03-01 17:08:46 +03:00
Gledis Zeneli b825232ecb
Handle rebalance / replication when a node is disabled (Fix #5664) (#5729)
The issue in question is caused when rebalance / replication call `FullShardPlacementList` which returns all shard placements (including those in disabled nodes with `citus_disable_node`).  Eventually, `FindFillStateForPlacement` looks for the state across active workers and fails to find a state for the placements which are in the disabled workers causing a seg fault shortly after.

Approach:
* `ActivePlacementHash` was not using the status of the shard placement's node to determine if the node it is active. Initially, I just fixed that.
* Additionally, I refactored the code which handles active shards in replication / rebalance to:
	* use a single function to determine if a shard placement is active. 
	* do the shard active shard filtering before calling `RebalancePlacementUpdates` and `ReplicationPlacementUpdates`, so test methods like `shard_placement_rebalance_array` and `shard_placement_replication_array` which have different shard placement active requirements can do their own filtering while using the same rebalance / replicate logic that `rebalance_table_shards` and `replicate_table_shards` use. 

Fix #5664
2022-02-25 19:54:30 +03:00
Onder Kalaci df95d59e33 Drop support for CitusInitiatedBackend
CitusInitiatedBackend was a pre-mature implemenation of the whole
GlobalPID infrastructure. We used it to track whether any individual
query is triggered by Citus or not.

As of now, after GlobalPID is already in place, we don't need
CitusInitiatedBackend, in fact it could even be wrong.
2022-02-24 12:12:43 +01:00
Marco Slot 490765a754 Enable re-partition joins after local execution 2022-02-23 19:40:21 +01:00
Marco Slot 72d8fde28b Use intermediate results for re-partition joins 2022-02-23 19:40:21 +01:00
Teja Mupparti a62901396b Allow unsafe triggers via a GUC 2022-02-21 22:45:17 -08:00
Onder Kalaci 95d5918967 Properly set worker_query and use 2022-02-21 18:22:33 +01:00
Onder Kalaci 331af3dce8 Dumping wait edges becomes optionally scan all backends
Before this commit, dumping wait edges can only be used for
distributed deadlock detection purposes. With this commit,
we open the possibility that we can use it for any backend.
2022-02-21 17:37:07 +01:00
Halil Ozan Akgul f6cd4d0f07 Overrides pg_cancel_backend and pg_terminate_backend to accept global pid 2022-02-21 16:41:35 +03:00
Ahmet Gedemenli 2bc6a00408 Refactor CreateDistributedTable to take column name 2022-02-21 12:07:17 +03:00
Burak Velioglu fa6866ed36
Start to propagate functions to worker nodes with
CREATE FUNCTION command together with it's dependencies.

If the function depends on any nondistributable object,
function will be created only locally. Parameterless
version of create_distributed_function becomes obsolete
with this change, it will deprecated from the code with a subsequent PR.
2022-02-18 13:56:51 +03:00
gledis69 a14fada153 Prevent Deadlocks When a Worker Tries to Create Collation (Fix #5583)
* When a worker tried to create a collation which had a dependency in the same worker node,
it would cause a deadlock, now it throws the correct "not a coordinator" error.
2022-02-18 12:28:02 +03:00
Teja Mupparti 46fa47beea Force-delegated functions' distribution argument must be reset as soon as the routine completes execution,
and not wait until the top level Executor ends. This fixes issue #5687
2022-02-17 10:48:30 -08:00
Nils Dijk ea86f9f94e
Add support for TEXT SEARCH CONFIGURATION objects (#5685)
DESCRIPTION: Implement TEXT SEARCH CONFIGURATION propagation

The change adds support to Citus for propagating TEXT SEARCH CONFIGURATION objects. TSConfig objects cannot always be created in one create statement, and instead require a create statement followed by many alter statements to get turned into the object they should represent.

To support this we add functionality to the worker to create or replace objects based on a list of statements. When the lists of the local object and the remote object correspond 1:1 we skip the creation of the object and simply mark it distributed. This is especially important for TSConfig objects as initdb pre-populates databases with a dozen configurations (for many different languages).

When the user creates a new TSConfig based on the copy of an existing configuration there is no direct link to the object copied from. Since there is no link we can't simply rely on propagating the dependencies to the worker and send a qualified
2022-02-17 13:12:46 +01:00
Onder Kalaci abd5b1c506 Prevent any monitoring view/udf to show already exited backends
The low-level StoreAllActiveTransactions() function filters out
backends that exited.

Before this commit, if you run a pgbench, after that you'd still
see the backends show up:
```SQL
 select count(*) from get_global_active_transactions();
┌───────┐
│ count │
├───────┤
│   538 │
└───────┘
```

After this patch, only active backends show-up:

```SQL
 select count(*) from get_global_active_transactions();
┌───────┐
│ count │
├───────┤
│    72 │
└───────┘
```
2022-02-14 17:34:32 +01:00
Ahmet Gedemenli 0411a98c99
Refactor EnsureSequentialMode functions (#5704) 2022-02-14 18:38:21 +03:00
Ahmet Gedemenli 76b63a307b Propagate create/drop schema commands 2022-02-10 14:58:09 +03:00
Halil Ozan Akgul 8ee02b29d0 Introduce global PID 2022-02-08 16:49:38 +03:00
Burak Velioglu 8ae7577581
Use superuser connection while syncing dependent objects' pg_dist_object tuples 2022-02-07 17:50:45 +03:00
Marco Slot 872f0a79db Remove random shard placement policy 2022-02-06 21:55:58 +01:00
Marco Slot 0cae8e7d6b Remove local-node-first shard placement 2022-02-06 21:36:34 +01:00
Onder Kalaci ff234fbfd2 Unify old GUCs into a single one
Replaces citus.enable_object_propagation with citus.enable_metadata_sync

Also, within Citus 11 release cycle, we added citus.enable_metadata_sync_by_default,
that is also replaced with citus.enable_metadata_sync.

In essence, when citus.enable_metadata_sync is set to true, all the objects
and the metadata is send to the remote node.

We strongly advice that the users never changes the value of
this GUC.
2022-02-04 10:52:56 +01:00
Teja Mupparti f31bce5b48 Fixes the issue seen in https://github.com/citusdata/citus-enterprise/issues/745
With this commit, rebalancer backends are identified by application_name = citus_rebalancer
and the regular internal backends are identified by application_name = citus_internal
2022-02-03 09:40:46 -08:00
Burak Velioglu f88cc230bf
Handle tables and objects as metadata. Update UDFs accordingly
With this commit we've started to propagate sequences and shell
tables within the object dependency resolution. So, ensuring any
dependencies for any object will consider shell tables and sequences
as well. Separate logics for both shell tables and sequences have
been removed.

Since both shell tables and sequences logic were implemented as a
part of the metadata handling before that logic, we were propagating
them while syncing table metadata. With this commit we've divided
metadata (which means anything except shards thereafter) syncing
logic into multiple parts and implemented it either as a part of
ActivateNode. You can check the functions called in ActivateNode
to check definition of different metadata.

Definitions of start_metadata_sync_to_node and citus_activate_node
have also been updated. citus_activate_node will basically create
an active node with all metadata and reference table shards.
start_metadata_sync_to_node will be same with citus_activate_node
except replicating reference tables. stop_metadata_sync_to_node
will remove all the metadata. All of those UDFs need to be called
by superuser.
2022-01-31 16:20:15 +03:00
Onur Tirtir 4dc38e9e3d
Use EnsureCompatibleLocalExecutionState instead (#5640) 2022-01-21 15:37:59 +01:00
Onur Tirtir 181111b84f Drop ruleutils copied for statistics 2022-01-20 17:28:19 +03:00
Onur Tirtir 7b59295af2 Drop ruleutils copied for triggers 2022-01-20 17:28:19 +03:00
Teja Mupparti 54862f8c22 (1) Functions will be delegated even when present in the scope of an explicit
BEGIN/COMMIT transaction block or in a UDF calling another UDF.
(2) Prohibit/Limit the delegated function not to do a 2PC (or any work on a
remote connection).
(3) Have a safety net to ensure the (2) i.e. we should block the connections
from the delegated procedure or make sure that no 2PC happens on the node.
(4) Such delegated functions are restricted to use only the distributed argument
value.

Note: To limit the scope of the project we are considering only Functions(not
procedures) for the initial work.

DESCRIPTION: Introduce a new flag "force_delegation" in create_distributed_function(),
which will allow a function to be delegated in an explicit transaction block.

Fixes #3265

Once the function is delegated to the worker, on that node during the planning

distributed_planner()
TryToDelegateFunctionCall()
CheckDelegatedFunctionExecution()
EnableInForceDelegatedFuncExecution()
Save the distribution argument (Constant)
ExecutorStart()
CitusBeginScan()
IsShardKeyValueAllowed()
Ensure to not use non-distribution argument.

ExecutorRun()
AdaptiveExecutor()
StartDistributedExecution()
EnsureNoRemoteExecutionFromWorkers()
Ensure all the shards are local to the node in the remoteTaskList.
NonPushableInsertSelectExecScan()
InitializeCopyShardState()
EnsureNoRemoteExecutionFromWorkers()
Ensure all the shards are local to the node in the placementList.

This also fixes a minor issue: Properly handle expressions+parameters in distribution arguments
2022-01-19 16:43:33 -08:00
Marco Slot 33bfa0b191 Hide shards from application_name's with a specific prefix 2022-01-18 15:20:55 +04:00
Onur Tirtir 3cc44ed8b3
Tell other backends it's safe to ignore the backend that concurrently built the shell table index (#5520)
In addition to starting a new transaction, we also need to tell other
backends --including the ones spawned for connections opened to
localhost to build indexes on shards of this relation-- that concurrent
index builds can safely ignore us.

Normally, DefineIndex() only does that if index doesn't have any
predicates (i.e.: where clause) and no index expressions at all.
However, now that we already called standard process utility, index
build on the shell table is finished anyway.

The reason behind doing so is that we cannot guarantee not grabbing any
snapshots via adaptive executor, and the backends creating indexes on
local shards (if any) might block on waiting for current xact of the
current backend to finish, which would cause self deadlocks that are not
detectable.
2022-01-10 10:23:09 +03:00
Marco Slot ee3b50b026 Disallow remote execution from queries on shards 2022-01-07 17:46:21 +01:00
Onder Kalaci 9f2d9e1487 Move placement deletion from disable node to activate node
We prefer the background daemon to only sync node metadata. That's
why we move placement metadata changes from disable node to
activate node. With that, we can make sure that disable node
only changes node metadata, whereas activate node syncs all
the metadata changes. In essence, we already expect all
nodes to be up when a node is activated. So, this does not change
the behavior much.
2022-01-07 09:56:03 +01:00
Ahmet Gedemenli 45e423136c
Support foreign tables in MX (#5461) 2022-01-06 18:50:34 +03:00
Önder Kalacı 5305aa4246
Do not drop sequences when dropping metadata (#5584)
Dropping sequences means we need to recreate
and hence losing the sequence.

With this commit, we keep the existing sequences
such that resyncing wouldn't drop the sequence.

We do that by breaking the dependency of the sequence
from the table.
2022-01-06 09:48:34 +01:00
jeff-davis 2e03efd91e
Columnar: move DDL hooks to citus to remove dependency. (#5547)
Add a new hook ColumnarTableSetOptions_hook so that citus can get
control when the columnar table options change.
2022-01-04 23:26:46 -08:00
jeff-davis c9292cfad1
Make pg_version_compat.h and listutils.c dependency-free. (#5548)
Split distributed/version_compat.h into dependency-free
pg_version_compat.h, and the original which still has
dependencies. The original doesn't have much purpose, but until other
files have better discipline about including the correct header files,
then it's still needed.

Also make distributed/listutils.h dependency-free. Should be moved
outside of 'distributed' subdirectory, but that will cause significant
code churn, so leave for another cleanup patch.

Now both files can be included in columnar without creating a
dependency on citus.
2022-01-04 23:02:08 -08:00
Ahmet Gedemenli 042d45b263 Propagate foreign server ops 2021-12-23 17:54:04 +03:00
Onder Kalaci fc98f83af2 Add citus.grep_remote_commands
Simply applies

```SQL
SELECT textlike(command, citus.grep_remote_commands)
```
And, if returns true, the command is logged. Else, the log is ignored.

When citus.grep_remote_commands is empty string, all commands are
logged.
2021-12-17 11:47:40 +01:00
Burak Velioglu ed8e32de5e
Sync pg_dist_object on an update and propagate while syncing to a new node
Before that PR we were updating citus.pg_dist_object metadata, which keeps
the metadata related to objects on Citus, only on the coordinator node. In
order to allow using those object from worker nodes (or erroring out with
proper error message) we've started to propagate that metedata to worker
nodes as well.
2021-12-06 19:25:50 +03:00
Onder Kalaci 549edcabb6 Allow disabling node(s) when multiple failures happen
As of master branch, Citus does all the modifications to replicated tables
(e.g., reference tables and distributed tables with replication factor > 1),
via 2PC and avoids any shardstate=3. As a side-effect of those changes,
handling node failures for replicated tables change.

With this PR, when one (or multiple) node failures happen, the users would
see query errors on modifications. If the problem is intermitant, that's OK,
once the node failure(s) recover by themselves, the modification queries would
succeed. If the node failure(s) are permenant, the users should call
`SELECT citus_disable_node(...)` to disable the node. As soon as the node is
disabled, modification would start to succeed. However, now the old node gets
behind. It means that, when the node is up again, the placements should be
re-created on the node. First, use `SELECT citus_activate_node()`. Then, use
`SELECT replicate_table_shards(...)` to replicate the missing placements on
the re-activated node.
2021-12-01 10:19:48 +01:00
Onder Kalaci d405993b57 Make sure to use a dedicated metadata connection
With this commit, we make sure to use a dedicated connection per
node for all the metadata operations within the same transaction.

This is needed because the same metadata (e.g., metadata includes
the distributed table on the workers) can be modified accross
multiple connections.

With this connection we guarantee that there is a single metadata connection.
But note that this connection can be used for any other operation.
In other words, this connection is not only reserved for metadata
operations.
2021-11-26 14:36:28 +01:00
Onder Kalaci 38b08ebde9 Generalize the error checks while removing node
The checks for preventing to remove a node are very much reference
table centric. We are soon going to add the same checks for replicated
tables. So, make the checks generic such that:
	 (a) replicated tables fit naturally
	 (b) we can the same checks in `citus_disable_node`.
2021-11-26 14:25:29 +01:00
Marco Slot 56eae48daf Stop updating shard range in citus_update_shard_statistics 2021-11-19 10:51:15 +01:00
Hanefi Onaldi c0d43d4905
Prevent cache usage on citus_drop_trigger codepaths 2021-11-18 20:24:51 +03:00
Marco Slot 9e6ca23286 Remove cstore_fdw-related logic 2021-11-16 13:59:03 +01:00
Önder Kalacı 8c0bc94b51
Enable replication factor > 1 in metadata syncing (#5392)
- [x] Add some more regression test coverage
- [x] Make sure returning works fine in case of
     local execution + remote execution
     (task->partiallyLocalOrRemote works as expected, already added tests)
- [x] Implement locking properly (and add isolation tests)
     - [x] We do #shardcount round-trips on `SerializeNonCommutativeWrites`.
           We made it a single round-trip.
- [x] Acquire locks for subselects on the workers & add isolation tests
- [x] Add a GUC to prevent modification from the workers, hence increase the
      coordinator-only throughput
       - The performance slightly drops (~%15), unless
         `citus.allow_modifications_from_workers_to_replicated_tables`
         is set to false
2021-11-15 15:10:18 +03:00
Ahmet Gedemenli 14a33d4e8e Introduce GUC citus.use_citus_managed_tables 2021-11-11 14:09:06 +03:00
Onder Kalaci d5e89b1132 Unify distributed execution logic for single replicated tables
Citus does not acquire any executor locks for shard replication == 1.
With this commit, we unify this decision and exit early.
2021-11-08 13:52:20 +01:00
Önder Kalacı d5b371b2e0
Merge branch 'master' into naisila/fix-partitioned-index 2021-11-08 10:53:16 +01:00
naisila 385ba94d15 Run fix_partition_shard_index_names after each wrong naming command 2021-11-08 10:43:34 +01:00
Marco Slot 78866df13c Remove master_append_table_to_shard UDF 2021-11-08 10:43:24 +01:00
Marco Slot fba93df4b0 Remove copy into new append shard logic 2021-11-07 21:01:40 +01:00
Sait Talha Nisanci ab29c25658 Fix missing from entry 2021-11-04 18:54:52 +03:00
naisila 796d56a7b1 Rename ddlJob->commandString to ddlJob->metadataSyncCommand 2021-10-29 23:45:43 +03:00
Ahmet Gedemenli 67dca4363d
Dont auto-undistribute user-added citus local tables (#5314)
* Disable auto-undistribute for user-added citus local tables
2021-10-28 12:10:26 +03:00
Philip Dubé cc50682158 Fix typos. Spurred spotting "connectios" in logs 2021-10-25 13:54:09 +00:00
Onder Kalaci 575bb6dde9 Drop support for Inactive Shard placements
Given that we do all operations via 2PC, there is no way
for any placement to be marked as INACTIVE.
2021-10-22 18:03:35 +02:00
Önder Kalacı b3299de81c
Drop support for citus.multi_shard_commit_protocol (#5380)
In the past, we allowed users to manually switch to 1PC
(e.g., one phase commit). However, with this commit, we
don't. All multi-shard modifications are done via 2PC.
2021-10-21 14:01:28 +02:00
Önder Kalacı 3f726c72e0
When replication factor > 1, all modifications are done via 2PC (#5379)
With Citus 9.0, we introduced `citus.single_shard_commit_protocol` which
defaults to 2PC.

With this commit, we prevent any user to set it to 1PC and drop support
for `citus.single_shard_commit_protocol`.

Although this might add some overhead for users, it is already the default
behaviour (so less likely) and marking placements as INVALID is much
worse.
2021-10-20 01:39:03 -07:00
Marco Slot 096660d61d Remove master_apply_delete_command 2021-10-18 22:29:37 +02:00
Önder Kalacı 31c8f279ac
Add helper UDFs to inspect object dependencies (#5293)
- citus_get_all_dependencies_for_object: emulate what Citus
                                         would qualify as
					 dependency when adding
					 a new node
- citus_get_dependencies_for_object: emulate what Citus would qualify
				     as dependency when creating an
				     object

Example use:
```SQL
-- find all the depedencies of table test
SELECT
	pg_identify_object(t.classid, t.objid, t.objsubid)
FROM
	(SELECT * FROM pg_get_object_address('table', '{test}', '{}')) as addr
JOIN LATERAL
	citus_get_all_dependencies_for_object(addr.classid, addr.objid, addr.objsubid) as t(classid oid, objid oid, objsubid int)
ON TRUE
	ORDER BY 1;
```
2021-10-18 14:46:49 +03:00
Teja Mupparti a8348047c5
Pushdown procedures with OUT parameters (#5348) 2021-10-11 23:14:36 -07:00
Onur Tirtir f7f4a93073 Remove get_relation_trigger_oid_compat 2021-10-11 11:53:00 +03:00
Ahmet Gedemenli d19793c174 Add partitioning support for citus local tables
Add/fix tests

Fix creating partitions

Add test for mx - partition creating case

Enable cascading to partitioned tables

Fix mx partition adding test

Fix cascading through fkeys

Style

Disable converting with non-inherited fkeys

Fix detach bug

Early return in case of cascade & Add tests

Style

Fix undistribute_table bug & Fix test outputs

Remove RemovePartitionRelationIds

Test with undistribute_table

Add test for mx+convert+undistribute

Remove redundant usage of CreatePartitionedCitusLocalTable

Add some comments

Introduce bulk functions for generating attach/detach partition commands

Fix: Convert partitioned tables after adding fkey

Change the error message for partitions

Introduce function ErrorIfPartitionTableAddedToMetadata

Polish attach/detach command generation functions

Use time_partitions for testing

Move mx tests to citus_local_tables_mx

Add new partitioned table to cascade test

Add test with time series management UDFs

Fix test output

Fix: Assertion fail on relation access tracking

Style

Refactor creating partitioned citus local tables

Remove CreatePartitionedCitusLocalTable

Style

Error out if converting multi-level table

Revert some old tests

Error out adding partitioned partition

Polish

Polish/address

Fix create table partition of case

Use CascadeOperationForRelationIdList if no cascade needed

Fix create partition bug

Revert / Add new tests to mx

Style

Fix dropping fkey bug

Add test with IF NOT EXISTS

Convert to CLT when doing ATTACH PARTITION

Add comments

Add more tests with time series management

Edit the error message for converting the child

Use OR instead of AND in ErrorIfUnsupportedAlterTableStmt

Edit/improve tests

Disable ddl prop when dropping default column definitions

Disable/enable ddl prop just before/after the command

Add comment

Add sequence test

Add trigger test

Remove NeedCascadeViaForeignKeys

Add one more insert to sequence test

Add comment

Style

Fix test output shard ids

Update comments

Disable creating fkey on partitions

Move partition check to CreateCitusLocalTable

Add comment

Add check for  attachingmulti-level  partition

Add test for pg_constraint

Check pg_dist_partition in tests

Add test inserting on the worker
2021-10-11 10:45:07 +03:00
Halil Ozan Akgul 9c9d4b5eeb Turn MX on by default 2021-10-08 18:17:21 +03:00
Naisila Puka a69abe3be0
Fixes bug about int and smallint sequences on MX (#5254)
* Introduce worker_nextval udf for int&smallint column defaults

* Fix current tests and add new ones for worker_nextval
2021-09-09 23:41:07 +03:00
Marco Slot 4faa49775b Perform copy command as regular user in worker_append_table_to_shard 2021-09-09 11:00:29 +02:00
Sait Talha Nisanci 3ad3bbba84 Apply latest version compat without conflicts 2021-09-03 16:09:59 +03:00
Halil Ozan Akgul 7823e49219 Introduces pg_get_statisticsobj_worker_compat macro
Relevant PG commit:
a4d75c86bf15220df22de0a92c819ecef9db3849
2021-09-03 15:41:28 +03:00
Halil Ozan Akgul f16d5e1833 Introduces make_simple_restrictinfo_compat and pull_varnos_compat macros
make_simple_restrictinfo and pull_varnos functions now have a new parameter
These new macros give us the ability to use this new parameter for PG14 and they don't give the parameter for previous versions

Relevant PG commit:
55dc86eca70b1dc18a79c141b3567efed910329d
2021-09-03 15:41:28 +03:00
Sait Talha Nisanci a1bfb4f31b Fix unlimited copy size variable's value 2021-09-03 15:41:28 +03:00
Halil Ozan Akgul cb3b76ed24 Introduces get_partition_parent_compat and RelationGetPartitionDesc_compat macros
get_partition_parent and RelationGetPartitionDesc functions now have new parameters to also include detached partitions
Thess new macros give us the ability to use these new parameter for PG14 and they don't give the parameters for previous versions
Existing parameters are set to not accept detached partitions

Relevant PG commit:
71f4c8c6f74ba021e55d35b1128d22fb8c6e1629
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 898d3bb8d3 Introduces proc_statusflags_compat macro
In two commits vacuumFlags in PGXACT is moved and then renamed to status flags
This macro uses the appropriate version of the flag

Relevant PG commits:
5788e258bb26495fab65ff3aa486268d1c50b123
cd9c1b3e197a9b53b840dcc87eb41b04d601a5f9
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 287706b717 Introduces SetTuplestoreDestReceiverParams_compat macro
SetTuplestoreDestReceiverParams function now has two new parameters
This new macro give us the ability to use this new parameter for PG14 and it doesn't give the parameter for previous versions
Existing parameters are set to NULL to keep previous behavior

Relevant PG commit:
2f48ede080f42b97b594fb14102c82ca1001b80c
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul c3f0528607 Extends statistics on expressions in ruleutils_14.c
Relevant PG commit:
a4d75c86bf15220df22de0a92c819ecef9db3849
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 1d5053b652 Removes support for old protocols in Copy functions from PG14
Some Copy related functions copied from Postgres had support for both old and new protocols
Postgres removed support for old version so we remove it too

Relevant PG commit:
3174d69fb96a66173224e60ec7053b988d5ed4d9
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 82858ca8fe Introduces ProcessUtility macros for readOnlyTree parameter
New macros: standard_ProcessUtility_compat, ProcessUtility_compat, ColumnarProcessUtility_compat, PrevProcessUtilityHook_compat

The functions now have a new bool parameter: readOnlyTree
These new macros give us the ability to use this new parameter for PG14 and it doesn't give the parameter for previous versions

In multi_ProcessUtility and ColumnarProcessUtility, before doing anything else, we check if readOnlyTree parameter is true and create a copy of pstmt
Existing readOnlyTree parameters are set to false since we already handle the read only case at multi_ProcessUtility and ColumnarProcessUtility

Relevant PG commit:
7c337b6b527b7052e6a751f966d5734c56f668b5
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul db2d9af863 Introduces BeginCopyFrom_compat macro
BeginCopyFrom function now has a new whereClause parameter.
In the function this parameter is assigned to the whereClause field of the CopyFromState returned
Currently in Postgres there is only one place where this argument isn't NULL, and in previous PG version the whereClause argument of copy state is set right after the function call
Since we don't have such example all current whereClause parameters are set to NULL

Relevant PG commit:
c532d15dddff14b01fe9ef1d465013cb8ef186df
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 35cfa5d7b9 Introduces CopyFromState_compat macro
CopyState struct is divided into parts and one of them is CopyFromState
This macro uses the appropriate one for PG versions

Relevant PG commit:
c532d15dddff14b01fe9ef1d465013cb8ef186df
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 8f34f84ce6 Introduces IsReindexWithParam_compat macro
In ReindexStmt concurrent field is moved to options and then options are converted to params list.
This macro uses previous fields for previous versions and the new params list with a new function named IsReindexWithParam for PG14

Relevant PG commits:
844c05abc3f1c1703bf17cf44ab66351ed9711d2
b5913f6120792465f4394b93c15c2e2ac0c08376
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 37ae22ce3e Introduces macros for vacuum options
VacOptTernaryValue enum is renamed to VacOptValue.
In the enum there were three values, VACOPT_TERNARY_DEFAULT, VACOPT_TERNARY_DISABLED, and VACOPT_TERNARY_ENABLED
Now there are four values VACOPTVALUE_UNSPECIFIED, VACOPTVALUE_AUTO, VACOPTVALUE_DISABLED, and VACOPTVALUE_ENABLED

New macros are VacOptValue_compat, VACOPTVALUE_UNSPECIFIED_COMPAT, VACOPTVALUE_DISABLED_COMPAT, and VACOPTVALUE_ENABLED_COMPAT
The VACOPTVALUE_UNSPECIFIED_COMPAT matches VACOPT_TERNARY_DEFAULT and VACOPTVALUE_UNSPECIFIED. And there are no macro for VACOPTVALUE_AUTO.

Relevant PG commit:
3499df0dee8c4ea51d264a674df5b5e31991319a
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul ebf1b7e23f Introduces macros for functions that now have include_out_arguments argument
New macros: FuncnameGetCandidates_compat and expand_function_arguments_compat

The functions (the ones without _compat) now have a new bool include_out_arguments parameter
These new macros give us the ability to use this new parameter for PG14 and it doesn't give the parameter for previous versions
Existing include_out_arguments parameters are set to 'false' to keep current behavior

Relevant PG commit:
e56bce5d43789cce95d099554ae9593ada92b3b7
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 347ae2928f Introduces stats_compat macro for MemoryContextMethods->stats
stats function now have a new bool print_to_stderr parameter
This new macro gives us the ability to use this new parameter for PG14 and it doesn't give the parameter for previous versions
Existing print_to_stderr parameter is set to true to keep current behavior

Relevant PG commit:
43620e328617c1f41a2a54c8cee01723064e3ffa
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 54ee93885a Introduces getObjectTypeDescription_compat and getObjectIdentity_compat macros
getObjectTypeDescription and getObjectIdentity functions now have a new bool missing_ok parameter
These new macros give us the ability to use this new parameter for PG14 and they don't give the parameter for previous versions
Currently all missing_ok parameters are set to false to keep current behavior

Relevant PG commit:
2a10fdc4307a667883f7a3369cb93a721ade9680
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul f8d3e50f25 Introduces STATUS_WAITING_COMPAT macro
The STATUS_WAITING define is removed and an enum with PROC_WAIT_STATUS_WAITING is added instead
This macro uses appropriate one

Relevant PG commit:
a513f1dfbf2c29a51b0f7cbd5913ce2d2ee452c5
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 3c10e0f568 Introduces ROLE_MONITOR_COMPAT macro
DEFAULT_ROLE_MONITOR is renamed to ROLE_PG_MONITOR
This macro uses appropriate one

Relevant PG commit:
c9c41c7a337d3e2deb0b2a193e9ecfb865d8f52b
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul b790ecf180 Introduces F_NEXTVAL_COMPAT macro
Name of F_NEXTVAL_OID is changed to F_NEXTVAL

Relevant PG commit:
8e1f37c07aafd4bb7aa6e1e1982010af11f8b5c7
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 63cdb4b70a Adds AlterTableStmtObjType macro
AlterTableStmt's relkind field is changed into objtype
New AlterTableStmtObjType macro uses the appropriate one

Relevant PG commit:
cc35d8933a211d9965eb1c1d2749a903d5735db2
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 1b6c8348fb Adds PG14 to version_compat.h and columnar_version_compat.h files 2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 7a27d7cee3 Adds copy of ruleutils_13.c as ruleutils_14.c 2021-09-03 15:27:24 +03:00
Naisila Puka 4fb05efabb
Distributes partition-to-be table before ProcessUtility (#5191)
* Skip ALTER TABLE constraint checks while planning

* Revert previous commit's solution, keep tests

* Distribute partition-to-be table before ProcessUtility

* Acquire locks in PreprocessAlterTableStmtAttachPartition
2021-09-02 13:07:42 +03:00
Hanefi Onaldi 7e39c7ea83
Replace master with citus in logs and comments (#5210)
I replaced 

- master_add_node,
- master_add_inactive_node
- master_activate_node

with

- citus_add_node,
- citus_add_inactive_node
- citus_activate_node

respectively.
2021-08-26 11:31:17 +03:00
Onur Tirtir 4b03195c06 Use RelationGetStatExtList instead of GetExplicitStatisticsIdList 2021-08-18 17:50:57 +03:00
Ahmet Gedemenli 9e90894f21
Synchronize hasmetadata flag on mx workers (#5086)
* Synchronize hasmetadata flag on mx workers

* Switch to sequential execution

* Add test

* Use SetWorkerColumn

* Add test for stop_sync

* Remove usage of UpdateHasmetadataOnWorkersWithMetadata

* Remove MarkNodeMetadataSynced

* Fix test for metadatasynced

* Remove MarkNodeMetadataSynced

* Style

* Remove MarkNodeHasMetadata

* Remove UpdateDistNodeBoolAttr

* Refactor SetWorkerColumn

* Use SetWorkerColumnLocalOnly when setting up dependencies

* Use SetWorkerColumnLocalOnly in TriggerSyncMetadataToPrimaryNodes

* Style

* Make update command generator functions static

* Set metadatasynced before syncing

* Call SetWorkerColumn only if the sync is successful

* Try to sync all nodes

* Fix indexno

* Update metadatasynced locally first

* Break if a node fails to sync metadata

* Send worker commands optional

* Style & Rebase

* Add raiseOnError param to SetWorkerColumn

* Style

* Set metadatasynced for all metadata nodes

* Style

* Introduce SetWorkerColumnOptional

* Polish

* Style

* Dont send set command to not synced metadata nodes

* Style

* Polish

* Add test for stop_sync

* Add test for shouldhaveshards

* Add test for isactive flag

* Sort by placementid in the function verify_metadata

* Cover edge cases for failing nodes

* Add comments

* Add nodeport to isactive test

* Add warning if metadata out of sync

* Update warning message
2021-08-12 14:16:18 +03:00
Onder Kalaci 5f02d18ef8 transactional metadata sync for maintanince daemon
As we use the current user to sync the metadata to the nodes
with #5105 (and many other PRs), there is no reason that
prevents us to use the coordinated transaction for metadata syncing.

This commit also renames few functions to reflect their actual
implementation.
2021-08-09 10:34:55 +02:00