Commit Graph

287 Commits (3b1c082791592097e2c944ef98ec86e4234882bf)

Author SHA1 Message Date
Önder Kalacı bd13836648
Add citus.skip_advisory_lock_permission_checks (#6293) 2022-09-05 17:47:41 +02:00
Jelte Fennema 1c5b8588fe
Address race condition in InitializeBackendData (#6285)
Sometimes in CI our isolation_citus_dist_activity test fails randomly
like this:
```diff
 step s2-view-dist:
  SELECT query, citus_nodename_for_nodeid(citus_nodeid_for_gpid(global_pid)), citus_nodeport_for_nodeid(citus_nodeid_for_gpid(global_pid)), state, wait_event_type, wait_event, usename, datname FROM citus_dist_stat_activity WHERE query NOT ILIKE ALL(VALUES('%pg_prepared_xacts%'), ('%COMMIT%'), ('%BEGIN%'), ('%pg_catalog.pg_isolation_test_session_is_blocked%'), ('%citus_add_node%')) AND backend_type = 'client backend' ORDER BY query DESC;

 query                                                                                                                                                                                                                                                                                                                                                                 |citus_nodename_for_nodeid|citus_nodeport_for_nodeid|state              |wait_event_type|wait_event|usename |datname
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+-------------------------+-------------------+---------------+----------+--------+----------

   INSERT INTO test_table VALUES (100, 100);
                                                                                                                                                                                                                                                                                                                          |localhost                |                    57636|idle in transaction|Client         |ClientRead|postgres|regression
-(1 row)
+
+                SELECT coalesce(to_jsonb(array_agg(csa_from_one_node.*)), '[{}]'::JSONB)
+                FROM (
+                    SELECT global_pid, worker_query AS is_worker_query, pg_stat_activity.* FROM
+                    pg_stat_activity LEFT JOIN get_all_active_transactions() ON process_id = pid
+                ) AS csa_from_one_node;
+            |localhost                |                    57636|active             |               |          |postgres|regression
+(2 rows)

 step s3-view-worker:
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26692/workflows/3406e4b4-b686-4667-bec6-8253ee0809b1/jobs/765119

I intended to fix this with #6263, but the fix turned out to be
insufficient. This PR tries to address the issue by setting
distributedCommandOriginator correctly in more situations. However, even
with this change it's still possible to reproduce the flaky test in CI.
In any case this should fix at least some instances of this issue.

In passing this changes the isolation_citus_dist_activity test to allow
running it multiple times in a row.
2022-09-02 14:23:47 +02:00
Jelte Fennema d68654680b
Fix flakyness in isolation_citus_dist_activity (#6263)
Sometimes in CI our isolation_citus_dist_activity test fails randomly
like this:
```diff
 step s2-view-dist:
  SELECT query, citus_nodename_for_nodeid(citus_nodeid_for_gpid(global_pid)), citus_nodeport_for_nodeid(citus_nodeid_for_gpid(global_pid)), state, wait_event_type, wait_event, usename, datname FROM citus_dist_stat_activity WHERE query NOT ILIKE ALL(VALUES('%pg_prepared_xacts%'), ('%COMMIT%'), ('%BEGIN%'), ('%pg_catalog.pg_isolation_test_session_is_blocked%'), ('%citus_add_node%')) AND backend_type = 'client backend' ORDER BY query DESC;

 query                                                                                                                                                                                                                                                                                                                                                                 |citus_nodename_for_nodeid|citus_nodeport_for_nodeid|state              |wait_event_type|wait_event|usename |datname
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+-------------------------+-------------------+---------------+----------+--------+----------

   INSERT INTO test_table VALUES (100, 100);
                                                                                                                                                                                                                                                                                                                          |localhost                |                    57636|idle in transaction|Client         |ClientRead|postgres|regression
-(1 row)
+
+                SELECT coalesce(to_jsonb(array_agg(csa_from_one_node.*)), '[{}]'::JSONB)
+                FROM (
+                    SELECT global_pid, worker_query AS is_worker_query, pg_stat_activity.* FROM
+                    pg_stat_activity LEFT JOIN get_all_active_transactions() ON process_id = pid
+                ) AS csa_from_one_node;
+            |localhost                |                    57636|active             |               |          |postgres|regression
+(2 rows)

 step s3-view-worker:
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26605/workflows/56d284d2-5bb3-4e64-a0ea-7b9b1626e7cd/jobs/760633

The reason for this is that citus_dist_stat_activity sometimes shows the
query that it uses itself to get the data from pg_stat_activity. This is
actually a bug, because it's a worker query and thus shouldn't show up
there. To try and solve this bug, we remove two small opportunities for a
race condition. These race conditions could happen when the backenddata
was marked as active, but the distributedCommandOriginator was not set
correctly yet/anymore. There was an opportunity for this to happen both 
during connection start and shutdown.
2022-08-30 12:57:37 +03:00
Marco Slot 9bf3c3dd5c
Add an allow_unsafe_constraints flag for constraints without distribution column (#6237)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-08-25 11:37:50 +03:00
Naisila Puka 35b4ddc355
Pg15 support (#6085)
* Adjust configure script to allow PG15

* Adds copy of ruleutils_14.c as ruleutils_15.c

* Uses get_namespace_name_or_temp in ruleutils_15.c

Relevant PG commit:
48c5c9068211e0a04fd9553c8714b2821ed3ad17

* Clean up code using "(expr) ? true : false" in ruleutils_15.c

Relevant PG commit:
fd0625c7a9c679c0c1e896014b8f49a489c3a245

* Change varno from Index (unsigned int) to int in ruleutils_15.c

Relevant PG commit:
e3ec3c00d85bd2844ffddee83df2bd67c4f8297f

* Adds find_recursive_union to ruleutils_15.c

Relevant PG commit:
3f50b82639637c9908afa2087de7588450aa866b

* Fix display of SQL-std func's args in INSERT/SELECT in ruleutils_15.c

Relevant PG commit:
a8d8445a7b2f80f6d0bfe97b19f90bd2cbef8759

* Fix ruleutils_15.c's dumping of whole-row Vars in more contexts

Relevant PG commit:
43c2175121c829c8591fc5117b725f1f22bfb670

* Fix assorted missing logic for GroupingFunc nodes in ruleutils_15.c

Relevant PG commit:
2591ee8ec44d8cbc8e1226550337a64c684746e4

* Adds grammar support for SQL/JSON clauses in ruleutils_15.c

Relevant PG commit:
f79b803dcc98d707450e158db3638dc67ff8380b

* Adds SQL/JSON constructors to ruleutils_15.c

Relevant PG commits:
f4fb45d15c59d7add2e1b81a9d477d0119a9691a
cc7401d5ca498a84d9b47fd2e01cebd8e830e558

* Adds support for MERGE in ruleutils_15.c

Relevant PG commit:
7103ebb7aae8ab8076b7e85f335ceb8fe799097c

* Add IS JSON predicate to ruleutils_15.c

Relevant PG commit:
33a377608fc29cdd1f6b63be561eab0aee5c81f0

* Add SQL/JSON query functions to ruleutils_15.c

Relevant PG commit:
1a36bc9dba8eae90963a586d37b6457b32b2fed4

* Adds three different SQL/JSON values to ruleutils_15.c

Relevant PG commits:
606948b058dc16bce494270eea577011a602810e
49082c2cc3d8167cca70cfe697afb064710828ca

* Adds JSON table functions in ruleutils_15.c

Relevant PG commit:
4e34747c88a03ede6e9d731727815e37273d4bc9

* Add PLAN function for JSON table in ruleutils_15.c

Relevant PG commit:
fadb48b00e02ccfd152baa80942de30205ab3c4f

* Remove extra blank lines before block-closing braces ruleutils_15.c

Relevant PG commit:
24d2b2680a8d0e01b30ce8a41c4eb3b47aca5031

* set_deparse_plan: Reuse variable to appease Coverity ruleutils_15.c

Relevant PG commit:
e70813fbc4aaca35ec012d5a426706bd54e4acab

* Mechanical code beautification ruleutils_15.c

Relevant PG commit:
23e7b38bfe396f919fdb66057174d29e17086418

* Rename value_type to item_type in ruleutils_15.c

Relevant PG commit:
3ab9a63cb638a1fd99475668e2da9c237495aeda

* Show 'AS "?column?"' explicitly when it's important in ruleutils_15.c

Relevant PG commit:
c7461fc25558832dd347a9c8150b0f1ed85e36e8

* Fix ruleutils_15.c issues with dropped cols in funcs-returning-composite

Relevant PG commit:
c1d1e8469c77ce6b8e5310955580b4a3eee7fe96

* Change comment regarding functions returning composite in ruleutils_15.c

Relevant PG commit:
c2fa113ddb1117b1f03e91960f65d5d7d8a90270

* Replace int nodes with bool nodes where needed

In PG15, Boolean nodes are added. Pre PG15, internal Boolean values
in Create Role commands were represented by Integer nodes. This
commit replaces int nodes logic with bool nodes logic where needed.
Mostly there are CREATE ROLE logic changes.

Relevant PG commit:
941460fcf731a32e6a90691508d5cfa3d1f8eeaf

* Handle new option colliculocale in CREATE COLLATION logic

In PG15, there is an added option to use ICU as global locale provider.
pg_collation has three locale-related fields: collcollate and collctype,
which are libc-related fields, and a new one colliculocale, which is the
ICU-related field. Only the libc-related fields or the ICU-related field
is set, never both.

Relevant PG commits:
f2553d43060edb210b36c63187d52a632448e1d2
54637508f87bd5f07fb9406bac6b08240283be3b

* Add PG15 tests to CI using test images that have 15beta2 (#6093)

* Change warning message in pg_signal_backend()

Relevant PG commit:
7fa945b857cc1b2964799411f1633468826861ff

* Revert "Add missing ifdef for PG 15"

This reverts commit c7b51025ab.

* Fixes tests for ALTER TRIGGER RENAME consistency for part. tables

Relevant PG commit:
80ba4bb383538a2ee846fece6a7b8da9518b6866

* Prevent creating child triggers on partitions when adding new node

Pre PG15, tgisinternal is true for a "child" trigger on a partition
cloned from the trigger on the parent.
In PG15, tgisinternal is false in that case. However, we don't want to
create this trigger on the partition since it will create a conflict
when we try to attach the partition to the parent table:
ERROR: trigger "..." for relation "{partition_name}" already exists

Relevant PG commit:
f4566345cf40b068368cb5617e61318da60676ec

* Fix tests for generated columns dependency changes

In PG15, For GENERATED columns, all dependencies of the generation
expression are recorded as NORMAL dependencies of the column itself.
This requires CASCADE to drop generated cols with the original col.
PRE PG15, dependencies were recorded as AUTO, with which
generated columns are silently dropped with the original column.

Relevant PG commit:
cb02fcb4c95bae08adaca1202c2081cfc81a28b5

* Explicitly cast catalog "char" column to text before concatenation

Relevant PG commit:
07eee5a0dc642d26f44d65c4e6263304208e8583

* Remove 'AS "?column?"' from test outputs

There were some instances in the following tst outputs
in planning debug outputs where AS "?column?" is added.
We add a normalization rule to remove it as it is not
important.

cte_inline.out
recursive_relation_planning_restriction_pushdown.out

Relevant PG commit:
c7461fc25558832dd347a9c8150b0f1ed85e36e8

* Use pg_backup_stop(PG15) instead of pg_stop_backup(PG<15)

Add an alternative test output because of the change in the
backup modes of Postgres. Specifically here, there is a renaming
issue: pg_stop_backup PRE PG15 vs pg_backup_stop PG15+
The alternative output can be deleted when we drop support for PG14

Relevant PG commit:
39969e2a1e4d7f5a37f3ef37d53bbfe171e7d77a

* Adds citus.mitmfifo GUC

Previously we setting this configuration parameter
in the fly for failure tests schedule.
However, PG15 doesn't allow that anymore: reserved prefixes
like "citus" cannot be used to set non-existing GUCs.

Relevant PG commit:
88103567cb8fa5be46dc9fac3e3b8774951a2be7

* Handles EXPLAIN output diffs in PG15 - Extra result lines

To handle extra "Result" lines in explain outputs, we add explain
method to multi_test_helpers.sql file
- plan_without_result_lines() is added for cases where we want the
whole explain output with only "Result" lines removed

* Handles EXPLAIN output diffs in PG15, Hash Agg/Join leverage

To handle differences in usage of GroupAggregate vs HashAggregate
or Merge Join vs Hash join in cases where this detail doesn't
seem to matter, we use coordinator_plan().
- coordinator_plan() is updated to remove "Result" lines

There are some cases where we have subplans so we add a new
function that prints all Task Count lines as well
- coordinator_plan_with_subplans()

Still not sure of the relevant PG commit
Could be db0d67db2401eb6238ccc04c6407a4fd4f985832
but disabling enable_group_by_reordering didn't help.

* Handles EXPLAIN output diffs in PG15: enable_group_by_reordering

Relevant PG commit
db0d67db2401eb6238ccc04c6407a4fd4f985832

* Normalizes Memory Usage, Buckets, Batches for PG15 explain diffs

We create a new function in multi_test_helpers, which is similar
to explain_merge function in PG15. This explain helper function
normalies Memory Usage, Buckets and Batches, and we use it in the
tests which give a different output for PG15.

* Bump test images to 15beta3 (#6172)

* Omit namespace in post-copy errmsg

Relevant PG commit:
069d33d0c5a021601245e44df77a0423ddd69359

* Handles EXPLAIN output diffs in PG15: extra arrows&result lines

To handle extra "->" arrows resulting from extra Result lines
in explain outputs, we add the following explain method to
multi_test_helpers.sql file

- plan_without_arrows() is added for cases where we want the
whole explain output without arrows and without Result lines

* Alters public schema's owner to pg_database_owner in PG15

In PG15, public schema is owned by pg_database_owner role.
In multi_extension, we drop and recreate the ppublic schema,
hence its owner become the default user in our tests, postgres.
Change that to pg_database_owner for PG15 consistency.

This results in alternative test output for public schema grants
in the following test:

grant_on_schema_propagation.sql

Relevant PG commit: b073c3ccd06e4cb845e121387a43faa8c68a7b62

* Add alternative test outputs for change in Insert Select display

citus_local_tables_queries.sql
coordinator_shouldhaveshards.sql
cte_inline.sql
insert_select_repartition.sql
intermediate_result_pruning.sql
local_shard_execution.sql
local_shard_execution_replicated.sql
multi_deparse_shard_query.sql
multi_insert_select.sql
multi_insert_select_conflict.sql
multi_mx_insert_select_repartition.sql
mx_coordinator_shouldhaveshards.sql
single_node.sql

Relevant PG commit:
a8d8445a7b2f80f6d0bfe97b19f90bd2cbef8759

* Fixes columnar tap tests for PG15

In PG15, Perl test modules have been moved to a new namespace.
Also, postgres node new() and get_new_node() methods have been
unified to one method: new()

We create separate tap tests for PG13/14 and PG15+
and update the Makefiles accordingly.

Relevant PG commits:
201a76183e2056c2217129e12d68c25ec9c559c8
b3b4d8e68ae83f432f43f035c7eb481ef93e1583

* Handles EXPLAIN output diffs in PG15: HashAgg Leverage,alt. output

Still not sure of the relevant PG commit
Could be db0d67db2401eb6238ccc04c6407a4fd4f985832
but disabling enable_group_by_reordering didn't help.
2022-08-24 17:59:17 +02:00
Jelte Fennema 78a5013e24
Support changing CPU priorities for backends and shard moves (#6126)
**Intro**
This adds support to Citus to change the CPU priority values of
backends. This is created with two main usecases in mind:

1. Users might want to run the logical replication part of the shard moves
   or shard splits at a higher speed than they would do by themselves. 
   This might cause some small loss of DB performance for their regular 
   queries, but this is often worth it. During high load it's very possible
   that the logical replication WAL sender is not able to keep up with the
   WAL that is generated. This is especially a big problem when the
   machine is close to running out of disk when doing a rebalance.
2. Users might have certain long running queries that they don't impact
   their regular workload too much.

**Be very careful!!!**
Using CPU priorities to control scheduling can be helpful in some cases
to control which processes are getting more CPU time than others. 
However, due to an issue called "[priority inversion][1]" it's possible that
using CPU priorities together with the many locks that are used within
Postgres cause the exact opposite behavior of what you intended. This
is why this PR only allows the PG superuser to change the CPU priority 
of its own processes. Currently it's not recommended to set `citus.cpu_priority`
directly. Currently the only recommended interface for users is the setting 
called `citus.cpu_priority_for_logical_replication_senders`. This setting
controls CPU priority for a very limited set of processes (the logical 
replication senders). So, the dangers of priority inversion are also limited
with when using it for this usecase.

**Background**
Before reading the rest it's important to understand some basic
background regarding process CPU priorities, because they are a bit
counter intuitive. A lower priority value, means that the process will
be scheduled more and whatever it's doing will thus complete faster. The
default priority for processes is 0. Valid values are from -20 to 19
inclusive. On Linux a larger difference between values of two processes
will result in a bigger difference in percentage of scheduling.

**Handling the usecases**
Usecase 1 can be achieved by setting `citus.cpu_priority_for_logical_replication_senders`
to the priority value that you want it to have. It's necessary to set
this both on the workers and the coordinator. Example:
```
citus.cpu_priority_for_logical_replication_senders = -10
```

Usecase 2 can with this PR be achieved by running the following as
superuser. Note that this is only possible as superuser currently 
due to the dangers mentioned in the "Be very carefull!!!" section. 
And although this is possible it's **NOT** recommended:
```sql
ALTER USER background_job_user SET citus.cpu_priority = 5;
```

**OS configuration**
To actually make these settings work well it's important to run Postgres
with more a more permissive value for the 'nice' resource limit than
Linux will do by default. By default Linux will not allow a process to
set its priority lower than it currently is, even if it was lower when
the process originally started. This capability is necessary to reset
the CPU priority to its original value after a transaction finishes.
Depending on how you run Postgres this needs to be done in one of two
ways:

If you use systemd to start Postgres all you have to do is add  a line
like this to the systemd service file:
```conf
LimitNice=+0 # the + is important, otherwise its interpreted incorrectly as 20
```

If that's not the case you'll have to configure `/etc/security/limits.conf` 
like so, assuming that you are running Postgres as the `postgres` OS user:
```
postgres            soft    nice            0
postgres            hard    nice            0
```
Finally you'd have add the following line to `/etc/pam.d/common-session`
```
session required pam_limits.so
```

These settings would allow to change the priority back after setting it
to a higher value.

However, to actually allow you to set priorities even lower than the
default priority value you would need to change the values in the 
config to something lower than 0. So for example:
```conf
LimitNice=-10
```

or

```
postgres            soft    nice            -10
postgres            hard    nice            -10
```

If you use WSL2 you'll likely have to do another thing. You have to 
open a new shell, because when PAM is only used during login, and 
WSL2 doesn't actually log you in. You can force a login like this:
```
sudo su $USER --shell /bin/bash
```
Source: https://stackoverflow.com/a/68322992/2570866

[1]: https://en.wikipedia.org/wiki/Priority_inversion
2022-08-16 13:07:17 +03:00
Jelte Fennema 1a01c896f0
Fix description of citus.distributed_deadlock_detection_factor (#5860)
The long description of the `citus.distributed_deadlock_detection_factor` 
setting was incorrectly stating that 1000 would disable it. Instead -1 
is the value that disables distributed deadlock detection.
2022-08-16 01:19:49 +03:00
Teja Mupparti e962113c63 Remove the GUC mention in the error message as this config is meant for advanced users 2022-08-11 09:43:14 -07:00
Jelte Fennema dd548ee3c7
Use faster custom copy logic for non-blocking shard moves (#6119)
DESCRIPTION: Use faster custom copy logic for non-blocking shard moves

Non-blocking shard moves consist of two main phases:
1. Initial data copy
2. Catchup phase

This changes the first of these phases significantly. Previously we used the
copy logic provided by postgres subscriptions. This meant we didn't have
to implement it ourselves, but it came with the downside of little control.
When implementing shard splits we needed more control to even make it
work, so we implemented our own logic for copying data between nodes.

This PR starts using that logic for non-blocking shard moves. Doing so
has four main advantages:
1. It uses COPY in binary format when possible, which is cheaper to encode 
    and decode. Furthermore it very often results in less data that needs to 
    be sent over the network.
2. It allows us to create the primary key (or other replica identity) after doing
    the initial data copy. This should give some speed up over the total run,
    because creating an index is bulk is much faster than incrementally building it.
3. It doesn't require a replication slot per parallel copy. Increasing the maximum
    number of replication slots uses resources in postgres, even if they are not used.
    So reducing the number of replication slots that shard moves need is nice.
4. Logical replication table_sync workers are slow to start up, so if lots of shards
    need to be copied that can make it quite slow. This can happen easily when
    combining Postgres partitioning with Citus.
2022-08-08 17:09:43 +02:00
Sameer Awasekar e236711eea Introduce Non-Blocking Shard Split Workflow 2022-08-04 16:32:38 +02:00
aykut-bozkurt 3ddc089651
stop distributing views with no distributed dependency if GUC DistributeLocalViews is set false. (#6083) 2022-08-04 12:34:40 +03:00
aykut-bozkurt f372e93d22
we supress notice log during looking up function oid to not break pg vanilla tests. (#6082) 2022-08-01 10:14:35 +03:00
Onder Kalaci d67cf907a2 Detach relation access tracking from connection management 2022-07-28 11:27:59 +02:00
Jelte Fennema 0f50bef696
Avoid possible information leakage about existing users (#6090) 2022-07-27 17:46:32 +02:00
aykut-bozkurt 5f27445b69
enable propagation warnings before postgres vanilla tests (#6081) 2022-07-27 10:34:41 +03:00
Onder Kalaci 6c65d29924 Check the PGPROC's validity properly
We used to only check whether the PID is valid
or not. However, Postgres does not necessarily
set the PID of the backend to 0 when it exists.

Instead, we need to be able to check it from procArray.
IsBackendPid() is what pg_stat_activity also relies
on for a similar purpose.
2022-07-26 17:44:44 +02:00
aykut-bozkurt 67ac3da2b0
added citus_depended_objects udf and HideCitusDependentObjects GUC to hide citus depended objects from pg meta queries (#6055)
use RecurseObjectDependencies api to find if an object is citus depended

make vanilla tests runnable to see if citus_depended function is working correctly
2022-07-25 16:43:34 +03:00
Onder Kalaci 483a3a5875 PG 15 Compat: Resolve compile issues + shmem requests
Similar to #5897, one more step for running Citus with PG 15.

This PR at least make Citus run with PG 15. I have not tried running the tests with PG 15.

Shmem changes are based on 4f2400cb3f

Compile breaks are mostly due to #6008
2022-07-15 10:11:39 +02:00
Ahmet Gedemenli c8e1e243b8
Fix matviews for citus_add_local_table_to_metadata (#6023) 2022-07-04 17:00:07 +03:00
Jelte Fennema 184c7c0bce
Make enterprise features open source (#6008)
This PR makes all of the features open source that were previously only
available in Citus Enterprise.

Features that this adds:
1. Non blocking shard moves/shard rebalancer
   (`citus.logical_replication_timeout`)
2. Propagation of CREATE/DROP/ALTER ROLE statements
3. Propagation of GRANT statements
4. Propagation of CLUSTER statements
5. Propagation of ALTER DATABASE ... OWNER TO ...
6. Optimization for COPY when loading JSON to avoid double parsing of
   the JSON object (`citus.skip_jsonb_validation_in_copy`)
7. Support for row level security
8. Support for `pg_dist_authinfo`, which allows storing different
   authentication options for different users, e.g. you can store
   passwords or certificates here.
9. Support for `pg_dist_poolinfo`, which allows using connection poolers
   in between coordinator and workers
10. Tracking distributed query execution times using
   citus_stat_statements (`citus.stat_statements_max`,
   `citus.stat_statements_purge_interval`,
   `citus.stat_statements_track`). This is disabled by default.
11. Blocking tenant_isolation
12. Support for `sslkey` and `sslcert` in `citus.node_conninfo`
2022-06-16 00:23:46 -07:00
Ying Xu a1151c2395
Clear metadatacache during abort for create extension (#5907)
* Bug fix for bug #5876. Memset MetadataCacheSystem every time there is an abort

* Created an ObjectAccessHook that saves the transactionlevel of when citus was created and will clear metadatacache if that transaction level is rolled back. Added additional tests to make sure metadatacache is cleared
2022-05-20 13:47:58 -07:00
Marco Slot 7abcfac61f Add caching for functions that check the backend type 2022-05-20 19:02:37 +02:00
Marco Slot 09ec366ff5 Improve nested execution checks and add GUC to disable 2022-05-20 18:55:43 +02:00
jeff-davis a9f8a60007
Columnar: support relation options with ALTER TABLE. (#5935)
Columnar: support relation options with ALTER TABLE.

Use ALTER TABLE ... SET/RESET to specify relation options rather than
alter_columnar_table_set() and alter_columnar_table_reset().

Not only is this more ergonomic, but it also allows better integration
because it can be treated like DDL on a regular table. For instance,
citus can use its own ProcessUtility_hook to distribute the new
settings to the shards.

DESCRIPTION: Columnar: support relation options with ALTER TABLE.
2022-05-20 08:35:00 -07:00
gledis69 4731630741 Add distributing lock command support 2022-05-20 12:28:07 +03:00
Marco Slot ceb593c9da Convert citus.hide_shards_from_app_name_prefixes to citus.show_shards_for_app_name_prefixes 2022-05-03 14:22:13 +02:00
Marco Slot 9476f377b5 Remove old re-partitioning functions 2022-04-04 18:11:52 +02:00
jeff-davis c485a04139
Separate build of citus.so and citus_columnar.so. (#5805)
* Separate build of citus.so and citus_columnar.so.

Because columnar code is statically-linked to both modules, it doesn't
make sense to load them both at once.

A subsequent commit will make the modules entirely separate and allow
loading them both simultaneously.

Author: Yanwen Jin

* Separate citus and citus_columnar modules.

Now the modules are independent. Columnar can be loaded by itself, or
along with citus.

Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2022-03-31 19:47:17 -07:00
Onder Kalaci af4ba3eb1f Remove citus.enable_cte_inlining GUC
In Postgres 12+, users can adjust whether to inline/not inline CTEs
by [NOT] MATERIALIZED keywords. So, this GUC is already useless.
2022-03-22 17:14:44 +01:00
Marco Slot 7559ad12ba Change create_object_propagation default to immediate 2022-03-09 17:40:50 +01:00
Onder Kalaci c32b2de1a7 Improve citus_lock_waits
1) Remove useless columns
2) Show backends that are blocked on a DDL even before
   gpid is assigned
3) One minor bugfix, where we clear distributedCommandOriginator
   properly.
2022-03-07 11:10:44 +01:00
Marco Slot 43e4dd3808 Add a citus.internal_reserved_connections setting 2022-03-02 19:13:53 +01:00
Marco Slot dcfbb51b6b Revert "Build Columnar.so and make Citus depends on it (#5661)"
This reverts commit a4133c69e8.
2022-03-02 11:33:15 +01:00
ywj a4133c69e8
Build Columnar.so and make Citus depends on it (#5661)
* [Columnar] Build columnar.so and let citus depends on it


Co-authored-by: Yanwen Jin <yanwjin@microsoft.com>
Co-authored-by: Ying Xu <32597660+yxu2162@users.noreply.github.com>
Co-authored-by: jeff-davis <Jeffrey.Davis@microsoft.com>
2022-03-01 23:31:14 +03:00
Nils Dijk 65bd540943
Feature: configure object propagation behaviour in transactions (#5724)
DESCRIPTION: Add GUC to control ddl creation behaviour in transactions

Historically we would _not_ propagate objects when we are in a transaction block. Creation of distributed tables would not always work in sequential mode, hence objects created in the same transaction as distributing a table that would use the just created object wouldn't work. The benefit was that the user could still benefit from parallelism.

Now that the creation of distributed tables is supported in sequential mode it would make sense for users to force transactional consistency of ddl commands for distributed tables. A transaction could switch more aggressively to sequential mode when creating new objects in a transaction.

We don't change the default behaviour just yet.

Also, many objects would not even propagate their creation when the transaction was already set to sequential, leaving the probability of a self deadlock. The new policy checks solve this discrepancy between objects as well.
2022-03-01 17:29:31 +03:00
Marco Slot 3cd9aa655a Stop using citus.binary_worker_copy_format 2022-02-23 19:40:21 +01:00
Teja Mupparti a62901396b Allow unsafe triggers via a GUC 2022-02-21 22:45:17 -08:00
Onder Kalaci abd5b1c506 Prevent any monitoring view/udf to show already exited backends
The low-level StoreAllActiveTransactions() function filters out
backends that exited.

Before this commit, if you run a pgbench, after that you'd still
see the backends show up:
```SQL
 select count(*) from get_global_active_transactions();
┌───────┐
│ count │
├───────┤
│   538 │
└───────┘
```

After this patch, only active backends show-up:

```SQL
 select count(*) from get_global_active_transactions();
┌───────┐
│ count │
├───────┤
│    72 │
└───────┘
```
2022-02-14 17:34:32 +01:00
Onder Kalaci 1c30f61a70 Prevent citus.node_conninfo to use "application_name"
With https://github.com/citusdata/citus/pull/5657, Citus uses
a fixed application_name while connecting to remote nodes
for internal purposes.

It means that we cannot allow users to override it via
citus.node_conninfo.
2022-02-09 13:22:04 +01:00
Halil Ozan Akgul 8ee02b29d0 Introduce global PID 2022-02-08 16:49:38 +03:00
Marco Slot 872f0a79db Remove random shard placement policy 2022-02-06 21:55:58 +01:00
Marco Slot 0cae8e7d6b Remove local-node-first shard placement 2022-02-06 21:36:34 +01:00
Ying Xu b5c116449b
Removed dependency from EnsureTableOwner (#5676)
Removed dependency for EnsureTableOwner. Also removed pg_fini() and columnar_tableam_finish() Still need to remove CheckCitusVersion dependency to make Columnar_tableam.h dependency free from Citus.
2022-02-04 12:45:07 -08:00
Onder Kalaci ff234fbfd2 Unify old GUCs into a single one
Replaces citus.enable_object_propagation with citus.enable_metadata_sync

Also, within Citus 11 release cycle, we added citus.enable_metadata_sync_by_default,
that is also replaced with citus.enable_metadata_sync.

In essence, when citus.enable_metadata_sync is set to true, all the objects
and the metadata is send to the remote node.

We strongly advice that the users never changes the value of
this GUC.
2022-02-04 10:52:56 +01:00
Onur Tirtir ff3913ad99
Copy errmsg for distributed deadlock error into heap (#5641)
multi_log_hook() hook is called by EmitErrorReport() when emitting the
ereport either to frontend or to the server logs. And some callers of
EmitErrorReport() (e.g.: errfinish()) seems to assume that string fields
of given ErrorData object needs to be freed. For this reason, we copy the
message into heap here.

I don't think we have faced with such a problem before but it seems worth
fixing as it is theoretically possible due to the reasoning above.
2022-01-24 06:27:41 -08:00
Marco Slot 33bfa0b191 Hide shards from application_name's with a specific prefix 2022-01-18 15:20:55 +04:00
jeff-davis 2e03efd91e
Columnar: move DDL hooks to citus to remove dependency. (#5547)
Add a new hook ColumnarTableSetOptions_hook so that citus can get
control when the columnar table options change.
2022-01-04 23:26:46 -08:00
Onder Kalaci fc98f83af2 Add citus.grep_remote_commands
Simply applies

```SQL
SELECT textlike(command, citus.grep_remote_commands)
```
And, if returns true, the command is logged. Else, the log is ignored.

When citus.grep_remote_commands is empty string, all commands are
logged.
2021-12-17 11:47:40 +01:00
Önder Kalacı 8c0bc94b51
Enable replication factor > 1 in metadata syncing (#5392)
- [x] Add some more regression test coverage
- [x] Make sure returning works fine in case of
     local execution + remote execution
     (task->partiallyLocalOrRemote works as expected, already added tests)
- [x] Implement locking properly (and add isolation tests)
     - [x] We do #shardcount round-trips on `SerializeNonCommutativeWrites`.
           We made it a single round-trip.
- [x] Acquire locks for subselects on the workers & add isolation tests
- [x] Add a GUC to prevent modification from the workers, hence increase the
      coordinator-only throughput
       - The performance slightly drops (~%15), unless
         `citus.allow_modifications_from_workers_to_replicated_tables`
         is set to false
2021-11-15 15:10:18 +03:00
Ahmet Gedemenli 14a33d4e8e Introduce GUC citus.use_citus_managed_tables 2021-11-11 14:09:06 +03:00
Marco Slot 78866df13c Remove master_append_table_to_shard UDF 2021-11-08 10:43:24 +01:00
Jelte Fennema 57a0228c52
Fix string-concatenation warning on Clang 13 (#5425)
Clang 13 complains about a suspicious string concatenation. It thinks we
might have missed a comma. This adds parentheses to make it clear that
concatenation is indeed what we meant.
2021-11-01 13:55:43 +03:00
Philip Dubé cc50682158 Fix typos. Spurred spotting "connectios" in logs 2021-10-25 13:54:09 +00:00
Önder Kalacı b3299de81c
Drop support for citus.multi_shard_commit_protocol (#5380)
In the past, we allowed users to manually switch to 1PC
(e.g., one phase commit). However, with this commit, we
don't. All multi-shard modifications are done via 2PC.
2021-10-21 14:01:28 +02:00
Önder Kalacı 3f726c72e0
When replication factor > 1, all modifications are done via 2PC (#5379)
With Citus 9.0, we introduced `citus.single_shard_commit_protocol` which
defaults to 2PC.

With this commit, we prevent any user to set it to 1PC and drop support
for `citus.single_shard_commit_protocol`.

Although this might add some overhead for users, it is already the default
behaviour (so less likely) and marking placements as INVALID is much
worse.
2021-10-20 01:39:03 -07:00
Halil Ozan Akgul 9c9d4b5eeb Turn MX on by default 2021-10-08 18:17:21 +03:00
Jelte Fennema bb5c494104 Enable binary encoding by default on PG14
Since PG14 we can now use binary encoding for arrays and composite types
that contain user defined types. This was fixed in this commit in
Postgres: 670c0a1d47

This change starts using that knowledge, by not necessarily falling back
to text encoding anymore for those types.

While doing this and testing a bit more I found various cases where
binary encoding would fail that our checks didn't cover. This fixes
those cases and adds tests for those. It also fixes EXPLAIN ANALYZE
never using binary encoding, which was a leftover of workaround that
was not necessary anymore.

Finally, it changes the default for both `citus.enable_binary_protocol`
and `citus.binary_worker_copy_format` to `true` for PG14 and up. In our
cloud offering `binary_worker_copy_format` already was true by default.
`enable_binary_protocol` had some bug with MX and user defined types,
this bug was fixed by the above mentioned fixes.
2021-09-06 10:27:29 +02:00
SaitTalhaNisanci c8326df8c0
Fix missing comma in connection options (#5206) 2021-08-25 13:40:42 +03:00
Jelte Fennema a31429aae5
Allow configuring tcp_user_timeout using citus.node_conn_info (#5203)
`tcp_user_timeout` is the awesome relatively unknown big brother of the
TCP keepalive related options. Instead of depending on keepalives being
sent, this determines that a socket is dead by waiting at most N seconds
for an ack of data that it has sent. It's exposed in libpq starting from
PG12.
2021-08-24 11:48:40 +03:00
Ahmet Gedemenli 51d410bb7b Add check for alphabetically sorted gucs
Move to a separate script

Add the new script to readme
2021-08-05 16:37:49 +03:00
Onder Kalaci 2c349e6dfd Use current user to sync metadata
Before this commit, we always synced the metadata with superuser.
However, that creates various edge cases such as visibility errors
or self distributed deadlocks or complicates user access checks.

Instead, with this commit, we use the current user to sync the metadata.
Note that, `start_metadata_sync_to_node` still requires super user
because accessing certain metadata (like pg_dist_node) always require
superuser (e.g., the current user should be a superuser).

However, metadata syncing operations regarding the distributed
tables can now be done with regular users, as long as the user
is the owner of the table. A table owner can still insert non-sense
metadata, however it'd only affect its own table. So, we cannot do
anything about that.
2021-07-16 13:25:27 +02:00
Ahmet Gedemenli 089ef35940 Disable dropping and truncating known shards
Add test for disabling dropping and truncating known shards
2021-06-02 14:30:27 +02:00
Jelte Fennema 1a83628195 Use "orphaned shards" naming in more places
We were not very consistent in how we named these shards.
2021-06-04 11:39:19 +02:00
Ahmet Gedemenli 103cf34418 Sort GUCs in alphabetical order 2021-06-02 12:52:18 +03:00
Hanefi Onaldi 878513f325
Remove all occurences of replication_model GUC 2021-05-21 16:14:59 +03:00
SaitTalhaNisanci 82f34a8d88
Enable citus.defer_drop_after_shard_move by default (#4961)
Enable citus.defer_drop_after_shard_move by default
2021-05-21 10:48:32 +03:00
Jelte Fennema 10f06ad753 Fetch shard size on the fly for the rebalance monitor
Without this change the rebalancer progress monitor gets the shard sizes
from the `shardlength` column in `pg_dist_placement`. This column needs to
be updated manually by calling `citus_update_table_statistics`.
However, `citus_update_table_statistics` could lead to distributed
deadlocks while database traffic is on-going (see #4752).

To work around this we don't use `shardlength` column anymore. Instead
for every rebalance we now fetch all shard sizes on the fly.

Two additional things this does are:
1. It adds tests for the rebalance progress function.
2. If a shard move cannot be done because a source or target node is
   unreachable, then we error in stop the rebalance, instead of showing
   a warning and continuing. When using the by_disk_size rebalance
   strategy it's not safe to continue with other moves if a specific
   move failed. It's possible that the failed move made space for the
   next move, and because the failed move never happened this space now
   does not exist.
3. Adds two new columns to the result of `get_rebalancer_progress` which
   shows the size of the shard on the source and target node.

Fixes #4930
2021-05-20 16:38:17 +02:00
Nils Dijk a6c2d2a4c4
Feature: alter database owner (#4986)
DESCRIPTION: Add support for ALTER DATABASE OWNER

This adds support for changing the database owner. It achieves this by marking the database as a distributed object. By marking the database as a distributed object it will look for its dependencies and order the user creation commands (enterprise only) before the alter of the database owner. This is mostly important when adding new nodes.

By having the database marked as a distributed object it can easily understand for which `ALTER DATABASE ... OWNER TO ...` commands to propagate by resolving the object address of the database and verifying it is a distributed object, and hence should propagate changes of owner ship to all workers.

Given the ownership of the database might have implications on subsequent commands in transactions we force sequential mode for transactions that have a `ALTER DATABASE ... OWNER TO ...` command in them. This will fail the transaction with meaningful help when the transaction already executed parallel statements.

By default the feature is turned off since roles are not automatically propagated, having it turned on would cause hard to understand errors for the user. It can be turned on by the user via setting the `citus.enable_alter_database_owner`.
2021-05-20 13:27:44 +02:00
Onder Kalaci 926069a859 Wait until all connections are successfully established
Comment from the code:
/*
 * Iterate until all the tasks are finished. Once all the tasks
 * are finished, ensure that that all the connection initializations
 * are also finished. Otherwise, those connections are terminated
 * abruptly before they are established (or failed). Instead, we let
 * the ConnectionStateMachine() to properly handle them.
 *
 * Note that we could have the connections that are not established
 * as a side effect of slow-start algorithm. At the time the algorithm
 * decides to establish new connections, the execution might have tasks
 * to finish. But, the execution might finish before the new connections
 * are established.
 */

 Note that the abruptly terminated connections lead to the following errors:

2020-11-16 21:09:09.800 CET [16633] LOG:  could not accept SSL connection: Connection reset by peer
2020-11-16 21:09:09.872 CET [16657] LOG:  could not accept SSL connection: Undefined error: 0
2020-11-16 21:09:09.894 CET [16667] LOG:  could not accept SSL connection: Connection reset by peer

To easily reproduce the issue:

- Create a single node Citus
- Add the coordinator to the metadata
- Create a distributed table with shards on the coordinator
- f.sql:  select count(*) from test;
- pgbench -f /tmp/f.sql postgres -T 12 -c 40 -P 1  or pgbench -f /tmp/f.sql postgres -T 12 -c 40 -P 1 -C
2021-05-19 15:59:13 +02:00
Onder Kalaci 995adf1a19 Executor takes connection establishment and task execution costs into account
With this commit, the executor becomes smarter about refrain to open
new connections. The very basic example is that, if the connection
establishments take 1000ms and task executions as 5 msecs, the executor
becomes smart enough to not establish new connections.
2021-05-19 15:48:07 +02:00
Nils Dijk c91f8d8a15
Feature: localhost guc (#4836)
DESCRIPTION: introduce `citus.local_hostname` GUC for connections to the current node

Citus once in a while needs to connect to itself for some systems operations. This used to be hardcoded to `localhost`. The hardcoded hostname causes some issues, for example in environments where `sslmode=verify-full` is required. It is not always desirable or even feasible to get `localhost` as an alt name on the certificate.

By introducing a GUC to use when connecting to the current instance the user has more control what network path is used and what hostname is required to be present in the server certificate.
2021-05-12 16:59:44 +02:00
Jelte Fennema cbbd10b974
Implement an improvement threshold in the rebalancer (#4927)
Every move in the rebalancer algorithm results in an improvement in the
balance. However, even if the improvement in the balance was very small
the move was still chosen. This is especially problematic if the shard
itself is very big and the move will take a long time.

This changes the rebalancer algorithm to take the relative size of the
balance improvement into account when choosing moves. By default a move
will not be chosen if it improves the balance by less than half of the
size of the shard. An extra argument is added to the rebalancer
functions so that the user can decide to lower the default threshold if
the ignored move is wanted anyway.
2021-05-11 14:24:59 +02:00
SaitTalhaNisanci 6b1904d37a
When moving a shard to a new node ensure there is enough space (#4929)
* When moving a shard to a new node ensure there is enough space

* Add WairForMiliseconds time utility

* Add more tests and increase readability

* Remove the retry loop and use a single udf for disk stats

* Address review

* address review

Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
2021-05-06 17:28:02 +03:00
Ahmet Gedemenli fe65be993e Sort GUCs in alphabetic order 2021-04-26 15:05:42 +03:00
Onder Kalaci 918838e488 Allow constant VALUES clauses in pushdown queries
As long as the VALUES clause contains constant values, we should not
recursively plan the queries/CTEs.

This is a follow-up work of #1805. So, we can easily apply OUTER join
checks as if VALUES clause is a reference table/immutable function.
2021-04-21 14:28:08 +02:00
SaitTalhaNisanci 03832f353c Drop postgres 11 support 2021-03-25 09:20:28 +03:00
Marco Slot fbc2147e11 Replace MAX_PUT_COPY_DATA_BUFFER_SIZE by citus.remote_copy_flush_threshold GUC 2021-03-16 06:00:38 +01:00
Marco Slot 1646fca445 Add GUC to set maximum connection lifetime 2021-03-16 01:57:57 +01:00
Onder Kalaci f297c96ec5 Add regression tests for COPY into colocated intermediate results
To add the tests without too much data, make the copy switchover
configurable.
2021-02-11 15:41:06 +01:00
Onder Kalaci c804c9aa21 Allow local execution for intermediate results in COPY
When COPY is used for copying into co-located files, it was
not allowed to use local execution. The primary reason was
Citus treating co-located intermediate results as co-located
shards, and COPY into the distributed table was done via
"format result". And, local execution of such COPY commands
was not implemented.

With this change, we implement support for local execution with
"format result". To do that, we use the buffer for every file
on shardState->copyOutState, similar to how local copy on
shards are implemented. In fact, the logic is similar to
local copy on shards, but instead of writing to the shards,
Citus writes the results to a file.

The logic relies on LOCAL_COPY_FLUSH_THRESHOLD, and flushes
only when the size exceeds the threshold. But, unlike local
copy on shards, in this case we write the headers and footers
just once.
2021-02-09 15:00:06 +01:00
Onder Kalaci 30d0a65f40 Adds citus.enable_local_reference_table_foreign_keys
When enabled any foreign keys between local tables and reference
tables supported by converting the local table to a citus local
table.

When the coordinator is not in the metadata, the logic is disabled
as foreign keys are not allowed in this configuration.
2021-01-15 18:04:52 +03:00
Ahmet Gedemenli 9a100bcdb9 Remove unused GUCs
Remove deprecated variables

Remove GUC citus.sslmode

Remove GUC citus.expire_cached_shards

Remove GUC citus.task_tracker_delay

Remove GUC citus.max_assign_task_batch_size

Remove GUC citus.max_tracked_tasks_per_node

Remove GUC citus.max_running_tasks_per_node

Remove GUC citus.large_table_shard_count

Remove GUC citus.max_task_string_size

Remove GUC citus.binary_master_copy_format
2021-01-15 13:30:45 +03:00
Marco Slot 011283122b Add the shard rebalancer implementation 2021-01-07 16:51:55 +01:00
Sait Talha Nisanci 1d82972ff4 Increase the performance with a trick
Instead of sending NULL's over a network, we now convert the subqueries
in the form of:

SELECT t.a, NULL, NULL FROM (SELECT a FROM table)t;

And we recursively plan the inner part so that we don't send the NULL's
over network. We still need the NULLs in the outer subquery because we
currently don't have an easy way of updating all the necessary places in
the query.

Add some documentation for how the conversion is done
2020-12-15 18:18:36 +03:00
Sait Talha Nisanci 5618f3a3fc Use BaseRestrictInfo for finding equality columns
Baseinfo also has pushed down filters etc, so it makes more sense to use
BaseRestrictInfo to determine what columns have constant equality
filters.

Also RteIdentity is used for removing conversion candidates instead of
rteIndex.
2020-12-15 18:18:36 +03:00
Sait Talha Nisanci 26d9f0b457 Use auto mode in tests and fix debug message 2020-12-15 18:17:10 +03:00
Sait Talha Nisanci f3d55448b3 Choose distributed table if it has a unique index in filter
When doing local-distributed table joins we convert one of them to
subquery. The current policy is that we convert distributed tables to
subquery if it has a unique index on a column that has unique
index(primary key also has a unique index).
2020-12-15 18:17:10 +03:00
Onder Kalaci 8f8390ed6e Recursively plan local table joins
The logical planner cannot handle joins between local and distributed table.
Instead, we can recursively plan one side of the join and let the logical
planner handle the rest.

Our algorithm is a little smart, trying not to recursively plan distributed
tables, but favors local tables.
2020-12-15 18:17:10 +03:00
Onder Kalaci c546ec5e78 Local node connection management
When Citus needs to parallelize queries on the local node (e.g., the node
executing the distributed query and the shards are the same), we need to
be mindful about the connection management. The reason is that the client
backends that are running distributed queries are competing with the client
backends that Citus initiates to parallelize the queries in order to get
a slot on the max_connections.

In that regard, we implemented a "failover" mechanism where if the distributed
queries cannot get a connection, the execution failovers the tasks to the local
execution.

The failover logic is follows:

- As the connection manager if it is OK to get a connection
	- If yes, we are good.
	- If no, we fail the workerPool and the failure triggers
	  the failover of the tasks to local execution queue

The decision of getting a connection is follows:

/*
 * For local nodes, solely relying on citus.max_shared_pool_size or
 * max_connections might not be sufficient. The former gives us
 * a preview of the future (e.g., we let the new connections to establish,
 * but they are not established yet). The latter gives us the close to
 * precise view of the past (e.g., the active number of client backends).
 *
 * Overall, we want to limit both of the metrics. The former limit typically
 * kics in under regular loads, where the load of the database increases in
 * a reasonable pace. The latter limit typically kicks in when the database
 * is issued lots of concurrent sessions at the same time, such as benchmarks.
 */
2020-12-03 14:16:13 +03:00
Onder Kalaci 629ecc3dee Add the infrastructure to count the number of client backends
Considering the adaptive connection management
improvements that we plan to roll soon, it makes it
very helpful to know the number of active client
backends.

We are doing this addition to simplify yhe adaptive connection
management for single node Citus. In single node Citus, both the
client backends and Citus parallel queries would compete to get
slots on Postgres' `max_connections` on the same Citus database.

With adaptive connection management, we have the counters for
Citus parallel queries. That helps us to adaptively decide
on the remote executions pool size (e.g., throttle connections
if necessary).

However, we do not have any counters for the total number of
client backends on the database. For single node Citus, we
should consider all the client backends, not only the remote
connections that Citus does.

Of course Postgres internally knows how many client
backends are active. However, to get that number Postgres
iterates over all the backends. For examaple, see [pg_stat_get_db_numbackends](8e90ec5580/src/backend/utils/adt/pgstatfuncs.c (L1240))
where Postgres iterates over all the backends.

For our purpuses, we need this information on every connection
establishment. That's why we cannot affort to do this kind of
iterattion.
2020-11-25 19:19:24 +01:00
Nils Dijk 213eb93e6d
make columnar compile and functionally working 2020-11-17 18:55:34 +01:00
SaitTalhaNisanci 45bb0fb587
Do initial cleanup only once in pg_init (#4213)
In postmasters execution of _PG_init, IsUnderPostmaster will be false and
we want to do the cleanup at that time only, otherwise there is a chance that
there will be parallel queries and we might do a cleanup for things that are
already in use.
2020-10-02 09:12:39 +03:00
Ahmet Gedemenli abfb79bda6 Sort explain analyze output by task time
Add sort method parameter for regression tests

Fix check-style

Change sorting method parameters to enum

Polish

Add task fields to OutTask

Add test into multi_explain

Fix isolation test
2020-09-24 11:38:40 +03:00
Önder Kalacı 983206c5e1
Hide `citus.subquery_pushdown` flag and NOTICE when enabled (#4124)
* Hide citus.subquery_pushdown flag

This flag is dangerous and could likely to let queries
return wrong results.

The flag has a very specific purpose for a very specific
data distribution and query structure. In those cases, when
the flag is set, the user can skip recursive planning altogether
*at their own risk*.

The meaning of the flag is that "I know what I'm doing such that
the query structure/data distribution is on my control, so Citus
can skip many correctness checks".

For regular users, enabling this flag is discouraged. We have to
keep the support only for backward compatibility for some users.

In addition to that, give a NOTICE to discourage new users to
use it.
2020-08-28 14:53:09 +02:00
Onder Kalaci eeb8c81de2 Implement shared connection count reservation & enable `citus.max_shared_pool_size` for COPY
With this patch, we introduce `locally_reserved_shared_connections.c/h` files
which are responsible for reserving some space in shared memory counters
upfront.

We sometimes need to reserve connections, but not necessarily
establish them. For example:
-  COPY command should reserve connections as it cannot know which
   connections it needs in which order. COPY establishes connections
   as any input data hits the workers. For example, for router COPY
   command, it only establishes 1 connection.

   As discussed here (https://github.com/citusdata/citus/pull/3849#pullrequestreview-431792473),
   COPY needs to reserve connections up-front, otherwise we can end
   up with resource starvation/un-detected deadlocks.
2020-08-03 18:51:40 +02:00
Sait Talha Nisanci a3dc8fe2b5 remove occurrences of task-tracker from gucs 2020-07-21 16:19:46 +03:00
SaitTalhaNisanci b3af63c8ce
Remove task tracker executor (#3850)
* use adaptive executor even if task-tracker is set

* Update check-multi-mx tests for adaptive executor

Basically repartition joins are enabled where necessary. For parallel
tests max adaptive executor pool size is decresed to 2, otherwise we
would get too many clients error.

* Update limit_intermediate_size test

It seems that when we use adaptive executor instead of task tracker, we
exceed the intermediate result size less in the test. Therefore updated
the tests accordingly.

* Update multi_router_planner

It seems that there is one problem with multi_router_planner when we use
adaptive executor, we should fix the following error:
+ERROR:  relation "authors_range_840010" does not exist
+CONTEXT:  while executing command on localhost:57637

* update repartition join tests for check-multi

* update isolation tests for repartitioning

* Error out if shard_replication_factor > 1 with repartitioning

As we are removing the task tracker, we cannot switch to it if
shard_replication_factor > 1. In that case, we simply error out.

* Remove MULTI_EXECUTOR_TASK_TRACKER

* Remove multi_task_tracker_executor

Some utility methods are moved to task_execution_utils.c.

* Remove task tracker protocol methods

* Remove task_tracker.c methods

* remove unused methods from multi_server_executor

* fix style

* remove task tracker specific tests from worker_schedule

* comment out task tracker udf calls in tests

We were using task tracker udfs to test permissions in
multi_multiuser.sql. We should find some other way to test them, then we
should remove the commented out task tracker calls.

* remove task tracker test from follower schedule

* remove task tracker tests from multi mx schedule

* Remove task-tracker specific functions from worker functions

* remove multi task tracker extra schedule

* Remove unused methods from multi physical planner

* remove task_executor_type related things in tests

* remove LoadTuplesIntoTupleStore

* Do initial cleanup for repartition leftovers

During startup, task tracker would call TrackerCleanupJobDirectories and
TrackerCleanupJobSchemas to clean up leftover directories and job
schemas. With adaptive executor, while doing repartitions it is possible
to leak these things as well. We don't retry cleanups, so it is possible
to have leftover in case of errors.

TrackerCleanupJobDirectories is renamed as
RepartitionCleanupJobDirectories since it is repartition specific now,
however TrackerCleanupJobSchemas cannot be used currently because it is
task tracker specific. The thing is that this function is a no-op
currently.

We should add cleaning up intermediate schemas to DoInitialCleanup
method when that problem is solved(We might want to solve it in this PR
as well)

* Revert "remove task tracker tests from multi mx schedule"

This reverts commit 03ecc0a681.

* update multi mx repartition parallel tests

* not error with task_tracker_conninfo_cache_invalidate

* not run 4 repartition queries in parallel

It seems that when we run 4 repartition queries in parallel we get too
many clients error on CI even though we don't get it locally. Our guess
is that, it is because we open/close many connections without doing some
work and postgres has some delay to close the connections. Hence even
though connections are removed from the pg_stat_activity, they might
still not be closed. If the above assumption is correct, it is unlikely
for it to happen in practice because:
- There is some network latency in clusters, so this leaves some times
for connections to be able to close
- Repartition joins return some data and that also leaves some time for
connections to be fully closed.

As we don't get this error in our local, we currently assume that it is
not a bug. Ideally this wouldn't happen when we get rid of the
task-tracker repartition methods because they don't do any pruning and
might be opening more connections than necessary.

If this still gives us "too many clients" error, we can try to increase
the max_connections in our test suite(which is 100 by default).

Also there are different places where this error is given in postgres,
but adding some backtrace it seems that we get this from
ProcessStartupPacket. The backtraces can be found in this link:
https://circleci.com/gh/citusdata/citus/138702

* Set distributePlan->relationIdList when it is needed

It seems that we were setting the distributedPlan->relationIdList after
JobExecutorType is called, which would choose task-tracker if
replication factor > 1 and there is a repartition query. However, it
uses relationIdList to decide if the query has a repartition query, and
since it was not set yet, it would always think it is not a repartition
query and would choose adaptive executor when it should choose
task-tracker.

* use adaptive executor even with shard_replication_factor > 1

It seems that we were already using adaptive executor when
replication_factor > 1. So this commit removes the check.

* remove multi_resowner.c and deprecate some settings

* remove TaskExecution related leftovers

* change deprecated API error message

* not recursively plan single relatition repartition subquery

* recursively plan single relation repartition subquery

* test depreceated task tracker functions

* fix overlapping shard intervals in range-distributed test

* fix error message for citus_metadata_container

* drop task-tracker deprecated functions

* put the implemantation back to worker_cleanup_job_schema_cachesince citus cloud uses it

* drop some functions, add downgrade script

Some deprecated functions are dropped.
Downgrade script is added.
Some gucs are deprecated.
A new guc for repartition joins bucket size is added.

* order by a test to fix flappiness
2020-07-18 13:11:36 +03:00
Nils Dijk d0b6e62c9a
change wording to allowlist and the likes (#3906)
In the same line as #3904

Change wording to better reflect use and remove words that enforce/maintain bias.
2020-07-15 16:24:40 +02:00
Jelte Fennema 392c5e2c34
Fix wrong cancellation message about distributed deadlocks (#3956) 2020-06-30 14:57:46 +02:00
Marco Slot 2a3234ca26 Rename masterQuery to combineQuery 2020-06-17 14:14:37 +02:00