Commit Graph

2488 Commits (321fcfcdb5cefc02feb0f13e760211511c7d91c1)

Author SHA1 Message Date
Halil Ozan Akgül 321fcfcdb5
Add Support for Single Shard Tables in update_distributed_table_colocation (#6924)
Adds Support for Single Shard Tables in
`update_distributed_table_colocation`.

This PR changes checks that make sure tables should be hash distributed
table to hash or single shard distributed tables.
2023-05-29 11:47:50 +03:00
Ahmet Gedemenli 1ca80813f6
Citus UDFs support for single shard tables (#6916)
Verify Citus UDFs work well with single shard tables

SUPPORTED
* citus_table_size
* citus_total_relation_size
* citus_relation_size
* citus_shard_sizes
* truncate_local_data_after_distributing_table
* create_distributed_function // test function colocated with a single
shard table
* undistribute_table
* alter_table_set_access_method

UNSUPPORTED - error out for single shard tables
* master_create_empty_shard
* create_distributed_table_concurrently
* create_distributed_table
* create_reference_table
* citus_add_local_table_to_metadata
* citus_split_shard_by_split_points
* alter_distributed_table
2023-05-26 17:30:05 +03:00
Onur Tirtir 246b054a7d
Add support for schema-based-sharding via a GUC (#6866)
DESCRIPTION: Adds citus.enable_schema_based_sharding GUC that allows
sharding the database based on schemas when enabled.

* Refactor the logic that automatically creates Citus managed tables 

* Refactor CreateSingleShardTable() to allow specifying colocation id
instead

* Add support for schema-based-sharding via a GUC

### What this PR is about:
Add **citus.enable_schema_based_sharding GUC** to enable schema-based
sharding. Each schema created while this GUC is ON will be considered
as a tenant schema. Later on, regardless of whether the GUC is ON or
OFF, any table created in a tenant schema will be converted to a
single shard distributed table (without a shard key). All the tenant
tables that belong to a particular schema will be co-located with each
other and will have a shard count of 1.

We introduce a new metadata table --pg_dist_tenant_schema-- to do the
bookkeeping for tenant schemas:
```sql
psql> \d pg_dist_tenant_schema
          Table "pg_catalog.pg_dist_tenant_schema"
┌───────────────┬─────────┬───────────┬──────────┬─────────┐
│    Column     │  Type   │ Collation │ Nullable │ Default │
├───────────────┼─────────┼───────────┼──────────┼─────────┤
│ schemaid      │ oid     │           │ not null │         │
│ colocationid  │ integer │           │ not null │         │
└───────────────┴─────────┴───────────┴──────────┴─────────┘
Indexes:
    "pg_dist_tenant_schema_pkey" PRIMARY KEY, btree (schemaid)
    "pg_dist_tenant_schema_unique_colocationid_index" UNIQUE, btree (colocationid)

psql> table pg_dist_tenant_schema;
┌───────────┬───────────────┐
│ schemaid  │ colocationid  │
├───────────┼───────────────┤
│     41963 │            91 │
│     41962 │            90 │
└───────────┴───────────────┘
(2 rows)
```

Colocation id column of pg_dist_tenant_schema can never be NULL even
for the tenant schemas that don't have a tenant table yet. This is
because, we assign colocation ids to tenant schemas as soon as they
are created. That way, we can keep associating tenant schemas with
particular colocation groups even if all the tenant tables of a tenant
schema are dropped and recreated later on.

When a tenant schema is dropped, we delete the corresponding row from
pg_dist_tenant_schema. In that case, we delete the corresponding
colocation group from pg_dist_colocation as well.

### Future work for 12.0 release:
We're building schema-based sharding on top of the infrastructure that
adds support for creating distributed tables without a shard key
(https://github.com/citusdata/citus/pull/6867).
However, not all the operations that can be done on distributed tables
without a shard key necessarily make sense (in the same way) in the
context of schema-based sharding. For example, we need to think about
what happens if user attempts altering schema of a tenant table. We
will tackle such scenarios in a future PR.

We will also add a new UDF --citus.schema_tenant_set() or such-- to
allow users to use an existing schema as a tenant schema, and another
one --citus.schema_tenant_unset() or such-- to stop using a schema as
a tenant schema in future PRs.
2023-05-26 10:49:58 +03:00
Halil Ozan Akgül 2c7beee562
Fix citus.tenant_stats_limit test by setting it to 2 (#6899)
citus.tenant_stats_limit was set to 2 when we were adding tests for it.
Then we changed it to 10, making the tests incorrect.
This PR fixes that without breaking other tests.
2023-05-23 17:44:07 +03:00
Emel Şimşek 02f815ce1f
Disable local execution when Explain Analyze is requested for a query. (#6892)
DESCRIPTION: Fixes a crash when explain analyze is requested for a query
that is normally locally executed.

When explain analyze is requested for a query, a task with two queries
is created. Those two queries are
    
1. Wrapped Query --> `SELECT ... FROM
worker_save_query_explain_analyze(<query>, <explain analyze options>)`
2. Fetch Query -->` SELECT explain_analyze_output, execution_duration
FROM worker_last_saved_explain_analyze();`

When the query is locally executed a task with multiple queries causes a
crash in production. See the Assert at
57455dc64d/src/backend/distributed/executor/tuple_destination.c#:~:text=Assert(task%2D%3EqueryCount%20%3D%3D%201)%3B

This becomes a critical issue when auto_explain extension is used. When
auto_explain extension is enabled, explain analyze is automatically
requested for every query.

One possible solution could be not to create two queries for a locally
executed query. The fetch part may not have to be a query since the
values are available in local variables.

Until we enable local execution for explain analyze, it is best to
disable local execution.

Fixes #6777.
2023-05-23 14:33:22 +03:00
Emel Şimşek f9a5be59b9
Run replicate_reference_tables background task as superuser. (#6930)
DESCRIPTION: Fixes a bug in background shard rebalancer where the
replicate reference tables task fails if the current user is not a
superuser.

This change is to be backported to earlier releases. We should fix the
permissions for replicate_reference_tables on main branch such that it
can be run by non-superuser roles.

Fixes #6925.
Fixes #6926.
2023-05-18 23:46:32 +03:00
Hanefi Onaldi 6a83290d91
Add ORDER BY clauses to some flaky tests (#6931)
I observed a flaky test output
[here](https://app.circleci.com/pipelines/github/citusdata/citus/32692/workflows/32464a22-7fd6-440a-9ff7-cfa62f9ff58a/jobs/1126144)
and added `ORDER BY` clauses to similar queries in the failing test
file.

```diff
 SELECT pg_identify_object_as_address(classid, objid, objsubid) from pg_catalog.pg_dist_object where objid IN('viewsc.prop_view3'::regclass::oid, 'viewsc.prop_view4'::regclass::oid);
   pg_identify_object_as_address  
 ---------------------------------
- (view,"{viewsc,prop_view3}",{})
  (view,"{viewsc,prop_view4}",{})
+ (view,"{viewsc,prop_view3}",{})
 (2 rows)
```
2023-05-18 12:45:39 +03:00
Onur Tirtir 8ff9dde4b3
Prevent pushing down INSERT .. SELECT queries that we shouldn't (and allow some more) (#6752)
Previously INSERT .. SELECT planner were pushing down some queries that should not be pushed down due to wrong colocation checks. It was checking whether one of the table in SELECT part and target table are colocated. But now, we check colocation for all tables in SELECT part and the target table.

Another problem with INSERT .. SELECT planner was that some queries, which is valid to be pushed down, were not pushed down due to unnecessary checks which are currently supported. e.g. UNION check. As solution, we reused the pushdown planner checks for INSERT .. SELECT planner.


DESCRIPTION: Fixes a bug that causes incorrectly pushing down some
INSERT .. SELECT queries that we shouldn't
DESCRIPTION: Prevents unnecessarily pulling the data into coordinator
for some INSERT .. SELECT queries
DESCRIPTION: Drops support for pushing down INSERT .. SELECT with append
table as target

Fixes #6749.
Fixes #1428.
Fixes #6920.

---------

Co-authored-by: aykutbozkurt <aykut.bozkurt1995@gmail.com>
2023-05-17 15:05:08 +03:00
Onur Tirtir 56d217b108
Mark objects as distributed even when pg_dist_node is empty (#6900)
We mark objects as distributed objects in Citus metadata only if we need
to propagate given the command that creates it to worker nodes. For this
reason, we were not doing this for the objects that are created while
pg_dist_node is empty.

One implication of doing so is that we defer the schema propagation to
the time when user creates the first distributed table in the schema.
However, this doesn't help for schema-based sharding (#6866) because we
want to sync pg_dist_tenant_schema to the worker nodes even for empty
schemas too.

* Support test dependencies for isolation tests without a schedule

* Comment out a test due to a known issue (#6901)

* Also, reduce the verbosity for some log messages and make some
   tests compatible with run_test.py.
2023-05-16 11:45:42 +03:00
Onur Tirtir e7abde7e81
Prevent downgrades when there is a single-shard table in the cluster (#6908)
Also add a few tests for Citus/PG upgrade/downgrade scenarios.
2023-05-16 09:44:28 +02:00
Onur Tirtir 893ed416f1
Disable citus.enable_non_colocated_router_query_pushdown by default (#6909)
Fixes #6779.

DESCRIPTION: Disables citus.enable_non_colocated_router_query_pushdown
GUC by default to ensure generating a consistent distributed plan for
the queries that reference non-colocated distributed tables

We already have tests for the cases where this GUC is disabled,
so I'm not adding any more tests in this PR.

Also make multi_insert_select_window idempotent.

Related to: #6793
2023-05-15 12:07:50 +03:00
Ivan Kush e3c6b8a10e
Fix flaky clolumnar_permissions test (#6913)
As attr_num isn't ordered, order may be random. And regression test may
be failed.
This MR adds attr_num to ORDER BY


```
  3 --- /build/contrib/citus/src/test/regress/expected/columnar_permissions.out.modified    2023-05-05 11:13:44.926085432 +0000
  4 +++ /build/contrib/citus/src/test/regress/results/columnar_permissions.out.modified 2023-05-05 11:13:44.934085414 +0000
  5 @@ -124,24 +124,24 @@
  6    from columnar.chunk
  7    where relation in ('no_access'::regclass, 'columnar_permissions'::regclass)
  8    order by relation, stripe_num;
  9         relation       | stripe_num | attr_num | chunk_group_num | value_count
 10  ----------------------+------------+----------+-----------------+-------------
 11   no_access            |          1 |        1 |               0 |           1
 12   no_access            |          2 |        1 |               0 |           1
 13   no_access            |          3 |        1 |               0 |           1
 14   columnar_permissions |          1 |        1 |               0 |           1
 15   columnar_permissions |          1 |        2 |               0 |           1
 16 - columnar_permissions |          2 |        1 |               0 |           1
 17   columnar_permissions |          2 |        2 |               0 |           1
 18 - columnar_permissions |          3 |        1 |               0 |           1
 19 + columnar_permissions |          2 |        1 |               0 |           1
 20   columnar_permissions |          3 |        2 |               0 |           1
 21 + columnar_permissions |          3 |        1 |               0 |           1
 22   columnar_permissions |          4 |        1 |               0 |           1
 23   columnar_permissions |          4 |        2 |               0 |           1
 24  (11 rows)
```

Co-authored-by: Ivan Kush <ivan.kush@tantorlabs.ru>
2023-05-09 12:42:37 +02:00
Hanefi Onaldi 06e6f8e428
Normalize columnar version in tests (#6917)
When we bump columnar version, some tests fail because of the output
change. Instead of changing those lines every time, I think it is better
to normalize it in tests.
2023-05-08 16:10:55 +03:00
Naisila Puka 905fd46410
Fixes flakiness in background_rebalance_parallel test (#6910)
Fixes the following flaky outputs by decreasing citus_task_wait loop
interval, and changing the order of wait commands.

https://app.circleci.com/pipelines/github/citusdata/citus/32102/workflows/19958297-6c7e-49ef-9bc2-8efe8aacb96f/jobs/1089589

``` diff
SELECT job_id, task_id, status, nodes_involved
 FROM pg_dist_background_task WHERE job_id in (:job_id) ORDER BY task_id;
  job_id | task_id |  status  | nodes_involved 
 --------+---------+----------+----------------
   17779 |    1013 | done     | {50,56}
   17779 |    1014 | running  | {50,57}
-  17779 |    1015 | running  | {50,56}
-  17779 |    1016 | blocked  | {50,57}
+  17779 |    1015 | done     | {50,56}
+  17779 |    1016 | running  | {50,57}
   17779 |    1017 | runnable | {50,56}
   17779 |    1018 | blocked  | {50,57}
   17779 |    1019 | runnable | {50,56}
   17779 |    1020 | blocked  | {50,57}
 (8 rows)
```

https://github.com/citusdata/citus/pull/6893#issuecomment-1525661408
```diff
SELECT job_id, task_id, status, nodes_involved
 FROM pg_dist_background_task WHERE job_id in (:job_id) ORDER BY task_id;
  job_id | task_id |  status  | nodes_involved 
 --------+---------+----------+----------------
   17779 |    1013 | done     | {50,56}
-  17779 |    1014 | running  | {50,57}
+  17779 |    1014 | runnable | {50,57}
   17779 |    1015 | running  | {50,56}
   17779 |    1016 | blocked  | {50,57}
   17779 |    1017 | runnable | {50,56}
   17779 |    1018 | blocked  | {50,57}
   17779 |    1019 | runnable | {50,56}
   17779 |    1020 | blocked  | {50,57}
 (8 rows)
```
2023-05-05 16:47:01 +03:00
Hanefi Onaldi 3217e3f181
Fix flaky background rebalance parallel test (#6893)
A test in background_rebalance_parallel.sql was failing intermittently
where the order of tasks in the output was not deterministic. This
commit fixes the test by removing id columns for the background tasks in
the output.

A sample failing diff before this patch is below:

```diff
 SELECT D.task_id,
        (SELECT T.command FROM pg_dist_background_task T WHERE T.task_id = D.task_id),
        D.depends_on,
        (SELECT T.command FROM pg_dist_background_task T WHERE T.task_id = D.depends_on)
 FROM pg_dist_background_task_depend D  WHERE job_id in (:job_id) ORDER BY D.task_id, D.depends_on ASC;
  task_id |                               command                               | depends_on |                               command
 ---------+---------------------------------------------------------------------+------------+---------------------------------------------------------------------
-    1014 | SELECT pg_catalog.citus_move_shard_placement(85674026,50,57,'auto') |       1013 | SELECT pg_catalog.citus_move_shard_placement(85674025,50,56,'auto')
-    1016 | SELECT pg_catalog.citus_move_shard_placement(85674032,50,57,'auto') |       1015 | SELECT pg_catalog.citus_move_shard_placement(85674031,50,56,'auto')
-    1018 | SELECT pg_catalog.citus_move_shard_placement(85674038,50,57,'auto') |       1017 | SELECT pg_catalog.citus_move_shard_placement(85674037,50,56,'auto')
-    1020 | SELECT pg_catalog.citus_move_shard_placement(85674044,50,57,'auto') |       1019 | SELECT pg_catalog.citus_move_shard_placement(85674043,50,56,'auto')
+    1014 | SELECT pg_catalog.citus_move_shard_placement(85674038,50,57,'auto') |       1013 | SELECT pg_catalog.citus_move_shard_placement(85674037,50,56,'auto')
+    1016 | SELECT pg_catalog.citus_move_shard_placement(85674044,50,57,'auto') |       1015 | SELECT pg_catalog.citus_move_shard_placement(85674043,50,56,'auto')
+    1018 | SELECT pg_catalog.citus_move_shard_placement(85674026,50,57,'auto') |       1017 | SELECT pg_catalog.citus_move_shard_placement(85674025,50,56,'auto')
+    1020 | SELECT pg_catalog.citus_move_shard_placement(85674032,50,57,'auto') |       1019 | SELECT pg_catalog.citus_move_shard_placement(85674031,50,56,'auto')
 (4 rows)
```

Notice that the dependent and dependee tasks have some commands, but
they have different task ids.
2023-05-05 12:07:46 +03:00
Naisila Puka 072ae44742
Adjusts query's CoerceViaIO & RelabelType nodes that are improper for deparsing (#6391)
Adjusts query's CoerceViaIO & RelabelType nodes that are
improper for deparsing

The standard planner converts some `::text` casts to `::cstring` and
here we convert back because `cstring` is a pseudotype and it cannot be
casted to most types. This problem occurs in CoerceViaIO nodes.
There was another problem with RelabelType nodes fixed in the following
PR:
https://github.com/citusdata/citus/pull/4580
We undo the changes in that PR, and fix both CoerceViaIO and RelabelType
nodes in the planning phase (not in the deparsing phase in ruleutils)

Fixes https://github.com/citusdata/citus/issues/5646
Fixes https://github.com/citusdata/citus/issues/5033
Fixes https://github.com/citusdata/citus/issues/6061
2023-05-04 16:46:02 +03:00
Onur Tirtir db2514ef78 Call null-shard-key tables as single-shard distributed tables in code 2023-05-03 17:02:43 +03:00
Onur Tirtir 39b7711527 Add support for more pushable / non-pushable insert .. select queries with null-shard-key tables (#6823)
* Add support for dist insert select by selecting from a reference
table.
  
  This was the only pushable insert .. select case that
  #6773 didn't cover.

* For the cases where we insert into a Citus table but the INSERT ..
SELECT
  query cannot be pushed down, allow pull-to-coordinator when possible.

  Remove the checks that we had at the very beginning of
  CreateInsertSelectPlanInternal so that we can try insert .. select via
  pull-to-coordinator for the cases where we cannot push-down the insert
  .. select query. What we support via pull-to-coordinator is still
  limited due to lacking of logical planner support for SELECT queries,
but this commit at least allows using pull-to-coordinator for the cases
  where the select query can be planned via router planner, without
  limiting ourselves to restrictive top-level checks.

  Also introduce some additional restrictions into
CreateDistributedInsertSelectPlan for the cases it was missing to check
  for null-shard-key tables. Indeed, it would make more sense to have
those checks for distributed tables in general, via separate PRs against
  main branch. See https://github.com/citusdata/citus/pull/6817.

* Add support for inserting into a Postgres table.
2023-05-03 16:24:20 +03:00
Onur Tirtir 85745b46d5 Add initial sql support for distributed tables that don't have a shard key (#6773/#6822)
Enable router planner and a limited version of INSERT .. SELECT planner
for the queries that reference colocated null shard key tables.

* SELECT / UPDATE / DELETE / MERGE is supported as long as it's a router
query.
* INSERT .. SELECT is supported as long as it only references colocated
  null shard key tables.

Note that this is not only limited to distributed INSERT .. SELECT but
also
covers a limited set of query types that require pull-to-coordinator,
e.g.,
  due to LIMIT clause, generate_series() etc. ...
(Ideally distributed INSERT .. SELECT could handle such queries too,
e.g.,
when we're only referencing tables that don't have a shard key, but
today
this is not the case. See
https://github.com/citusdata/citus/pull/6773#discussion_r1140130562.
2023-05-03 16:24:20 +03:00
Onur Tirtir ac0ffc9839 Add a config for arbitrary config tests where all the tables are null-shard-key tables (#6783/#6788) 2023-05-03 16:18:27 +03:00
Ahmet Gedemenli cdf54ff4b1 Add DDL support null-shard-key tables(#6778/#6784/#6787/#6859)
Add tests for ddl coverage:
* indexes
* partitioned tables + indexes with long names
* triggers
* foreign keys
* statistics
* grant & revoke statements
* truncate & vacuum
* create/test/drop view that depends on a dist table with no shard key
* policy & rls test

* alter table add/drop/alter_type column (using sequences/different data
  types/identity columns)
* alter table add constraint (not null, check, exclusion constraint)
* alter table add column with a default value / set default / drop
  default
* alter table set option (autovacuum)

* indexes / constraints without names
* multiple subcommands

Adds support for
* Creating new partitions after distributing (with null key) the parent
table
* Attaching partitions to a distributed table with null distribution key
(and automatically distribute the new partition with null key as well)
* Detaching partitions from it
2023-05-03 16:18:27 +03:00
Onur Tirtir fa467e05e7 Add support for creating distributed tables with a null shard key (#6745)
With this PR, we allow creating distributed tables with without
specifying a shard key via create_distributed_table(). Here are the
the important details about those tables:
* Specifying `shard_count` is not allowed because it is assumed to be 1.
* We mostly call such tables as "null shard-key" table in code /
comments.
* To avoid doing a breaking layout change in create_distributed_table();
instead of throwing an error, it will inform the user that
`distribution_type`
  param is ignored unless it's explicitly set to NULL or  'h'.
* `colocate_with` param allows colocating such null shard-key tables to
  each other.
* We define this table type, i.e., NULL_SHARD_KEY_TABLE, as a subclass
of
  DISTRIBUTED_TABLE because we mostly want to treat them as distributed
  tables in terms of SQL / DDL / operation support.
* Metadata for such tables look like:
  - distribution method => DISTRIBUTE_BY_NONE
  - replication model => REPLICATION_MODEL_STREAMING
- colocation id => **!=** INVALID_COLOCATION_ID (distinguishes from
Citus local tables)
* We assign colocation groups for such tables to different nodes in a
  round-robin fashion based on the modulo of "colocation id".

Note that this PR doesn't care about DDL (except CREATE TABLE) / SQL /
operation (i.e., Citus UDFs) support for such tables but adds a
preliminary
API.
2023-05-03 16:18:27 +03:00
Teja Mupparti e444dd4f3f MERGE: Support reference table as source with local table as target 2023-05-02 11:37:29 -07:00
Hanefi Onaldi efd41e8ea5
Bump columnar to 11.3 (#6898)
When working on changelog, Marco suggested in
https://github.com/citusdata/citus/pull/6856#pullrequestreview-1386601215
that we should bump columnar version to 11.3 as well.

This PR aims to contain all the necessary changes to allow upgrades to
and downgrades from 11.3.0 for columnar. Note that updating citus
extension version does not affect columnar as the two extension versions
are not really coupled.

The same changes will also be applied to the release branch in
https://github.com/citusdata/citus/pull/6897
2023-05-02 11:58:32 +03:00
Ahmet Gedemenli 59ccf364df
Ignore nodes not allowed for shards, when planning rebalance steps (#6887)
We are handling colocation groups with shard group count less than the
worker node count, using a method different than the usual rebalancer.
See #6739
While making the decision of using this method or not, we should've
ignored the nodes that are marked `shouldhaveshards = false`. This PR
excludes those nodes when making the decision.

Adds a test such that:
 coordinator: []
 worker 1: [1_1, 1_2]
 worker 2: [2_1, 2_2]
(rebalance)
 coordinator: []
 worker 1: [1_1, 2_1]
 worker 2: [1_2, 2_2]

If we take the coordinator into account, the rebalancer considers the
first state as balanced and does nothing (because shard_count <
worker_count)
But with this pr, we ignore the coordinator because it's
shouldhaveshards = false
So the rebalancer distributes each colocation group to both workers

Also, fixes an unrelated flaky test in the same file
2023-05-01 12:21:08 +02:00
aykut-bozkurt 8cb69cfd13
break sequence dependency during table creation (#6889)
We need to break sequence dependency for a table while creating the
table during non-transactional metadata sync to ensure idempotency of
the creation of the table.

**Problem:**
When we send `SELECT
pg_catalog.worker_drop_sequence_dependency(logicalrelid::regclass::text)
FROM pg_dist_partition` to workers during the non-transactional sync,
table might not be in `pg_dist_partition` at worker, and sequence
dependency is not broken at the worker.

**Solution:** 
We break sequence dependency via `SELECT
pg_catalog.worker_drop_sequence_dependency(logicalrelid::regclass::text)`
for each table while creating it at the workers. It is safe to send
since the udf is a no-op when there is no sequence dependency.

DESCRIPTION: Fixes a bug related to sequence idempotency at
non-transactional sync.

Fixes https://github.com/citusdata/citus/issues/6888.
2023-04-28 15:09:09 +03:00
Jelte Fennema a5f4fece13
Fix running PG upgrade tests with run_test.py (#6829)
In #6814 we started using the Python test runner for upgrade tests in
run_test.py, instead of the Perl based one. This had a problem though,
not all tests in minimal_schedule can be run with the Python runner.
This adds a separate minimal schedule for the pg_upgrade tests which
doesn't include the tests that break with the Python runner.

This PR also fixes various other issues that came up while testing
the upgrade tests.
2023-04-24 15:54:32 +02:00
aykut-bozkurt 08e2820c67
skip restriction clause if it contains placeholdervar (#6857)
`PlaceHolderVar` is not relevant to be processed inside a restriction
clause. Otherwise, `pull_var_clause_default` would throw error. PG would
create the restriction to physical `Var` that `PlaceHolderVar` points to
anyway, so it is safe to skip this restriction.

DESCRIPTION: Fixes a bug related to WHERE clause list which contains
placeholder.

Fixes https://github.com/citusdata/citus/issues/6758
2023-04-17 18:14:01 +03:00
Emel Şimşek 2675a68218
Make coordinator always in metadata by default in regression tests. (#6847)
DESCRIPTION: Changes the regression test setups adding the coordinator
to metadata by default.

When creating a Citus cluster, coordinator can be added in metadata
explicitly by running `citus_set_coordinator_host ` function. Adding the
coordinator to metadata allows to create citus managed local tables.
Other Citus functionality is expected to be unaffected.

This change adds the coordinator to metadata by default when creating
test clusters in regression tests.

There are 3 ways to run commands in a sql file (or a schedule which is a
sequence of sql files) with Citus regression tests. Below is how this PR
adds the coordinator to metadata for each.

1. `make <schedule_name>`
Changed the sql files (sql/multi_cluster_management.sql and
sql/minimal_cluster_management.sql) which sets up the test clusters such
that they call `citus_set_coordinator_host`. This ensures any following
tests will have the coordinator in metadata by default.
 
2. `citus_tests/run_test.py <sql_file_name>`
Changed the python code that sets up the cluster to always call `
citus_set_coordinator_host`.
For the upgrade tests, a version check is included to make sure
`citus_set_coordinator_host` function is available for a given version.

3. ` make check-arbitrary-configs  `     
Changed the python code that sets up the cluster to always call
`citus_set_coordinator_host `.

#6864 will be used to track the remaining work which is to change the
tests where coordinator is added/removed as a node.
2023-04-17 14:14:37 +03:00
Gokhan Gulbiz 8782ea1582
Ensure partitionKeyValue and colocationId are set for proper tenant stats gathering (#6834)
This PR updates the tenant stats implementation to set partitionKeyValue
and colocationId in ExecuteLocalTaskListExtended, in addition to
LocallyExecuteTaskPlan. This ensures that tenant stats can be properly
gathered regardless of the code path taken. The changes were initially
made while testing stored procedure calls for tenant stats.
2023-04-17 09:35:26 +03:00
aykut-bozkurt 3286ec59e9
fix 3 flaky tests in failure schedule (#6846)
Fixed 3 flaky tests in failure tests which caused flakiness in other
tests due to changed node and group sequence ids during node
addition-removal.
2023-04-13 13:13:28 +03:00
Halil Ozan Akgül 9ba70696f7
Add CPU usage to citus_stat_tenants (#6844)
This PR adds CPU usage to `citus_stat_tenants` monitor.
CPU usage is tracked in periods, similar to query counts.
2023-04-12 16:23:00 +03:00
Halil Ozan Akgül 8b50e95dc8
Fix citus_stat_tenants period updating bug (#6833)
Fixes the bug that causes updating the citus_stat_tenants periods
incorrectly.

`TimestampDifferenceExceeds` expects the difference in milliseconds but
it was microseconds, this is fixed.
`tenantStats->lastQueryTime` was updated during monitoring too, now it's
updated only when there are tenant queries.
2023-04-11 17:40:07 +03:00
aykut-bozkurt a20f7e1a55
fixes update propagation bug when `citus_set_coordinator_host` is called more than once (#6837)
DESCRIPTION: Fixes update propagation bug when
`citus_set_coordinator_host` is called more than once.

Fixes https://github.com/citusdata/citus/issues/6731.
2023-04-11 11:27:16 +03:00
Onur Tirtir 0194657c5d
Bump Citus to 12.0devel (#6840) 2023-04-10 12:05:18 +03:00
Naisila Puka 84f2d8685a
Adds control for background task executors involving a node (#6771)
DESCRIPTION: Adds control for background task executors involving a node

### Background and motivation

Nonblocking concurrent task execution via background workers was
introduced in [#6459](https://github.com/citusdata/citus/pull/6459), and
concurrent shard moves in the background rebalancer were introduced in
[#6756](https://github.com/citusdata/citus/pull/6756) - with a hard
dependency that limits to 1 shard move per node. As we know, a shard
move consists of a shard moving from a source node to a target node. The
hard dependency was used because the background task runner didn't have
an option to limit the parallel shard moves per node.

With the motivation of controlling the number of concurrent shard
moves that involve a particular node, either as source or target, this
PR introduces a general new GUC
citus.max_background_task_executors_per_node to be used in the
background task runner infrastructure. So, why do we even want to
control and limit the concurrency? Well, it's all about resource
availability: because the moves involve the same nodes, extra
parallelism won’t make the rebalance complete faster if some resource is
already maxed out (usually cpu or disk). Or, if the cluster is being
used in a production setting, the moves might compete for resources with
production queries much more than if they had been executed
sequentially.

### How does it work?

A new column named nodes_involved is added to the catalog table that
keeps track of the scheduled background tasks,
pg_dist_background_task. It is of type integer[] - to store a list
of node ids. It is NULL by default - the column will be filled by the
rebalancer, but we may not care about the nodes involved in other uses
of the background task runner.

Table "pg_catalog.pg_dist_background_task"

     Column     |           Type           
============================================
 job_id         | bigint
 task_id        | bigint
 owner          | regrole
 pid            | integer
 status         | citus_task_status
 command        | text
 retry_count    | integer
 not_before     | timestamp with time zone
 message        | text
+nodes_involved | integer[]

A hashtable named ParallelTasksPerNode keeps track of the number of
parallel running background tasks per node. An entry in the hashtable is
as follows:

ParallelTasksPerNodeEntry
{
	node_id // The node is used as the hash table key 
	counter // Number of concurrent background tasks that involve node node_id
                // The counter limit is citus.max_background_task_executors_per_node
}

When the background task runner assigns a runnable task to a new
executor, it increments the counter for each of the nodes involved with
that runnable task. The limit of each counter is
citus.max_background_task_executors_per_node. If the limit is reached
for any of the nodes involved, this runnable task is skipped. And then,
later, when the running task finishes, the background task runner
decrements the counter for each of the nodes involved with the done
task. The following functions take care of these increment-decrement
steps:

IncrementParallelTaskCountForNodesInvolved(task)
DecrementParallelTaskCountForNodesInvolved(task)

citus.max_background_task_executors_per_node can be changed in the
fly. In the background rebalancer, we simply give {source_node,
target_node} as the nodesInvolved input to the
ScheduleBackgroundTask function. The rest is taken care of by the
general background task runner infrastructure explained above. Check
background_task_queue_monitor.sql and
background_rebalance_parallel.sql tests for detailed examples.

#### Note

This PR also adds a hard node dependency if a node is first being used
as a source for a move, and then later as a target. The reason this
should be a hard dependency is that the first move might make space for
the second move. So, we could run out of disk space (or at least
overload the node) if we move the second shard to it before the first
one is moved away.

Fixes https://github.com/citusdata/citus/issues/6716
2023-04-06 14:12:39 +03:00
Gokhan Gulbiz fa00fc6e3e
Add upgrade/downgrade paths between v11.2.2 and v11.3.1 (#6820)
DESCRIPTION: PR description that will go into the change log, up to 78
characters

---------

Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
2023-04-06 12:46:09 +03:00
Ahmet Gedemenli 83a2cfbfcf
Move cleanup record test to upgrade schedule (#6794)
DESCRIPTION: Move cleanup record test to upgrade schedule
2023-04-06 11:42:49 +03:00
Naisila Puka fc479bfa49
Fixes flakiness in multi_metadata_sync test (#6824)
Fixes flakiness in multi_metadata_sync test


https://app.circleci.com/pipelines/github/citusdata/citus/31863/workflows/ea937480-a4cc-4646-815c-bb2634361d98/jobs/1074457
```diff
SELECT
 	logicalrelid, repmodel
 FROM
 	pg_dist_partition
 WHERE
 	logicalrelid = 'mx_test_schema_1.mx_table_1'::regclass
 	OR logicalrelid = 'mx_test_schema_2.mx_table_2'::regclass;
         logicalrelid         | repmodel 
 -----------------------------+----------
- mx_test_schema_1.mx_table_1 | s
  mx_test_schema_2.mx_table_2 | s
+ mx_test_schema_1.mx_table_1 | s
 (2 rows)
```
This is a simple issue of missing `ORDER BY` clauses. I went ahead and
added some other missing ones in the same file as well. Also, I replaced
existing `ORDER BY logicalrelid` with `ORDER BY logicalrelid::text`, in
order to compare names, not OIDs.
2023-04-06 11:19:32 +03:00
Halil Ozan Akgül 52ad2d08c7
Multi tenant monitoring (#6725)
DESCRIPTION: Adds views that monitor statistics on tenant usages

This PR adds `citus_stats_tenants` view that monitors the tenants on the
cluster.

`citus_stats_tenants` shows the node id, colocation id, tenant
attribute, read count in this period and last period, and query count in
this period and last period of the tenant.
Tenant attribute currently is the tenant's distribution column value,
later when schema based sharding is introduced, this meaning might
change.
A period is a time bucket the queries are counted by. Read and query
counts for this period can increase until the current period ends. After
that those counts are moved to last period's counts, which cannot
change. The period length can be set using 'citus.stats_tenants_period'.

`SELECT` queries are counted as _read_ queries, `INSERT`, `UPDATE` and
`DELETE` queries are counted as _write_ queries. So in the view read
counts are `SELECT` counts and query counts are `SELECT`, `INSERT`,
`UPDATE` and `DELETE` count.

The data is stored in shared memory, in a struct named
`MultiTenantMonitor`.

`citus_stats_tenants` shows the data from local tenants.

`citus_stats_tenants` show up to `citus.stats_tenant_limit` number of
tenants.
The tenants are scored based on the number of queries they run and the
recency of those queries. Every query ran increases the score of tenant
by `ONE_QUERY_SCORE`, and after every period ends the scores are halved.
Halving is done lazily.
To retain information a longer the monitor keeps up to 3 times
`citus.stats_tenant_limit` tenants. When the tenant count hits `3 *
citus.stats_tenant_limit`, last `citus.stats_tenant_limit` tenants are
removed. To see all stored tenants you can use
`citus_stats_tenants(return_all_tenants := true)`

- [x] Create collector view that gets data from all nodes. #6761 
- [x] Add monitoring log #6762 
- [x] Create enable/disable GUC #6769 
- [x] Parse the annotation string correctly #6796 
- [x] Add local queries and prepared statements #6797
- [x] Rename to citus_stat_statements #6821 
- [x] Run pgbench
- [x] Fix role permissions #6812

---------

Co-authored-by: Gokhan Gulbiz <ggulbiz@gmail.com>
Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
2023-04-05 17:44:17 +03:00
Naisila Puka eda3cc418a
Fixes flakiness in multi_cluster_management test (#6825)
Fixes flakiness in multi_cluster_management test

https://app.circleci.com/pipelines/github/citusdata/citus/31816/workflows/2f455a30-1c0b-4b21-9831-f7cf2169df5a/jobs/1071444
```diff
SELECT public.wait_until_metadata_sync();
+WARNING:  waiting for metadata sync timed out
  wait_until_metadata_sync 
 --------------------------
  
 (1 row)
```
Default timeout value is 15000. I increased it to 60000.
2023-04-05 15:50:22 +03:00
Jelte Fennema dcee370270
Fix flakyness in citus_split_shard_by_split_points_deferred_drop (#6819)
In CI we would sometimes get this failure:
```diff
 -- The original shard is marked for deferred drop with policy_type = 2.
 -- The previous shard should be dropped at the beginning of the second split call
 SELECT * from pg_dist_cleanup;
  record_id | operation_id | object_type |                               object_name                                | node_group_id | policy_type
 -----------+--------------+-------------+--------------------------------------------------------------------------+---------------+-------------
+        60 |          778 |           3 | citus_shard_split_slot_18_21216_778                                      |            16 |           0
        512 |          778 |           1 | citus_split_shard_by_split_points_deferred_schema.table_to_split_8981001 |            16 |           2
-(1 row)
+(2 rows)
```

Replication slots sometimes cannot be deleted right away. Which is hard
to resolve, but luckily we can filter these cleanup records out easily
by filtering by policy_type.

While debugging this issue I learnt that we did not use
`GetNextCleanupRecordId` in all places where we created cleanup
records. This caused test failures when running tests multiple times,
when they set `citus.next_cleanup_record_id`. I tried fixing that by
calling GetNextCleanupRecordId in all places but that caused many
other tests to fail due to deadlocks. So, instead this adresses
that issue by using `ALTER SEQUENCE ... RESTART` instead of
`citus.next_cleanup_record_id`. In a follow up PR we should
probably get rid of `citus.next_cleanup_record_id`, since it's
only used in one other file.
2023-04-04 09:45:48 +02:00
Marco Slot 7c0589abb8
Do not override combinefunc of custom aggregates with common names (#6805)
DESCRIPTION: Fix an issue that caused some queries with custom
aggregates to fail

While playing around with https://github.com/pgvector/pgvector I noticed
that the AVG query was broken. That's because we treat it as any other
AVG by breaking it down in SUM and COUNT, but there are no SUM/COUNT
functions in this case, but there is a perfectly usable combinefunc.
This PR changes our aggregate logic to prefer custom aggregates with a
combinefunc even if they have a common name.

Co-authored-by: Marco Slot <marco.slot@gmail.com>
2023-04-03 19:43:09 +02:00
Jelte Fennema 085b59f586
Fix flaky multi_mx_schema_support test (#6813)
This happened sometimes:
```diff
 SELECT objid::oid::regnamespace as "Distributed Schemas"
     FROM pg_catalog.pg_dist_object
     WHERE objid::oid::regnamespace IN ('mx_old_schema', 'mx_new_schema');
  Distributed Schemas
 ---------------------
- mx_old_schema
  mx_new_schema
+ mx_old_schema
 (2 rows)
```

Source:
https://app.circleci.com/pipelines/github/citusdata/citus/31706/workflows/edc84a6a-dfef-42b3-ab5c-54daa64c2154/jobs/1065463

In passing make multi_mx_schema_support runnable with run_test.py
2023-03-31 12:36:53 +02:00
aykutbozkurt dc57e4b2d8 PR #6728  / commit - 13
Add failure tests for nontransactional metadata sync mode.
2023-03-30 11:06:16 +03:00
aykutbozkurt 35dbdae5a4 PR #6728  / commit - 11
Let AddNodeMetadata to use metadatasync api during node addition.
2023-03-30 11:06:16 +03:00
aykutbozkurt fe00b3263a PR #6728  / commit - 10
Do not acquire strict lock on separate transaction to localhost as we already take the lock before.
But make sure that caller has the ExclusiveLock.
2023-03-30 11:06:16 +03:00
aykutbozkurt a74232bb39 PR #6728  / commit - 9
Do not enforce distributed transaction at `EnsureCoordinatorInitiatedOperation`.
2023-03-30 10:53:22 +03:00
aykutbozkurt 1fb3de14df PR #6728  / commit - 6
Let `activate_node_snapshot` use new metadata sync api.
2023-03-30 10:53:22 +03:00
aykutbozkurt bc25ba51c3 PR #6728  / commit - 5
Let `ActivateNode` use new metadata sync api.
2023-03-30 10:53:22 +03:00