Commit Graph

6746 Commits (release-12.1-hanefi-fixing-tests)

Author SHA1 Message Date
ahmet gedemenli f4b2494d0c Disable update_distributed_table_colocation for tenant tables 2023-06-02 14:48:07 +03:00
Halil Ozan Akgül 3e183746b7
Single Shard Misc UDFs 2 (#6963)
Creating a second PR to make reviewing easier.
This PR tests:
- replicate_reference_tables
- fix_partition_shard_index_names
- isolate_tenant_to_new_shard
- replicate_table_shards
2023-06-02 13:46:14 +03:00
Halil Ozan Akgül ac7f732be2
Add Single Shard Table Tests for Dependency UDFs (#6960)
This PR tests:
- citus_get_all_dependencies_for_object
- citus_get_dependencies_for_object
- is_citus_depended_object
2023-06-02 11:57:53 +03:00
Teja Mupparti ff2062e8c3 Rename insert-select redistribute code base to generic purpose 2023-06-01 09:43:43 -07:00
Halil Ozan Akgül 9961d39d97
Adds Single Shard Table Tests for Foreign Key UDFs (#6959)
This PR adds tests for:
- get_referencing_relation_id_list
- get_referenced_relation_id_list
- get_foreign_key_connected_relations
2023-06-01 12:56:06 +03:00
Ahmet Gedemenli 3cd81a7107
Add test for rebalancer with single shard tables (#6949)
Adds test for shard moves / rebalancer with single shard tables
2023-05-31 14:58:23 +03:00
ahmet gedemenli 8ace5a7af5 Use citus_drain_node with single shard tables 2023-05-31 14:01:52 +03:00
ahmet gedemenli ee42af7ad2 Add test for rebalancer with single shard tables 2023-05-31 11:48:49 +03:00
Teja Mupparti f9dbe7784b This commit adds a safety-net to the issue seen in #6785. The fix for the underlying issue will be in the PR#6943 2023-05-30 10:53:05 -07:00
Halil Ozan Akgül d99a5e2f62
Single Shard Table Tests for Shard Lock UDFs (#6944)
This PR adds single shard table tests for shard lock UDFs,
`shard_lock_metadata`, `shard_lock_resources`
2023-05-30 12:23:41 +03:00
Halil Ozan Akgül 5b54700b93
Single Shard Table Tests for Time Partitions (#6941)
This PR adds tests for time partitions UDFs and view with single shard
tables.
2023-05-29 14:18:56 +03:00
Halil Ozan Akgül 9d9b3817c1
Single Shard Table Columnar UDFs Tests (#6937)
Adds columnar UDF tests for single shard tables.
2023-05-29 13:53:00 +03:00
Halil Ozan Akgül 321fcfcdb5
Add Support for Single Shard Tables in update_distributed_table_colocation (#6924)
Adds Support for Single Shard Tables in
`update_distributed_table_colocation`.

This PR changes checks that make sure tables should be hash distributed
table to hash or single shard distributed tables.
2023-05-29 11:47:50 +03:00
Ahmet Gedemenli 1ca80813f6
Citus UDFs support for single shard tables (#6916)
Verify Citus UDFs work well with single shard tables

SUPPORTED
* citus_table_size
* citus_total_relation_size
* citus_relation_size
* citus_shard_sizes
* truncate_local_data_after_distributing_table
* create_distributed_function // test function colocated with a single
shard table
* undistribute_table
* alter_table_set_access_method

UNSUPPORTED - error out for single shard tables
* master_create_empty_shard
* create_distributed_table_concurrently
* create_distributed_table
* create_reference_table
* citus_add_local_table_to_metadata
* citus_split_shard_by_split_points
* alter_distributed_table
2023-05-26 17:30:05 +03:00
Onur Tirtir 246b054a7d
Add support for schema-based-sharding via a GUC (#6866)
DESCRIPTION: Adds citus.enable_schema_based_sharding GUC that allows
sharding the database based on schemas when enabled.

* Refactor the logic that automatically creates Citus managed tables 

* Refactor CreateSingleShardTable() to allow specifying colocation id
instead

* Add support for schema-based-sharding via a GUC

### What this PR is about:
Add **citus.enable_schema_based_sharding GUC** to enable schema-based
sharding. Each schema created while this GUC is ON will be considered
as a tenant schema. Later on, regardless of whether the GUC is ON or
OFF, any table created in a tenant schema will be converted to a
single shard distributed table (without a shard key). All the tenant
tables that belong to a particular schema will be co-located with each
other and will have a shard count of 1.

We introduce a new metadata table --pg_dist_tenant_schema-- to do the
bookkeeping for tenant schemas:
```sql
psql> \d pg_dist_tenant_schema
          Table "pg_catalog.pg_dist_tenant_schema"
┌───────────────┬─────────┬───────────┬──────────┬─────────┐
│    Column     │  Type   │ Collation │ Nullable │ Default │
├───────────────┼─────────┼───────────┼──────────┼─────────┤
│ schemaid      │ oid     │           │ not null │         │
│ colocationid  │ integer │           │ not null │         │
└───────────────┴─────────┴───────────┴──────────┴─────────┘
Indexes:
    "pg_dist_tenant_schema_pkey" PRIMARY KEY, btree (schemaid)
    "pg_dist_tenant_schema_unique_colocationid_index" UNIQUE, btree (colocationid)

psql> table pg_dist_tenant_schema;
┌───────────┬───────────────┐
│ schemaid  │ colocationid  │
├───────────┼───────────────┤
│     41963 │            91 │
│     41962 │            90 │
└───────────┴───────────────┘
(2 rows)
```

Colocation id column of pg_dist_tenant_schema can never be NULL even
for the tenant schemas that don't have a tenant table yet. This is
because, we assign colocation ids to tenant schemas as soon as they
are created. That way, we can keep associating tenant schemas with
particular colocation groups even if all the tenant tables of a tenant
schema are dropped and recreated later on.

When a tenant schema is dropped, we delete the corresponding row from
pg_dist_tenant_schema. In that case, we delete the corresponding
colocation group from pg_dist_colocation as well.

### Future work for 12.0 release:
We're building schema-based sharding on top of the infrastructure that
adds support for creating distributed tables without a shard key
(https://github.com/citusdata/citus/pull/6867).
However, not all the operations that can be done on distributed tables
without a shard key necessarily make sense (in the same way) in the
context of schema-based sharding. For example, we need to think about
what happens if user attempts altering schema of a tenant table. We
will tackle such scenarios in a future PR.

We will also add a new UDF --citus.schema_tenant_set() or such-- to
allow users to use an existing schema as a tenant schema, and another
one --citus.schema_tenant_unset() or such-- to stop using a schema as
a tenant schema in future PRs.
2023-05-26 10:49:58 +03:00
Halil Ozan Akgül 2c7beee562
Fix citus.tenant_stats_limit test by setting it to 2 (#6899)
citus.tenant_stats_limit was set to 2 when we were adding tests for it.
Then we changed it to 10, making the tests incorrect.
This PR fixes that without breaking other tests.
2023-05-23 17:44:07 +03:00
Jelte Fennema 350a0f6417
Support running Citus upgrade tests with run_test.py (#6832)
Citus upgrade tests require some additional logic to run, because we
have a before and after schedule and we need to swap the Citus
version in-between. This adds that logic to `run_test.py`.

In passing this makes running upgrade tests locally multiple times
faster by caching tarballs.
2023-05-23 14:38:54 +02:00
Emel Şimşek 02f815ce1f
Disable local execution when Explain Analyze is requested for a query. (#6892)
DESCRIPTION: Fixes a crash when explain analyze is requested for a query
that is normally locally executed.

When explain analyze is requested for a query, a task with two queries
is created. Those two queries are
    
1. Wrapped Query --> `SELECT ... FROM
worker_save_query_explain_analyze(<query>, <explain analyze options>)`
2. Fetch Query -->` SELECT explain_analyze_output, execution_duration
FROM worker_last_saved_explain_analyze();`

When the query is locally executed a task with multiple queries causes a
crash in production. See the Assert at
57455dc64d/src/backend/distributed/executor/tuple_destination.c#:~:text=Assert(task%2D%3EqueryCount%20%3D%3D%201)%3B

This becomes a critical issue when auto_explain extension is used. When
auto_explain extension is enabled, explain analyze is automatically
requested for every query.

One possible solution could be not to create two queries for a locally
executed query. The fetch part may not have to be a query since the
values are available in local variables.

Until we enable local execution for explain analyze, it is best to
disable local execution.

Fixes #6777.
2023-05-23 14:33:22 +03:00
Emel Şimşek f9a5be59b9
Run replicate_reference_tables background task as superuser. (#6930)
DESCRIPTION: Fixes a bug in background shard rebalancer where the
replicate reference tables task fails if the current user is not a
superuser.

This change is to be backported to earlier releases. We should fix the
permissions for replicate_reference_tables on main branch such that it
can be run by non-superuser roles.

Fixes #6925.
Fixes #6926.
2023-05-18 23:46:32 +03:00
Hanefi Onaldi 6a83290d91
Add ORDER BY clauses to some flaky tests (#6931)
I observed a flaky test output
[here](https://app.circleci.com/pipelines/github/citusdata/citus/32692/workflows/32464a22-7fd6-440a-9ff7-cfa62f9ff58a/jobs/1126144)
and added `ORDER BY` clauses to similar queries in the failing test
file.

```diff
 SELECT pg_identify_object_as_address(classid, objid, objsubid) from pg_catalog.pg_dist_object where objid IN('viewsc.prop_view3'::regclass::oid, 'viewsc.prop_view4'::regclass::oid);
   pg_identify_object_as_address  
 ---------------------------------
- (view,"{viewsc,prop_view3}",{})
  (view,"{viewsc,prop_view4}",{})
+ (view,"{viewsc,prop_view3}",{})
 (2 rows)
```
2023-05-18 12:45:39 +03:00
Onur Tirtir 8ff9dde4b3
Prevent pushing down INSERT .. SELECT queries that we shouldn't (and allow some more) (#6752)
Previously INSERT .. SELECT planner were pushing down some queries that should not be pushed down due to wrong colocation checks. It was checking whether one of the table in SELECT part and target table are colocated. But now, we check colocation for all tables in SELECT part and the target table.

Another problem with INSERT .. SELECT planner was that some queries, which is valid to be pushed down, were not pushed down due to unnecessary checks which are currently supported. e.g. UNION check. As solution, we reused the pushdown planner checks for INSERT .. SELECT planner.


DESCRIPTION: Fixes a bug that causes incorrectly pushing down some
INSERT .. SELECT queries that we shouldn't
DESCRIPTION: Prevents unnecessarily pulling the data into coordinator
for some INSERT .. SELECT queries
DESCRIPTION: Drops support for pushing down INSERT .. SELECT with append
table as target

Fixes #6749.
Fixes #1428.
Fixes #6920.

---------

Co-authored-by: aykutbozkurt <aykut.bozkurt1995@gmail.com>
2023-05-17 15:05:08 +03:00
Onur Tirtir 56d217b108
Mark objects as distributed even when pg_dist_node is empty (#6900)
We mark objects as distributed objects in Citus metadata only if we need
to propagate given the command that creates it to worker nodes. For this
reason, we were not doing this for the objects that are created while
pg_dist_node is empty.

One implication of doing so is that we defer the schema propagation to
the time when user creates the first distributed table in the schema.
However, this doesn't help for schema-based sharding (#6866) because we
want to sync pg_dist_tenant_schema to the worker nodes even for empty
schemas too.

* Support test dependencies for isolation tests without a schedule

* Comment out a test due to a known issue (#6901)

* Also, reduce the verbosity for some log messages and make some
   tests compatible with run_test.py.
2023-05-16 11:45:42 +03:00
Onur Tirtir e7abde7e81
Prevent downgrades when there is a single-shard table in the cluster (#6908)
Also add a few tests for Citus/PG upgrade/downgrade scenarios.
2023-05-16 09:44:28 +02:00
Onur Tirtir 893ed416f1
Disable citus.enable_non_colocated_router_query_pushdown by default (#6909)
Fixes #6779.

DESCRIPTION: Disables citus.enable_non_colocated_router_query_pushdown
GUC by default to ensure generating a consistent distributed plan for
the queries that reference non-colocated distributed tables

We already have tests for the cases where this GUC is disabled,
so I'm not adding any more tests in this PR.

Also make multi_insert_select_window idempotent.

Related to: #6793
2023-05-15 12:07:50 +03:00
Jelte Fennema 07b8cd2634
Forward to existing emit_log_hook in our log hook (#6877)
DESCRIPTION: Forward to existing emit_log_hook in our log hook

This makes us work better with other extensions installed in Postgres.
Without this change we would overwrite their emit_log_hook, causing it
to never be called.

Fixes #6874
2023-05-09 16:55:56 +02:00
Ivan Kush e3c6b8a10e
Fix flaky clolumnar_permissions test (#6913)
As attr_num isn't ordered, order may be random. And regression test may
be failed.
This MR adds attr_num to ORDER BY


```
  3 --- /build/contrib/citus/src/test/regress/expected/columnar_permissions.out.modified    2023-05-05 11:13:44.926085432 +0000
  4 +++ /build/contrib/citus/src/test/regress/results/columnar_permissions.out.modified 2023-05-05 11:13:44.934085414 +0000
  5 @@ -124,24 +124,24 @@
  6    from columnar.chunk
  7    where relation in ('no_access'::regclass, 'columnar_permissions'::regclass)
  8    order by relation, stripe_num;
  9         relation       | stripe_num | attr_num | chunk_group_num | value_count
 10  ----------------------+------------+----------+-----------------+-------------
 11   no_access            |          1 |        1 |               0 |           1
 12   no_access            |          2 |        1 |               0 |           1
 13   no_access            |          3 |        1 |               0 |           1
 14   columnar_permissions |          1 |        1 |               0 |           1
 15   columnar_permissions |          1 |        2 |               0 |           1
 16 - columnar_permissions |          2 |        1 |               0 |           1
 17   columnar_permissions |          2 |        2 |               0 |           1
 18 - columnar_permissions |          3 |        1 |               0 |           1
 19 + columnar_permissions |          2 |        1 |               0 |           1
 20   columnar_permissions |          3 |        2 |               0 |           1
 21 + columnar_permissions |          3 |        1 |               0 |           1
 22   columnar_permissions |          4 |        1 |               0 |           1
 23   columnar_permissions |          4 |        2 |               0 |           1
 24  (11 rows)
```

Co-authored-by: Ivan Kush <ivan.kush@tantorlabs.ru>
2023-05-09 12:42:37 +02:00
Hanefi Onaldi 06e6f8e428
Normalize columnar version in tests (#6917)
When we bump columnar version, some tests fail because of the output
change. Instead of changing those lines every time, I think it is better
to normalize it in tests.
2023-05-08 16:10:55 +03:00
aykut-bozkurt 73c771d6ed
Update readme for 11.3 (#6903)
Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
Co-authored-by: Jelte Fennema <jelte.fennema@microsoft.com>
2023-05-05 19:08:35 +03:00
Naisila Puka 905fd46410
Fixes flakiness in background_rebalance_parallel test (#6910)
Fixes the following flaky outputs by decreasing citus_task_wait loop
interval, and changing the order of wait commands.

https://app.circleci.com/pipelines/github/citusdata/citus/32102/workflows/19958297-6c7e-49ef-9bc2-8efe8aacb96f/jobs/1089589

``` diff
SELECT job_id, task_id, status, nodes_involved
 FROM pg_dist_background_task WHERE job_id in (:job_id) ORDER BY task_id;
  job_id | task_id |  status  | nodes_involved 
 --------+---------+----------+----------------
   17779 |    1013 | done     | {50,56}
   17779 |    1014 | running  | {50,57}
-  17779 |    1015 | running  | {50,56}
-  17779 |    1016 | blocked  | {50,57}
+  17779 |    1015 | done     | {50,56}
+  17779 |    1016 | running  | {50,57}
   17779 |    1017 | runnable | {50,56}
   17779 |    1018 | blocked  | {50,57}
   17779 |    1019 | runnable | {50,56}
   17779 |    1020 | blocked  | {50,57}
 (8 rows)
```

https://github.com/citusdata/citus/pull/6893#issuecomment-1525661408
```diff
SELECT job_id, task_id, status, nodes_involved
 FROM pg_dist_background_task WHERE job_id in (:job_id) ORDER BY task_id;
  job_id | task_id |  status  | nodes_involved 
 --------+---------+----------+----------------
   17779 |    1013 | done     | {50,56}
-  17779 |    1014 | running  | {50,57}
+  17779 |    1014 | runnable | {50,57}
   17779 |    1015 | running  | {50,56}
   17779 |    1016 | blocked  | {50,57}
   17779 |    1017 | runnable | {50,56}
   17779 |    1018 | blocked  | {50,57}
   17779 |    1019 | runnable | {50,56}
   17779 |    1020 | blocked  | {50,57}
 (8 rows)
```
2023-05-05 16:47:01 +03:00
Hanefi Onaldi 3217e3f181
Fix flaky background rebalance parallel test (#6893)
A test in background_rebalance_parallel.sql was failing intermittently
where the order of tasks in the output was not deterministic. This
commit fixes the test by removing id columns for the background tasks in
the output.

A sample failing diff before this patch is below:

```diff
 SELECT D.task_id,
        (SELECT T.command FROM pg_dist_background_task T WHERE T.task_id = D.task_id),
        D.depends_on,
        (SELECT T.command FROM pg_dist_background_task T WHERE T.task_id = D.depends_on)
 FROM pg_dist_background_task_depend D  WHERE job_id in (:job_id) ORDER BY D.task_id, D.depends_on ASC;
  task_id |                               command                               | depends_on |                               command
 ---------+---------------------------------------------------------------------+------------+---------------------------------------------------------------------
-    1014 | SELECT pg_catalog.citus_move_shard_placement(85674026,50,57,'auto') |       1013 | SELECT pg_catalog.citus_move_shard_placement(85674025,50,56,'auto')
-    1016 | SELECT pg_catalog.citus_move_shard_placement(85674032,50,57,'auto') |       1015 | SELECT pg_catalog.citus_move_shard_placement(85674031,50,56,'auto')
-    1018 | SELECT pg_catalog.citus_move_shard_placement(85674038,50,57,'auto') |       1017 | SELECT pg_catalog.citus_move_shard_placement(85674037,50,56,'auto')
-    1020 | SELECT pg_catalog.citus_move_shard_placement(85674044,50,57,'auto') |       1019 | SELECT pg_catalog.citus_move_shard_placement(85674043,50,56,'auto')
+    1014 | SELECT pg_catalog.citus_move_shard_placement(85674038,50,57,'auto') |       1013 | SELECT pg_catalog.citus_move_shard_placement(85674037,50,56,'auto')
+    1016 | SELECT pg_catalog.citus_move_shard_placement(85674044,50,57,'auto') |       1015 | SELECT pg_catalog.citus_move_shard_placement(85674043,50,56,'auto')
+    1018 | SELECT pg_catalog.citus_move_shard_placement(85674026,50,57,'auto') |       1017 | SELECT pg_catalog.citus_move_shard_placement(85674025,50,56,'auto')
+    1020 | SELECT pg_catalog.citus_move_shard_placement(85674032,50,57,'auto') |       1019 | SELECT pg_catalog.citus_move_shard_placement(85674031,50,56,'auto')
 (4 rows)
```

Notice that the dependent and dependee tasks have some commands, but
they have different task ids.
2023-05-05 12:07:46 +03:00
Teja Mupparti b58665773b Move all pre-15-defined routines to the bottom of the file 2023-05-04 10:07:08 -07:00
Naisila Puka 072ae44742
Adjusts query's CoerceViaIO & RelabelType nodes that are improper for deparsing (#6391)
Adjusts query's CoerceViaIO & RelabelType nodes that are
improper for deparsing

The standard planner converts some `::text` casts to `::cstring` and
here we convert back because `cstring` is a pseudotype and it cannot be
casted to most types. This problem occurs in CoerceViaIO nodes.
There was another problem with RelabelType nodes fixed in the following
PR:
https://github.com/citusdata/citus/pull/4580
We undo the changes in that PR, and fix both CoerceViaIO and RelabelType
nodes in the planning phase (not in the deparsing phase in ruleutils)

Fixes https://github.com/citusdata/citus/issues/5646
Fixes https://github.com/citusdata/citus/issues/5033
Fixes https://github.com/citusdata/citus/issues/6061
2023-05-04 16:46:02 +03:00
Önder Kalacı 1662694471
Update CHANGELOG.md (#6907)
Change `citus_stats_tenants` to `citus_stat_tenants`

Thanks @clairegiordano for noticing
2023-05-04 11:45:02 +03:00
Onur Tirtir aeaa48c197
Add support for creating distributed tables without shard key [merging the main devel branch] (#6867)
DESCRIPTION: Adds support for creating distributed tables without shard
key

Commits proposed in this PR have already been reviewed in other PRs
noted
for each commit.

With this PR, we allow creating distributed tables without
specifying a shard key via create_distributed_table(). Here are the
the important details about those tables:
* Specifying `shard_count` is not allowed because it is assumed to be 1.
* We mostly call such tables as "single-shard" distributed table in code
  / comments.
* `colocate_with` param allows colocating such single-shard tables to
  each other.
* We define this table type, i.e., SINGLE_SHARD_DISTRIBUTED, as a
subclass
of DISTRIBUTED_TABLE because we mostly want to treat them as distributed
  tables in terms of SQL / DDL / operation support.
* Metadata for such tables look like:
  - distribution method => DISTRIBUTE_BY_NONE
  - replication model => REPLICATION_MODEL_STREAMING
- colocation id => **!=** INVALID_COLOCATION_ID (distinguishes from
Citus local tables)
* We assign colocation groups for such tables to different nodes in a
  round-robin fashion based on the modulo of "colocation id".

There are also still more work that needs to be done, such as improving
SQL
support, making sure that Citus operations work well such distributed
tables
and making sure that latest features merged in at 11.3 / 12.0 (such as
CDC)
works fine. We will take care of them in subsequent PRs.

In this release, we will build schema-based-sharding on top of this
infrastructure. And it's likely that we will use this infra for some
other nice features in future too.
2023-05-03 17:15:22 +03:00
Ahmet Gedemenli 4321286005 Disable master_create_empty_shard udf for single shard tables (#6902) 2023-05-03 17:02:43 +03:00
Onur Tirtir db2514ef78 Call null-shard-key tables as single-shard distributed tables in code 2023-05-03 17:02:43 +03:00
Onur Tirtir 39b7711527 Add support for more pushable / non-pushable insert .. select queries with null-shard-key tables (#6823)
* Add support for dist insert select by selecting from a reference
table.
  
  This was the only pushable insert .. select case that
  #6773 didn't cover.

* For the cases where we insert into a Citus table but the INSERT ..
SELECT
  query cannot be pushed down, allow pull-to-coordinator when possible.

  Remove the checks that we had at the very beginning of
  CreateInsertSelectPlanInternal so that we can try insert .. select via
  pull-to-coordinator for the cases where we cannot push-down the insert
  .. select query. What we support via pull-to-coordinator is still
  limited due to lacking of logical planner support for SELECT queries,
but this commit at least allows using pull-to-coordinator for the cases
  where the select query can be planned via router planner, without
  limiting ourselves to restrictive top-level checks.

  Also introduce some additional restrictions into
CreateDistributedInsertSelectPlan for the cases it was missing to check
  for null-shard-key tables. Indeed, it would make more sense to have
those checks for distributed tables in general, via separate PRs against
  main branch. See https://github.com/citusdata/citus/pull/6817.

* Add support for inserting into a Postgres table.
2023-05-03 16:24:20 +03:00
Onur Tirtir 85745b46d5 Add initial sql support for distributed tables that don't have a shard key (#6773/#6822)
Enable router planner and a limited version of INSERT .. SELECT planner
for the queries that reference colocated null shard key tables.

* SELECT / UPDATE / DELETE / MERGE is supported as long as it's a router
query.
* INSERT .. SELECT is supported as long as it only references colocated
  null shard key tables.

Note that this is not only limited to distributed INSERT .. SELECT but
also
covers a limited set of query types that require pull-to-coordinator,
e.g.,
  due to LIMIT clause, generate_series() etc. ...
(Ideally distributed INSERT .. SELECT could handle such queries too,
e.g.,
when we're only referencing tables that don't have a shard key, but
today
this is not the case. See
https://github.com/citusdata/citus/pull/6773#discussion_r1140130562.
2023-05-03 16:24:20 +03:00
Onur Tirtir ac0ffc9839 Add a config for arbitrary config tests where all the tables are null-shard-key tables (#6783/#6788) 2023-05-03 16:18:27 +03:00
Ahmet Gedemenli cdf54ff4b1 Add DDL support null-shard-key tables(#6778/#6784/#6787/#6859)
Add tests for ddl coverage:
* indexes
* partitioned tables + indexes with long names
* triggers
* foreign keys
* statistics
* grant & revoke statements
* truncate & vacuum
* create/test/drop view that depends on a dist table with no shard key
* policy & rls test

* alter table add/drop/alter_type column (using sequences/different data
  types/identity columns)
* alter table add constraint (not null, check, exclusion constraint)
* alter table add column with a default value / set default / drop
  default
* alter table set option (autovacuum)

* indexes / constraints without names
* multiple subcommands

Adds support for
* Creating new partitions after distributing (with null key) the parent
table
* Attaching partitions to a distributed table with null distribution key
(and automatically distribute the new partition with null key as well)
* Detaching partitions from it
2023-05-03 16:18:27 +03:00
Onur Tirtir fa467e05e7 Add support for creating distributed tables with a null shard key (#6745)
With this PR, we allow creating distributed tables with without
specifying a shard key via create_distributed_table(). Here are the
the important details about those tables:
* Specifying `shard_count` is not allowed because it is assumed to be 1.
* We mostly call such tables as "null shard-key" table in code /
comments.
* To avoid doing a breaking layout change in create_distributed_table();
instead of throwing an error, it will inform the user that
`distribution_type`
  param is ignored unless it's explicitly set to NULL or  'h'.
* `colocate_with` param allows colocating such null shard-key tables to
  each other.
* We define this table type, i.e., NULL_SHARD_KEY_TABLE, as a subclass
of
  DISTRIBUTED_TABLE because we mostly want to treat them as distributed
  tables in terms of SQL / DDL / operation support.
* Metadata for such tables look like:
  - distribution method => DISTRIBUTE_BY_NONE
  - replication model => REPLICATION_MODEL_STREAMING
- colocation id => **!=** INVALID_COLOCATION_ID (distinguishes from
Citus local tables)
* We assign colocation groups for such tables to different nodes in a
  round-robin fashion based on the modulo of "colocation id".

Note that this PR doesn't care about DDL (except CREATE TABLE) / SQL /
operation (i.e., Citus UDFs) support for such tables but adds a
preliminary
API.
2023-05-03 16:18:27 +03:00
aykut-bozkurt 2d005ac777
Query Generator Seed (#6883)
- Give seed number as argument to query generator to reproduce a
previous run.
- Expose the difference between results, if any, as artifact on CI.
2023-05-03 15:54:11 +03:00
Teja Mupparti e444dd4f3f MERGE: Support reference table as source with local table as target 2023-05-02 11:37:29 -07:00
Hanefi Onaldi efd41e8ea5
Bump columnar to 11.3 (#6898)
When working on changelog, Marco suggested in
https://github.com/citusdata/citus/pull/6856#pullrequestreview-1386601215
that we should bump columnar version to 11.3 as well.

This PR aims to contain all the necessary changes to allow upgrades to
and downgrades from 11.3.0 for columnar. Note that updating citus
extension version does not affect columnar as the two extension versions
are not really coupled.

The same changes will also be applied to the release branch in
https://github.com/citusdata/citus/pull/6897
2023-05-02 11:58:32 +03:00
Hanefi Onaldi 934430003e
Changelog entries for 11.3.0 (#6856)
In this release, I tried something different. I experimented with adding
the PR number and title to the changelog right before each changelog
entry. This way, it is easier to track where a particular changelog
entry comes from. After reviews are over, I plan to remove those lines
with PR numbers and titles.

I went through all the PRs that are merged after 11.2.0 release and came
up with a list of PRs that may need help with changelog entries. You can
see details on PRs grouped in several sections below.

## PRs with missing entries

The following PRs below do not have a changelog entry. If you think that
this is a mistake, please share it in this PR along with a suggestion on
what the changelog item should be.

PR #6846 : fix 3 flaky tests in failure schedule
PR #6844 : Add CPU usage to citus_stat_tenants
PR #6833 : Fix citus_stat_tenants period updating bug
PR #6787 : Add more tests for ddl coverage
PR #6842 : Add build-cdc-* temporary directories to .gitignore
PR #6841 : Add build-cdc-* temporary directories to .gitignore
PR #6840 : Bump Citus to 12.0devel
PR #6824 : Fixes flakiness in multi_metadata_sync test
PR #6811 : Backport identity column improvements to v11.2
PR #6830 : In run_test.py actually return worker_count
PR #6825 : Fixes flakiness in multi_cluster_management test
PR #6816 : Refactor run_test.py
PR #6817 : Explicitly disallow local rels when inserting into dist table
PR #6821 : Rename citus stats tenants
PR #6822 : Add some more tests for initial sql support
PR #6819 : Fix flakyness in
citus_split_shard_by_split_points_deferred_drop
PR #6814 : Make python-regress based tests runnable with run_test.py
PR #6813 : Fix flaky multi_mx_schema_support test
PR #6720 : Convert columnar tap tests to pytest
PR #6812 : Revoke statistics permissions from public and grant them to
pg_monitor
PR #6769 : Citus stats tenants guc
PR #6807 : Fix the incorrect (constant) value passed to pointer-to-bool
parameter, pass a NULL as the value is not used
PR #6797 : Attribute local queries and cached plans on local execution
PR #6796 : Parse the annotation string correctly
PR #6762 : Add logs to citus_stats_tenants
PR #6773 : Add initial sql support for distributed tables that don't
have a shard key
PR #6792 : Disentangle MERGE planning code from the modify-planning code
path
PR #6761 : Citus stats tenants collector view
PR #6791 : Make 8 more tests runnable multiple times via run_test.py
PR #6786 : Refactor some of the planning code to accommodate a new
planning path for MERGE SQL
PR #6789 : Rename AllRelations.. functions to AllDistributedRelations..
PR #6788 : Actually skip arbitrary_configs_router & nested_execution for
AllNullDistKeyDefaultConfig
PR #6783 : Add a config for arbitrary config tests where all the tables
are null-shard-key tables
PR #6784 : Fix attach partition: citus local to null distributed
PR #6782 : Add an arbitrary config test heavily based on
multi_router_planner_fast_path.sql
PR #6781 : Decide what to do with router planner error at one place
PR #6778 : Support partitioning for dist tables with null dist keys
PR #6766 : fix pip lock file
PR #6764 : Make workerCount configurable for regression tests
PR #6745 : Add support for creating distributed tables with a null shard
key
PR #6696 : This implements MERGE phase-III
PR #6767 : Add pytest depedencies to Pipfile
PR #6760 : Decide core distribution params in CreateCitusTable
PR #6759 : Add multi_create_fdw into minimal_schedule
PR #6743 : Replace CITUS_TABLE_WITH_NO_DIST_KEY checks with
HasDistributionKey()
PR #6751 : Stabilize single_node.sql and others that report illegal node
removal
PR #6742 : Refactor CreateDistributedTable()
PR #6747 : Remove unused lock functions
PR #6744 : Fix multiple output version arbitrary config tests
PR #6741 : Stabilize single node tests
PR #6740 : Fix string eval bug in migration files check
PR #6736 : Make run_test.py and create_test.py importable without errors
PR #6734 : Don't blanket ignore flake8 E402 error
PR #6737 : Fixes bookworm packaging pipeline problem
PR #6735 : Fix run_test.py on python 3.9
PR #6733 : MERGE: In deparser, add missing check for RETURNING clause.
PR #6714 : Remove auto_explain workaround in citus explain hook for
ALTER TABLE
PR #6719 : Fix flaky test
PR #6718 : Add more powerfull dependency tracking to run_test.py
PR #6710 : Install non-vulnerable cryptography package
PR #6711 : Support compilation and run tests on latest PG versions
PR #6700 : Add auto-formatting and linting to our python code
PR #6707 : Allow multi_insert_select to run repeatably
PR #6708 : Fix flakyness in failure_create_distributed_table_non_empty
PR #6698 : Miscellaneous cleanup
PR #6704 : Update README for 11.2
PR #6703 : Fix dubious ownership error from git
PR #6690 : Bump Citus to 11.3devel

## Too long changelog entries

The following PRs have changelog entries that are too long to fit in a
single line. I'd expect authors to supply at changelog entries in
`DESCRIPTION:` lines that are at most 78 characters. If you want to
supply multi-line changelog items, you can have multiple lines that
start with `DESCRIPTION:` instead.

PR #6837 : fixes update propagation bug when
`citus_set_coordinator_host` is called more than once
PR #6738 :  Identity column implementation refactorings
PR #6756 : Schedule parallel shard moves in background rebalancer by
removing task dependencies between shard moves across colocation groups.
PR #6793 : Add a GUC to disallow planning the queries that reference
non-colocated tables via router planner
PR #6726 : fix memory leak during altering distributed table with a lot
of partition and shards
PR #6722 : fix memory leak during distribution of a table with a lot of
partitions
PR #6693 : prevent memory leak during ConvertTable with a lot of
partitions

## Empty changelog entries.

The following PR had an empty `DESCRIPTION:` line. This generates an
empty changelog line that needs to be removed manually. Please either
provide a short entry, or remove `DESCRIPTION:` line completely.

PR #6810 : Make CDC decoder an independent extension
PR #6827 : Makefile changes to build CDC in builddir for pgoutput and
wal2json.

---------

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2023-05-02 11:29:24 +03:00
Ahmet Gedemenli 59ccf364df
Ignore nodes not allowed for shards, when planning rebalance steps (#6887)
We are handling colocation groups with shard group count less than the
worker node count, using a method different than the usual rebalancer.
See #6739
While making the decision of using this method or not, we should've
ignored the nodes that are marked `shouldhaveshards = false`. This PR
excludes those nodes when making the decision.

Adds a test such that:
 coordinator: []
 worker 1: [1_1, 1_2]
 worker 2: [2_1, 2_2]
(rebalance)
 coordinator: []
 worker 1: [1_1, 2_1]
 worker 2: [1_2, 2_2]

If we take the coordinator into account, the rebalancer considers the
first state as balanced and does nothing (because shard_count <
worker_count)
But with this pr, we ignore the coordinator because it's
shouldhaveshards = false
So the rebalancer distributes each colocation group to both workers

Also, fixes an unrelated flaky test in the same file
2023-05-01 12:21:08 +02:00
aykut-bozkurt 8cb69cfd13
break sequence dependency during table creation (#6889)
We need to break sequence dependency for a table while creating the
table during non-transactional metadata sync to ensure idempotency of
the creation of the table.

**Problem:**
When we send `SELECT
pg_catalog.worker_drop_sequence_dependency(logicalrelid::regclass::text)
FROM pg_dist_partition` to workers during the non-transactional sync,
table might not be in `pg_dist_partition` at worker, and sequence
dependency is not broken at the worker.

**Solution:** 
We break sequence dependency via `SELECT
pg_catalog.worker_drop_sequence_dependency(logicalrelid::regclass::text)`
for each table while creating it at the workers. It is safe to send
since the udf is a no-op when there is no sequence dependency.

DESCRIPTION: Fixes a bug related to sequence idempotency at
non-transactional sync.

Fixes https://github.com/citusdata/citus/issues/6888.
2023-04-28 15:09:09 +03:00
Hanefi Onaldi 135aaf45ca
Add missing entry for 10.0.8 (#6891)
When creating tags for backport releases, I realized that I missed one
changelog item. Adding it on the default branch in a commit. See #6885
for the relevant PR for the release branch.
2023-04-27 16:01:04 +03:00
aykut-bozkurt a7fa1db696
fix flaky test regex (#6890)
There was a bug related to regex. We sometimes caught the wrong line
when the test name is also included in comments.
Example: We caught the wrong line as multi_metadata_sync is included in
the comment before the test line.

```
# ----------
# multi_metadata_sync tests the propagation of mx-related metadata changes to metadata workers
# multi_unsupported_worker_operations tests that unsupported operations error out on metadata workers
# ----------
test: multi_metadata_sync
```

Solution: Restrict regex rule better.
2023-04-27 13:14:40 +03:00
Hanefi Onaldi 5fc5931506
Skip some versions on changelog (#6882)
We had 10.1.5, 10.0.7, and 9.5.11 in the changelog, but those versions
are already used in enterprise repository. This commit skips those
versions and uses 10.1.6, 10.0.8, and 9.5.12 instead to prevent clashes.
2023-04-26 12:05:27 +03:00