Commit Graph

6545 Commits (multi-tenant-monitoring-annotate-local-queries)

Author SHA1 Message Date
Onur Tirtir 4960ced175
Add an arbitrary config test heavily based on multi_router_planner_fast_path.sql (#6782)
This would be useful for testing #6773. This is because, given that
#6773
only adds support for router / fast-path queries, theoretically almost
all
the tests that we have in that test file should work for null-shard-key
tables too (and they indeed do).

I deliberately did not replace multi_router_planner_fast_path.sql with
the one that I'm adding into arbitrary configs because we might still
want to see when we're able to go through fast-path planning for the
usual distributed tables (the ones that have a shard key).
2023-03-22 10:49:08 +03:00
Ahmet Gedemenli 2713e015d6
Check before logicalrep for rebalancer, error if needed (#6754)
DESCRIPTION: Check before logicalrep for rebalancer, error if needed

Check if we can use logical replication or not, in case of shard
transfer mode = auto, before executing the shard moves. If we can't,
error out. Before this PR, we used to error out in the middle of shard
moves:
```sql
set citus.shard_count = 4; -- just to get the error sooner
select citus_remove_node('localhost',9702);

create table t1 (a int primary key);
select create_distributed_table('t1','a');
create table t2 (a bigint);
select create_distributed_table('t2','a');

select citus_add_node('localhost',9702);
select rebalance_table_shards();
NOTICE:  Moving shard 102008 from localhost:9701 to localhost:9702 ...
NOTICE:  Moving shard 102009 from localhost:9701 to localhost:9702 ...
NOTICE:  Moving shard 102012 from localhost:9701 to localhost:9702 ...
ERROR:  cannot use logical replication to transfer shards of the relation t2 since it doesn't have a REPLICA IDENTITY or PRIMARY KEY
```

Now we check and error out in the beginning, without moving the shards.

fixes: #6727
2023-03-21 16:34:52 +03:00
Onur Tirtir aa465b6de1
Decide what to do with router planner error at one place (#6781) 2023-03-21 14:04:07 +03:00
Gokhan Gulbiz 852db8f2f9
Rewrite ExtractTopComment by using strstr() and stringinfo 2023-03-21 12:34:05 +03:00
Gokhan Gulbiz 16c75165cf
Use stringinfo for escaping/unescaping 2023-03-21 11:27:00 +03:00
Gokhan Gulbiz 26226d0eb4
Update comment
Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
2023-03-21 11:22:29 +03:00
Gokhan Gulbiz 3f61a62420
Add an additional comment
Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
2023-03-21 11:22:08 +03:00
aykut-bozkurt aa33988c6e
fix pip lock file (#6766)
ci/fix_styles.sh were complaining about `black` and `isort` packages are
not found even if I `pipenv install --dev` due to broken lock file. I
regenerated the lock file and now it works fine. We also wanted to
upgrade required python version for the pipfile.
2023-03-21 00:58:12 +03:00
aykut-bozkurt ea3093bdb6
Make workerCount configurable for regression tests (#6764)
Make worker count flexible in our regression tests instead of hardcoding
it to 2 workers.
2023-03-20 12:06:31 +03:00
Gokhan Gulbiz 1df9260f82
Revert "Use text_substr for getting top comment"
This reverts commit 9531cfd3bf.
2023-03-20 11:02:34 +03:00
Gokhan Gulbiz 34e1dafb18
Convert char* to text for text_substr call 2023-03-20 09:40:08 +03:00
Gokhan Gulbiz df3ad5c4f6
Indent 2023-03-20 08:49:39 +03:00
Gokhan Gulbiz f972b6fae9
Handle no comment end chars 2023-03-20 08:46:52 +03:00
Gokhan Gulbiz 9531cfd3bf
Use text_substr for getting top comment 2023-03-20 08:46:22 +03:00
Gokhan Gulbiz 5929fda143
Use palloc instead of malloc 2023-03-20 08:15:11 +03:00
Teja Mupparti cf55136281 1) Restrict MERGE command INSERT to the source's distribution column
Fixes #6672

2) Move all MERGE related routines to a new file merge_planner.c

3) Make ConjunctionContainsColumnFilter() static again, and rearrange the code in MergeQuerySupported()
4) Restore the original format in the comments section.
5) Add big serial test. Implement latest set of comments
2023-03-16 13:43:08 -07:00
Teja Mupparti 1e42cd3da0 Support MERGE on distributed tables with restrictions
This implements the phase - II of MERGE sql support

Support routable query where all the tables in the merge-sql are distributed, co-located, and both the source and
target relations are joined on the distribution column with a constant qual. This should be a Citus single-task
query. Below is an example.

SELECT create_distributed_table('t1', 'id');
SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’);

MERGE INTO t1
USING s1 ON t1.id = s1.id AND t1.id = 100
WHEN MATCHED THEN
UPDATE SET val = s1.val + 10
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED THEN
INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src)

Basically, MERGE checks to see if

There are a minimum of two distributed tables (source and a target).
All the distributed tables are indeed colocated.
MERGE relations are joined on the distribution column
MERGE .. USING .. ON target.dist_key = source.dist_key
The query should touch only a single shard i.e. JOIN AND with a constant qual
MERGE .. USING .. ON target.dist_key = source.dist_key AND target.dist_key = <>
If any of the conditions are not met, it raises an exception.

(cherry picked from commit 44c387b978)

This implements MERGE phase3

Support pushdown query where all the tables in the merge-sql are Citus-distributed, co-located, and both
the source and target relations are joined on the distribution column. This will generate multiple tasks
which execute independently after pushdown.

SELECT create_distributed_table('t1', 'id');
SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’);

MERGE INTO t1
USING s1
ON t1.id = s1.id
        WHEN MATCHED THEN
                UPDATE SET val = s1.val + 10
        WHEN MATCHED THEN
                DELETE
        WHEN NOT MATCHED THEN
                INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src)

*The only exception for both the phases II and III is, UPDATEs and INSERTs must be done on the same shard-group
as the joined key; for example, below scenarios are NOT supported as the key-value to be inserted/updated is not
guaranteed to be on the same node as the id distribution-column.

MERGE INTO target t
USING source s ON (t.customer_id = s.customer_id)
WHEN NOT MATCHED THEN - -
     INSERT(customer_id, …) VALUES (<non-local-constant-key-value>, ……);

OR this scenario where we update the distribution column itself

MERGE INTO target t
USING source s On (t.customer_id = s.customer_id)
WHEN MATCHED THEN
     UPDATE SET customer_id = 100;

(cherry picked from commit fa7b8949a8)
2023-03-16 13:43:08 -07:00
Gokhan Gulbiz bee5b6d58f
Indent 2023-03-16 14:52:36 +03:00
Gokhan Gulbiz 7074035307
Indent 2023-03-16 11:31:12 +03:00
Gokhan Gulbiz de527dbb71
Refactoring to reduce nesting 2023-03-16 11:02:52 +03:00
Gokhan Gulbiz ae276e70ab
Remove unnecessary check 2023-03-16 11:00:37 +03:00
Gokhan Gulbiz c8ed4d174a
Renamings 2023-03-16 10:54:07 +03:00
Gokhan Gulbiz 05f90b30d3
Set INVALID_COLOCATION_ID if colocationId doesn't exist in the annotation. 2023-03-16 10:40:02 +03:00
Jelte Fennema b8b85072d6
Add pytest depedencies to Pipfile (#6767)
In #6720 I'm adding a `pytest` based testing framework. This adds the
dependencies for those. They have already been [merged into our docker
files][the-process-merge] in the the-process repo preparation for #6720.
But by not having them on our citus main branch it is impossible to
make changes to the Pipfile, because our CI Dockerfiles and master
are out of date.

Since #6720 will need some more discussion and might take a few more
weeks to be merged, this takes out the Pipfile changes. By merging this
PR we can unblock new Pipfile changes.

Unblocks and partially addresses #6766 

[the-process-merge]: https://github.com/citusdata/the-process/pull/117
2023-03-15 14:53:14 +01:00
Gokhan Gulbiz 174f9574a1
Indent 2023-03-15 13:35:28 +03:00
Gokhan Gulbiz 7c5a724b3b
Merge branch 'multi-tenant-monitoring-annotation-parsing' into multi-tenant-monitoring-annotate-local-queries 2023-03-15 09:19:59 +03:00
Gokhan Gulbiz 57eaeb2ecb
Indent 2023-03-15 09:01:41 +03:00
Gokhan Gulbiz 5cd3c8ad11
Indent 2023-03-15 08:36:43 +03:00
Gokhan Gulbiz 23f4406d7f
Merge remote-tracking branch 'upstream/multi-tenant-monitoring' into multi-tenant-monitoring-annotation-parsing 2023-03-15 08:18:21 +03:00
Gokhan Gulbiz 44fec55d58
Escape/Unescape sql comment chars 2023-03-15 08:11:07 +03:00
Gokhan Gulbiz 230a189c1f
Minor renamings and refactorings 2023-03-15 08:10:45 +03:00
Gokhan Gulbiz 5cedee7a7d
Add comment chars escaping 2023-03-15 08:09:24 +03:00
Gokhan Gulbiz 3a435e7c14
Fix attribute prefix 2023-03-15 08:08:33 +03:00
Onur Tirtir a0a41943d7
Remove pg_depend entries from columnar metadata indexes to columnar-am (inserted in #5456) (#6628)
DESCRIPTION: Fixes (pg_dump/pg_upgrade) dependency loop warnings caused
by pg_depend entries inserted by citus_columnar

Fixes #5510.

In the past, having columnar tables in the cluster was causing pg
upgrades to fail when attempting to access columnar metadata. This is
because, pg_dump doesn't see objects that we use for columnar-am related
booking as the dependencies of the tables using columnar-am.
To fix that; in #5456, we inserted some "normal dependency" edges (from
those objects to columnar-am) into pg_depend.

This helped us ensuring the existency of a class of metadata objects
--such as columnar.storageid_seq-- and helped fixing #5437.

However, the normal-dependency edges that we added for indexes on
columnar metadata tables --such columnar.stripe_pkey-- didn't help at
all because they were indeed causing dependency loops (#5510) and
pg_dump was not able to take those dependency edges into the account.

For this reason, this commit deletes those dependency edges so that
pg_dump stops complaining about them. Note that it's not critical to
delete those edges from pg_depend since they're not breaking pg upgrades
but were triggering some warning messages. And given that backporting
a sql change into older versions is hard a lot, we skip backporting
this.
2023-03-15 01:24:57 +03:00
Halil Ozan Akgul 33265dd411 Fix indent 2023-03-14 17:44:33 +03:00
Onur Tirtir 9550ebd118 Remove pg_depend entries from columnar metadata indexes to columnar-am
In the past, having columnar tables in the cluster was causing pg
upgrades to fail when attempting to access columnar metadata. This is
because, pg_dump doesn't see objects that we use for columnar-am related
booking as the dependencies of the tables using columnar-am.
To fix that; in #5456, we inserted some "normal dependency" edges (from
those objects to columnar-am) into pg_depend.

This helped us ensuring the existency of a class of metadata objects
--such as columnar.storageid_seq-- and helped fixing #5437.

However, the normal-dependency edges that we added for indexes on
columnar metadata tables --such columnar.stripe_pkey-- didn't help at
all because they were indeed causing dependency loops (#5510) and
pg_dump was not able to take those dependency edges into the account.

For this reason, this commit deletes those dependency edges so that
pg_dump stops complaining about them. Note that it's not critical to
delete those edges from pg_depend since they're not breaking pg upgrades
but were triggering some warning messages. And given that backporting
a sql change into older versions is hard a lot, we skip backporting
this.
2023-03-14 17:13:52 +03:00
Onur Tirtir be0735a329 Use "cpp" to expand "#include" directives in columnar sql files 2023-03-14 17:13:52 +03:00
Onur Tirtir 2b4be535de Do clean-up before upgrade_columnar_before to make it runnable multiple times
So that flaky test detector can run upgrade_columnar_before.sql multiple
times.
2023-03-14 17:13:52 +03:00
Onur Tirtir 994f67185f Make upgrade_columnar_after runnable multiple times
This commit hides port numbers in upgrade_columnar_after because the
port numbers assigned to nodes in upgrade schedule differ from the ones
that flaky test detector assigns.
2023-03-14 17:13:52 +03:00
Onur Tirtir 821f26cc74 Fix flaky test detection for upgrade tests
When run_test.py is run for an upgrade_.*_after.sql then, then
automatically run the corresponding uprade_.*_before.sql file first.
This is because all those upgrade_.*_after.sql files depend on the
objects created in upgrade_.*_before.sql files by definition.
2023-03-14 17:13:52 +03:00
Halil Ozan Akgul 5e9e67e5bf Include update and delete queries 2023-03-14 15:52:36 +03:00
Onur Tirtir f68fc9e69c
Decide core distribution params in CreateCitusTable (#6760)
Decide core distribution params in CreateCitusTable to reduce the
chances of
creating Citus tables based on incorrect combinations of distribution
method
and replication model params.

Also introduce DistributedTableParams struct to encapsulate the
parameters
that are specific to distributed tables.
2023-03-14 14:24:52 +03:00
Halil Ozan Akgul 2aa30dd303 Remove citus_stats_tenants from tests 2023-03-14 12:42:45 +03:00
Halil Ozan Akgul 93fb53b72b Remove storage view 2023-03-14 10:53:27 +03:00
Onur Tirtir cc945fa331
Add multi_create_fdw into minimal_schedule (#6759)
So that we can run the tests that require fake_fdw by using minimal
schedule too.

Also move multi_create_fdw.sql up in multi_1_schedule to make it
available to more tests.
2023-03-14 10:22:34 +03:00
Halil Ozan Akgul bb1db823f7 Add tests 2023-03-13 19:41:55 +03:00
Gokhan Gulbiz c241af1eb9
Fix tenant statistics annotations normalization 2023-03-13 14:32:16 +03:00
Gokhan Gulbiz 596954622c
Indent 2023-03-13 14:31:41 +03:00
Gokhan Gulbiz 59990addd1
Validate attribute prefix existance on query string 2023-03-13 13:24:27 +03:00
Gokhan Gulbiz 0de2ebad3f
Validate input string length 2023-03-13 12:44:57 +03:00