Commit Graph

6464 Commits (508ee987e64195a78cfb49de1ea93056b5337658)

Author SHA1 Message Date
Gokhan Gulbiz 508ee987e6
Rewrite ExtractTopComment by using strstr() and stringinfo 2023-03-28 08:33:02 +03:00
Gokhan Gulbiz 0b06e64c3f
Use stringinfo for escaping/unescaping 2023-03-28 08:33:02 +03:00
Gokhan Gulbiz 0744384bac
Update comment
Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
2023-03-28 08:33:02 +03:00
Gokhan Gulbiz 5e9dd3c894
Add an additional comment
Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
2023-03-28 08:33:01 +03:00
Gokhan Gulbiz 2ac56ef955
Revert "Use text_substr for getting top comment"
This reverts commit 9531cfd3bf.
2023-03-28 08:33:01 +03:00
Gokhan Gulbiz 81953d7ac6
Convert char* to text for text_substr call 2023-03-28 08:33:01 +03:00
Gokhan Gulbiz 426cfd3ce5
Indent 2023-03-28 08:33:01 +03:00
Gokhan Gulbiz 8b09a4f8c0
Handle no comment end chars 2023-03-28 08:33:01 +03:00
Gokhan Gulbiz fc23fd5061
Use text_substr for getting top comment 2023-03-28 08:33:00 +03:00
Gokhan Gulbiz 905cc5b4f3
Use palloc instead of malloc 2023-03-28 08:33:00 +03:00
Gokhan Gulbiz 3ec2994abd
Indent 2023-03-28 08:33:00 +03:00
Gokhan Gulbiz 3a5c7c3280
Indent 2023-03-28 08:33:00 +03:00
Gokhan Gulbiz 486e2a622a
Refactoring to reduce nesting 2023-03-28 08:33:00 +03:00
Gokhan Gulbiz 8e1e827242
Remove unnecessary check 2023-03-28 08:32:59 +03:00
Gokhan Gulbiz 5e6ac25885
Renamings 2023-03-28 08:32:59 +03:00
Gokhan Gulbiz a355825bfe
Set INVALID_COLOCATION_ID if colocationId doesn't exist in the annotation. 2023-03-28 08:32:59 +03:00
Gokhan Gulbiz c60de03d6a
Indent 2023-03-28 08:32:21 +03:00
Gokhan Gulbiz da24f2fd62
Indent 2023-03-28 08:32:20 +03:00
Gokhan Gulbiz 3cfa197f69
Escape/Unescape sql comment chars 2023-03-28 08:32:20 +03:00
Gokhan Gulbiz 80dd73711e
Minor renamings and refactorings 2023-03-28 08:32:20 +03:00
Gokhan Gulbiz dbc26cacb5
Add comment chars escaping 2023-03-28 08:32:20 +03:00
Gokhan Gulbiz 89e6623960
Fix attribute prefix 2023-03-28 08:32:20 +03:00
Gokhan Gulbiz bb4aacb92f
Fix tenant statistics annotations normalization 2023-03-28 08:32:19 +03:00
Gokhan Gulbiz e9a6f8a7c5
Indent 2023-03-28 08:32:19 +03:00
Gokhan Gulbiz 21298f6661
Validate attribute prefix existance on query string 2023-03-28 08:32:19 +03:00
Gokhan Gulbiz 6d8cd8a9a0
Validate input string length 2023-03-28 08:31:26 +03:00
Gokhan Gulbiz eaa896e744
Normalize multiline sql comment statements 2023-03-28 08:31:26 +03:00
Gokhan Gulbiz 517ceb2d22
Use strncpy_s instead of strncpy 2023-03-28 08:31:26 +03:00
Gokhan Gulbiz fda680d22e
Use palloc instead of malloc 2023-03-28 08:31:26 +03:00
Gokhan Gulbiz 024526ab2f
Introduce JSON based annotation parsing 2023-03-28 08:31:25 +03:00
Gokhan Gulbiz 9d2d97fe67
Add ExtractFieldInt32(..) to jsonbutils 2023-03-28 08:03:26 +03:00
Halil Ozan Akgül b989e8872c
Citus stats tenants collector view (#6761)
Add a view that collects statistics from all nodes
2023-03-27 17:42:22 +03:00
Halil Ozan Akgul d6603390ab Add multi tenant statistics monitoring 2023-03-27 17:13:24 +03:00
Onur Tirtir 372a93b529
Make 8 more tests runnable multiple times via run_test.py (#6791)
Soon I will be doing some changes related to #692 in router planner
and those changes require updating ~5/6 tests related to router
planning. And to make those test files runnable by run_test.py
multiple times, we need to make some other tests (that they're
run in parallel / they badly depend on) ready for run_test.py too.
2023-03-27 12:19:06 +03:00
Teja Mupparti da7db53c87 Refactor some of the planning code to accomodate a new planning path for MERGE SQL 2023-03-22 11:29:24 -07:00
Onur Tirtir e1f1d63050
Rename AllRelations.. functions to AllDistributedRelations.. (#6789)
Because they're only interested in distributed tables. Even more,
this replaces HasDistributionKey() check with
IsCitusTableType(DISTRIBUTED_TABLE) because this doesn't make a
difference on main and sounds slightly more intuitive. Plus, this
would also allow safely using this function in
https://github.com/citusdata/citus/pull/6773.
2023-03-22 15:15:23 +03:00
Onur Tirtir 4960ced175
Add an arbitrary config test heavily based on multi_router_planner_fast_path.sql (#6782)
This would be useful for testing #6773. This is because, given that
#6773
only adds support for router / fast-path queries, theoretically almost
all
the tests that we have in that test file should work for null-shard-key
tables too (and they indeed do).

I deliberately did not replace multi_router_planner_fast_path.sql with
the one that I'm adding into arbitrary configs because we might still
want to see when we're able to go through fast-path planning for the
usual distributed tables (the ones that have a shard key).
2023-03-22 10:49:08 +03:00
Ahmet Gedemenli 2713e015d6
Check before logicalrep for rebalancer, error if needed (#6754)
DESCRIPTION: Check before logicalrep for rebalancer, error if needed

Check if we can use logical replication or not, in case of shard
transfer mode = auto, before executing the shard moves. If we can't,
error out. Before this PR, we used to error out in the middle of shard
moves:
```sql
set citus.shard_count = 4; -- just to get the error sooner
select citus_remove_node('localhost',9702);

create table t1 (a int primary key);
select create_distributed_table('t1','a');
create table t2 (a bigint);
select create_distributed_table('t2','a');

select citus_add_node('localhost',9702);
select rebalance_table_shards();
NOTICE:  Moving shard 102008 from localhost:9701 to localhost:9702 ...
NOTICE:  Moving shard 102009 from localhost:9701 to localhost:9702 ...
NOTICE:  Moving shard 102012 from localhost:9701 to localhost:9702 ...
ERROR:  cannot use logical replication to transfer shards of the relation t2 since it doesn't have a REPLICA IDENTITY or PRIMARY KEY
```

Now we check and error out in the beginning, without moving the shards.

fixes: #6727
2023-03-21 16:34:52 +03:00
Onur Tirtir aa465b6de1
Decide what to do with router planner error at one place (#6781) 2023-03-21 14:04:07 +03:00
aykut-bozkurt aa33988c6e
fix pip lock file (#6766)
ci/fix_styles.sh were complaining about `black` and `isort` packages are
not found even if I `pipenv install --dev` due to broken lock file. I
regenerated the lock file and now it works fine. We also wanted to
upgrade required python version for the pipfile.
2023-03-21 00:58:12 +03:00
aykut-bozkurt ea3093bdb6
Make workerCount configurable for regression tests (#6764)
Make worker count flexible in our regression tests instead of hardcoding
it to 2 workers.
2023-03-20 12:06:31 +03:00
Teja Mupparti cf55136281 1) Restrict MERGE command INSERT to the source's distribution column
Fixes #6672

2) Move all MERGE related routines to a new file merge_planner.c

3) Make ConjunctionContainsColumnFilter() static again, and rearrange the code in MergeQuerySupported()
4) Restore the original format in the comments section.
5) Add big serial test. Implement latest set of comments
2023-03-16 13:43:08 -07:00
Teja Mupparti 1e42cd3da0 Support MERGE on distributed tables with restrictions
This implements the phase - II of MERGE sql support

Support routable query where all the tables in the merge-sql are distributed, co-located, and both the source and
target relations are joined on the distribution column with a constant qual. This should be a Citus single-task
query. Below is an example.

SELECT create_distributed_table('t1', 'id');
SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’);

MERGE INTO t1
USING s1 ON t1.id = s1.id AND t1.id = 100
WHEN MATCHED THEN
UPDATE SET val = s1.val + 10
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED THEN
INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src)

Basically, MERGE checks to see if

There are a minimum of two distributed tables (source and a target).
All the distributed tables are indeed colocated.
MERGE relations are joined on the distribution column
MERGE .. USING .. ON target.dist_key = source.dist_key
The query should touch only a single shard i.e. JOIN AND with a constant qual
MERGE .. USING .. ON target.dist_key = source.dist_key AND target.dist_key = <>
If any of the conditions are not met, it raises an exception.

(cherry picked from commit 44c387b978)

This implements MERGE phase3

Support pushdown query where all the tables in the merge-sql are Citus-distributed, co-located, and both
the source and target relations are joined on the distribution column. This will generate multiple tasks
which execute independently after pushdown.

SELECT create_distributed_table('t1', 'id');
SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’);

MERGE INTO t1
USING s1
ON t1.id = s1.id
        WHEN MATCHED THEN
                UPDATE SET val = s1.val + 10
        WHEN MATCHED THEN
                DELETE
        WHEN NOT MATCHED THEN
                INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src)

*The only exception for both the phases II and III is, UPDATEs and INSERTs must be done on the same shard-group
as the joined key; for example, below scenarios are NOT supported as the key-value to be inserted/updated is not
guaranteed to be on the same node as the id distribution-column.

MERGE INTO target t
USING source s ON (t.customer_id = s.customer_id)
WHEN NOT MATCHED THEN - -
     INSERT(customer_id, …) VALUES (<non-local-constant-key-value>, ……);

OR this scenario where we update the distribution column itself

MERGE INTO target t
USING source s On (t.customer_id = s.customer_id)
WHEN MATCHED THEN
     UPDATE SET customer_id = 100;

(cherry picked from commit fa7b8949a8)
2023-03-16 13:43:08 -07:00
Jelte Fennema b8b85072d6
Add pytest depedencies to Pipfile (#6767)
In #6720 I'm adding a `pytest` based testing framework. This adds the
dependencies for those. They have already been [merged into our docker
files][the-process-merge] in the the-process repo preparation for #6720.
But by not having them on our citus main branch it is impossible to
make changes to the Pipfile, because our CI Dockerfiles and master
are out of date.

Since #6720 will need some more discussion and might take a few more
weeks to be merged, this takes out the Pipfile changes. By merging this
PR we can unblock new Pipfile changes.

Unblocks and partially addresses #6766 

[the-process-merge]: https://github.com/citusdata/the-process/pull/117
2023-03-15 14:53:14 +01:00
Onur Tirtir a0a41943d7
Remove pg_depend entries from columnar metadata indexes to columnar-am (inserted in #5456) (#6628)
DESCRIPTION: Fixes (pg_dump/pg_upgrade) dependency loop warnings caused
by pg_depend entries inserted by citus_columnar

Fixes #5510.

In the past, having columnar tables in the cluster was causing pg
upgrades to fail when attempting to access columnar metadata. This is
because, pg_dump doesn't see objects that we use for columnar-am related
booking as the dependencies of the tables using columnar-am.
To fix that; in #5456, we inserted some "normal dependency" edges (from
those objects to columnar-am) into pg_depend.

This helped us ensuring the existency of a class of metadata objects
--such as columnar.storageid_seq-- and helped fixing #5437.

However, the normal-dependency edges that we added for indexes on
columnar metadata tables --such columnar.stripe_pkey-- didn't help at
all because they were indeed causing dependency loops (#5510) and
pg_dump was not able to take those dependency edges into the account.

For this reason, this commit deletes those dependency edges so that
pg_dump stops complaining about them. Note that it's not critical to
delete those edges from pg_depend since they're not breaking pg upgrades
but were triggering some warning messages. And given that backporting
a sql change into older versions is hard a lot, we skip backporting
this.
2023-03-15 01:24:57 +03:00
Onur Tirtir 9550ebd118 Remove pg_depend entries from columnar metadata indexes to columnar-am
In the past, having columnar tables in the cluster was causing pg
upgrades to fail when attempting to access columnar metadata. This is
because, pg_dump doesn't see objects that we use for columnar-am related
booking as the dependencies of the tables using columnar-am.
To fix that; in #5456, we inserted some "normal dependency" edges (from
those objects to columnar-am) into pg_depend.

This helped us ensuring the existency of a class of metadata objects
--such as columnar.storageid_seq-- and helped fixing #5437.

However, the normal-dependency edges that we added for indexes on
columnar metadata tables --such columnar.stripe_pkey-- didn't help at
all because they were indeed causing dependency loops (#5510) and
pg_dump was not able to take those dependency edges into the account.

For this reason, this commit deletes those dependency edges so that
pg_dump stops complaining about them. Note that it's not critical to
delete those edges from pg_depend since they're not breaking pg upgrades
but were triggering some warning messages. And given that backporting
a sql change into older versions is hard a lot, we skip backporting
this.
2023-03-14 17:13:52 +03:00
Onur Tirtir be0735a329 Use "cpp" to expand "#include" directives in columnar sql files 2023-03-14 17:13:52 +03:00
Onur Tirtir 2b4be535de Do clean-up before upgrade_columnar_before to make it runnable multiple times
So that flaky test detector can run upgrade_columnar_before.sql multiple
times.
2023-03-14 17:13:52 +03:00
Onur Tirtir 994f67185f Make upgrade_columnar_after runnable multiple times
This commit hides port numbers in upgrade_columnar_after because the
port numbers assigned to nodes in upgrade schedule differ from the ones
that flaky test detector assigns.
2023-03-14 17:13:52 +03:00
Onur Tirtir 821f26cc74 Fix flaky test detection for upgrade tests
When run_test.py is run for an upgrade_.*_after.sql then, then
automatically run the corresponding uprade_.*_before.sql file first.
This is because all those upgrade_.*_after.sql files depend on the
objects created in upgrade_.*_before.sql files by definition.
2023-03-14 17:13:52 +03:00