citus

Commit Graph

Author	SHA1	Message	Date
Marco Slot	e5fd1c3a87	Fix TAP tests after CREATE PUBLICATION changes	2023-03-29 00:59:12 +02:00
Marco Slot	8ad444f8ef	Hide shards from CDC subscriptions	2023-03-29 00:59:12 +02:00
Marco Slot	b09d239809	Propagate CREATE PUBLICATION statements	2023-03-29 00:59:12 +02:00
Gokhan Gulbiz	e618345703	Handle identity columns properly in the router planner (#6802 ) DESCRIPTION: Fixes a bug with insert..select queries with identity columns Fixes #6798	2023-03-29 15:50:12 +03:00
Teja Mupparti	37500806d6	Add appropriate locks for MERGE to run in parallel	2023-03-28 09:45:40 -07:00
rajeshkt78	85b8a2c7a1	CDC implementation for Citus using Logical Replication (#6623 ) Description: Implementing CDC changes using Logical Replication to avoid re-publishing events multiple times by setting up replication origin session, which will add "DoNotReplicateId" to every WAL entry. - shard splits - shard moves - create distributed table - undistribute table - alter distributed tables (for some cases) - reference table operations The citus decoder which will be decoding WAL events for CDC clients, ignores any WAL entry with replication origin that is not zero. It also maps the shard names to distributed table names.	2023-03-28 16:00:21 +05:30
Onur Tirtir	616b5018a0	Add a GUC to disallow planning the queries that reference non-colocated tables via router planner (#6793 ) Today we allow planning the queries that reference non-colocated tables if the shards that query targets are placed on the same node. However, this may not be the case, e.g., after rebalancing shards because it's not guaranteed to have those shards on the same node anymore. This commit adds citus.enable_non_colocated_router_query_pushdown GUC that can be used to disallow planning such queries via router planner, when it's set to false. Note that the default value for this GUC will be "true" for 11.3, but we will alter it to "false" on 12.0 to not introduce a breaking change in a minor release. Closes #692. Even more, allowing such queries to go through router planner also causes generating an incorrect plan for the DML queries that reference distributed tables that are sharded based on different replication factor settings. For this reason, #6779 can be closed after altering the default value for this GUC to "false", hence not now. DESCRIPTION: Adds `citus.enable_non_colocated_router_query_pushdown` GUC to ensure generating a consistent distributed plan for the queries that reference non-colocated distributed tables (when set to "false", the default is "true").	2023-03-28 13:10:29 +03:00
Teja Mupparti	9bab819f26	Disentangle MERGE planning code from the modify-planning code path	2023-03-27 10:41:46 -07:00
Onur Tirtir	372a93b529	Make 8 more tests runnable multiple times via run_test.py (#6791 ) Soon I will be doing some changes related to #692 in router planner and those changes require updating ~5/6 tests related to router planning. And to make those test files runnable by run_test.py multiple times, we need to make some other tests (that they're run in parallel / they badly depend on) ready for run_test.py too.	2023-03-27 12:19:06 +03:00
Teja Mupparti	da7db53c87	Refactor some of the planning code to accomodate a new planning path for MERGE SQL	2023-03-22 11:29:24 -07:00
Onur Tirtir	e1f1d63050	Rename AllRelations.. functions to AllDistributedRelations.. (#6789 ) Because they're only interested in distributed tables. Even more, this replaces HasDistributionKey() check with IsCitusTableType(DISTRIBUTED_TABLE) because this doesn't make a difference on main and sounds slightly more intuitive. Plus, this would also allow safely using this function in https://github.com/citusdata/citus/pull/6773.	2023-03-22 15:15:23 +03:00
Onur Tirtir	4960ced175	Add an arbitrary config test heavily based on multi_router_planner_fast_path.sql (#6782 ) This would be useful for testing #6773. This is because, given that #6773 only adds support for router / fast-path queries, theoretically almost all the tests that we have in that test file should work for null-shard-key tables too (and they indeed do). I deliberately did not replace multi_router_planner_fast_path.sql with the one that I'm adding into arbitrary configs because we might still want to see when we're able to go through fast-path planning for the usual distributed tables (the ones that have a shard key).	2023-03-22 10:49:08 +03:00
Ahmet Gedemenli	2713e015d6	Check before logicalrep for rebalancer, error if needed (#6754 ) DESCRIPTION: Check before logicalrep for rebalancer, error if needed Check if we can use logical replication or not, in case of shard transfer mode = auto, before executing the shard moves. If we can't, error out. Before this PR, we used to error out in the middle of shard moves: ```sql set citus.shard_count = 4; -- just to get the error sooner select citus_remove_node('localhost',9702); create table t1 (a int primary key); select create_distributed_table('t1','a'); create table t2 (a bigint); select create_distributed_table('t2','a'); select citus_add_node('localhost',9702); select rebalance_table_shards(); NOTICE: Moving shard 102008 from localhost:9701 to localhost:9702 ... NOTICE: Moving shard 102009 from localhost:9701 to localhost:9702 ... NOTICE: Moving shard 102012 from localhost:9701 to localhost:9702 ... ERROR: cannot use logical replication to transfer shards of the relation t2 since it doesn't have a REPLICA IDENTITY or PRIMARY KEY ``` Now we check and error out in the beginning, without moving the shards. fixes: #6727	2023-03-21 16:34:52 +03:00
Onur Tirtir	aa465b6de1	Decide what to do with router planner error at one place (#6781 )	2023-03-21 14:04:07 +03:00
aykut-bozkurt	aa33988c6e	fix pip lock file (#6766 ) ci/fix_styles.sh were complaining about `black` and `isort` packages are not found even if I `pipenv install --dev` due to broken lock file. I regenerated the lock file and now it works fine. We also wanted to upgrade required python version for the pipfile.	2023-03-21 00:58:12 +03:00
aykut-bozkurt	ea3093bdb6	Make workerCount configurable for regression tests (#6764 ) Make worker count flexible in our regression tests instead of hardcoding it to 2 workers.	2023-03-20 12:06:31 +03:00
Teja Mupparti	cf55136281	1) Restrict MERGE command INSERT to the source's distribution column Fixes #6672 2) Move all MERGE related routines to a new file merge_planner.c 3) Make ConjunctionContainsColumnFilter() static again, and rearrange the code in MergeQuerySupported() 4) Restore the original format in the comments section. 5) Add big serial test. Implement latest set of comments	2023-03-16 13:43:08 -07:00
Teja Mupparti	1e42cd3da0	Support MERGE on distributed tables with restrictions This implements the phase - II of MERGE sql support Support routable query where all the tables in the merge-sql are distributed, co-located, and both the source and target relations are joined on the distribution column with a constant qual. This should be a Citus single-task query. Below is an example. SELECT create_distributed_table('t1', 'id'); SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’); MERGE INTO t1 USING s1 ON t1.id = s1.id AND t1.id = 100 WHEN MATCHED THEN UPDATE SET val = s1.val + 10 WHEN MATCHED THEN DELETE WHEN NOT MATCHED THEN INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src) Basically, MERGE checks to see if There are a minimum of two distributed tables (source and a target). All the distributed tables are indeed colocated. MERGE relations are joined on the distribution column MERGE .. USING .. ON target.dist_key = source.dist_key The query should touch only a single shard i.e. JOIN AND with a constant qual MERGE .. USING .. ON target.dist_key = source.dist_key AND target.dist_key = <> If any of the conditions are not met, it raises an exception. (cherry picked from commit `44c387b978`) This implements MERGE phase3 Support pushdown query where all the tables in the merge-sql are Citus-distributed, co-located, and both the source and target relations are joined on the distribution column. This will generate multiple tasks which execute independently after pushdown. SELECT create_distributed_table('t1', 'id'); SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’); MERGE INTO t1 USING s1 ON t1.id = s1.id WHEN MATCHED THEN UPDATE SET val = s1.val + 10 WHEN MATCHED THEN DELETE WHEN NOT MATCHED THEN INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src) *The only exception for both the phases II and III is, UPDATEs and INSERTs must be done on the same shard-group as the joined key; for example, below scenarios are NOT supported as the key-value to be inserted/updated is not guaranteed to be on the same node as the id distribution-column. MERGE INTO target t USING source s ON (t.customer_id = s.customer_id) WHEN NOT MATCHED THEN - - INSERT(customer_id, …) VALUES (<non-local-constant-key-value>, ……); OR this scenario where we update the distribution column itself MERGE INTO target t USING source s On (t.customer_id = s.customer_id) WHEN MATCHED THEN UPDATE SET customer_id = 100; (cherry picked from commit `fa7b8949a8`)	2023-03-16 13:43:08 -07:00
Jelte Fennema	b8b85072d6	Add pytest depedencies to Pipfile (#6767 ) In #6720 I'm adding a `pytest` based testing framework. This adds the dependencies for those. They have already been [merged into our docker files][the-process-merge] in the the-process repo preparation for #6720. But by not having them on our citus main branch it is impossible to make changes to the Pipfile, because our CI Dockerfiles and master are out of date. Since #6720 will need some more discussion and might take a few more weeks to be merged, this takes out the Pipfile changes. By merging this PR we can unblock new Pipfile changes. Unblocks and partially addresses #6766 [the-process-merge]: https://github.com/citusdata/the-process/pull/117	2023-03-15 14:53:14 +01:00
Onur Tirtir	9550ebd118	Remove pg_depend entries from columnar metadata indexes to columnar-am In the past, having columnar tables in the cluster was causing pg upgrades to fail when attempting to access columnar metadata. This is because, pg_dump doesn't see objects that we use for columnar-am related booking as the dependencies of the tables using columnar-am. To fix that; in #5456, we inserted some "normal dependency" edges (from those objects to columnar-am) into pg_depend. This helped us ensuring the existency of a class of metadata objects --such as columnar.storageid_seq-- and helped fixing #5437. However, the normal-dependency edges that we added for indexes on columnar metadata tables --such columnar.stripe_pkey-- didn't help at all because they were indeed causing dependency loops (#5510) and pg_dump was not able to take those dependency edges into the account. For this reason, this commit deletes those dependency edges so that pg_dump stops complaining about them. Note that it's not critical to delete those edges from pg_depend since they're not breaking pg upgrades but were triggering some warning messages. And given that backporting a sql change into older versions is hard a lot, we skip backporting this.	2023-03-14 17:13:52 +03:00
Onur Tirtir	be0735a329	Use "cpp" to expand "#include" directives in columnar sql files	2023-03-14 17:13:52 +03:00
Onur Tirtir	2b4be535de	Do clean-up before upgrade_columnar_before to make it runnable multiple times So that flaky test detector can run upgrade_columnar_before.sql multiple times.	2023-03-14 17:13:52 +03:00
Onur Tirtir	994f67185f	Make upgrade_columnar_after runnable multiple times This commit hides port numbers in upgrade_columnar_after because the port numbers assigned to nodes in upgrade schedule differ from the ones that flaky test detector assigns.	2023-03-14 17:13:52 +03:00
Onur Tirtir	821f26cc74	Fix flaky test detection for upgrade tests When run_test.py is run for an upgrade_._after.sql then, then automatically run the corresponding uprade_._before.sql file first. This is because all those upgrade_._after.sql files depend on the objects created in upgrade_._before.sql files by definition.	2023-03-14 17:13:52 +03:00
Onur Tirtir	f68fc9e69c	Decide core distribution params in CreateCitusTable (#6760 ) Decide core distribution params in CreateCitusTable to reduce the chances of creating Citus tables based on incorrect combinations of distribution method and replication model params. Also introduce DistributedTableParams struct to encapsulate the parameters that are specific to distributed tables.	2023-03-14 14:24:52 +03:00
Onur Tirtir	cc945fa331	Add multi_create_fdw into minimal_schedule (#6759 ) So that we can run the tests that require fake_fdw by using minimal schedule too. Also move multi_create_fdw.sql up in multi_1_schedule to make it available to more tests.	2023-03-14 10:22:34 +03:00
Onur Tirtir	20a5f3af2b	Replace CITUS_TABLE_WITH_NO_DIST_KEY checks with HasDistributionKey() (#6743 ) Now that we will soon add another table type having DISTRIBUTE_BY_NONE as distribution method and that we want the code to interpret such tables mostly as distributed tables, let's make the definition of those other two table types more strict by removing CITUS_TABLE_WITH_NO_DIST_KEY macro. And instead, use HasDistributionKey() check in the places where the logic applies to all table types that have / don't have a distribution key. In future PRs, we might want to convert some of those HasDistributionKey() checks if logic only applies to Citus local / reference tables, not the others. And adding HasDistributionKey() also allows us to consider having DISTRIBUTE_BY_NONE as the distribution method as a "table attribute" that can apply to distributed tables too, rather something that determines the table type.	2023-03-10 13:55:52 +03:00
Onur Tirtir	e3cf7ace7c	Stabilize single_node.sql and others that report illegal node removal (#6751 ) See https://app.circleci.com/pipelines/github/citusdata/citus/30859/workflows/223d61db-8c1d-4909-9aea-d8e470f0368b/jobs/1009243.	2023-03-08 15:25:36 +03:00
Onur Tirtir	d82c11f793	Refactor CreateDistributedTable() (#6742 ) Split the main logic that allows creating a Citus table into the internal function CreateCitusTable(). Old CreateDistributedTable() function was assuming that it's creating a reference table when the distribution method is DISTRIBUTE_BY_NONE. However, soon this won't be the case when adding support for creating single-shard distributed tables because their distribution method would also be the same. Now the internal method CreateCitusTable() doesn't make any assumptions about table's replication model or such. Instead, it expects callers to properly set all such metadata bits. Even more, some of the parameters the old CreateDistributedTable() takes --such as the shard count-- were not meaningful for a reference table, and would be the same as for new table type.	2023-03-08 13:38:51 +03:00
Emel Şimşek	4043abd5aa	Exclude-Generated-Columns-In-Copy (#6721 ) DESCRIPTION: Fixes a bug in shard copy operations. For copying shards in both shard move and shard split operations, Citus uses the COPY statement. A COPY all statement in the following form ` COPY target_shard FROM STDIN;` throws an error when there is a GENERATED column in the shard table. In order to fix this issue, we need to exclude the GENERATED columns in the COPY and the matching SELECT statements. Hence this fix converts the COPY and SELECT all statements to the following form: ``` COPY target_shard (col1, col2, ..., coln) FROM STDIN; SELECT (col1, col2, ..., coln) FROM source_shard; ``` where (col1, col2, ..., coln) does not include a GENERATED column. GENERATED column values are created in the target_shard as the values are inserted. Fixes #6705. --------- Co-authored-by: Teja Mupparti <temuppar@microsoft.com> Co-authored-by: aykut-bozkurt <51649454+aykut-bozkurt@users.noreply.github.com> Co-authored-by: Jelte Fennema <jelte.fennema@microsoft.com> Co-authored-by: Gürkan İndibay <gindibay@microsoft.com>	2023-03-07 18:15:50 +03:00
Ahmet Gedemenli	03f1bb70b7	Rebalance shard groups with placement count less than worker count (#6739 ) DESCRIPTION: Adds logic to distribute unbalanced shards If the number of shard placements (for a colocation group) is less than the number of workers, it means that some of the workers will remain empty. With this PR, we consider these shard groups as a colocation group, in order to make them be distributed evenly as much as possible across the cluster. Example: ```sql create table t1 (a int primary key); create table t2 (a int primary key); create table t3 (a int primary key); set citus.shard_count =1; select create_distributed_table('t1','a'); select create_distributed_table('t2','a',colocate_with=>'t1'); select create_distributed_table('t3','a',colocate_with=>'t2'); create table tb1 (a bigint); create table tb2 (a bigint); select create_distributed_table('tb1','a'); select create_distributed_table('tb2','a',colocate_with=>'tb1'); select citus_add_node('localhost',9702); select rebalance_table_shards(); ``` Here we have two colocation groups, each with one shard group. Both shard groups are placed on the first worker node. When we add a new worker node and try to rebalance table shards, the rebalance planner considers it well balanced and does nothing. With this PR, the rebalancer tries to distribute these shard groups evenly across the cluster as much as possible. For this example, with this PR, the rebalancer moves one of the shard groups to the second worker node. fixes: #6715	2023-03-06 14:14:27 +03:00
Emel Şimşek	ed7cc8f460	Remove unused lock functions (#6747 ) Code cleanup. This change removes two unused functions seemingly left over after a previous refactoring of shard move code.	2023-03-06 13:59:45 +03:00
Jelte Fennema	b489d763e1	Use pg_total_relation_size in citus_shards (#6748 ) DESCRIPTION: Correctly report shard size in citus_shards view When looking at citus_shards, people are interested in the actual size that all the data related to the shard takes up on disk. `pg_total_relation_size` is the function to use for that purpose. The previously used `pg_relation_size` does not include indexes or TOAST. Especially the missing toast can have enormous impact on the size of the shown data.	2023-03-06 10:53:12 +01:00
Gledis Zeneli	dc7fa0d5af	Fix multiple output version arbitrary config tests (#6744 ) With this small change, arbitrary config tests can have multiple acceptable correct outputs. For an arbitrary config tests named `t`, now you can define `expected/t.out`, `expected/t_0.out`, `expected/t_1.out` etc and the test will succeed if the output of `sql/t.sql` is equal to any of the `t.out` or `t_{0, 1, ...}.out` files.	2023-03-03 21:06:59 +03:00
Onur Tirtir	a9820e96a3	Make single_node_truncate.sql re-runnable First of all, this commit sets next_shard_id for single_node_truncate.sql because shard ids in the test output were changing whenever we modify a prior test file. Then the flaky test detector started complaining about single_node_truncate.sql. We fix that by specifying the correct test dependency for it in run_test.py.	2023-03-02 16:33:18 +03:00
Onur Tirtir	40105bf1fc	Make single_node.sql re-runnable	2023-03-02 16:33:17 +03:00
aykut-bozkurt	e2654deeae	fix memory leak during altering distributed table with a lot of partition and shards (#6726 ) 2 improvements to prevent memory leaks during altering or undistributing distributed tables with a lot of partitions and shards: 1. Free memory for each call to ConvertTable so that colocated and partition tables at `AlterDistributedTable`, `UndistributeTable`, or `AlterTableSetAccessMethod` will not cause an increase in memory usage, 2. Free memory while executing attach partition commands for each partition table at `AlterDistributedTable` to prevent an increase in memory usage. DESCRIPTION: Fixes memory leak issue during altering distributed table with a lot of partition and shards. Fixes https://github.com/citusdata/citus/issues/6503.	2023-02-28 21:23:41 +03:00
Jelte Fennema	17ad61678f	Make run_test.py and create_test.py importable without errors (#6736 ) Allowing scripts to be importable is good practice in general and it's required for the pytest testing framework that I'm adding in a follow up PR.	2023-02-28 00:34:42 +03:00
Jelte Fennema	c018e29bec	Don't blanket ignore flake8 E402 error (#6734 ) Instead this starts ignoring it in specific places only, because most files don't actually need it ignored.	2023-02-27 18:17:15 +03:00
Jelte Fennema	24ad8574b5	Fix run_test.py on python 3.9 (#6735 ) In #6718 I accidentally added Python type hint syntax that was only supported on Python 3.10. Our CI uses 3.9, so this PR changes that to a syntax that's supported on 3.9 too.	2023-02-27 10:12:18 +01:00
Teja Mupparti	9cbfdc86dd	MERGE: In deparser, add missing check for RETURNING clause.	2023-02-26 22:38:14 -08:00
Teja Mupparti	d7b499929c	Rearrange the common code into a newfunction to facilitate the multiple checks of the same conditions in a multi-modify MERGE statement	2023-02-24 12:55:11 -08:00
aykut-bozkurt	a7689c3f8d	fix memory leak during distribution of a table with a lot of partitions (#6722 ) We have memory leak during distribution of a table with a lot of partitions as we do not release memory at ExprContext until all partitions are not distributed. We improved 2 things to resolve the issue: 1. We create and delete MemoryContext for each call to `CreateDistributedTable` by partitions, 2. We rebuild the cache after we insert all the placements instead of each placement for a shard. DESCRIPTION: Fixes memory leak during distribution of a table with a lot of partitions and shards. Fixes https://github.com/citusdata/citus/issues/6572.	2023-02-17 18:12:49 +03:00
Emel Şimşek	756c1d3f5d	Remove auto_explain workaround in citus explain hook for ALTER TABLE (#6714 ) When auto_explain module is loaded and configured, EXPLAIN will be implicitly run for all the supported commands. Postgres does not support `EXPLAIN` for `ALTER` command. However, auto_explain will try to `EXPLAIN` other supported commands internally triggered by `ALTER`. For instance, `ALTER TABLE target_table ADD CONSTRAINT fkey_167 FOREIGN KEY (col_1) REFERENCES ref_table(key) ... ` command may trigger a SELECT command in the following form for foreign key validation purpose: `SELECT fk.col_1 FROM ONLY target_table fk LEFT OUTER JOIN ONLY ref_table pk ON ( pk.key OPERATOR(pg_catalog.=) fk.col_1) WHERE pk.key IS NULL AND (fk.col_1 IS NOT NULL) ` For Citus tables, the Citus utility hook should ensure that constraint validation is skipped for shell tables but they are done for shard tables. The reason behind this design choice can be summed up as: - An ALTER TABLE command via coordinator node is run in a distributed transaction. - Citus does not support nested distributed transactions. - A SELECT query on a distributed table (aka shell table) is also run in a distributed transaction. - Therefore, Citus does not support running a SELECT query on a shell table while an ALTER TABLE command is running. With `eadc88a800` a bug is introduced breaking the skip constraint validation behaviour of Citus. With this change, we see that validation queries on distributed tables are triggered within `ALTER` command adding constraints with validation check. This regression did not cause an issue for regular use cases since the citus executor hook blocks those queries heuristically when there is an ALTER TABLE command in progress. The issue is surfaced as a crash (#6424 Workers, when configured to use auto_explain, crash during distributed transactions.) when auto_explain is enabled. This is due to auto_explain trying to execute the SELECT queries in a nested distributed transaction. Now since the regression with constraint validation is fixed in https://github.com/citusdata/citus/issues/6543, we should be able to remove the workaround.	2023-02-17 17:47:03 +03:00
aykut-bozkurt	9e69dd0e7f	fix single tuple result memory leak (#6724 ) We should not omit to free PGResult when we receive single tuple result from an internal backend. Single tuple results are normally freed by our ReceiveResults for `tupleDescriptor != NULL` flow but not for those with `tupleDescriptor == NULL`. See PR #6722 for details. DESCRIPTION: Fixes memory leak issue with query results that returns single row.	2023-02-17 14:15:09 +03:00
Teja Mupparti	ca65d2ba0b	Fix flaky tests local_shards_execution and local_shards_execution_replication. O Simple fix is to add ORDER BY to have definitive results. O Add search_path explicitly after reconnecting, this avoids creating objects in public schema which prevents us from repetitive running of tests. O multi_mx_modification is not designed to run repetitive, so isolate it.	2023-02-15 09:18:10 -08:00
Jelte Fennema	b02a5b5b78	Add more powerfull dependency tracking to run_test.py (#6718 ) Some of our tests depend on previous tests. Normally all these tests should be part of a base schedule, but that's not always the case. The flaky test detection script should ensure that we don't introduce other dependencies by accident in new tests. But we have many old tests that are not worth the effort of changing. This adds a way to define such test dependencies in `run_test.py`, so that it can make sure to run any dependencies before the actual test.	2023-02-15 17:20:05 +03:00
Jelte Fennema	3ba639f162	Install non-vulnerable cryptography package (#6710 ) Our repo was complaining about the cryptography package being vulnerable. This updates it, including our mitmproxy fork, because that was pinning an outdated version. Relevant commit on our mitmproxy fork: `2fd18ef051` Relevant PR on the-process: https://github.com/citusdata/the-process/pull/112	2023-02-14 18:03:10 +01:00
aykut-bozkurt	273911ac7f	prevent memory leak during ConvertTable with a lot of partitions (#6693 ) Prevents memory leak during ConvertTable call for a table with a lot of partitions. DESCRIPTION: Fixes memory leak during undistribution and alteration of a table with a lot of partitions.	2023-02-13 15:22:13 +03:00
Jelte Fennema	3200187757	Support compilation and run tests on latest PG versions (#6711 ) Postgres got minor updates this starts using the images with the latest version for our tests. These new Postgres versions caused a compilation issue in PG14 and PG13 due to some function being backported that we had already backported ourselves. Due this backport being a static inline function it doesn't matter who provides this and there will be no linkage errors when either running old Citus packages on new PG versions or the other way around.	2023-02-10 16:02:03 +01:00
Jelte Fennema	9f41ea2157	Fix issues reported by flake8	2023-02-10 13:05:37 +01:00
Jelte Fennema	188cc7d2ae	Run python files through isort	2023-02-10 13:05:37 +01:00
Jelte Fennema	530b24a887	Format python files with black	2023-02-10 13:05:37 +01:00
Jelte Fennema	42970665fc	Add linting and formatting tools for python	2023-02-10 13:05:37 +01:00
Jelte Fennema	09be4bb5fd	Allow multi_insert_select to run repeatably (#6707 ) It was not cleaning up all the tables it created. This changes it to create a dedicated schema for this test, like we have for many others.	2023-02-10 10:06:42 +01:00
Jelte Fennema	590df5360c	Fix flakyness in failure_create_distributed_table_non_empty (#6708 ) The failure_create_distributed_table_non_empty test would sometimes fail like this: ```diff -- in the first test, cancel the first connection we sent from the coordinator SELECT citus.mitmproxy('conn.cancel(' \|\| pg_backend_pid() \|\| ')'); - mitmproxy ---------------------------------------------------------------------- - -(1 row) - +ERROR: canceling statement due to user request +CONTEXT: COPY mitmproxy_result, line 0 +SQL statement "COPY mitmproxy_result FROM '/home/circleci/project/src/test/regress/tmp_check/mitmproxy.fifo'" +PL/pgSQL function citus.mitmproxy(text) line 11 at EXECUTE SELECT create_distributed_table('test_table', 'id'); ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/30474/workflows/be1c9f9d-22c9-465c-964a-dcdd1cb8c99c/jobs/985441 Because the cancel command had no filter it would actually sometimes cancel the mitmproxy cancel command itself. This PR addresses that by simply removing this test. This is basically the exact same issue as #6217, only in a different place in the file. It's fixed here by removing the test since there's already many different similar tests.	2023-02-10 09:55:12 +01:00
Teja Mupparti	0824d9c1fb	Miscellaneous cleanup	2023-02-09 13:05:59 -08:00
Onur Tirtir	483b51392f	Bump Citus to 11.3devel (#6690 )	2023-02-06 10:23:25 +00:00
Gokhan Gulbiz	b6a4652849	Stop background daemon before dropping the database (#6688 ) DESCRIPTION: Stop maintenance daemon when dropping a database even without Citus extension Fixes #6670	2023-02-03 15:15:44 +03:00
Jelte Fennema	f061dbb253	Also reset transactions at connection shutdown (#6685 ) In #6314 I refactored the connection cleanup to be simpler to understand and use. However, by doing so I introduced a use-after-free possibility (that valgrind luckily picked up): In the `ShouldShutdownConnection` path of `AfterXactHostConnectionHandling` we free connections without removing the `transactionNode` from the dlist that it might be part of. Before the refactoring this wasn't a problem, because the dlist would be completely reset quickly after in `ResetGlobalVariables` (without reading or writing the dlist entries). The refactoring changed this by moving the `dlist_delete` call to `ResetRemoteTransaction`, which in turn was called in the `!ShouldShutdownConnection` path of `AfterXactHostConnectionHandling`. Thus this `!ShouldShutdownConnection` path would now delete from the `dlist`, but the `ShouldShutdownConnection` path would not. Thus to remove itself the deleting path would sometimes update nodes in the list that were freed right before. There's two ways of fixing this: 1. Call `dlist_delete` from both of paths. 2. Call `dlist_delete` from neither of the paths. This commit implements the second approach, and #6684 implements the first. We need to choose which approach we prefer. To make calling `dlist_delete` from both paths actually work, we also need to use a slightly different check to determine if we need to call dlist_delete. Various regression tests showed that there can be cases where the `transactionState` is something else than `REMOTE_TRANS_NOT_STARTED` but the connection was not added to the `InProgressTransactions` list One example of such a case is when running `TransactionStateMachine` without calling `StartRemoteTransactionBegin` beforehand. In those cases the connection won't be added to `InProgressTransactions`, but the `transactionState` is changed to `REMOTE_TRANS_SENT_COMMAND`. Sidenote: This bug already existed in 11.1, but valgrind didn't catch it back then. My guess is that this happened because #6314 was merged after the initial release branch was cut. Fixes #6638	2023-02-02 16:05:34 +01:00
Hanefi Onaldi	47ff03123b	Improve rebalance reporting for retried tasks (#6683 ) If there is a problem with an ongoing rebalance, we did not show details on background tasks that are stuck in runnable state. Similar to how we show details for errored tasks, we now show details on tasks that are being retried. Earlier we showed the following output when a task was stuck: ``` ┌────────────────────────────┐ │ { ↵│ │ "tasks": [ ↵│ │ ], ↵│ │ "task_state_counts": {↵│ │ "done": 13, ↵│ │ "blocked": 2, ↵│ │ "runnable": 1 ↵│ │ } ↵│ │ } │ └────────────────────────────┘ ``` Now we show details like the following: ``` +----------------------------------------------------------------------- \| { \| "tasks": [ \| { \| "state": "runnable", \| "command": "SELECT pg_catalog.citus_move_shard_placement(1 \| "message": "ERROR: Moving shards to a node that shouldn't \| "retried": 2, \| "task_id": 3 \| } \| ], \| "task_state_counts": { \| "blocked": 1, \| "runnable": 1 \| } \| } +----------------------------------------------------------------------- ```	2023-01-31 15:26:52 +03:00
Jelte Fennema	14c31fbb07	Fix background rebalance when reference table has no PK (#6682 ) DESCRIPTION: Fix background rebalance when reference table has no PK For the background rebalance we would always fail if a reference table that was not replicated to all nodes would not have a PK (or replica identity). Even when we used force_logical or block_writes as the shard transfer mode. This fixes that and adds some regression tests. Fixes #6680	2023-01-31 12:18:29 +01:00
aykut-bozkurt	8a9bb272e4	fix dropping table_name option from foreign table (#6669 ) We should disallow dropping table_name option if foreign table is in metadata. Otherwise, we get table not found error which contains shardid. DESCRIPTION: Fixes an unexpected foreign table error by disallowing to drop the table_name option. Fixes #6663	2023-01-30 17:24:30 +03:00
Marco Slot	a482b36760	Revert "Support MERGE on distributed tables with restrictions" (#6675 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2023-01-30 15:01:59 +01:00
Hanefi Onaldi	0962cf7517	Allow empty lines in arbitrary config schedules (#6654 ) This change is a precursor to attempts to add more editorconfig rules in our codebase. It is a good idea to comply with POSIX standards and have an empty newline at the end of text files. However, once we have such a rule, arbitrary configs scripts used to fail before this change. Related: #5981	2023-01-30 16:30:12 +03:00
Onur Tirtir	594684bb33	Do clean-up before columnar_create to make it runnable multiple times So that flaky test detector can run columnar_create.sql multiple times.	2023-01-30 15:58:34 +03:00
Onur Tirtir	1c51ddae49	Fall-back to seq-scan when accessing columnar metadata if the index doesn't exist Fixes #6570. In the past, having columnar tables in the cluster was causing pg upgrades to fail when attempting to access columnar metadata. This is because, pg_dump doesn't see objects that we use for columnar-am related booking as the dependencies of the tables using columnar-am. To fix that; in #5456, we inserted some "normal dependency" edges (from those objects to columnar-am) into pg_depend. This helped us ensuring the existency of a class of metadata objects --such as columnar.storageid_seq-- and helped fixing #5437. However, the normal-dependency edges that we added for indexes on columnar metadata tables --such columnar.stripe_pkey-- didn't help at all because they were indeed causing dependency loops (#5510) and pg_dump was not able to take those dependency edges into the account. For this reason, instead of inserting such dependency edges from indexes to columnar-am, we allow columnar metadata accessors to fall-back to sequential scan during pg upgrades.	2023-01-30 15:58:34 +03:00
Jelte Fennema	1109b70e58	Fix flaky isolation_non_blocking_shard_split test (#6666 ) Sometimes isolation_non_blocking_shard_split would fail like this: ```diff step s2-show-pg_dist_cleanup: SELECT object_name, object_type, policy_type FROM pg_dist_cleanup; object_name \|object_type\|policy_type ------------------------------+-----------+----------- +citus_shard_split_slot_2_10_39\| 3\| 0 public.to_split_table_1500001 \| 1\| 2 -(1 row) +(2 rows) ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/30237/workflows/edcf34b7-d7d3-4d10-8293-b6f59b00cdf2/jobs/970960 The reason is that replication slots have now become part of pg_dist_cleanup too, and sometimes they cannot be cleaned up right away. This is harmless as they will be cleaned up eventually. So this simply filters out the replication slots for those tests.	2023-01-30 13:44:23 +01:00
Jelte Fennema	10603ed5d4	Fix flaky multi_reference_table test (#6664 ) Sometimes in CI our multi_reference_table test fails like this: ```diff WHERE colocated_table_test.value_2 = reference_table_test.value_2; LOG: join order: [ "colocated_table_test" ][ reference join "reference_table_test" ] value_2 --------- - 1 2 + 1 (2 rows) ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/30223/workflows/ce3ab5db-310f-4e30-ba0b-c3b31927d9b6/jobs/970041 We forgot an ORDER BY in this test.	2023-01-30 13:32:38 +01:00
aykut-bozkurt	ab71cd01ee	fix multi level recursive plan (#6650 ) Recursive planner should handle all the tree from bottom to top at single pass. i.e. It should have already recursively planned all required parts in its first pass. Otherwise, this means we have bug at recursive planner, which needs to be handled. We add a check here and return error. DESCRIPTION: Fixes wrong results by throwing error in case recursive planner multipass the query. We found 3 different cases which causes recursive planner passes the query multiple times. 1. Sublink in WHERE clause is planned at second pass after we recursively planned a distributed table at the first pass. Fixed by PR #6657. 2. Local-distributed joins are recursively planned at both the first and the second pass. Issue #6659. 3. Some parts of the query is considered to be noncolocated at the second pass as we do not generate attribute equivalances between nondistributed and distributed tables. Issue #6653	2023-01-27 21:25:04 +03:00
Jelte Fennema	0903091343	Add valgrind support to run_test.py (#6667 ) Running tests with valgrind was not possible with our run_test.py script yet. This adds that support.	2023-01-27 16:01:59 +01:00
Gokhan Gulbiz	4e26464969	Allow plain pg foreign tables without a table_name option (#6652 )	2023-01-27 16:34:11 +03:00
Jelte Fennema	81dcddd1ef	Actually skip constraint validation on shards after shard move (#6640 ) DESCRIPTION: Fix foreign key validation skip at the end of shard move In `eadc88a` we started completely skipping foreign key constraint validation at the end of a non blocking shard move, instead of only for foreign keys to reference tables. However, it turns out that this didn't work at all because of a hard to notice bug: By resetting the SkipConstraintValidation flag at the end of our utility hook, we actually make the SET command that sets it a no-op. This fixes that bug by removing the code that resets it. This is fine because #6543 removed the only place where we set the flag in C code. So the resetting of the flag has no purpose anymore. This PR also adds a regression test, because it turned out we didn't have any otherwise we would have caught that the feature was completely broken. It also moves the constraint validation skipping to the utility hook. The reason is that #6550 showed us that this is the better place to skip it, because it will also skip the planning phase and not just the execution.	2023-01-27 13:08:05 +01:00
aykut-bozkurt	8870f0f90b	fix order of recursive sublink planning (#6657 ) We should do the sublink conversations at the end of the recursive planning because earlier steps might have transformed the query into a shape that needs recursively planning the sublinks. DESCRIPTION: Fixes early sublink check at recursive planner. Related to PR https://github.com/citusdata/citus/pull/6650	2023-01-27 14:35:16 +03:00
Onur Tirtir	97dba0ac00	Fix uninit mem acceess in UpdateFunctionDistributionInfo (#6658 ) Fixes #6655. heap_modify_tuple() fetches values[i] if replace[i] is set true, regardless of the fact that whether isnull[i] is true or false. So similar to replace[], let's init values[] & isnull[] too. DESCRIPTION: Fixes an uninitialized memory access in create_distributed_function()	2023-01-27 11:00:41 +03:00
Onur Tirtir	d2d507eb85	Fix columnar README.md (#6633 ) Reported in #6626.	2023-01-26 10:39:39 +03:00
Emel Şimşek	24f6136f72	Fixes ADD {PRIMARY KEY/UNIQUE} USING INDEX cmd (#6647 ) This change allows creating a constraint without a name using an index. The index name will be used as the constraint name the same way postgres handles it. Fixes issue #6644 This commit also cleans up some leftovers from nameless constraint checks. With this commit, we now fully support adding all nameless constraints directly to a table. Co-authored-by: naisila <nicypp@gmail.com>	2023-01-25 21:28:07 +03:00
Emel Şimşek	2169e0222b	Propagates NOT VALID option for FK&CHECK constraints w/out a name (#6649 ) Adds NOT VALID option to deparser. When we need to deparse: "ALTER TABLE ADD FOREIGN KEY ... NOT VALID" "ALTER TABLE ADD CHECK ... NOT VALID" NOT VALID option should be propagated to workers. Fixes issue #6646 This commit also uses AppendColumnNameList function instead of repeated code blocks in two appropriate places in the "ALTER TABLE" deparser.	2023-01-25 20:41:04 +03:00
Hanefi Onaldi	94b63f35a5	Prevent crashes on update with returning clauses (#6643 ) If an update query on a reference table has a returns clause with a subquery that accesses some other local table, we end-up with an crash. This commit prevents the crash, but does not prevent other error messages from happening due to Citus not being able to pushdown the results of that subquery in a valid SQL command. Related: #6634	2023-01-24 20:07:43 +03:00
Jelte Fennema	aa9cd16d15	Use correct guc value to disable statistics collection (#6641 ) The `citus.enable_statistics_collection` is a boolean GUC not an integer one. Setting it to `-1` showed errors in the logs.	2023-01-24 15:32:50 +01:00
Naisila Puka	3c96b2a0cd	Remove unused function RelationUsesIdentityColumns (#6645 ) Cleanup from #6591	2023-01-24 17:10:05 +03:00
Jelte Fennema	7a7880aec9	Fix regression in allowed foreign keys on distributed tables (#6550 ) DESCRIPTION: Fix regression in allowed foreign keys on distributed tables In commit `eadc88a` we changed how we skip foreign key validation. The goal was to skip it in more cases. However, one change had the unintended regression of introducing failures when trying to create certain foreign keys. This reverts that part of the change. The way of skipping validation of foreign keys that was introduced in `eadc88a` was skipping validation during execution. The reason that this caused this regression was because some foreign key validation queries already fail during planning. In those cases it never gets to the execution step where it would later be skipped. Fixes #6543	2023-01-24 16:09:21 +03:00
Jelte Fennema	93fcc5c5d8	Move tablespace directory creation to pg_regress_multi.pl (#6629 ) Multiple `check-xxx` targets create tablespaces. If you run two of these at the same time you would get an error like: ```diff CREATE TABLESPACE test_tablespace LOCATION :'test_tablespace'; +ERROR: directory "/home/rajesh/citus/citus/src/test/regress/tmp_check/ts0/PG_14_202107181" already in use as a tablespace ``` This fixes that by moving creation of table space directory creation and removal to pg_regress_multi.pl instead of being in the Makefile.	2023-01-20 12:34:33 +00:00
Emel Şimşek	58368b7783	Enable adding FOREIGN KEY constraints on Citus tables without a name. (#6616 ) DESCRIPTION: Enable adding FOREIGN KEY constraints on Citus tables without a name This PR enables adding a foreign key to a distributed/reference/Citus local table without specifying the name of the constraint, e.g. `ALTER TABLE items ADD FOREIGN KEY (user_id) REFERENCES users (id);`	2023-01-20 01:43:52 +03:00
Gokhan Gulbiz	2388fbea6e	Identity Column Support on Citus Managed Tables (#6591 ) DESCRIPTION: Identity Column Support on Citus Managed Tables	2023-01-19 15:45:41 +03:00
Marco Slot	64e3fee89b	Remove shardstate leftovers (#6627 ) Remove ShardState enum and associated logic. Co-authored-by: Marco Slot <marco.slot@gmail.com> Co-authored-by: Ahmet Gedemenli <afgedemenli@gmail.com>	2023-01-19 11:43:58 +03:00
Teja Mupparti	44c387b978	Support MERGE on distributed tables with restrictions This implements the phase - II of MERGE sql support Support routable query where all the tables in the merge-sql are distributed, co-located, and both the source and target relations are joined on the distribution column with a constant qual. This should be a Citus single-task query. Below is an example. SELECT create_distributed_table('t1', 'id'); SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’); MERGE INTO t1 USING s1 ON t1.id = s1.id AND t1.id = 100 WHEN MATCHED THEN UPDATE SET val = s1.val + 10 WHEN MATCHED THEN DELETE WHEN NOT MATCHED THEN INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src) Basically, MERGE checks to see if There are a minimum of two distributed tables (source and a target). All the distributed tables are indeed colocated. MERGE relations are joined on the distribution column MERGE .. USING .. ON target.dist_key = source.dist_key The query should touch only a single shard i.e. JOIN AND with a constant qual MERGE .. USING .. ON target.dist_key = source.dist_key AND target.dist_key = <> If any of the conditions are not met, it raises an exception.	2023-01-18 11:05:27 -08:00
Ahmet Gedemenli	b3b135867e	Remove shardstate from placement insert functions (#6615 )	2023-01-18 09:52:38 +01:00
Hanefi Onaldi	f21dfd5fae	Rebalance Progress Reporting API (#6576 ) citus_job_list() lists all background jobs by simply showing the records in pg_dist_background_job. citus_job_status(job_id bigint, raw boolean default false) shows the status of a single background job by appending a jsonb details column to the associated row from pg_dist_background_job. If the raw argument is set, machine readable sizes are used instead of human readable alternatives. citus_rebalance_status(raw boolean default false) shows the status of the last rebalance operation. If the raw argument is set, machine readable sizes are used instead of human readable alternatives.	2023-01-16 16:17:31 +03:00
Jelte Fennema	92689a8362	Make GPIDs work with pg_dist_poolinfo (#6588 ) The original implementation of GPIDs didn't work correctly when using `pg_dist_poolinfo` together with PgBouncer. The reason is that it assumed that once a connection was made to a worker, the originating GPID should stay the same for ever. But when pg_dist_poolinfo is used this isn't the case, because the same connection on the worker might be used by different backends of the coordinator. This fixes that issue by updating the GPID whenever a new application name is set on a connection. This is the only thing that's needed, because PgBouncer already sets the application name correctly on the server connection whenever a client is updated.	2023-01-13 14:39:19 +00:00
Marco Slot	ad3407b5ff	Revert "Make the metadata syncing less resource invasive [Phase-1]" (#6618 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2023-01-13 13:56:55 +01:00
Ahmet Gedemenli	ed6dd8b086	Mark 0/0 lsn results as null (#6617 ) Marks `source_lsn` and `target_lsn` fields as null if the result is 0/0	2023-01-13 15:33:43 +03:00
Emel Şimşek	28ed013a91	Enable ALTER TABLE ... ADD CHECK (#6606 ) DESCRIPTION: Enable adding CHECK constraints on distributed tables without the client having to provide a constraint name. This PR enables the following command syntax for adding check constraints to distributed tables. ALTER TABLE ... ADD CHECK ... by creating a default constraint name and transforming the command into the below syntax before sending it to workers. ALTER TABLE ... ADD CONSTRAINT \<conname> CHECK ...	2023-01-12 23:31:06 +03:00
Ahmet Gedemenli	9b9d8e7abd	Use shard transfer UDFs with node ids for rebalancing	2023-01-12 16:57:51 +03:00
Ahmet Gedemenli	e5fef40c06	Introduce citus_move_shard_placement UDF with nodeid	2023-01-12 16:57:51 +03:00
Ahmet Gedemenli	e19c545fbf	Introduce citus_copy_shard_placement UDF with nodeid	2023-01-12 16:57:51 +03:00
Emel Şimşek	d322f9e382	Handle DEFERRABLE option for the relevant constraints at deparser. (#6613 ) Table Constraints UNIQUE, PRIMARY KEY and EXCLUDE may have option DEFERRABLE in their command syntax. This PR handles the option when deparsing the relevant constraint statements. NOT DEFERRABLE and INITIALLY IMMEDIATE (if DEFERRABLE} are the default values for the option so we only append the non-default values to the alter table statement.	2023-01-12 12:32:38 +03:00
Jelte Fennema	34df853bda	Fix bug introduced by #6412 (#6590 ) In #6412 I made a change to not re-assign the global PID if it was already set. This inadvertently introduced a regression where `userId` and `databaseId` would not be set on the backend data when the global PID was assigned in the authentication hook. This fixes it by doing two things: 1. Removing `userId` from `BackendData`, since it's not used anywhere anyway. 2. Move assignment of `databaseId` to dedicated `SetBackendDataDatabaseId` function, that isn't a no-op when global pid is already set. Since #6412 is not released yet this does not need a description.	2023-01-10 16:21:57 +01:00
Jelte Fennema	c2b4087ff0	Quote all identifiers that we use for logical replication (#6604 ) In #6598 it was noticed that Citus could generate syntactically invalid statements during logical replication. With #6603 we resolved the direct issue, by only generating valid subscription names. But there was also the underlying problem that we did not escape certain identifier strings. While in theory this should be okay since we should only generate names that are valid, this issue reiterated that we should not take this for granted. As an extra line of defense this quotes all identifiers we use during logical replication setup.	2023-01-06 14:12:03 +00:00
Jelte Fennema	44e09128f0	Fix failures in mx_base_schedule (#6601 ) Apparently no-one actually ran the mx_base_schedule, because the tests in schedule itself were already failing. This updates it to be in line with multi_mx_schedule again to make the tests pass again. Notably it doesn't contain multi_mx_node_metadata and multi_extension. Because those tests take long to run and the were not necessary to make multi_mx_create_table pass again.	2023-01-06 14:48:18 +01:00

1 2 3 4 5 ...

4242 Commits (efd41e8ea55cf613fc1a0255a034ce13ceb554d3)