citus

Commit Graph

Author	SHA1	Message	Date
Halil Ozan Akgül	3acadd7321	Citus Clock tests with Single Shard Tables (#6938 ) This PR tests Citus clock with single shard tables.	2023-06-09 15:06:46 +03:00
Emel Şimşek	8b2024b730	When Creating a FOREIGN KEY without a name, schema qualify referenced table name in deparser. (#6986 ) DESCRIPTION: Fixes a bug which causes an error when creating a FOREIGN KEY constraint without a name if the referenced table is schema qualified. In deparsing the `ALTER TABLE s1.t1 ADD FOREIGN KEY (key) REFERENCES s2.t2; `, command back from its cooked form, we should schema qualify the REFERENCED table. Fixes #6982.	2023-06-09 14:13:13 +03:00
Onur Tirtir	fa8870217d	Enable logical planner for single-shard tables (#6950 ) * Enable using logical planner for single-shard tables * Improve non-colocated table error in physical planner * Favor distributed tables over reference tables when chosing anchor shard	2023-06-08 10:57:23 +03:00
Halil Ozan Akgül	b569d53a0c	Single shard misc udfs (#6956 ) This PR tests: - shards_colocated - citus_shard_cost_by_disk_size - citus_update_shard_statistics - citus_update_table_statistics	2023-06-07 13:30:50 +03:00
Emel Şimşek	6369645db4	Restore Test Coverage for Pushing Down Subqueries. (#6976 ) When we add the coordinator in metadata, reference tables gets replicated to coordinator. As a result we lose some test coverage since some queries start to run locally instead of getting pushed down. This PR adds new test cases involving distributed tables instead of reference tables for covering distributed execution in related cases.	2023-06-07 12:14:34 +03:00
Ahmet Gedemenli	8d8968ae63	Disable ALTER TABLE .. SET SCHEMA for tenant tables (#6973 ) Disables `ALTER TABLE .. SET SCHEMA` for tenant tables. Disables `ALTER TABLE .. SET SCHEMA` for tenant schemas.	2023-06-07 11:02:53 +03:00
Halil Ozan Akgül	3f7bc0cbf5	Single Shard Partition Column UDFs (#6964 ) This PR fixes and tests: - debug_equality_expression - partition_column_id	2023-06-06 17:55:40 +03:00
Halil Ozan Akgül	7e486345f1	Fix citus_table_type column in citus_tables and citus_shards views for single shard tables (#6971 ) `citus_table_type` column of `citus_tables` and `citus_shards` will show "schema" for tenants schema tables and "distributed" for single shard tables that are not in a tenant schema.	2023-06-06 16:20:11 +03:00
Emel Şimşek	3fda2c3254	Change test files in multi and multi-1 schedules to accommodate coordinator in the metadata. (#6939 ) Changes test files in multi and multi-1 schedules such that they accomodate coordinator in metadata. Changes fall into the following buckets: 1. When coordinator is in metadata, reference table shards are present in coordinator too. This changes test outputs checking the table size, shard numbers etc. for reference tables. 2. When coordinator is in metadata, postgres tables are converted to citus local tables whenever a foreign key relationship to them is created. This changes some test cases which tests it should not be possible to create foreign keys to postgres tables. 3. Remove lines that add/remove coordinator for testing purposes.	2023-06-05 10:37:48 +03:00
ahmet gedemenli	2bd6ff0e93	Use schema name in the error msg	2023-06-02 15:25:14 +03:00
ahmet gedemenli	f68ea20009	Disable alter_distributed_table for tenant tables	2023-06-02 14:48:07 +03:00
ahmet gedemenli	4b67e398b1	Disable undistribute_table for tenant tables	2023-06-02 14:48:07 +03:00
ahmet gedemenli	f4b2494d0c	Disable update_distributed_table_colocation for tenant tables	2023-06-02 14:48:07 +03:00
Halil Ozan Akgül	3e183746b7	Single Shard Misc UDFs 2 (#6963 ) Creating a second PR to make reviewing easier. This PR tests: - replicate_reference_tables - fix_partition_shard_index_names - isolate_tenant_to_new_shard - replicate_table_shards	2023-06-02 13:46:14 +03:00
Halil Ozan Akgül	ac7f732be2	Add Single Shard Table Tests for Dependency UDFs (#6960 ) This PR tests: - citus_get_all_dependencies_for_object - citus_get_dependencies_for_object - is_citus_depended_object	2023-06-02 11:57:53 +03:00
Halil Ozan Akgül	9961d39d97	Adds Single Shard Table Tests for Foreign Key UDFs (#6959 ) This PR adds tests for: - get_referencing_relation_id_list - get_referenced_relation_id_list - get_foreign_key_connected_relations	2023-06-01 12:56:06 +03:00
ahmet gedemenli	8ace5a7af5	Use citus_drain_node with single shard tables	2023-05-31 14:01:52 +03:00
ahmet gedemenli	ee42af7ad2	Add test for rebalancer with single shard tables	2023-05-31 11:48:49 +03:00
Halil Ozan Akgül	d99a5e2f62	Single Shard Table Tests for Shard Lock UDFs (#6944 ) This PR adds single shard table tests for shard lock UDFs, `shard_lock_metadata`, `shard_lock_resources`	2023-05-30 12:23:41 +03:00
Halil Ozan Akgül	5b54700b93	Single Shard Table Tests for Time Partitions (#6941 ) This PR adds tests for time partitions UDFs and view with single shard tables.	2023-05-29 14:18:56 +03:00
Halil Ozan Akgül	9d9b3817c1	Single Shard Table Columnar UDFs Tests (#6937 ) Adds columnar UDF tests for single shard tables.	2023-05-29 13:53:00 +03:00
Halil Ozan Akgül	321fcfcdb5	Add Support for Single Shard Tables in update_distributed_table_colocation (#6924 ) Adds Support for Single Shard Tables in `update_distributed_table_colocation`. This PR changes checks that make sure tables should be hash distributed table to hash or single shard distributed tables.	2023-05-29 11:47:50 +03:00
Ahmet Gedemenli	1ca80813f6	Citus UDFs support for single shard tables (#6916 ) Verify Citus UDFs work well with single shard tables SUPPORTED * citus_table_size * citus_total_relation_size * citus_relation_size * citus_shard_sizes * truncate_local_data_after_distributing_table * create_distributed_function // test function colocated with a single shard table * undistribute_table * alter_table_set_access_method UNSUPPORTED - error out for single shard tables * master_create_empty_shard * create_distributed_table_concurrently * create_distributed_table * create_reference_table * citus_add_local_table_to_metadata * citus_split_shard_by_split_points * alter_distributed_table	2023-05-26 17:30:05 +03:00
Onur Tirtir	246b054a7d	Add support for schema-based-sharding via a GUC (#6866 ) DESCRIPTION: Adds citus.enable_schema_based_sharding GUC that allows sharding the database based on schemas when enabled. * Refactor the logic that automatically creates Citus managed tables * Refactor CreateSingleShardTable() to allow specifying colocation id instead * Add support for schema-based-sharding via a GUC ### What this PR is about: Add citus.enable_schema_based_sharding GUC to enable schema-based sharding. Each schema created while this GUC is ON will be considered as a tenant schema. Later on, regardless of whether the GUC is ON or OFF, any table created in a tenant schema will be converted to a single shard distributed table (without a shard key). All the tenant tables that belong to a particular schema will be co-located with each other and will have a shard count of 1. We introduce a new metadata table --pg_dist_tenant_schema-- to do the bookkeeping for tenant schemas: ```sql psql> \d pg_dist_tenant_schema Table "pg_catalog.pg_dist_tenant_schema" ┌───────────────┬─────────┬───────────┬──────────┬─────────┐ │ Column │ Type │ Collation │ Nullable │ Default │ ├───────────────┼─────────┼───────────┼──────────┼─────────┤ │ schemaid │ oid │ │ not null │ │ │ colocationid │ integer │ │ not null │ │ └───────────────┴─────────┴───────────┴──────────┴─────────┘ Indexes: "pg_dist_tenant_schema_pkey" PRIMARY KEY, btree (schemaid) "pg_dist_tenant_schema_unique_colocationid_index" UNIQUE, btree (colocationid) psql> table pg_dist_tenant_schema; ┌───────────┬───────────────┐ │ schemaid │ colocationid │ ├───────────┼───────────────┤ │ 41963 │ 91 │ │ 41962 │ 90 │ └───────────┴───────────────┘ (2 rows) ``` Colocation id column of pg_dist_tenant_schema can never be NULL even for the tenant schemas that don't have a tenant table yet. This is because, we assign colocation ids to tenant schemas as soon as they are created. That way, we can keep associating tenant schemas with particular colocation groups even if all the tenant tables of a tenant schema are dropped and recreated later on. When a tenant schema is dropped, we delete the corresponding row from pg_dist_tenant_schema. In that case, we delete the corresponding colocation group from pg_dist_colocation as well. ### Future work for 12.0 release: We're building schema-based sharding on top of the infrastructure that adds support for creating distributed tables without a shard key (https://github.com/citusdata/citus/pull/6867). However, not all the operations that can be done on distributed tables without a shard key necessarily make sense (in the same way) in the context of schema-based sharding. For example, we need to think about what happens if user attempts altering schema of a tenant table. We will tackle such scenarios in a future PR. We will also add a new UDF --citus.schema_tenant_set() or such-- to allow users to use an existing schema as a tenant schema, and another one --citus.schema_tenant_unset() or such-- to stop using a schema as a tenant schema in future PRs.	2023-05-26 10:49:58 +03:00
Halil Ozan Akgül	2c7beee562	Fix citus.tenant_stats_limit test by setting it to 2 (#6899 ) citus.tenant_stats_limit was set to 2 when we were adding tests for it. Then we changed it to 10, making the tests incorrect. This PR fixes that without breaking other tests.	2023-05-23 17:44:07 +03:00
Jelte Fennema	350a0f6417	Support running Citus upgrade tests with run_test.py (#6832 ) Citus upgrade tests require some additional logic to run, because we have a before and after schedule and we need to swap the Citus version in-between. This adds that logic to `run_test.py`. In passing this makes running upgrade tests locally multiple times faster by caching tarballs.	2023-05-23 14:38:54 +02:00
Emel Şimşek	02f815ce1f	Disable local execution when Explain Analyze is requested for a query. (#6892 ) DESCRIPTION: Fixes a crash when explain analyze is requested for a query that is normally locally executed. When explain analyze is requested for a query, a task with two queries is created. Those two queries are 1. Wrapped Query --> `SELECT ... FROM worker_save_query_explain_analyze(<query>, <explain analyze options>)` 2. Fetch Query -->` SELECT explain_analyze_output, execution_duration FROM worker_last_saved_explain_analyze();` When the query is locally executed a task with multiple queries causes a crash in production. See the Assert at `57455dc64d/src/backend/distributed/executor/tuple_destination.c`#:~:text=Assert(task%2D%3EqueryCount%20%3D%3D%201)%3B This becomes a critical issue when auto_explain extension is used. When auto_explain extension is enabled, explain analyze is automatically requested for every query. One possible solution could be not to create two queries for a locally executed query. The fetch part may not have to be a query since the values are available in local variables. Until we enable local execution for explain analyze, it is best to disable local execution. Fixes #6777.	2023-05-23 14:33:22 +03:00
Emel Şimşek	f9a5be59b9	Run replicate_reference_tables background task as superuser. (#6930 ) DESCRIPTION: Fixes a bug in background shard rebalancer where the replicate reference tables task fails if the current user is not a superuser. This change is to be backported to earlier releases. We should fix the permissions for replicate_reference_tables on main branch such that it can be run by non-superuser roles. Fixes #6925. Fixes #6926.	2023-05-18 23:46:32 +03:00
Hanefi Onaldi	6a83290d91	Add ORDER BY clauses to some flaky tests (#6931 ) I observed a flaky test output [here](https://app.circleci.com/pipelines/github/citusdata/citus/32692/workflows/32464a22-7fd6-440a-9ff7-cfa62f9ff58a/jobs/1126144) and added `ORDER BY` clauses to similar queries in the failing test file. ```diff SELECT pg_identify_object_as_address(classid, objid, objsubid) from pg_catalog.pg_dist_object where objid IN('viewsc.prop_view3'::regclass::oid, 'viewsc.prop_view4'::regclass::oid); pg_identify_object_as_address --------------------------------- - (view,"{viewsc,prop_view3}",{}) (view,"{viewsc,prop_view4}",{}) + (view,"{viewsc,prop_view3}",{}) (2 rows) ```	2023-05-18 12:45:39 +03:00
Onur Tirtir	8ff9dde4b3	Prevent pushing down INSERT .. SELECT queries that we shouldn't (and allow some more) (#6752 ) Previously INSERT .. SELECT planner were pushing down some queries that should not be pushed down due to wrong colocation checks. It was checking whether one of the table in SELECT part and target table are colocated. But now, we check colocation for all tables in SELECT part and the target table. Another problem with INSERT .. SELECT planner was that some queries, which is valid to be pushed down, were not pushed down due to unnecessary checks which are currently supported. e.g. UNION check. As solution, we reused the pushdown planner checks for INSERT .. SELECT planner. DESCRIPTION: Fixes a bug that causes incorrectly pushing down some INSERT .. SELECT queries that we shouldn't DESCRIPTION: Prevents unnecessarily pulling the data into coordinator for some INSERT .. SELECT queries DESCRIPTION: Drops support for pushing down INSERT .. SELECT with append table as target Fixes #6749. Fixes #1428. Fixes #6920. --------- Co-authored-by: aykutbozkurt <aykut.bozkurt1995@gmail.com>	2023-05-17 15:05:08 +03:00
Onur Tirtir	56d217b108	Mark objects as distributed even when pg_dist_node is empty (#6900 ) We mark objects as distributed objects in Citus metadata only if we need to propagate given the command that creates it to worker nodes. For this reason, we were not doing this for the objects that are created while pg_dist_node is empty. One implication of doing so is that we defer the schema propagation to the time when user creates the first distributed table in the schema. However, this doesn't help for schema-based sharding (#6866) because we want to sync pg_dist_tenant_schema to the worker nodes even for empty schemas too. * Support test dependencies for isolation tests without a schedule * Comment out a test due to a known issue (#6901) * Also, reduce the verbosity for some log messages and make some tests compatible with run_test.py.	2023-05-16 11:45:42 +03:00
Onur Tirtir	e7abde7e81	Prevent downgrades when there is a single-shard table in the cluster (#6908 ) Also add a few tests for Citus/PG upgrade/downgrade scenarios.	2023-05-16 09:44:28 +02:00
Onur Tirtir	893ed416f1	Disable citus.enable_non_colocated_router_query_pushdown by default (#6909 ) Fixes #6779. DESCRIPTION: Disables citus.enable_non_colocated_router_query_pushdown GUC by default to ensure generating a consistent distributed plan for the queries that reference non-colocated distributed tables We already have tests for the cases where this GUC is disabled, so I'm not adding any more tests in this PR. Also make multi_insert_select_window idempotent. Related to: #6793	2023-05-15 12:07:50 +03:00
Ivan Kush	e3c6b8a10e	Fix flaky clolumnar_permissions test (#6913 ) As attr_num isn't ordered, order may be random. And regression test may be failed. This MR adds attr_num to ORDER BY ``` 3 --- /build/contrib/citus/src/test/regress/expected/columnar_permissions.out.modified 2023-05-05 11:13:44.926085432 +0000 4 +++ /build/contrib/citus/src/test/regress/results/columnar_permissions.out.modified 2023-05-05 11:13:44.934085414 +0000 5 @@ -124,24 +124,24 @@ 6 from columnar.chunk 7 where relation in ('no_access'::regclass, 'columnar_permissions'::regclass) 8 order by relation, stripe_num; 9 relation \| stripe_num \| attr_num \| chunk_group_num \| value_count 10 ----------------------+------------+----------+-----------------+------------- 11 no_access \| 1 \| 1 \| 0 \| 1 12 no_access \| 2 \| 1 \| 0 \| 1 13 no_access \| 3 \| 1 \| 0 \| 1 14 columnar_permissions \| 1 \| 1 \| 0 \| 1 15 columnar_permissions \| 1 \| 2 \| 0 \| 1 16 - columnar_permissions \| 2 \| 1 \| 0 \| 1 17 columnar_permissions \| 2 \| 2 \| 0 \| 1 18 - columnar_permissions \| 3 \| 1 \| 0 \| 1 19 + columnar_permissions \| 2 \| 1 \| 0 \| 1 20 columnar_permissions \| 3 \| 2 \| 0 \| 1 21 + columnar_permissions \| 3 \| 1 \| 0 \| 1 22 columnar_permissions \| 4 \| 1 \| 0 \| 1 23 columnar_permissions \| 4 \| 2 \| 0 \| 1 24 (11 rows) ``` Co-authored-by: Ivan Kush <ivan.kush@tantorlabs.ru>	2023-05-09 12:42:37 +02:00
Hanefi Onaldi	06e6f8e428	Normalize columnar version in tests (#6917 ) When we bump columnar version, some tests fail because of the output change. Instead of changing those lines every time, I think it is better to normalize it in tests.	2023-05-08 16:10:55 +03:00
Naisila Puka	905fd46410	Fixes flakiness in background_rebalance_parallel test (#6910 ) Fixes the following flaky outputs by decreasing citus_task_wait loop interval, and changing the order of wait commands. https://app.circleci.com/pipelines/github/citusdata/citus/32102/workflows/19958297-6c7e-49ef-9bc2-8efe8aacb96f/jobs/1089589 ``` diff SELECT job_id, task_id, status, nodes_involved FROM pg_dist_background_task WHERE job_id in (:job_id) ORDER BY task_id; job_id \| task_id \| status \| nodes_involved --------+---------+----------+---------------- 17779 \| 1013 \| done \| {50,56} 17779 \| 1014 \| running \| {50,57} - 17779 \| 1015 \| running \| {50,56} - 17779 \| 1016 \| blocked \| {50,57} + 17779 \| 1015 \| done \| {50,56} + 17779 \| 1016 \| running \| {50,57} 17779 \| 1017 \| runnable \| {50,56} 17779 \| 1018 \| blocked \| {50,57} 17779 \| 1019 \| runnable \| {50,56} 17779 \| 1020 \| blocked \| {50,57} (8 rows) ``` https://github.com/citusdata/citus/pull/6893#issuecomment-1525661408 ```diff SELECT job_id, task_id, status, nodes_involved FROM pg_dist_background_task WHERE job_id in (:job_id) ORDER BY task_id; job_id \| task_id \| status \| nodes_involved --------+---------+----------+---------------- 17779 \| 1013 \| done \| {50,56} - 17779 \| 1014 \| running \| {50,57} + 17779 \| 1014 \| runnable \| {50,57} 17779 \| 1015 \| running \| {50,56} 17779 \| 1016 \| blocked \| {50,57} 17779 \| 1017 \| runnable \| {50,56} 17779 \| 1018 \| blocked \| {50,57} 17779 \| 1019 \| runnable \| {50,56} 17779 \| 1020 \| blocked \| {50,57} (8 rows) ```	2023-05-05 16:47:01 +03:00
Hanefi Onaldi	3217e3f181	Fix flaky background rebalance parallel test (#6893 ) A test in background_rebalance_parallel.sql was failing intermittently where the order of tasks in the output was not deterministic. This commit fixes the test by removing id columns for the background tasks in the output. A sample failing diff before this patch is below: ```diff SELECT D.task_id, (SELECT T.command FROM pg_dist_background_task T WHERE T.task_id = D.task_id), D.depends_on, (SELECT T.command FROM pg_dist_background_task T WHERE T.task_id = D.depends_on) FROM pg_dist_background_task_depend D WHERE job_id in (:job_id) ORDER BY D.task_id, D.depends_on ASC; task_id \| command \| depends_on \| command ---------+---------------------------------------------------------------------+------------+--------------------------------------------------------------------- - 1014 \| SELECT pg_catalog.citus_move_shard_placement(85674026,50,57,'auto') \| 1013 \| SELECT pg_catalog.citus_move_shard_placement(85674025,50,56,'auto') - 1016 \| SELECT pg_catalog.citus_move_shard_placement(85674032,50,57,'auto') \| 1015 \| SELECT pg_catalog.citus_move_shard_placement(85674031,50,56,'auto') - 1018 \| SELECT pg_catalog.citus_move_shard_placement(85674038,50,57,'auto') \| 1017 \| SELECT pg_catalog.citus_move_shard_placement(85674037,50,56,'auto') - 1020 \| SELECT pg_catalog.citus_move_shard_placement(85674044,50,57,'auto') \| 1019 \| SELECT pg_catalog.citus_move_shard_placement(85674043,50,56,'auto') + 1014 \| SELECT pg_catalog.citus_move_shard_placement(85674038,50,57,'auto') \| 1013 \| SELECT pg_catalog.citus_move_shard_placement(85674037,50,56,'auto') + 1016 \| SELECT pg_catalog.citus_move_shard_placement(85674044,50,57,'auto') \| 1015 \| SELECT pg_catalog.citus_move_shard_placement(85674043,50,56,'auto') + 1018 \| SELECT pg_catalog.citus_move_shard_placement(85674026,50,57,'auto') \| 1017 \| SELECT pg_catalog.citus_move_shard_placement(85674025,50,56,'auto') + 1020 \| SELECT pg_catalog.citus_move_shard_placement(85674032,50,57,'auto') \| 1019 \| SELECT pg_catalog.citus_move_shard_placement(85674031,50,56,'auto') (4 rows) ``` Notice that the dependent and dependee tasks have some commands, but they have different task ids.	2023-05-05 12:07:46 +03:00
Naisila Puka	072ae44742	Adjusts query's CoerceViaIO & RelabelType nodes that are improper for deparsing (#6391 ) Adjusts query's CoerceViaIO & RelabelType nodes that are improper for deparsing The standard planner converts some `::text` casts to `::cstring` and here we convert back because `cstring` is a pseudotype and it cannot be casted to most types. This problem occurs in CoerceViaIO nodes. There was another problem with RelabelType nodes fixed in the following PR: https://github.com/citusdata/citus/pull/4580 We undo the changes in that PR, and fix both CoerceViaIO and RelabelType nodes in the planning phase (not in the deparsing phase in ruleutils) Fixes https://github.com/citusdata/citus/issues/5646 Fixes https://github.com/citusdata/citus/issues/5033 Fixes https://github.com/citusdata/citus/issues/6061	2023-05-04 16:46:02 +03:00
Onur Tirtir	db2514ef78	Call null-shard-key tables as single-shard distributed tables in code	2023-05-03 17:02:43 +03:00
Onur Tirtir	39b7711527	Add support for more pushable / non-pushable insert .. select queries with null-shard-key tables (#6823 ) * Add support for dist insert select by selecting from a reference table. This was the only pushable insert .. select case that #6773 didn't cover. * For the cases where we insert into a Citus table but the INSERT .. SELECT query cannot be pushed down, allow pull-to-coordinator when possible. Remove the checks that we had at the very beginning of CreateInsertSelectPlanInternal so that we can try insert .. select via pull-to-coordinator for the cases where we cannot push-down the insert .. select query. What we support via pull-to-coordinator is still limited due to lacking of logical planner support for SELECT queries, but this commit at least allows using pull-to-coordinator for the cases where the select query can be planned via router planner, without limiting ourselves to restrictive top-level checks. Also introduce some additional restrictions into CreateDistributedInsertSelectPlan for the cases it was missing to check for null-shard-key tables. Indeed, it would make more sense to have those checks for distributed tables in general, via separate PRs against main branch. See https://github.com/citusdata/citus/pull/6817. * Add support for inserting into a Postgres table.	2023-05-03 16:24:20 +03:00
Onur Tirtir	85745b46d5	Add initial sql support for distributed tables that don't have a shard key (#6773/#6822) Enable router planner and a limited version of INSERT .. SELECT planner for the queries that reference colocated null shard key tables. * SELECT / UPDATE / DELETE / MERGE is supported as long as it's a router query. * INSERT .. SELECT is supported as long as it only references colocated null shard key tables. Note that this is not only limited to distributed INSERT .. SELECT but also covers a limited set of query types that require pull-to-coordinator, e.g., due to LIMIT clause, generate_series() etc. ... (Ideally distributed INSERT .. SELECT could handle such queries too, e.g., when we're only referencing tables that don't have a shard key, but today this is not the case. See https://github.com/citusdata/citus/pull/6773#discussion_r1140130562.	2023-05-03 16:24:20 +03:00
Onur Tirtir	ac0ffc9839	Add a config for arbitrary config tests where all the tables are null-shard-key tables (#6783/#6788)	2023-05-03 16:18:27 +03:00
Ahmet Gedemenli	cdf54ff4b1	Add DDL support null-shard-key tables(#6778/#6784/#6787/#6859) Add tests for ddl coverage: * indexes * partitioned tables + indexes with long names * triggers * foreign keys * statistics * grant & revoke statements * truncate & vacuum * create/test/drop view that depends on a dist table with no shard key * policy & rls test * alter table add/drop/alter_type column (using sequences/different data types/identity columns) * alter table add constraint (not null, check, exclusion constraint) * alter table add column with a default value / set default / drop default * alter table set option (autovacuum) * indexes / constraints without names * multiple subcommands Adds support for * Creating new partitions after distributing (with null key) the parent table * Attaching partitions to a distributed table with null distribution key (and automatically distribute the new partition with null key as well) * Detaching partitions from it	2023-05-03 16:18:27 +03:00
Onur Tirtir	fa467e05e7	Add support for creating distributed tables with a null shard key (#6745 ) With this PR, we allow creating distributed tables with without specifying a shard key via create_distributed_table(). Here are the the important details about those tables: * Specifying `shard_count` is not allowed because it is assumed to be 1. * We mostly call such tables as "null shard-key" table in code / comments. * To avoid doing a breaking layout change in create_distributed_table(); instead of throwing an error, it will inform the user that `distribution_type` param is ignored unless it's explicitly set to NULL or 'h'. * `colocate_with` param allows colocating such null shard-key tables to each other. * We define this table type, i.e., NULL_SHARD_KEY_TABLE, as a subclass of DISTRIBUTED_TABLE because we mostly want to treat them as distributed tables in terms of SQL / DDL / operation support. * Metadata for such tables look like: - distribution method => DISTRIBUTE_BY_NONE - replication model => REPLICATION_MODEL_STREAMING - colocation id => != INVALID_COLOCATION_ID (distinguishes from Citus local tables) * We assign colocation groups for such tables to different nodes in a round-robin fashion based on the modulo of "colocation id". Note that this PR doesn't care about DDL (except CREATE TABLE) / SQL / operation (i.e., Citus UDFs) support for such tables but adds a preliminary API.	2023-05-03 16:18:27 +03:00
aykut-bozkurt	2d005ac777	Query Generator Seed (#6883 ) - Give seed number as argument to query generator to reproduce a previous run. - Expose the difference between results, if any, as artifact on CI.	2023-05-03 15:54:11 +03:00
Teja Mupparti	e444dd4f3f	MERGE: Support reference table as source with local table as target	2023-05-02 11:37:29 -07:00
Hanefi Onaldi	efd41e8ea5	Bump columnar to 11.3 (#6898 ) When working on changelog, Marco suggested in https://github.com/citusdata/citus/pull/6856#pullrequestreview-1386601215 that we should bump columnar version to 11.3 as well. This PR aims to contain all the necessary changes to allow upgrades to and downgrades from 11.3.0 for columnar. Note that updating citus extension version does not affect columnar as the two extension versions are not really coupled. The same changes will also be applied to the release branch in https://github.com/citusdata/citus/pull/6897	2023-05-02 11:58:32 +03:00
Ahmet Gedemenli	59ccf364df	Ignore nodes not allowed for shards, when planning rebalance steps (#6887 ) We are handling colocation groups with shard group count less than the worker node count, using a method different than the usual rebalancer. See #6739 While making the decision of using this method or not, we should've ignored the nodes that are marked `shouldhaveshards = false`. This PR excludes those nodes when making the decision. Adds a test such that: coordinator: [] worker 1: [1_1, 1_2] worker 2: [2_1, 2_2] (rebalance) coordinator: [] worker 1: [1_1, 2_1] worker 2: [1_2, 2_2] If we take the coordinator into account, the rebalancer considers the first state as balanced and does nothing (because shard_count < worker_count) But with this pr, we ignore the coordinator because it's shouldhaveshards = false So the rebalancer distributes each colocation group to both workers Also, fixes an unrelated flaky test in the same file	2023-05-01 12:21:08 +02:00
aykut-bozkurt	8cb69cfd13	break sequence dependency during table creation (#6889 ) We need to break sequence dependency for a table while creating the table during non-transactional metadata sync to ensure idempotency of the creation of the table. Problem: When we send `SELECT pg_catalog.worker_drop_sequence_dependency(logicalrelid::regclass::text) FROM pg_dist_partition` to workers during the non-transactional sync, table might not be in `pg_dist_partition` at worker, and sequence dependency is not broken at the worker. Solution: We break sequence dependency via `SELECT pg_catalog.worker_drop_sequence_dependency(logicalrelid::regclass::text)` for each table while creating it at the workers. It is safe to send since the udf is a no-op when there is no sequence dependency. DESCRIPTION: Fixes a bug related to sequence idempotency at non-transactional sync. Fixes https://github.com/citusdata/citus/issues/6888.	2023-04-28 15:09:09 +03:00
aykut-bozkurt	a7fa1db696	fix flaky test regex (#6890 ) There was a bug related to regex. We sometimes caught the wrong line when the test name is also included in comments. Example: We caught the wrong line as multi_metadata_sync is included in the comment before the test line. ``` # ---------- # multi_metadata_sync tests the propagation of mx-related metadata changes to metadata workers # multi_unsupported_worker_operations tests that unsupported operations error out on metadata workers # ---------- test: multi_metadata_sync ``` Solution: Restrict regex rule better.	2023-04-27 13:14:40 +03:00
Jelte Fennema	a5f4fece13	Fix running PG upgrade tests with run_test.py (#6829 ) In #6814 we started using the Python test runner for upgrade tests in run_test.py, instead of the Perl based one. This had a problem though, not all tests in minimal_schedule can be run with the Python runner. This adds a separate minimal schedule for the pg_upgrade tests which doesn't include the tests that break with the Python runner. This PR also fixes various other issues that came up while testing the upgrade tests.	2023-04-24 15:54:32 +02:00
aykut-bozkurt	a6a7271e63	Query generator test tool (#6686 ) - Query generator is used to create queries, allowed by the grammar which is documented at `query_generator/query_gen.py` (currently contains only joins). - This PR adds a CI test which utilizes the query generator to compare the results of generated queries that are executed on Citus tables and local (undistributed) tables. It fails if there is an unexpected error at results. The error can be related to Citus, the query generator, or even Postgres. - The tool is configured by the file `query_generator/config/config.yaml`, which limits table counts at generated queries and sets many table related parameters (e.g. row count). - Run time of the CI task can be configured from the config file. By default, we run 250 queries with maximum table count of 40 inside each query.	2023-04-23 20:28:26 +03:00
aykut-bozkurt	08e2820c67	skip restriction clause if it contains placeholdervar (#6857 ) `PlaceHolderVar` is not relevant to be processed inside a restriction clause. Otherwise, `pull_var_clause_default` would throw error. PG would create the restriction to physical `Var` that `PlaceHolderVar` points to anyway, so it is safe to skip this restriction. DESCRIPTION: Fixes a bug related to WHERE clause list which contains placeholder. Fixes https://github.com/citusdata/citus/issues/6758	2023-04-17 18:14:01 +03:00
Emel Şimşek	2675a68218	Make coordinator always in metadata by default in regression tests. (#6847 ) DESCRIPTION: Changes the regression test setups adding the coordinator to metadata by default. When creating a Citus cluster, coordinator can be added in metadata explicitly by running `citus_set_coordinator_host ` function. Adding the coordinator to metadata allows to create citus managed local tables. Other Citus functionality is expected to be unaffected. This change adds the coordinator to metadata by default when creating test clusters in regression tests. There are 3 ways to run commands in a sql file (or a schedule which is a sequence of sql files) with Citus regression tests. Below is how this PR adds the coordinator to metadata for each. 1. `make <schedule_name>` Changed the sql files (sql/multi_cluster_management.sql and sql/minimal_cluster_management.sql) which sets up the test clusters such that they call `citus_set_coordinator_host`. This ensures any following tests will have the coordinator in metadata by default. 2. `citus_tests/run_test.py <sql_file_name>` Changed the python code that sets up the cluster to always call ` citus_set_coordinator_host`. For the upgrade tests, a version check is included to make sure `citus_set_coordinator_host` function is available for a given version. 3. ` make check-arbitrary-configs ` Changed the python code that sets up the cluster to always call `citus_set_coordinator_host `. #6864 will be used to track the remaining work which is to change the tests where coordinator is added/removed as a node.	2023-04-17 14:14:37 +03:00
Gokhan Gulbiz	8782ea1582	Ensure partitionKeyValue and colocationId are set for proper tenant stats gathering (#6834 ) This PR updates the tenant stats implementation to set partitionKeyValue and colocationId in ExecuteLocalTaskListExtended, in addition to LocallyExecuteTaskPlan. This ensures that tenant stats can be properly gathered regardless of the code path taken. The changes were initially made while testing stored procedure calls for tenant stats.	2023-04-17 09:35:26 +03:00
aykut-bozkurt	3286ec59e9	fix 3 flaky tests in failure schedule (#6846 ) Fixed 3 flaky tests in failure tests which caused flakiness in other tests due to changed node and group sequence ids during node addition-removal.	2023-04-13 13:13:28 +03:00
Halil Ozan Akgül	9ba70696f7	Add CPU usage to citus_stat_tenants (#6844 ) This PR adds CPU usage to `citus_stat_tenants` monitor. CPU usage is tracked in periods, similar to query counts.	2023-04-12 16:23:00 +03:00
Halil Ozan Akgül	8b50e95dc8	Fix citus_stat_tenants period updating bug (#6833 ) Fixes the bug that causes updating the citus_stat_tenants periods incorrectly. `TimestampDifferenceExceeds` expects the difference in milliseconds but it was microseconds, this is fixed. `tenantStats->lastQueryTime` was updated during monitoring too, now it's updated only when there are tenant queries.	2023-04-11 17:40:07 +03:00
aykut-bozkurt	a20f7e1a55	fixes update propagation bug when `citus_set_coordinator_host` is called more than once (#6837 ) DESCRIPTION: Fixes update propagation bug when `citus_set_coordinator_host` is called more than once. Fixes https://github.com/citusdata/citus/issues/6731.	2023-04-11 11:27:16 +03:00
Onur Tirtir	0194657c5d	Bump Citus to 12.0devel (#6840 )	2023-04-10 12:05:18 +03:00
rajeshkt78	29c8d9633a	Makefile changes to build CDC in builddir for pgoutput and wal2json. (#6827 ) DESCRIPTION: Makefile changes to build different versions of CDC decoder for different base decoders like pgoutput and wal2json with the same name and copy it to $packagelib/cdc_decoders dir. This helps the user to use logical replication slots normally with pgoutput without being aware of CDC decoder. 1) Changed src/backend/distributed/cdc/Makefile to setup a build directory for CDC in build-cdc-$(DECODER) dir and copy the source files (.c.h and Makefile.decoder) to the build dir and build it for each base decoder. 2) copy the pgoutput.so and wal2json.so into the above build dir and install them in PG packagelibdir/citus_decoders directory. 3)Added a testcase 016_cdc_wal2json.pl for testing the wal2json decoder using pg_recv_logical_changes function.	2023-04-06 17:03:12 +05:30
Naisila Puka	84f2d8685a	Adds control for background task executors involving a node (#6771 ) DESCRIPTION: Adds control for background task executors involving a node ### Background and motivation Nonblocking concurrent task execution via background workers was introduced in [#6459](https://github.com/citusdata/citus/pull/6459), and concurrent shard moves in the background rebalancer were introduced in [#6756](https://github.com/citusdata/citus/pull/6756) - with a hard dependency that limits to 1 shard move per node. As we know, a shard move consists of a shard moving from a source node to a target node. The hard dependency was used because the background task runner didn't have an option to limit the parallel shard moves per node. With the motivation of controlling the number of concurrent shard moves that involve a particular node, either as source or target, this PR introduces a general new GUC citus.max_background_task_executors_per_node to be used in the background task runner infrastructure. So, why do we even want to control and limit the concurrency? Well, it's all about resource availability: because the moves involve the same nodes, extra parallelism won’t make the rebalance complete faster if some resource is already maxed out (usually cpu or disk). Or, if the cluster is being used in a production setting, the moves might compete for resources with production queries much more than if they had been executed sequentially. ### How does it work? A new column named nodes_involved is added to the catalog table that keeps track of the scheduled background tasks, pg_dist_background_task. It is of type integer[] - to store a list of node ids. It is NULL by default - the column will be filled by the rebalancer, but we may not care about the nodes involved in other uses of the background task runner. Table "pg_catalog.pg_dist_background_task" Column \| Type ============================================ job_id \| bigint task_id \| bigint owner \| regrole pid \| integer status \| citus_task_status command \| text retry_count \| integer not_before \| timestamp with time zone message \| text +nodes_involved \| integer[] A hashtable named ParallelTasksPerNode keeps track of the number of parallel running background tasks per node. An entry in the hashtable is as follows: ParallelTasksPerNodeEntry { node_id // The node is used as the hash table key counter // Number of concurrent background tasks that involve node node_id // The counter limit is citus.max_background_task_executors_per_node } When the background task runner assigns a runnable task to a new executor, it increments the counter for each of the nodes involved with that runnable task. The limit of each counter is citus.max_background_task_executors_per_node. If the limit is reached for any of the nodes involved, this runnable task is skipped. And then, later, when the running task finishes, the background task runner decrements the counter for each of the nodes involved with the done task. The following functions take care of these increment-decrement steps: IncrementParallelTaskCountForNodesInvolved(task) DecrementParallelTaskCountForNodesInvolved(task) citus.max_background_task_executors_per_node can be changed in the fly. In the background rebalancer, we simply give {source_node, target_node} as the nodesInvolved input to the ScheduleBackgroundTask function. The rest is taken care of by the general background task runner infrastructure explained above. Check background_task_queue_monitor.sql and background_rebalance_parallel.sql tests for detailed examples. #### Note This PR also adds a hard node dependency if a node is first being used as a source for a move, and then later as a target. The reason this should be a hard dependency is that the first move might make space for the second move. So, we could run out of disk space (or at least overload the node) if we move the second shard to it before the first one is moved away. Fixes https://github.com/citusdata/citus/issues/6716	2023-04-06 14:12:39 +03:00
Gokhan Gulbiz	fa00fc6e3e	Add upgrade/downgrade paths between v11.2.2 and v11.3.1 (#6820 ) DESCRIPTION: PR description that will go into the change log, up to 78 characters --------- Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>	2023-04-06 12:46:09 +03:00
Ahmet Gedemenli	83a2cfbfcf	Move cleanup record test to upgrade schedule (#6794 ) DESCRIPTION: Move cleanup record test to upgrade schedule	2023-04-06 11:42:49 +03:00
Naisila Puka	fc479bfa49	Fixes flakiness in multi_metadata_sync test (#6824 ) Fixes flakiness in multi_metadata_sync test https://app.circleci.com/pipelines/github/citusdata/citus/31863/workflows/ea937480-a4cc-4646-815c-bb2634361d98/jobs/1074457 ```diff SELECT logicalrelid, repmodel FROM pg_dist_partition WHERE logicalrelid = 'mx_test_schema_1.mx_table_1'::regclass OR logicalrelid = 'mx_test_schema_2.mx_table_2'::regclass; logicalrelid \| repmodel -----------------------------+---------- - mx_test_schema_1.mx_table_1 \| s mx_test_schema_2.mx_table_2 \| s + mx_test_schema_1.mx_table_1 \| s (2 rows) ``` This is a simple issue of missing `ORDER BY` clauses. I went ahead and added some other missing ones in the same file as well. Also, I replaced existing `ORDER BY logicalrelid` with `ORDER BY logicalrelid::text`, in order to compare names, not OIDs.	2023-04-06 11:19:32 +03:00
Halil Ozan Akgül	52ad2d08c7	Multi tenant monitoring (#6725 ) DESCRIPTION: Adds views that monitor statistics on tenant usages This PR adds `citus_stats_tenants` view that monitors the tenants on the cluster. `citus_stats_tenants` shows the node id, colocation id, tenant attribute, read count in this period and last period, and query count in this period and last period of the tenant. Tenant attribute currently is the tenant's distribution column value, later when schema based sharding is introduced, this meaning might change. A period is a time bucket the queries are counted by. Read and query counts for this period can increase until the current period ends. After that those counts are moved to last period's counts, which cannot change. The period length can be set using 'citus.stats_tenants_period'. `SELECT` queries are counted as _read_ queries, `INSERT`, `UPDATE` and `DELETE` queries are counted as _write_ queries. So in the view read counts are `SELECT` counts and query counts are `SELECT`, `INSERT`, `UPDATE` and `DELETE` count. The data is stored in shared memory, in a struct named `MultiTenantMonitor`. `citus_stats_tenants` shows the data from local tenants. `citus_stats_tenants` show up to `citus.stats_tenant_limit` number of tenants. The tenants are scored based on the number of queries they run and the recency of those queries. Every query ran increases the score of tenant by `ONE_QUERY_SCORE`, and after every period ends the scores are halved. Halving is done lazily. To retain information a longer the monitor keeps up to 3 times `citus.stats_tenant_limit` tenants. When the tenant count hits `3 * citus.stats_tenant_limit`, last `citus.stats_tenant_limit` tenants are removed. To see all stored tenants you can use `citus_stats_tenants(return_all_tenants := true)` - [x] Create collector view that gets data from all nodes. #6761 - [x] Add monitoring log #6762 - [x] Create enable/disable GUC #6769 - [x] Parse the annotation string correctly #6796 - [x] Add local queries and prepared statements #6797 - [x] Rename to citus_stat_statements #6821 - [x] Run pgbench - [x] Fix role permissions #6812 --------- Co-authored-by: Gokhan Gulbiz <ggulbiz@gmail.com> Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2023-04-05 17:44:17 +03:00
Jelte Fennema	d04d32b314	In run_test.py actually return worker_count (#6830 ) Fixes a small mistake that was missed in the refactor of run_test.py that was done in #6816.	2023-04-05 16:38:57 +03:00
Naisila Puka	eda3cc418a	Fixes flakiness in multi_cluster_management test (#6825 ) Fixes flakiness in multi_cluster_management test https://app.circleci.com/pipelines/github/citusdata/citus/31816/workflows/2f455a30-1c0b-4b21-9831-f7cf2169df5a/jobs/1071444 ```diff SELECT public.wait_until_metadata_sync(); +WARNING: waiting for metadata sync timed out wait_until_metadata_sync -------------------------- (1 row) ``` Default timeout value is 15000. I increased it to 60000.	2023-04-05 15:50:22 +03:00
Jelte Fennema	e5e5eb35c7	Refactor run_test.py (#6816 ) Over the last few months run_test.py got more and more complex. This refactors the code in `run_test.py` to be better understandable. Mostly this splits up separate pieces of logic into separate functions.	2023-04-05 11:11:30 +02:00
Jelte Fennema	dcee370270	Fix flakyness in citus_split_shard_by_split_points_deferred_drop (#6819 ) In CI we would sometimes get this failure: ```diff -- The original shard is marked for deferred drop with policy_type = 2. -- The previous shard should be dropped at the beginning of the second split call SELECT * from pg_dist_cleanup; record_id \| operation_id \| object_type \| object_name \| node_group_id \| policy_type -----------+--------------+-------------+--------------------------------------------------------------------------+---------------+------------- + 60 \| 778 \| 3 \| citus_shard_split_slot_18_21216_778 \| 16 \| 0 512 \| 778 \| 1 \| citus_split_shard_by_split_points_deferred_schema.table_to_split_8981001 \| 16 \| 2 -(1 row) +(2 rows) ``` Replication slots sometimes cannot be deleted right away. Which is hard to resolve, but luckily we can filter these cleanup records out easily by filtering by policy_type. While debugging this issue I learnt that we did not use `GetNextCleanupRecordId` in all places where we created cleanup records. This caused test failures when running tests multiple times, when they set `citus.next_cleanup_record_id`. I tried fixing that by calling GetNextCleanupRecordId in all places but that caused many other tests to fail due to deadlocks. So, instead this adresses that issue by using `ALTER SEQUENCE ... RESTART` instead of `citus.next_cleanup_record_id`. In a follow up PR we should probably get rid of `citus.next_cleanup_record_id`, since it's only used in one other file.	2023-04-04 09:45:48 +02:00
Marco Slot	7c0589abb8	Do not override combinefunc of custom aggregates with common names (#6805 ) DESCRIPTION: Fix an issue that caused some queries with custom aggregates to fail While playing around with https://github.com/pgvector/pgvector I noticed that the AVG query was broken. That's because we treat it as any other AVG by breaking it down in SUM and COUNT, but there are no SUM/COUNT functions in this case, but there is a perfectly usable combinefunc. This PR changes our aggregate logic to prefer custom aggregates with a combinefunc even if they have a common name. Co-authored-by: Marco Slot <marco.slot@gmail.com>	2023-04-03 19:43:09 +02:00
rajeshkt78	d5df892394	Make CDC decoder an independent extension (#6810 ) DESCRIPTION: - The CDC decoder is refacroted into a seperate extension that can be used loaded dynamically without having to reload citus. - CDC decoder code can be compiled using DECODER flag to work with different decoders like pgoutput and wal2json. by default the base decode is "pgoutput". - the dynamic_library_path config is adjusted dynamically to prefer the decoders in cdc_decoders directory in citus init so that the users can use the replication subscription commands without having to make any config changes.	2023-04-03 21:32:15 +05:30
Jelte Fennema	92b358fe0a	Make python-regress based tests runnable with run_test.py (#6814 ) For some tests such as upgrade tests and arbitrary config tests we set up the citus cluster using Python. This setup is slightly different from the perl based setup script (`multi_regress.pl`). Most importantly it uses replication factor 1 by default. This changes our run_test.py script to be able to run a schedule using python instead of `multi_regress.pl`, for the tests that require it. For now arbitrary config tests are still not runnable with `run_test.py`, but this brings us one step closer to being able to do that. Fixes #6804	2023-03-31 17:07:12 +02:00
Jelte Fennema	085b59f586	Fix flaky multi_mx_schema_support test (#6813 ) This happened sometimes: ```diff SELECT objid::oid::regnamespace as "Distributed Schemas" FROM pg_catalog.pg_dist_object WHERE objid::oid::regnamespace IN ('mx_old_schema', 'mx_new_schema'); Distributed Schemas --------------------- - mx_old_schema mx_new_schema + mx_old_schema (2 rows) ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/31706/workflows/edc84a6a-dfef-42b3-ab5c-54daa64c2154/jobs/1065463 In passing make multi_mx_schema_support runnable with run_test.py	2023-03-31 12:36:53 +02:00
Jelte Fennema	7b60cdd13b	Convert columnar tap tests to pytest (#6720 ) Having as little Perl as possible in our repo seems a worthy goal. Sadly Postgres its Perl based TAP infrastructure was the only way in which we could run tests that were hard to do using only SQL commands. This change adds infrastructure to run such "application style tests" using python and converts all our existing Perl TAP tests to this new infrastructure. Some of the helper functions that are added in this PR are currently unused. Most of these will be used by the CDC PR that depends on this. Some others are there because they were needed by the PgBouncer test framework that this is based on, and the functions seemed useful enough to citus testing to keep. The main features of the test suite are: 1. Application style tests using a programming language that our developers know how to write. 2. Caching of Citus clusters in-between tests using the ["fixture" pattern][fixture] from `pytest` to achieve speedy tests. To make this work in practice any changes made during a test are automatically undone. Schemas, replication slots, subscriptions, publications are dropped at the end of each test. And any changes made by `ALTER SYSTEM` or manually editing of `pg_hba.conf` are undone too. 3. Automatic parallel execution of tests using the `-n auto` flag that's added by `pytest-xdist`. This improved the speed of tests greatly with the similar test framework I created for PgBouncer. Right now it doesn't help much yet though, since this PR only adds two tests (one of which takes ~10 times longer than the other). Possible future improvements are: 1. Clean up even more things at the end of each test (e.g. users that were created). These are fairly easy to add, but I have not done so yet since they were not needed yet for this PR or the CDC PR. So I would not be able to test the cleanup easily. 2. Support for query block detection similar to what we can now do using isolation tests. [fixture]: https://docs.pytest.org/en/6.2.x/fixture.html	2023-03-31 12:25:19 +02:00
aykutbozkurt	dc57e4b2d8	PR #6728 / commit - 13 Add failure tests for nontransactional metadata sync mode.	2023-03-30 11:06:16 +03:00
aykutbozkurt	35dbdae5a4	PR #6728 / commit - 11 Let AddNodeMetadata to use metadatasync api during node addition.	2023-03-30 11:06:16 +03:00
aykutbozkurt	fe00b3263a	PR #6728 / commit - 10 Do not acquire strict lock on separate transaction to localhost as we already take the lock before. But make sure that caller has the ExclusiveLock.	2023-03-30 11:06:16 +03:00
aykutbozkurt	a74232bb39	PR #6728 / commit - 9 Do not enforce distributed transaction at `EnsureCoordinatorInitiatedOperation`.	2023-03-30 10:53:22 +03:00
aykutbozkurt	1fb3de14df	PR #6728 / commit - 6 Let `activate_node_snapshot` use new metadata sync api.	2023-03-30 10:53:22 +03:00
aykutbozkurt	bc25ba51c3	PR #6728 / commit - 5 Let `ActivateNode` use new metadata sync api.	2023-03-30 10:53:22 +03:00
aykutbozkurt	29ef9117e6	PR #6728 / commit - 4 Add new metadata sync methods which uses MemorySyncContext api so that during the sync we can - free memory to prevent OOM, - use either transactional or nontransactional modes according to the GUC .	2023-03-30 10:53:22 +03:00
aykutbozkurt	8feb8c634a	PR #6728 / commit - 3 Let nontransactional sync mode create transaction per shell table during dropping the shell tables from worker.	2023-03-30 10:53:20 +03:00
Gokhan Gulbiz	e71bfd6074	Identity column implementation refactorings (#6738 ) This pull request proposes a change to the logic used for propagating identity columns to worker nodes in citus. Instead of creating a dependent sequence for each identity column and changing its default value to `nextval(seq)/worker_nextval(seq)`, this update will pass the identity columns as-is to the worker nodes. Please note that there are a few limitations to this change. 1. Only bigint identity columns will be allowed in distributed tables to ensure compatibility with the DDL from any node functionality. Our current distributed sequence implementation only allows insert statements from all nodes for bigint sequences. 2. `alter_distributed_table` and `undistribute_table` operations will not be allowed for tables with identity columns. This is because we do not have a proper way of keeping sequence states consistent across the cluster. DESCRIPTION: Prevents using identity columns on data types other than `bigint` on distributed tables DESCRIPTION: Prevents using `alter_distributed_table` and `undistribute_table` UDFs when a table has identity columns DESCRIPTION: Fixes a bug that prevents enforcing identity column restrictions on worker nodes Depends on #6740 Fixes #6694	2023-03-30 10:41:01 +03:00
Emel Şimşek	d3fb9288ab	Schedule parallel shard moves in background rebalancer by removing task dependencies between shard moves across colocation groups. (#6756 ) DESCRIPTION: This PR removes the task dependencies between shard moves for which the shards belong to different colocation groups. This change results in scheduling multiple tasks in the RUNNABLE state. Therefore it is possible that the background task monitor can run them concurrently. Previously, all the shard moves planned in a rebalance operation took dependency on each other sequentially. For instance, given the following table and shards colocation group 1 colocation group 2 table1 table2 table3 table4 table 5 shard11 shard21 shard31 shard41 shard51 shard12 shard22 shard32 shard42 shard52 if the rebalancer planner returned the below set of moves ` {move(shard11), move(shard12), move(shard41), move(shard42)}` background rebalancer scheduled them such that they depend on each other sequentially. ``` {move(reftables) if there is any, none} \| move( shard11) \| move(shard12) \| {move(shard41)<--- move(shard12)} This is an artificial dependency move(shard41) \| move(shard42) ``` This results in artificial dependencies between otherwise independent moves. Considering that the shards in different colocation groups can be moved concurrently, this PR changes the dependency relationship between the moves as follows: ``` {move(reftables) if there is any, none} {move(reftables) if there is any, none} \| \| move(shard11) move(shard41) \| \| move(shard12) move(shard42) ``` --------- Co-authored-by: Jelte Fennema <jelte.fennema@microsoft.com>	2023-03-29 22:03:37 +03:00
Marco Slot	e5fd1c3a87	Fix TAP tests after CREATE PUBLICATION changes	2023-03-29 00:59:12 +02:00
Marco Slot	8ad444f8ef	Hide shards from CDC subscriptions	2023-03-29 00:59:12 +02:00
Marco Slot	b09d239809	Propagate CREATE PUBLICATION statements	2023-03-29 00:59:12 +02:00
Gokhan Gulbiz	e618345703	Handle identity columns properly in the router planner (#6802 ) DESCRIPTION: Fixes a bug with insert..select queries with identity columns Fixes #6798	2023-03-29 15:50:12 +03:00
Teja Mupparti	37500806d6	Add appropriate locks for MERGE to run in parallel	2023-03-28 09:45:40 -07:00
rajeshkt78	85b8a2c7a1	CDC implementation for Citus using Logical Replication (#6623 ) Description: Implementing CDC changes using Logical Replication to avoid re-publishing events multiple times by setting up replication origin session, which will add "DoNotReplicateId" to every WAL entry. - shard splits - shard moves - create distributed table - undistribute table - alter distributed tables (for some cases) - reference table operations The citus decoder which will be decoding WAL events for CDC clients, ignores any WAL entry with replication origin that is not zero. It also maps the shard names to distributed table names.	2023-03-28 16:00:21 +05:30
Onur Tirtir	616b5018a0	Add a GUC to disallow planning the queries that reference non-colocated tables via router planner (#6793 ) Today we allow planning the queries that reference non-colocated tables if the shards that query targets are placed on the same node. However, this may not be the case, e.g., after rebalancing shards because it's not guaranteed to have those shards on the same node anymore. This commit adds citus.enable_non_colocated_router_query_pushdown GUC that can be used to disallow planning such queries via router planner, when it's set to false. Note that the default value for this GUC will be "true" for 11.3, but we will alter it to "false" on 12.0 to not introduce a breaking change in a minor release. Closes #692. Even more, allowing such queries to go through router planner also causes generating an incorrect plan for the DML queries that reference distributed tables that are sharded based on different replication factor settings. For this reason, #6779 can be closed after altering the default value for this GUC to "false", hence not now. DESCRIPTION: Adds `citus.enable_non_colocated_router_query_pushdown` GUC to ensure generating a consistent distributed plan for the queries that reference non-colocated distributed tables (when set to "false", the default is "true").	2023-03-28 13:10:29 +03:00
Teja Mupparti	9bab819f26	Disentangle MERGE planning code from the modify-planning code path	2023-03-27 10:41:46 -07:00
Onur Tirtir	372a93b529	Make 8 more tests runnable multiple times via run_test.py (#6791 ) Soon I will be doing some changes related to #692 in router planner and those changes require updating ~5/6 tests related to router planning. And to make those test files runnable by run_test.py multiple times, we need to make some other tests (that they're run in parallel / they badly depend on) ready for run_test.py too.	2023-03-27 12:19:06 +03:00
Onur Tirtir	4960ced175	Add an arbitrary config test heavily based on multi_router_planner_fast_path.sql (#6782 ) This would be useful for testing #6773. This is because, given that #6773 only adds support for router / fast-path queries, theoretically almost all the tests that we have in that test file should work for null-shard-key tables too (and they indeed do). I deliberately did not replace multi_router_planner_fast_path.sql with the one that I'm adding into arbitrary configs because we might still want to see when we're able to go through fast-path planning for the usual distributed tables (the ones that have a shard key).	2023-03-22 10:49:08 +03:00
Ahmet Gedemenli	2713e015d6	Check before logicalrep for rebalancer, error if needed (#6754 ) DESCRIPTION: Check before logicalrep for rebalancer, error if needed Check if we can use logical replication or not, in case of shard transfer mode = auto, before executing the shard moves. If we can't, error out. Before this PR, we used to error out in the middle of shard moves: ```sql set citus.shard_count = 4; -- just to get the error sooner select citus_remove_node('localhost',9702); create table t1 (a int primary key); select create_distributed_table('t1','a'); create table t2 (a bigint); select create_distributed_table('t2','a'); select citus_add_node('localhost',9702); select rebalance_table_shards(); NOTICE: Moving shard 102008 from localhost:9701 to localhost:9702 ... NOTICE: Moving shard 102009 from localhost:9701 to localhost:9702 ... NOTICE: Moving shard 102012 from localhost:9701 to localhost:9702 ... ERROR: cannot use logical replication to transfer shards of the relation t2 since it doesn't have a REPLICA IDENTITY or PRIMARY KEY ``` Now we check and error out in the beginning, without moving the shards. fixes: #6727	2023-03-21 16:34:52 +03:00
aykut-bozkurt	aa33988c6e	fix pip lock file (#6766 ) ci/fix_styles.sh were complaining about `black` and `isort` packages are not found even if I `pipenv install --dev` due to broken lock file. I regenerated the lock file and now it works fine. We also wanted to upgrade required python version for the pipfile.	2023-03-21 00:58:12 +03:00
aykut-bozkurt	ea3093bdb6	Make workerCount configurable for regression tests (#6764 ) Make worker count flexible in our regression tests instead of hardcoding it to 2 workers.	2023-03-20 12:06:31 +03:00
Teja Mupparti	cf55136281	1) Restrict MERGE command INSERT to the source's distribution column Fixes #6672 2) Move all MERGE related routines to a new file merge_planner.c 3) Make ConjunctionContainsColumnFilter() static again, and rearrange the code in MergeQuerySupported() 4) Restore the original format in the comments section. 5) Add big serial test. Implement latest set of comments	2023-03-16 13:43:08 -07:00
Teja Mupparti	1e42cd3da0	Support MERGE on distributed tables with restrictions This implements the phase - II of MERGE sql support Support routable query where all the tables in the merge-sql are distributed, co-located, and both the source and target relations are joined on the distribution column with a constant qual. This should be a Citus single-task query. Below is an example. SELECT create_distributed_table('t1', 'id'); SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’); MERGE INTO t1 USING s1 ON t1.id = s1.id AND t1.id = 100 WHEN MATCHED THEN UPDATE SET val = s1.val + 10 WHEN MATCHED THEN DELETE WHEN NOT MATCHED THEN INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src) Basically, MERGE checks to see if There are a minimum of two distributed tables (source and a target). All the distributed tables are indeed colocated. MERGE relations are joined on the distribution column MERGE .. USING .. ON target.dist_key = source.dist_key The query should touch only a single shard i.e. JOIN AND with a constant qual MERGE .. USING .. ON target.dist_key = source.dist_key AND target.dist_key = <> If any of the conditions are not met, it raises an exception. (cherry picked from commit `44c387b978`) This implements MERGE phase3 Support pushdown query where all the tables in the merge-sql are Citus-distributed, co-located, and both the source and target relations are joined on the distribution column. This will generate multiple tasks which execute independently after pushdown. SELECT create_distributed_table('t1', 'id'); SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’); MERGE INTO t1 USING s1 ON t1.id = s1.id WHEN MATCHED THEN UPDATE SET val = s1.val + 10 WHEN MATCHED THEN DELETE WHEN NOT MATCHED THEN INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src) *The only exception for both the phases II and III is, UPDATEs and INSERTs must be done on the same shard-group as the joined key; for example, below scenarios are NOT supported as the key-value to be inserted/updated is not guaranteed to be on the same node as the id distribution-column. MERGE INTO target t USING source s ON (t.customer_id = s.customer_id) WHEN NOT MATCHED THEN - - INSERT(customer_id, …) VALUES (<non-local-constant-key-value>, ……); OR this scenario where we update the distribution column itself MERGE INTO target t USING source s On (t.customer_id = s.customer_id) WHEN MATCHED THEN UPDATE SET customer_id = 100; (cherry picked from commit `fa7b8949a8`)	2023-03-16 13:43:08 -07:00
Jelte Fennema	b8b85072d6	Add pytest depedencies to Pipfile (#6767 ) In #6720 I'm adding a `pytest` based testing framework. This adds the dependencies for those. They have already been [merged into our docker files][the-process-merge] in the the-process repo preparation for #6720. But by not having them on our citus main branch it is impossible to make changes to the Pipfile, because our CI Dockerfiles and master are out of date. Since #6720 will need some more discussion and might take a few more weeks to be merged, this takes out the Pipfile changes. By merging this PR we can unblock new Pipfile changes. Unblocks and partially addresses #6766 [the-process-merge]: https://github.com/citusdata/the-process/pull/117	2023-03-15 14:53:14 +01:00
Onur Tirtir	9550ebd118	Remove pg_depend entries from columnar metadata indexes to columnar-am In the past, having columnar tables in the cluster was causing pg upgrades to fail when attempting to access columnar metadata. This is because, pg_dump doesn't see objects that we use for columnar-am related booking as the dependencies of the tables using columnar-am. To fix that; in #5456, we inserted some "normal dependency" edges (from those objects to columnar-am) into pg_depend. This helped us ensuring the existency of a class of metadata objects --such as columnar.storageid_seq-- and helped fixing #5437. However, the normal-dependency edges that we added for indexes on columnar metadata tables --such columnar.stripe_pkey-- didn't help at all because they were indeed causing dependency loops (#5510) and pg_dump was not able to take those dependency edges into the account. For this reason, this commit deletes those dependency edges so that pg_dump stops complaining about them. Note that it's not critical to delete those edges from pg_depend since they're not breaking pg upgrades but were triggering some warning messages. And given that backporting a sql change into older versions is hard a lot, we skip backporting this.	2023-03-14 17:13:52 +03:00
Onur Tirtir	2b4be535de	Do clean-up before upgrade_columnar_before to make it runnable multiple times So that flaky test detector can run upgrade_columnar_before.sql multiple times.	2023-03-14 17:13:52 +03:00
Onur Tirtir	994f67185f	Make upgrade_columnar_after runnable multiple times This commit hides port numbers in upgrade_columnar_after because the port numbers assigned to nodes in upgrade schedule differ from the ones that flaky test detector assigns.	2023-03-14 17:13:52 +03:00
Onur Tirtir	821f26cc74	Fix flaky test detection for upgrade tests When run_test.py is run for an upgrade_._after.sql then, then automatically run the corresponding uprade_._before.sql file first. This is because all those upgrade_._after.sql files depend on the objects created in upgrade_._before.sql files by definition.	2023-03-14 17:13:52 +03:00
Onur Tirtir	cc945fa331	Add multi_create_fdw into minimal_schedule (#6759 ) So that we can run the tests that require fake_fdw by using minimal schedule too. Also move multi_create_fdw.sql up in multi_1_schedule to make it available to more tests.	2023-03-14 10:22:34 +03:00
Emel Şimşek	4043abd5aa	Exclude-Generated-Columns-In-Copy (#6721 ) DESCRIPTION: Fixes a bug in shard copy operations. For copying shards in both shard move and shard split operations, Citus uses the COPY statement. A COPY all statement in the following form ` COPY target_shard FROM STDIN;` throws an error when there is a GENERATED column in the shard table. In order to fix this issue, we need to exclude the GENERATED columns in the COPY and the matching SELECT statements. Hence this fix converts the COPY and SELECT all statements to the following form: ``` COPY target_shard (col1, col2, ..., coln) FROM STDIN; SELECT (col1, col2, ..., coln) FROM source_shard; ``` where (col1, col2, ..., coln) does not include a GENERATED column. GENERATED column values are created in the target_shard as the values are inserted. Fixes #6705. --------- Co-authored-by: Teja Mupparti <temuppar@microsoft.com> Co-authored-by: aykut-bozkurt <51649454+aykut-bozkurt@users.noreply.github.com> Co-authored-by: Jelte Fennema <jelte.fennema@microsoft.com> Co-authored-by: Gürkan İndibay <gindibay@microsoft.com>	2023-03-07 18:15:50 +03:00
Ahmet Gedemenli	03f1bb70b7	Rebalance shard groups with placement count less than worker count (#6739 ) DESCRIPTION: Adds logic to distribute unbalanced shards If the number of shard placements (for a colocation group) is less than the number of workers, it means that some of the workers will remain empty. With this PR, we consider these shard groups as a colocation group, in order to make them be distributed evenly as much as possible across the cluster. Example: ```sql create table t1 (a int primary key); create table t2 (a int primary key); create table t3 (a int primary key); set citus.shard_count =1; select create_distributed_table('t1','a'); select create_distributed_table('t2','a',colocate_with=>'t1'); select create_distributed_table('t3','a',colocate_with=>'t2'); create table tb1 (a bigint); create table tb2 (a bigint); select create_distributed_table('tb1','a'); select create_distributed_table('tb2','a',colocate_with=>'tb1'); select citus_add_node('localhost',9702); select rebalance_table_shards(); ``` Here we have two colocation groups, each with one shard group. Both shard groups are placed on the first worker node. When we add a new worker node and try to rebalance table shards, the rebalance planner considers it well balanced and does nothing. With this PR, the rebalancer tries to distribute these shard groups evenly across the cluster as much as possible. For this example, with this PR, the rebalancer moves one of the shard groups to the second worker node. fixes: #6715	2023-03-06 14:14:27 +03:00
Jelte Fennema	b489d763e1	Use pg_total_relation_size in citus_shards (#6748 ) DESCRIPTION: Correctly report shard size in citus_shards view When looking at citus_shards, people are interested in the actual size that all the data related to the shard takes up on disk. `pg_total_relation_size` is the function to use for that purpose. The previously used `pg_relation_size` does not include indexes or TOAST. Especially the missing toast can have enormous impact on the size of the shown data.	2023-03-06 10:53:12 +01:00
Gledis Zeneli	dc7fa0d5af	Fix multiple output version arbitrary config tests (#6744 ) With this small change, arbitrary config tests can have multiple acceptable correct outputs. For an arbitrary config tests named `t`, now you can define `expected/t.out`, `expected/t_0.out`, `expected/t_1.out` etc and the test will succeed if the output of `sql/t.sql` is equal to any of the `t.out` or `t_{0, 1, ...}.out` files.	2023-03-03 21:06:59 +03:00
Onur Tirtir	a9820e96a3	Make single_node_truncate.sql re-runnable First of all, this commit sets next_shard_id for single_node_truncate.sql because shard ids in the test output were changing whenever we modify a prior test file. Then the flaky test detector started complaining about single_node_truncate.sql. We fix that by specifying the correct test dependency for it in run_test.py.	2023-03-02 16:33:18 +03:00
Onur Tirtir	40105bf1fc	Make single_node.sql re-runnable	2023-03-02 16:33:17 +03:00
Jelte Fennema	17ad61678f	Make run_test.py and create_test.py importable without errors (#6736 ) Allowing scripts to be importable is good practice in general and it's required for the pytest testing framework that I'm adding in a follow up PR.	2023-02-28 00:34:42 +03:00
Jelte Fennema	c018e29bec	Don't blanket ignore flake8 E402 error (#6734 ) Instead this starts ignoring it in specific places only, because most files don't actually need it ignored.	2023-02-27 18:17:15 +03:00
Jelte Fennema	24ad8574b5	Fix run_test.py on python 3.9 (#6735 ) In #6718 I accidentally added Python type hint syntax that was only supported on Python 3.10. Our CI uses 3.9, so this PR changes that to a syntax that's supported on 3.9 too.	2023-02-27 10:12:18 +01:00
Teja Mupparti	d7b499929c	Rearrange the common code into a newfunction to facilitate the multiple checks of the same conditions in a multi-modify MERGE statement	2023-02-24 12:55:11 -08:00
Teja Mupparti	ca65d2ba0b	Fix flaky tests local_shards_execution and local_shards_execution_replication. O Simple fix is to add ORDER BY to have definitive results. O Add search_path explicitly after reconnecting, this avoids creating objects in public schema which prevents us from repetitive running of tests. O multi_mx_modification is not designed to run repetitive, so isolate it.	2023-02-15 09:18:10 -08:00
Jelte Fennema	b02a5b5b78	Add more powerfull dependency tracking to run_test.py (#6718 ) Some of our tests depend on previous tests. Normally all these tests should be part of a base schedule, but that's not always the case. The flaky test detection script should ensure that we don't introduce other dependencies by accident in new tests. But we have many old tests that are not worth the effort of changing. This adds a way to define such test dependencies in `run_test.py`, so that it can make sure to run any dependencies before the actual test.	2023-02-15 17:20:05 +03:00
Jelte Fennema	3ba639f162	Install non-vulnerable cryptography package (#6710 ) Our repo was complaining about the cryptography package being vulnerable. This updates it, including our mitmproxy fork, because that was pinning an outdated version. Relevant commit on our mitmproxy fork: `2fd18ef051` Relevant PR on the-process: https://github.com/citusdata/the-process/pull/112	2023-02-14 18:03:10 +01:00
Jelte Fennema	9f41ea2157	Fix issues reported by flake8	2023-02-10 13:05:37 +01:00
Jelte Fennema	188cc7d2ae	Run python files through isort	2023-02-10 13:05:37 +01:00
Jelte Fennema	530b24a887	Format python files with black	2023-02-10 13:05:37 +01:00
Jelte Fennema	42970665fc	Add linting and formatting tools for python	2023-02-10 13:05:37 +01:00
Jelte Fennema	09be4bb5fd	Allow multi_insert_select to run repeatably (#6707 ) It was not cleaning up all the tables it created. This changes it to create a dedicated schema for this test, like we have for many others.	2023-02-10 10:06:42 +01:00
Jelte Fennema	590df5360c	Fix flakyness in failure_create_distributed_table_non_empty (#6708 ) The failure_create_distributed_table_non_empty test would sometimes fail like this: ```diff -- in the first test, cancel the first connection we sent from the coordinator SELECT citus.mitmproxy('conn.cancel(' \|\| pg_backend_pid() \|\| ')'); - mitmproxy ---------------------------------------------------------------------- - -(1 row) - +ERROR: canceling statement due to user request +CONTEXT: COPY mitmproxy_result, line 0 +SQL statement "COPY mitmproxy_result FROM '/home/circleci/project/src/test/regress/tmp_check/mitmproxy.fifo'" +PL/pgSQL function citus.mitmproxy(text) line 11 at EXECUTE SELECT create_distributed_table('test_table', 'id'); ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/30474/workflows/be1c9f9d-22c9-465c-964a-dcdd1cb8c99c/jobs/985441 Because the cancel command had no filter it would actually sometimes cancel the mitmproxy cancel command itself. This PR addresses that by simply removing this test. This is basically the exact same issue as #6217, only in a different place in the file. It's fixed here by removing the test since there's already many different similar tests.	2023-02-10 09:55:12 +01:00
Onur Tirtir	483b51392f	Bump Citus to 11.3devel (#6690 )	2023-02-06 10:23:25 +00:00
Gokhan Gulbiz	b6a4652849	Stop background daemon before dropping the database (#6688 ) DESCRIPTION: Stop maintenance daemon when dropping a database even without Citus extension Fixes #6670	2023-02-03 15:15:44 +03:00
Jelte Fennema	14c31fbb07	Fix background rebalance when reference table has no PK (#6682 ) DESCRIPTION: Fix background rebalance when reference table has no PK For the background rebalance we would always fail if a reference table that was not replicated to all nodes would not have a PK (or replica identity). Even when we used force_logical or block_writes as the shard transfer mode. This fixes that and adds some regression tests. Fixes #6680	2023-01-31 12:18:29 +01:00
aykut-bozkurt	8a9bb272e4	fix dropping table_name option from foreign table (#6669 ) We should disallow dropping table_name option if foreign table is in metadata. Otherwise, we get table not found error which contains shardid. DESCRIPTION: Fixes an unexpected foreign table error by disallowing to drop the table_name option. Fixes #6663	2023-01-30 17:24:30 +03:00
Marco Slot	a482b36760	Revert "Support MERGE on distributed tables with restrictions" (#6675 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2023-01-30 15:01:59 +01:00
Hanefi Onaldi	0962cf7517	Allow empty lines in arbitrary config schedules (#6654 ) This change is a precursor to attempts to add more editorconfig rules in our codebase. It is a good idea to comply with POSIX standards and have an empty newline at the end of text files. However, once we have such a rule, arbitrary configs scripts used to fail before this change. Related: #5981	2023-01-30 16:30:12 +03:00
Onur Tirtir	594684bb33	Do clean-up before columnar_create to make it runnable multiple times So that flaky test detector can run columnar_create.sql multiple times.	2023-01-30 15:58:34 +03:00
Onur Tirtir	1c51ddae49	Fall-back to seq-scan when accessing columnar metadata if the index doesn't exist Fixes #6570. In the past, having columnar tables in the cluster was causing pg upgrades to fail when attempting to access columnar metadata. This is because, pg_dump doesn't see objects that we use for columnar-am related booking as the dependencies of the tables using columnar-am. To fix that; in #5456, we inserted some "normal dependency" edges (from those objects to columnar-am) into pg_depend. This helped us ensuring the existency of a class of metadata objects --such as columnar.storageid_seq-- and helped fixing #5437. However, the normal-dependency edges that we added for indexes on columnar metadata tables --such columnar.stripe_pkey-- didn't help at all because they were indeed causing dependency loops (#5510) and pg_dump was not able to take those dependency edges into the account. For this reason, instead of inserting such dependency edges from indexes to columnar-am, we allow columnar metadata accessors to fall-back to sequential scan during pg upgrades.	2023-01-30 15:58:34 +03:00
Jelte Fennema	1109b70e58	Fix flaky isolation_non_blocking_shard_split test (#6666 ) Sometimes isolation_non_blocking_shard_split would fail like this: ```diff step s2-show-pg_dist_cleanup: SELECT object_name, object_type, policy_type FROM pg_dist_cleanup; object_name \|object_type\|policy_type ------------------------------+-----------+----------- +citus_shard_split_slot_2_10_39\| 3\| 0 public.to_split_table_1500001 \| 1\| 2 -(1 row) +(2 rows) ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/30237/workflows/edcf34b7-d7d3-4d10-8293-b6f59b00cdf2/jobs/970960 The reason is that replication slots have now become part of pg_dist_cleanup too, and sometimes they cannot be cleaned up right away. This is harmless as they will be cleaned up eventually. So this simply filters out the replication slots for those tests.	2023-01-30 13:44:23 +01:00
Jelte Fennema	10603ed5d4	Fix flaky multi_reference_table test (#6664 ) Sometimes in CI our multi_reference_table test fails like this: ```diff WHERE colocated_table_test.value_2 = reference_table_test.value_2; LOG: join order: [ "colocated_table_test" ][ reference join "reference_table_test" ] value_2 --------- - 1 2 + 1 (2 rows) ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/30223/workflows/ce3ab5db-310f-4e30-ba0b-c3b31927d9b6/jobs/970041 We forgot an ORDER BY in this test.	2023-01-30 13:32:38 +01:00
aykut-bozkurt	ab71cd01ee	fix multi level recursive plan (#6650 ) Recursive planner should handle all the tree from bottom to top at single pass. i.e. It should have already recursively planned all required parts in its first pass. Otherwise, this means we have bug at recursive planner, which needs to be handled. We add a check here and return error. DESCRIPTION: Fixes wrong results by throwing error in case recursive planner multipass the query. We found 3 different cases which causes recursive planner passes the query multiple times. 1. Sublink in WHERE clause is planned at second pass after we recursively planned a distributed table at the first pass. Fixed by PR #6657. 2. Local-distributed joins are recursively planned at both the first and the second pass. Issue #6659. 3. Some parts of the query is considered to be noncolocated at the second pass as we do not generate attribute equivalances between nondistributed and distributed tables. Issue #6653	2023-01-27 21:25:04 +03:00
Jelte Fennema	0903091343	Add valgrind support to run_test.py (#6667 ) Running tests with valgrind was not possible with our run_test.py script yet. This adds that support.	2023-01-27 16:01:59 +01:00
Gokhan Gulbiz	4e26464969	Allow plain pg foreign tables without a table_name option (#6652 )	2023-01-27 16:34:11 +03:00
Jelte Fennema	81dcddd1ef	Actually skip constraint validation on shards after shard move (#6640 ) DESCRIPTION: Fix foreign key validation skip at the end of shard move In `eadc88a` we started completely skipping foreign key constraint validation at the end of a non blocking shard move, instead of only for foreign keys to reference tables. However, it turns out that this didn't work at all because of a hard to notice bug: By resetting the SkipConstraintValidation flag at the end of our utility hook, we actually make the SET command that sets it a no-op. This fixes that bug by removing the code that resets it. This is fine because #6543 removed the only place where we set the flag in C code. So the resetting of the flag has no purpose anymore. This PR also adds a regression test, because it turned out we didn't have any otherwise we would have caught that the feature was completely broken. It also moves the constraint validation skipping to the utility hook. The reason is that #6550 showed us that this is the better place to skip it, because it will also skip the planning phase and not just the execution.	2023-01-27 13:08:05 +01:00
aykut-bozkurt	8870f0f90b	fix order of recursive sublink planning (#6657 ) We should do the sublink conversations at the end of the recursive planning because earlier steps might have transformed the query into a shape that needs recursively planning the sublinks. DESCRIPTION: Fixes early sublink check at recursive planner. Related to PR https://github.com/citusdata/citus/pull/6650	2023-01-27 14:35:16 +03:00
Emel Şimşek	24f6136f72	Fixes ADD {PRIMARY KEY/UNIQUE} USING INDEX cmd (#6647 ) This change allows creating a constraint without a name using an index. The index name will be used as the constraint name the same way postgres handles it. Fixes issue #6644 This commit also cleans up some leftovers from nameless constraint checks. With this commit, we now fully support adding all nameless constraints directly to a table. Co-authored-by: naisila <nicypp@gmail.com>	2023-01-25 21:28:07 +03:00
Emel Şimşek	2169e0222b	Propagates NOT VALID option for FK&CHECK constraints w/out a name (#6649 ) Adds NOT VALID option to deparser. When we need to deparse: "ALTER TABLE ADD FOREIGN KEY ... NOT VALID" "ALTER TABLE ADD CHECK ... NOT VALID" NOT VALID option should be propagated to workers. Fixes issue #6646 This commit also uses AppendColumnNameList function instead of repeated code blocks in two appropriate places in the "ALTER TABLE" deparser.	2023-01-25 20:41:04 +03:00
Hanefi Onaldi	94b63f35a5	Prevent crashes on update with returning clauses (#6643 ) If an update query on a reference table has a returns clause with a subquery that accesses some other local table, we end-up with an crash. This commit prevents the crash, but does not prevent other error messages from happening due to Citus not being able to pushdown the results of that subquery in a valid SQL command. Related: #6634	2023-01-24 20:07:43 +03:00
Jelte Fennema	aa9cd16d15	Use correct guc value to disable statistics collection (#6641 ) The `citus.enable_statistics_collection` is a boolean GUC not an integer one. Setting it to `-1` showed errors in the logs.	2023-01-24 15:32:50 +01:00
Jelte Fennema	7a7880aec9	Fix regression in allowed foreign keys on distributed tables (#6550 ) DESCRIPTION: Fix regression in allowed foreign keys on distributed tables In commit `eadc88a` we changed how we skip foreign key validation. The goal was to skip it in more cases. However, one change had the unintended regression of introducing failures when trying to create certain foreign keys. This reverts that part of the change. The way of skipping validation of foreign keys that was introduced in `eadc88a` was skipping validation during execution. The reason that this caused this regression was because some foreign key validation queries already fail during planning. In those cases it never gets to the execution step where it would later be skipped. Fixes #6543	2023-01-24 16:09:21 +03:00
Jelte Fennema	93fcc5c5d8	Move tablespace directory creation to pg_regress_multi.pl (#6629 ) Multiple `check-xxx` targets create tablespaces. If you run two of these at the same time you would get an error like: ```diff CREATE TABLESPACE test_tablespace LOCATION :'test_tablespace'; +ERROR: directory "/home/rajesh/citus/citus/src/test/regress/tmp_check/ts0/PG_14_202107181" already in use as a tablespace ``` This fixes that by moving creation of table space directory creation and removal to pg_regress_multi.pl instead of being in the Makefile.	2023-01-20 12:34:33 +00:00
Emel Şimşek	58368b7783	Enable adding FOREIGN KEY constraints on Citus tables without a name. (#6616 ) DESCRIPTION: Enable adding FOREIGN KEY constraints on Citus tables without a name This PR enables adding a foreign key to a distributed/reference/Citus local table without specifying the name of the constraint, e.g. `ALTER TABLE items ADD FOREIGN KEY (user_id) REFERENCES users (id);`	2023-01-20 01:43:52 +03:00
Gokhan Gulbiz	2388fbea6e	Identity Column Support on Citus Managed Tables (#6591 ) DESCRIPTION: Identity Column Support on Citus Managed Tables	2023-01-19 15:45:41 +03:00
Marco Slot	64e3fee89b	Remove shardstate leftovers (#6627 ) Remove ShardState enum and associated logic. Co-authored-by: Marco Slot <marco.slot@gmail.com> Co-authored-by: Ahmet Gedemenli <afgedemenli@gmail.com>	2023-01-19 11:43:58 +03:00
Teja Mupparti	44c387b978	Support MERGE on distributed tables with restrictions This implements the phase - II of MERGE sql support Support routable query where all the tables in the merge-sql are distributed, co-located, and both the source and target relations are joined on the distribution column with a constant qual. This should be a Citus single-task query. Below is an example. SELECT create_distributed_table('t1', 'id'); SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’); MERGE INTO t1 USING s1 ON t1.id = s1.id AND t1.id = 100 WHEN MATCHED THEN UPDATE SET val = s1.val + 10 WHEN MATCHED THEN DELETE WHEN NOT MATCHED THEN INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src) Basically, MERGE checks to see if There are a minimum of two distributed tables (source and a target). All the distributed tables are indeed colocated. MERGE relations are joined on the distribution column MERGE .. USING .. ON target.dist_key = source.dist_key The query should touch only a single shard i.e. JOIN AND with a constant qual MERGE .. USING .. ON target.dist_key = source.dist_key AND target.dist_key = <> If any of the conditions are not met, it raises an exception.	2023-01-18 11:05:27 -08:00
Ahmet Gedemenli	b3b135867e	Remove shardstate from placement insert functions (#6615 )	2023-01-18 09:52:38 +01:00
Hanefi Onaldi	f21dfd5fae	Rebalance Progress Reporting API (#6576 ) citus_job_list() lists all background jobs by simply showing the records in pg_dist_background_job. citus_job_status(job_id bigint, raw boolean default false) shows the status of a single background job by appending a jsonb details column to the associated row from pg_dist_background_job. If the raw argument is set, machine readable sizes are used instead of human readable alternatives. citus_rebalance_status(raw boolean default false) shows the status of the last rebalance operation. If the raw argument is set, machine readable sizes are used instead of human readable alternatives.	2023-01-16 16:17:31 +03:00
Jelte Fennema	92689a8362	Make GPIDs work with pg_dist_poolinfo (#6588 ) The original implementation of GPIDs didn't work correctly when using `pg_dist_poolinfo` together with PgBouncer. The reason is that it assumed that once a connection was made to a worker, the originating GPID should stay the same for ever. But when pg_dist_poolinfo is used this isn't the case, because the same connection on the worker might be used by different backends of the coordinator. This fixes that issue by updating the GPID whenever a new application name is set on a connection. This is the only thing that's needed, because PgBouncer already sets the application name correctly on the server connection whenever a client is updated.	2023-01-13 14:39:19 +00:00
Marco Slot	ad3407b5ff	Revert "Make the metadata syncing less resource invasive [Phase-1]" (#6618 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2023-01-13 13:56:55 +01:00
Ahmet Gedemenli	ed6dd8b086	Mark 0/0 lsn results as null (#6617 ) Marks `source_lsn` and `target_lsn` fields as null if the result is 0/0	2023-01-13 15:33:43 +03:00
Emel Şimşek	28ed013a91	Enable ALTER TABLE ... ADD CHECK (#6606 ) DESCRIPTION: Enable adding CHECK constraints on distributed tables without the client having to provide a constraint name. This PR enables the following command syntax for adding check constraints to distributed tables. ALTER TABLE ... ADD CHECK ... by creating a default constraint name and transforming the command into the below syntax before sending it to workers. ALTER TABLE ... ADD CONSTRAINT \<conname> CHECK ...	2023-01-12 23:31:06 +03:00
Ahmet Gedemenli	9b9d8e7abd	Use shard transfer UDFs with node ids for rebalancing	2023-01-12 16:57:51 +03:00
Ahmet Gedemenli	e5fef40c06	Introduce citus_move_shard_placement UDF with nodeid	2023-01-12 16:57:51 +03:00
Ahmet Gedemenli	e19c545fbf	Introduce citus_copy_shard_placement UDF with nodeid	2023-01-12 16:57:51 +03:00
Emel Şimşek	d322f9e382	Handle DEFERRABLE option for the relevant constraints at deparser. (#6613 ) Table Constraints UNIQUE, PRIMARY KEY and EXCLUDE may have option DEFERRABLE in their command syntax. This PR handles the option when deparsing the relevant constraint statements. NOT DEFERRABLE and INITIALLY IMMEDIATE (if DEFERRABLE} are the default values for the option so we only append the non-default values to the alter table statement.	2023-01-12 12:32:38 +03:00
Jelte Fennema	44e09128f0	Fix failures in mx_base_schedule (#6601 ) Apparently no-one actually ran the mx_base_schedule, because the tests in schedule itself were already failing. This updates it to be in line with multi_mx_schedule again to make the tests pass again. Notably it doesn't contain multi_mx_node_metadata and multi_extension. Because those tests take long to run and the were not necessary to make multi_mx_create_table pass again.	2023-01-06 14:48:18 +01:00
Ahmet Gedemenli	bc3383170e	Fix crash when trying to replicate a ref table that is actually dropped (#6595 ) DESCRIPTION: Fix crash when trying to replicate a ref table that is actually dropped see #6592 We should have a real solution for it.	2023-01-06 14:52:08 +03:00
Emel Şimşek	db7a70ef3e	Enable ALTER TABLE ... ADD UNIQUE and ADD EXCLUDE. (#6582 ) DESCRIPTION: Adds support for creating table constraints UNIQUE and EXCLUDE via ALTER TABLE command without client having to specify a name. ALTER TABLE ... ADD CONSTRAINT <conname> UNIQUE ... ALTER TABLE ... ADD CONSTRAINT <conname> EXCLUDE ... commands require the client to provide an explicit constraint name. However, in postgres it is possible for clients not to provide a name and let the postgres generate it using the following commands ALTER TABLE ... ADD UNIQUE ... ALTER TABLE ... ADD EXCLUDE ... This PR enables the same functionality for citus tables.	2023-01-05 18:12:32 +03:00
Emel Şimşek	135c519c62	Fix flakyness for multi_name_lengths test (#6599 ) Fix flakyness for multi_name_lengths test.	2023-01-05 17:27:16 +03:00
Önder Kalacı	a1aa96b32c	Make the metadata syncing less resource invasive [Phase-1] (#6537 )	2023-01-04 11:36:45 +01:00
Ahmet Gedemenli	235047670d	Drop SHARD_STATE_TO_DELETE (#6494 ) DESCRIPTION: Drop `SHARD_STATE_TO_DELETE` and use the cleanup records instead Drops the shard state that is used to mark shards as orphaned. Now we insert cleanup records into `pg_dist_cleanup` so "orphaned" shards will be dropped either by maintenance daemon or internal cleanup calls. With this PR, we make the "cleanup orphaned shards" functions to be no-op, as they would not be needed anymore. This PR includes some naming changes about placement functions. We don't need functions that filter orphaned shards, as there will be no orphaned shards anymore. We will also be introducing a small script with this PR, for users with orphaned shards. We'll basically delete the orphaned shard entries from `pg_dist_placement` and insert cleanup records into `pg_dist_cleanup` for each one of them, during Citus upgrade. We also have a lot of flakiness fixes in this PR. Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2023-01-03 14:38:16 +03:00
Jelte Fennema	f56904fe04	Fix flakyness in isolation_insert_vs_vacuum (#6589 ) Sometimes our `isolation_insert_vs_vacuum` test would fail like this. ```diff step s2-vacuum-analyze: VACUUM ANALYZE test_insert_vacuum; - + <waiting ...> step s1-commit: COMMIT; +step s2-vacuum-analyze: <... completed> ``` The reason seems to be that VACUUM ANALYZE tries to take some locks that conflict with the other transaction, but these locks somehow get released or VACUUM ANALYZE stops waiting for them. This is somewhat expected since VACUUM has some special locking logic. To solve the flakyness we now trigger VACUUM ANALYZE to always report as blocking and after that we wait explicitly wait for it to complete. This is done like is suggested by the flaky test tips from postgres: `c68a183990/src/test/isolation/README (L152)` I've confirmed that this fixes the issue suing our flaky-test-debugging CI workflow.	2023-01-02 16:51:58 +01:00
Ahmet Gedemenli	0c74e4cc0f	Fix some flaky tests (#6587 ) Fix for some simple flakiness'. All `DROP USER' and cleanup function calls.	2022-12-29 10:19:09 +03:00
Ahmet Gedemenli	f824c996b3	Remove duplicate split entry in run_test.py (#6586 ) Nothing important. Removing the duplicate `"split" in test_schedule` check	2022-12-28 18:18:24 +03:00
Ahmet Gedemenli	1b1e737e51	Drop cleanup on failure (#6584 ) DESCRIPTION: Defers cleanup after a failure in shard move or split We don't need to do a cleanup in case of failure on a shard transfer or split anymore. Because, * Maintenance daemon will clean them up anyway. * We trigger a cleanup at the beginning of shard transfers/splits. * The cleanup on failure logic also can fail sometimes and instead of the original error, we throw the error that is raised by the cleanup procedure, and it causes confusion.	2022-12-28 15:48:44 +03:00
Ahmet Gedemenli	cfc17385e9	Some minor improvements on flakiness detection (#6585 ) * Skip some exceptional test files in the flaky workflow, like multi_extension * Run some tests without a schedule, like single_node_enterprise * Use minimal schedule for the tests in split and operations schedules	2022-12-28 15:08:39 +03:00
Ahmet Gedemenli	eba9abeee2	Fix leftover shard copy on the target node when tx with move is aborted (#6583 ) DESCRIPTION: Cleanup the shard on the target node in case of a failed/aborted shard move Inserts a cleanup record for the moved shard placement on the target node. If the move operation succeeds, the record will be deleted. If not, it will remain there to be cleaned up later. fixes: #6580	2022-12-27 22:42:46 +03:00
Naisila Puka	e937935935	Clean up normalize file (#6578 )	2022-12-26 12:08:27 +03:00
Naisila Puka	bc49616426	Fix link of path to flaky tests docs (#6579 )	2022-12-23 12:09:22 +03:00
Ahmet Gedemenli	96bf648d1a	Unify dependent test files into one	2022-12-22 13:06:26 +03:00
Ahmet Gedemenli	acf3539a90	Fix split schedule	2022-12-22 13:06:26 +03:00
Hanefi Onaldi	303db172f8	Use Citus version comparison in upgrade tests, not equality (#6568 ) We have several version checks in our Citus upgrade tests. However, as we drop support for PG versions, we need to update the Citus versions used in our CI images. Therefore we must compare Citus versions in our tests instead of using equality checks so that the queries are ran in all the associated Citus versions. For example, we have many conditionals where we early exit if the Citus version is not equal to 9.0. However, as of today we never use version 9.0 and thus we always early exit in those tests.	2022-12-21 14:01:57 +03:00
Teja Mupparti	9a9989fc15	Support MERGE Phase – I All the tables (target, source or any CTE present) in the SQL statement are local i.e. a merge-sql with a combination of Citus local and Non-Citus tables (regular Postgres tables) should work and give the same result as Postgres MERGE on regular tables. Catch and throw an exception (not-yet-supported) for all other scenarios during Citus-planning phase.	2022-12-18 20:32:15 -08:00
Emel Şimşek	5268d0a6cb	Enable PRIMARY KEY generation via ALTER TABLE even if the constraint name is not provided (#6520 ) DESCRIPTION: Support ALTER TABLE .. ADD PRIMARY KEY ... command Before processing > ALTER TABLE ... ADD PRIMARY KEY ... command 1. Create a primary key name to use as the constraint name. 2. Change the ALTER TABLE ... ADD PRIMARY KEY ... command to into ALTER TABLE ... ADD CONSTRAINT \<constraint name> PRIMARY KEY ... form. This is the only form we can specify a name for a primary key. If we run ALTER TABLE .. ADD PRIMARY KEY, postgres would create a constraint name internally in its own scheme. But the problem is that we need to create constraint names for shards in our own scheme which is \<constraint name>_\<shardid>. Hence we need to create a name and send it to workers so that the workers can append the shardid. 4. Run the changed command on the coordinator to make sure we are using the same constraint name across the board. 5. Send the changed command to workers such that it is executed for the main table as well as for the shards. Fixes #6515.	2022-12-16 20:34:00 +03:00
Onder Kalaci	feb5534c65	Do not create additional WaitEventSet for RemoteSocketClosed checks Before this commit, we created an additional WaitEventSet for checking whether the remote socket is closed per connection - only once at the start of the execution. However, for certain workloads, such as pgbench select-only workloads, the creation/deletion of the additional WaitEventSet adds ~7% CPU overhead, which is also reflected on the benchmark results. With this commit, we use the same WaitEventSet for the purposes of checking the remote socket at the start of the execution. We use "rebuildWaitEventSet" flag so that the executor can re-use the existing WaitEventSet. As a result, we see the following improvements on PG 15: main : 120051 tps, 0.532 ms latency avg. avoid_wes_rebuild: 127119 tps, 0.503 ms latency avg. And, on PG 14, as expected, there is no difference main : 129191 tps, 0.495 ms latency avg. avoid_wes_rebuild: 129480 tps, 0.494 ms latency avg. But, note that PG 15 is slightly (~1.5%) slower than PG 14. That is probably the overhead of checking the remote socket.	2022-12-14 22:42:55 +01:00
Gokhan Gulbiz	556161be32	Fix make recipe mapping in test runner (#6561 )	2022-12-14 12:57:13 +03:00
aykut-bozkurt	8be4ce546e	fix vanilla test status on CI (#6555 ) - Because of the make command used for vanilla tests, test status is always shown as success on CI. As a fix, I added `&& false` at the end of the copying diff file to make the command fail when check-vanilla fails. ```make check-vanilla: all $(pg_regress_multi_check) --vanillatest \|\| (cp $(vanilla_diffs_file) $(citus_abs_srcdir)/regression.diffs && false) ``` - I also fixed some vanilla tests that fails due to recently added clock related operators shown up at some queries.	2022-12-13 11:15:47 +03:00
Gürkan İndibay	3f091e3493	Give nicer error message when using alter_table_set_access_method on a view (#6553 ) DESCRIPTION: Fixes alter_table_set_access_method error for views. Fixes #6001	2022-12-12 23:56:22 +03:00
aykut-bozkurt	1ad1a0a336	add citus_task_wait udf to wait on desired task status (#6475 ) We already have citus_job_wait to wait until the job reaches the desired state. That PR adds waiting on task state to allow more granular waiting. It can be used for Citus operations. Moreover, it is also useful for testing purposes. (wait until a task reaches specified state) Related to #6459.	2022-12-12 22:41:03 +03:00
aykut-bozkurt	3da6e3e743	bgworkers with backend connection should handle SIGTERM properly (#6552 ) Fixes task executor SIGTERM handling. Problem: When task executors are sent SIGTERM, their default handler `bgworker_die`, which is set at worker startup, logs FATAL error. But they do not release locks there before logging the error, which sometimes causes hanging of the monitor. e.g. Monitor waits for the lock forever at pg_stat flush after calling proc_exit. Solution: Because executors have connection to backend, they should handle SIGTERM similar to normal backends. Normal backends uses `die` handler, in which they set ProcDiePending flag and the next CHECK_FOR_INTERRUPTS call handles it gracefully by releasing any lock before termination.	2022-12-12 16:44:36 +03:00
dependabot[bot]	f6b8990fc7	Bump certifi from 2022.9.14 to 2022.12.7 in /src/test/regress (#6554 ) Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.9.14 to 2022.12.7. Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-12-12 14:13:08 +01:00
Gokhan Gulbiz	e2a73ad8a8	Flaky Test Detection CI Workflow (#6495 ) This PR adds a new CI workflow named ```flaky-test``` to run flaky test detection on newly introduced regression tests. Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2022-12-12 14:36:23 +03:00
Ahmet Gedemenli	190307e8d8	Wait for cleanup function (#6549 ) Adding a testing function `wait_for_resource_cleanup` which waits until all records in `pg_dist_cleanup` are cleaned up. The motivation is to prevent flakiness in our tests, since the `NOTICE: cleaned up X orphaned resources` message is not consistent in many cases. This PR replaces `citus_cleanup_orphaned_resources` calls with `wait_for_resource_cleanup` calls.	2022-12-08 13:19:25 +03:00
Teja Mupparti	cbb33167f9	Fix the flaky test in clock.sql	2022-12-07 09:47:35 -08:00
Ahmet Gedemenli	989a3b54c9	Disable maintenance daemon for cleanup test (#6551 ) Disabling the cleanup in maintenance daemon, to prevent a flaky test.	2022-12-07 20:28:55 +03:00
Onur Tirtir	b177975371	Add new regression tests	2022-12-07 18:27:50 +03:00
Onur Tirtir	2803470b58	Add lateral join checks for outer joins and drop the useless ones for semi joins	2022-12-07 18:27:50 +03:00
Onur Tirtir	e7e4881289	Phase - III: recursively plan non-recurring sub join trees too	2022-12-07 18:27:50 +03:00
Onur Tirtir	f52381387e	Phase - II: recursively plan non-recurring subqueries too	2022-12-07 18:27:50 +03:00
Onur Tirtir	f339450a9d	Phase - I: recursively plan non-recurring relations	2022-12-07 18:27:50 +03:00
Ahmet Gedemenli	3cc5d9842a	Remove IF EXISTS from cleanup on failure test for subscription object (#6547 ) Nothing critical. Just improving a DROP SUBSCRIPTION test for a cleanup after failure scenario.	2022-12-07 17:51:36 +03:00
Ahmet Gedemenli	cb02d62369	Unique names for replication artifacts (#6529 ) DESCRIPTION: Create replication artifacts with unique names We're creating replication objects with generic names. This disallows us to enable parallel shard moves, as two operations might use the same objects. With this PR, we'll create below objects with operation specific names, by appending OparationId to the names. * Subscriptions * Publications * Replication Slots * Users created for subscriptions	2022-12-06 15:48:16 +03:00
Teja Mupparti	e14dc5d45d	Address the issues/comments from the original PR# 6315 1) Regular users fail to use clock UDF with permission issue. 2) Clock functions were declared as STABLE, whereas by definition they are VOLATILE. By design, any clock/time functions will return different results for each call even within a single SQL statement. Note: UDF citus_get_transaction_clock() is a misnomer as it internally calls the clock tick which always returns different results for every invocation in the same transaction.	2022-12-05 11:06:21 -08:00
aykut-bozkurt	65f256eec4	* add SIGTERM handler to gracefully terminate task executors, \ (#6473 ) Adds signal handlers for graceful termination, cancellation of task executors and detecting config updates. Related to PR #6459. #### How to handle termination signal? Monitor need to gracefully terminate all running task executors before terminating. Hence, we have sigterm handler for the monitor. #### How to handle cancellation signal? Monitor need to gracefully cancel all running task executors before terminating. Hence, we have sigint handler for the monitor. #### How to detect configuration changes? Monitor has SIGHUP handler to reflect configuration changes while executing tasks.	2022-12-02 18:15:31 +03:00
songjinzhou	ad6450b793	fix the problem #5763 (#6519 ) Co-authored-by: TsinghuaLucky912 <tsinghualucky912@foxmail.com> Fixes https://github.com/citusdata/citus/issues/5763	2022-12-02 13:49:32 +01:00
Ahmet Gedemenli	3b24c47470	Fix flaky cleanup tests (#6530 ) We are having some flakiness in our test schedule because of the objects leftover from shard moves/splits. With this commit we prevent logging cleanup object counts. fixes: #6534	2022-12-02 12:39:36 +03:00
songjinzhou	29f0196fdf	Add support for SET ACCESS METHOD in altering a distributed table (#6525 ) Co-authored-by: TsinghuaLucky912 <postgres@localhost.localdomain>	2022-12-01 17:45:32 +01:00
Ahmet Gedemenli	0e92244bfe	Cleanup for shard moves (#6472 ) DESCRIPTION: Extend cleanup process for replication artifacts This PR adds new cleanup record types for: * Subscriptions * Replication slots * Publications * Users created for subscriptions We add records for these object types, to `pg_dist_cleanup` during creation phase. Once the operation is done, in case of success or failure, we iterate those records and drop the objects. With this PR we will not be dropping any of these objects during the operation. In short, we will always be deferring the drop. One thing that's worth mentioning is that we sort cleanup records before processing (dropping) them, because of dependency relations among those objects, e.g a subscription might depend on a publication. Therefore, we always drop subscriptions before publications. We have some renames in this PR: * `TryDropOrphanedShards` -> `TryDropOrphanedResources` * `DropOrphanedShardsForCleanup` -> `DropOrphanedResourcesForCleanup` * `run_try_drop_marked_shards` -> `run_try_drop_marked_resources` as these functions now process replication artifacts as well. This PR drops function `DropAllLogicalReplicationLeftovers` and its all usages, since now we rely on the deferring drop mechanism.	2022-11-30 15:38:05 +03:00
aykut-bozkurt	1f8675da43	nonblocking concurrent task execution via background workers (#6459 ) Improvement on our background task monitoring API (PR #6296) to support concurrent and nonblocking task execution. Mainly we have a queue monitor background process which forks task executors for `Runnable` tasks and then monitors their status by fetching messages from shared memory queue in nonblocking way.	2022-11-30 14:29:46 +03:00
aykut-bozkurt	83ef600f27	fix false full join pushdown error check (#6523 ) Problem: Currently, we error out if we detect recurring tuples in one side without checking the other side of the join. Solution: When one side of the full join consists recurring tuples and the other side consists nonrecurring tuples, we should not pushdown to prevent duplicate results. Otherwise, safe to pushdown.	2022-11-30 14:17:56 +03:00
Gokhan Gulbiz	bc118ee551	Change GUC propagation flag's default value to off (#6516 ) This PR changes ```citus.propagate_session_settings_for_loopback_connection``` default value to off not to expose this feature publicly at this point. See #6488 for details.	2022-11-29 13:25:53 +03:00
Jelte Fennema	e12d97def2	Fix flakyness in multi_metadata_access (#6524 ) Sometimes multi_metadata_access failed like this in CI: ```diff AND ext.extname = 'citus' AND nsp.nspname = 'pg_catalog' AND NOT has_table_privilege(pg_class.oid, 'select'); oid --------------------------- - pg_dist_authinfo pg_dist_clock_logical_seq + pg_dist_authinfo (2 rows) ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/28784/workflows/e462f118-eb64-4a3f-941a-e04115334f9b/jobs/883443 This fixes that by ordering the column.	2022-11-29 10:00:06 +01:00
Philip Dubé	cf69fc3652	Grammar: it's to its Includes an error message & one case of its to it's Also fix "to the to" typos	2022-11-28 20:43:44 +00:00
Jelte Fennema	68de2ce601	Include gpid in all internal application names (#6431 ) When debugging issues it's quite useful to see the originating gpid in the application_name of a query on a worker. This already happens for most queries, but not for queries created by the rebalancer or by run_command_on_worker. This adds a gpid to those two application_names too. Note, that if the GPID of the new application_names is different than the current GPID of the backend the backend will continue to keep the old gpid as its actual GPID. This PR is just meant to make sure that the application_name is as useful as it can be for users to look at. Updating of gpids will be done in a follow-up PR, and adding gpids to all internal connections will make this easier.	2022-11-25 11:16:33 +01:00
Emel Şimşek	8e5ba45b74	Fixes a bug that causes crash when using auto_explain extension with ALTER TABLE...ADD FOREIGN KEY... queries. (#6470 ) Fixes a bug that causes crash when using auto_explain extension with ALTER TABLE...ADD FOREIGN KEY... queries. Those queries trigger a SELECT query on the citus tables as part of the foreign key constraint validation check. At the explain hook, workers try to explain this SELECT query as a distributed query causing memory corruption in the connection data structures. Hence, we will not explain ALTER TABLE...ADD FOREIGN KEY... and the triggered queries on the workers. Fixes #6424.	2022-11-15 17:53:39 +03:00
Teja Mupparti	7358b826ef	Remove the explicit-transaction requirement for the UDF citus_get_transaction_clock() as implicit transactions too use this UDF.	2022-11-10 10:54:36 -08:00
Marco Slot	77fbcfaf14	Propagate BEGIN properties to worker nodes (#6483 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-11-10 18:08:43 +01:00
Onur Tirtir	e0363470bc	Add missing targets to check-full	2022-11-08 09:59:55 +03:00
Onur Tirtir	7b3e55f903	Add missing dependencies to test targets	2022-11-08 09:59:55 +03:00
Marco Slot	fcaabfdcf3	Remove remaining master_create_distributed_table usages (#6477 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-11-04 16:30:06 +01:00
Marco Slot	666696c01c	Deprecate citus.replicate_reference_tables_on_activate, make it always off (#6474 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-11-04 16:21:10 +01:00
Naisila Puka	b8c7a9844c	Add docs on handling alternative test outputs (#6469 ) I recently cleaned up our test suite from redundant test outputs: #6111 #6140 #6214 #6140 #6434 I had to check many files manually, as they didn't have any documentation on why the alternative test output existed in the first place. Adding a section in our test docs to remind developers to add alternative test outputs with enough information/keywords.	2022-11-03 10:55:50 +03:00
Onur Tirtir	a5f7f001b0	Make sure to disallow triggers that depend on extensions (#6399 ) DESCRIPTION: Makes sure to disallow triggers that depend on extensions We were already doing so for `ALTER trigger DEPENDS ON EXTENSION` commands. However, we also need to disallow creating Citus tables having such triggers already, so this PR fixes that.	2022-11-02 16:27:31 +03:00
Alexander Kukushkin	402a30a2b7	Allow citus_update_node() to work with nodes from different clusters (#6466 ) DESCRIPTION: Allow citus_update_node() to work with nodes from different clusters citus_update_node(), citus_nodename_for_nodeid(), and citus_nodeport_for_nodeid() functions only checked for nodes in their own clusters and hence last two returned NULLs and the first one showed an error is the nodeId was from a different cluster. Fixes https://github.com/citusdata/citus/issues/6433	2022-11-02 10:07:01 +01:00
Teja Mupparti	01103ce05d	This implements a new UDF citus_get_cluster_clock() that returns a monotonically increasing logical clock. Clock guarantees to never go back in value after restarts, and makes best attempt to keep the value close to unix epoch time in milliseconds. Also, introduces a new GUC "citus.enable_cluster_clock", when true, every distributed transaction is stamped with logical causal clock and persisted in a catalog pg_dist_commit_transaction.	2022-10-28 10:15:08 -07:00
Ahmet Gedemenli	c379ff8614	Drop defer drop gucs (#6447 ) DESCRIPTION: Drops GUC defer_drop_after_shard_split DESCRIPTION: Drops GUC defer_drop_after_shard_move Drop GUCs and related parts from the code. Delete tests that specifically added for the GUCs. Keep tests that can be used without the GUCs. Update test output changes. The motivation for this PR is to have an "always deferring" mechanism. These two GUCs provide an option to not deferring dropping objects during a shard move/split, and dropping them immediately. With this PR, we will be always deferring dropping orphaned shards and other types of objects. We will have a separate PR to extend the deferred cleanup operation, so that we would create records for deferred drop, for Subscriptions, Publications, Replication Slots etc. This will make us be able to keep track of created objects that needs to be dropped, during a shard move/split. We will have objects created specifically for the current operation; and those objects will be dropped at the end. We have an issue (a draft roadmap) for enabling parallel shard moves. For details please see: https://github.com/citusdata/citus/issues/6437	2022-10-25 16:48:34 +03:00
Hanefi Onaldi	915d1b3b38	Repartition tests for numeric types with neg scale (#6358 ) This PR adds some test cases where repartition join correctly prunes shards on two tables that have numeric columns with negative scale.	2022-10-24 20:59:05 +03:00
Jelte Fennema	20a4d742aa	Fix flakyness in failure_split_cleanup (#6450 ) Sometimes in CI our failure_split_cleanup test would fail like this: ```diff CALL pg_catalog.citus_cleanup_orphaned_resources(); -NOTICE: cleaned up 79 orphaned resources +NOTICE: cleaned up 82 orphaned resources SELECT operation_id, object_type, object_name, node_group_id, policy_type ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/28107/workflows/4ec712c9-98b5-4e90-9806-e02a37d71679/jobs/846107 The reason was that previous tests in the schedule would also create some orphaned resources. Sometimes some of those would already be cleaned up by the maintenance daemon, resulting in a different number of cleaned up resources than expected. This cleans up any previously created resources at the start of the test without logging how many exactly were cleaned up. As a bonus this now also allows running this test using check-failure-base.	2022-10-24 17:35:31 +02:00
Jelte Fennema	7f05ad033a	Add a section on PR descriptions to flaky test docs (#6446 ) Good PR descriptions for flaky tests are quite helpful when reviewing. Although obviously no PR description is the same, there's a few common pieces of information that are useful for all PRs that fix flaky tests.	2022-10-21 16:52:31 +02:00
Jelte Fennema	737e2bb1bb	Don't leak search_path to workers on DDL (#6444 ) DESCRIPTION: Don't leak search_path to workers on DDL For DDL we have to set the `search_path` on workers to the same as on the coordinator for some DDL to work. Previously this search_path would leak outside of the transaction that was used for the DDL. This fixes that by using `SET LOCAL` instead of `SET`. The only place where we still use plain `SET` is for DDL commands that are not allowed within transactions, such as `CREATE INDEX CONCURRENLTY`. This fixes this flaky test: ```diff CONTEXT: SQL statement "SELECT change_id FROM distributed_triggers.data_changes WHERE shard_key_value = NEW.shard_key_value AND object_id = NEW.object_id ORDER BY change_id DESC LIMIT 1" -PL/pgSQL function record_change() line XX at SQL statement +PL/pgSQL function distributed_triggers.record_change() line 17 at SQL statement while executing command on localhost:57638 DELETE FROM data_ref_table where shard_key_value = 'hello'; ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/27849/workflows/75ae5f1a-100b-4b7a-b991-7de069f39ee1/jobs/831429 I had tried to fix this flaky test in #5894 and then I tried implementing a better fix in #5896, where @marcocitus suggested this better fix. This change reverts the fix from #5894 and implements the fix suggested by Marco. Our multi_mx_alter_distributed_table test actually depended on the old buggy search_path leaking behavior. After fixing the bug that test would fail like this: ```diff CALL proc_0(1.0); DEBUG: pushing down the procedure -NOTICE: Res: 3 -DETAIL: from localhost:xxxxx +ERROR: relation "test_proc_colocation_0" does not exist +CONTEXT: PL/pgSQL function mx_alter_distributed_table.proc_0(double precision) line 5 at SQL statement +while executing command on localhost:57637 RESET client_min_messages; ``` I fixed this test by fully qualifying the table names used in the procedure. I think it's quite unlikely that actual users depend on this behavior though. Since it would require first doing DDL before calling a procedure in a session where the search_path was changed after connecting.	2022-10-19 16:47:35 +02:00
Ahmet Gedemenli	cdbda9ea6b	Add failure test for shard move (#6325 ) DESCRIPTION: Adds failure test for shard move DESCRIPTION: Remove function `WaitForAllSubscriptionsToBecomeReady` and related tests Adding some failure tests for shard moves. Dropping the not-needed-anymore function `WaitForAllSubscriptionsToBecomeReady`, as the subscriptions now start as ready from the beginning because we don't use logical replication table sync workers anymore. fixes: #6260	2022-10-19 14:25:26 +02:00
Gokhan Gulbiz	56da3cf6aa	Increase node_connection_timeout to prevent flakiness in shard_rebalancer regression tests (#6445 ) In CI shard_rebalancer sometimes fails with this error: ```diff SET citus.node_connection_timeout to 60; BEGIN; SET LOCAL citus.shard_replication_factor TO 2; SET citus.log_remote_commands TO ON; SET SESSION citus.max_adaptive_executor_pool_size TO 5; SELECT replicate_table_shards('dist_table_test_2', max_shard_copies := 4, shard_transfer_mode:='block_writes'); +WARNING: could not establish connection after 60 ms ``` Source https://app.circleci.com/pipelines/github/citusdata/citus/28128/workflows/38eeacc4-4191-4366-87ed-9a628414965a/jobs/847458?invite=true#step-107-21 This PR avoids this issue by increasing ```citus.node_connection_timeout``` to 35s.	2022-10-19 13:03:14 +03:00
Onur Tirtir	5aec88d084	Not try locking relations referencing to views (#6430 ) Since there can't be such a foreign key already. This mainly fixes the error that Citus throws when trying to truncate a distributed view. Fixes #5990.	2022-10-19 11:24:22 +03:00
Jelte Fennema	f756db39c4	Add docs on how to fix flaky tests (#6438 ) I fixed a lot of flaky tests recently and I found some patterns in the type of issues and type of fixes. This adds a document that lists these types of issues and explains how to fix them.	2022-10-18 15:52:01 +02:00
Gokhan Gulbiz	e87eda6496	Introduce a new GUC to propagate local settings to new connections in rebalancer (#6396 ) DESCRIPTION: Introduce ```citus.propagate_session_settings_for_loopback_connection``` GUC to propagate local settings to new connections. Fixes: #5289	2022-10-18 12:50:30 +03:00
Jelte Fennema	60eb67b908	Increase shard move test coverage by improving advisory locks (#6429 ) To be able to test non-blocking shard moves we take an advisory lock, so we can pause the shard move at an interesting moment. Originally this was during the logical replication catch up phase. But when I added tests for the rebalancer progress I moved this lock before the initial data copy. This allowed testing of the rebalance progress, but inadvertently made our non-blocking tests not actually test if we held unintended locks during logical replication catch up. This fixes that by creating two types of advisory locks, one before the copy and one after. This causes the tests to actually test their intended scenario again. Furthermore it starts using one of these locks for blocking shard moves too. Which allowed me to reduce the complexity of the rebalance progress test suite quite a bit. It also allowed enabling some flaky tests again, because this stopped them from being flaky. And finally it allowed testing of rebalance progress for blocking shard copy operations as well. In passing it fixes a flaky test during parallel blocking shard moves by ordering the output.	2022-10-17 17:32:28 +02:00
Ahmet Gedemenli	96912d9ba1	Add status column to get_rebalance_progress() (#6403 ) DESCRIPTION: Adds status column to get_rebalance_progress() Introduces a new column named `status` for the function `get_rebalance_progress()`. For each ongoing shard move, this column will reveal information about that shard move operation's current status. For now, candidate status messages could be one of the below. * Not Started * Setting Up * Copying Data * Catching Up * Creating Constraints * Final Catchup * Creating Foreign Keys * Completing * Completed	2022-10-17 16:55:31 +03:00
Naisila Puka	8323f4f12c	Cleans up test outputs (#6434 )	2022-10-17 15:13:07 +03:00
Onur Tirtir	4152a391c2	Properly set col names for shard rels that citus_extradata_container points to (#6428 ) Deparser function set_relation_column_names() knows that it needs to re-evaluate column names based on relation's tuple descriptor when the rte belongs to a relation (RTE_RELATION). However before this commit, it didn't know about the fact that citus might wrap such an rte with an rte that points to citus_extradata_container() placeholder. And because of this, it was simply taking the column aliases (e.g., "bar" in "foo AS bar") into the account and this might result in an incorrectly deparsed query as in below case: * Say, if we had view based on following query: ```sql SELECT a FROM table; ``` * And if we rename column "a" to "b", the view query normally becomes: ```sql SELECT b AS a FROM table; ``` * So before this commit, deparsing a query based on that view was resulting in such a query due to deparsing based on the column aliases, which is not correct: ```sql SELECT a FROM table; ``` Fixes #5932. DESCRIPTION: Fixes a bug that might cause failing to query the views based on tables that have renamed columns	2022-10-14 17:31:25 +03:00
Önder Kalacı	8b624b5c9d	Detect remotely closed sockets and add a single connection retry in the executor (#6404 ) PostgreSQL 15 exposes WL_SOCKET_CLOSED in WaitEventSet API, which is useful for detecting closed remote sockets. In this patch, we use this new event and try to detect closed remote sockets in the executor. When a closed socket is detected, the executor now has the ability to retry the connection establishment. Note that, the executor can retry connection establishments only for the connection that has not been used. Basically, this patch is mostly useful for preventing the executor to fail if a cached connection is closed because of the worker node restart (or worker failover). In other words, the executor cannot retry connection establishment if we are in a distributed transaction AND any command has been sent over the connection. That requires more sophisticated retry mechanisms. For now, fixing the above use case is enough. Fixes #5538 Earlier discussions: #5908, #6259 and #6283 ### Summary of the current approach regards to earlier trials As noted, we explored some alternatives before getting into this. https://github.com/citusdata/citus/pull/6283 is simple, but lacks an important property. We should be checking for `WL_SOCKET_CLOSED` _before_ sending anything over the wire. Otherwise, it becomes very tricky to understand which connection is actually safe to retry. For example, in the current patch, we can safely check `transaction->transactionState == REMOTE_TRANS_NOT_STARTED` before restarting a connection. #6259 does what we intent here (e.g., check for sending any command). However, as @marcocitus noted, it is very tricky to handle `WaitEventSets` in multiple places. And, the executor is designed such that it reacts to the events. So, adding anything `pre-executor` seemed too ugly. In the end, I converged into this patch. This patch relies on the simplicity of #6283 and also does a very limited handling of `WaitEventSets`, just for our purpose. Just before we add any connection to the execution, we check if the remote session has already closed. With that, we do a brief interaction of multiple wait event processing, but with different purposes. The new wait event processing we added does not even consider cancellations. We let that handled by the main event processing loop. Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-10-14 15:08:49 +02:00
Jelte Fennema	ecc37b9028	Fix flakyness in multi_partitioning (#6427 ) In CI multi_partitioning sometimes fails with this error: ```diff SELECT citus_remove_node('localhost', :master_port); - citus_remove_node ---------------------------------------------------------------------- - -(1 row) - +ERROR: tuple concurrently deleted -- d) invalid tables for helper UDFs ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/27993/workflows/685e5b20-c923-43e5-8a0d-b932ef4c4914/jobs/839466 This PR avoids this concurrency issue by not running the multi_partitioning test in parallel with other tests.	2022-10-13 10:33:37 +03:00
Onur Tirtir	20847515fa	Hint users to call "citus_set_coordinator_host" first (#6425 ) If an operation requires having coordinator in pg_dist_node and if that is not the case, then we automatically add the coordinator into pg_dist_node if user didn't add any worker nodes yet. However, if user have already added some worker nodes before, we throw an error. With this commit, we improve the error thrown in that case. Closes #6423 based on the discussion made there.	2022-10-12 18:18:51 +03:00
Jelte Fennema	6277ffd69e	Reduce isolation flakyness by improving blocked process detection (#6405 ) Sometimes our CI randomly fails on a test in a way similar to this: ```diff step s2-drop: DROP TABLE cancel_table; - + <waiting ...> +step s2-drop: <... completed> starting permutation: s1-timeout s1-begin s1-sleep10000 s1-rollback s1-reset s1-drop ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/26524/workflows/5415b84f-13a3-482f-bef9-648314c79a67/jobs/756377 I tried to fix that already in #6252 by disabling the maintenance daemon during isolation tests. But it seems that hasn't fixed all cases of these errors. This is another attempt at fixing these issues that seems to have better results. What it does is that it starts using the pInterestingPids parameter that citus_isolation_test_session_is_blocked receives. With this change we start filter out block-edges that are not caused by any of these pids. In passing this change also makes it possible to run `isolation_create_distributed_table_concurrently` with `check-isolation-base`	2022-10-12 16:35:09 +02:00
Jelte Fennema	cb34adf7ac	Don't reassign global PID when already assigned (#6412 ) DESCRIPTION: Fix bug in global PID assignment for rebalancer sub-connections In CI our isolation_shard_rebalancer_progress test would sometimes fail like this: ```diff +isolationtester: canceling step s1-rebalance-c1-block-writes after 60 seconds step s1-rebalance-c1-block-writes: SELECT rebalance_table_shards('colocated1', shard_transfer_mode:='block_writes'); - <waiting ...> + +ERROR: canceling statement due to user request step s7-get-progress: ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/27855/workflows/2a7e335a-f3e8-46ed-b6bd-6920d42f7214/jobs/831710 It turned out this was an actual bug in the way our assigning of global PIDs interacts with the way we connect to ourselves as the shard rebalancer. The first command the shard rebalancer sends is a SET ommand to change the application_name to `citus_rebalancer`. If `StartupCitusBackend` is called after this command is processed, then it overwrites the global PID that was extracted from the previous application_name. This makes sure that we don't do that, and continue to use the original global PID. While it might seem that we only call `StartupCitusBackend` once for each query backend, this isn't actually the case. Whenever pg_dist_partition gets ANALYZEd by autovacuum we indirectly call `StartupCitusBackend` again, because we invalidate the cache then. In passing this fixes two other things as well: 1. It sets `distributedCommandOriginator` correctly in `AssignGlobalPID`, by using IsExternalClientBackend(). This doesn't matter much anymore, since AssignGlobalPID effectively becomes a no-op in this PR for any non-external client backends. 2. It passes the application_name to InitializeBackendData in StartupCitusBackend, instead of INVALID_CITUS_INTERNAL_BACKEND_GPID (which effectively got casted to NULL). In practice this doesn't change the behaviour of the call, since the call is a no-op for every backend except the maintenance daemon. And the behaviour of the call is the same for NULL as for the application_name of the maintenance daemon.	2022-10-11 16:41:01 +02:00
Naisila Puka	b5d70d2e11	Fix flakyness in alter_table_set_access_method (#6421 ) We decrease verbosity level here to avoid the flaky output https://app.circleci.com/pipelines/github/citusdata/citus/27936/workflows/dc63128a-1570-41a0-8722-08f3e3cfe301/jobs/836153 ```diff select alter_table_set_access_method('ref','heap'); NOTICE: creating a new table for alter_table_set_access_method.ref NOTICE: moving the data of alter_table_set_access_method.ref NOTICE: dropping the old alter_table_set_access_method.ref NOTICE: drop cascades to 2 other objects -DETAIL: drop cascades to materialized view m_ref -drop cascades to view v_ref +DETAIL: drop cascades to view v_ref +drop cascades to materialized view m_ref CONTEXT: SQL statement "DROP TABLE alter_table_set_access_method.ref CASCADE" NOTICE: renaming the new table to alter_table_set_access_method.ref alter_table_set_access_method ------------------------------- (1 row) ```	2022-10-11 16:31:24 +03:00
Naisila Puka	89aa9a015f	Fixes empty password issue (#6417 )	2022-10-11 15:56:44 +03:00
aykut-bozkurt	442cdb2ea5	pg_regress needs the option dlpath for postgres tests to find regress.so (#6416 ) When you run vanilla tests in your local environment, some of the tests tries to find path for regress.so which is not in default lib path. That is why we need to specify regress.so path as dlpath option. Example failure: ``` LOAD :'regresslib'; +ERROR: could not access file "/home/aykutbozkurt/.pgenv/pgsql-15beta4/lib/regress.so": No such file or directory ``` It is actually in `~/.pgenv/src/postgresql-15beta4/src/test/regress/regress.so` which is found by `$regresslibdir`.	2022-10-11 14:43:06 +03:00
Hanefi Onaldi	cbe4298c5b	Remove references to optimization PG15 reverted PG15 introduced an optimization on GROUP BY keys that is now reverted on RC2. Relevant PG commit: Revert "Optimize order of GROUP BY keys". 443df6e2db932a7cd6d85ddfb67e11a43345130d	2022-10-10 21:54:08 +03:00
Gokhan Gulbiz	1776bdf654	Limit citus_drain_node to drain the specified node only (#6361 ) DESCRIPTION: Fixes citus_drain_node to drain the specified worker only. Fixes #6267	2022-10-09 13:33:08 +03:00
Onur Tirtir	86e186f671	Retain trigger settings when re-creating the triggers (on shards) (#6398 ) Fixes https://github.com/citusdata/citus/issues/6394. DESCRIPTION: Fixes a bug that causes creating disabled-triggers on shards as enabled Since CREATE TRIGGER doesn't have syntax support to specify whether the trigger should be enabled/disabled, the underlying PG function (`pg_get_triggerdef()`) that we use to generate the command to create the trigger is not enough. For this reason, we append a second command to enable/disable trigger, right after creating it. We don't retain explicit extension dependencies set by using `ALTER trigger DEPENDS ON EXTENSION` commands too, but apparently right fix for that is to throw an error as in `PreprocessAlterTriggerDependsStmt()`; so, opened a separate PR to fix that #6399.	2022-10-06 14:51:07 +03:00
Naisila Puka	27e867afbc	Propagates column aliases (#6400 ) Propagates column aliases in the shard-level commands	2022-10-06 12:27:31 +03:00
Naisila Puka	b5cba3a3fe	Use original relation to retrieve column name because of syscache (#6387 ) During alter_distributed_table, we create a new table like the original table but with the altered options. To retrieve the name of the distribution column, we were using the attribute syscache of the new table, since we already created the new table as identical to the original table. However, the attribute syscaches of these two tables are not the same if the original table has dropped columns. The reason is that dropped columns are all still present in the cache. Hence, for example, the attnos would be different in the syscaches. So, let's use the attribute syscache of the original table.	2022-10-06 12:08:00 +03:00
Ying Xu	f21cbe68f8	[Columnar] Bugfix for Columnar: options ignored during ALTER TABLE rewrite (#6337 ) DESCRIPTION: Fixes a bug that prevents retaining columnar table options after a table-rewrite A fix for this issue: Columnar: options ignored during ALTER TABLE rewrite #5927 The OID for the temporary table created during ALTER TABLE was not the same as the original table's OID so the columnar options were not being applied during rewrite. The change is that I applied the original table's columnar options to the new table so that it has the correct options during write. I also added a test.	2022-10-05 11:42:09 -07:00
Ahmet Gedemenli	e36890ce55	Add source_lsn and target_lsn fields into get_rebalance_progress (#6364 ) DESCRIPTION: Adds source_lsn and target_lsn fields into get_rebalance_progress Adding two fields named `source_lsn` and `target_lsn` to the function `get_rebalance_progress`. Target lsn data is fetched in `GetShardStatistics`, by expanding the query sent to workers (joining with pg_subscription_rel and pg_stat_subscription). Then put into the hashmap, for each shard. Source lsn data is fetched in `BuildWorkerShardStatististicsHash`, in the loop that iterate each node, by sending a pg_current_wal_lsn query to each node. Then put into the hashmap, for each node.	2022-10-05 11:12:24 +03:00
Hanefi Onaldi	11a9a3771f	Ensure no dependencies to index before drop	2022-10-04 18:56:20 +03:00

... 3 4 5 6 7 ...

3019 Commits (b9c5f6be76529bc6c89d0ae296fbcb85f341f1f2)