citus

Commit Graph

Author	SHA1	Message	Date
Colm	ec141f696a	Enhance MERGE .. WHEN NOT MATCHED BY SOURCE for repartitioned source (#7900 ) DESCRIPTION: Ensure that a MERGE command on a distributed table with a `WHEN NOT MATCHED BY SOURCE` clause runs against all shards of the distributed table. The Postgres MERGE command updates a table using a table or a query as a data source. It provides three ways to match the target table with the source: `WHEN MATCHED` means that there is a row in both the target and source; `WHEN NOT MATCHED` means that there is a row in the source that has no match (is not present) in the target; and, as of PG17, `WHEN NOT MATCHED BY SOURCE` means that there is a row in the target that has no match in the source. In Citus, when a MERGE command updates a distributed table using a local/reference table or a distributed query as source, that source is repartitioned, and for each repartitioned shard that has data (i.e. 1 or more rows) the MERGE is run against the corresponding distributed table shard. Suppose the distributed table has 32 shards, and the source repartitions into 4 shards that have data, with the remaining 28 shards being empty; then the MERGE command is performed on the 4 corresponding shards of the distributed table. However, the semantics of `WHEN NOT MATCHED BY SOURCE` are that the specified action must be performed on the target for each row in the target that is not in the source; so if the source is empty, all target rows should be updated. To see this, consider the following MERGE command: ``` MERGE INTO target AS t USING source AS s ON t.id = s.id WHEN NOT MATCHED BY SOURCE THEN UPDATE t SET t.col1 = 100 ``` If the source has zero rows then every row in the target is updated s.t. its col1 value is 100. Currently in Citus a MERGE on a distributed table with a local/reference table or a distributed query as source ignores shards of the distributed table when the corresponding shard of the repartitioned source has zero rows. However, if the MERGE command specifies a `WHEN NOT MATCHED BY SOURCE` clause, then the MERGE should be performed on all shards of the distributed table, to ensure that the specified action is performed on the target for each row in the target that is not in the source. This PR enhances Citus MERGE execution so that when a repartitioned source shard has zero rows, and the MERGE command specifies a `WHEN NOT MATCHED BY SOURCE` clause, the MERGE is performed against the corresponding shard of the distributed table using an empty (zero row) relation as source, by generating a query of the form: ``` MERGE INTO target_shard_0002 AS t USING (SELECT id FROM (VALUES (NULL) ) source_0002(id) WHERE FALSE) AS s ON t.id = s.id WHEN NOT MATCHED BY SOURCE THEN UPDATE t set t.col1 = 100 ``` This works because each row in the target shard will be updated, and `WHEN MATCHED` and `WHEN NOT MATCHED`, if specified, will be no-ops because the source has zero rows. To implement this when the source is a local or reference table involves teaching function `ExcuteSourceAtCoordAndRedistribution()` in `merge_executor.c` to not prune tasks when the query has `WHEN NOT MATCHED BY SOURCE` but to instead replace the task's query to one that uses an empty relation as source. And when the source is a distributed query, function `ExecuteMergeSourcePlanIntoColocatedIntermediateResults()` (also in `merge_executor.c`) instead of skipping empty tasks now generates a query that uses an empty relation as source for the corresponding target shard of the distributed table, but again only when the query has `WHEN NOT MATCHED BY SOURCE`. A new function `BuildEmptyResultQuery()` is added to `recursive_planning.c` and it is used by both the aforementioned functions in `merge_executor.c` to build an empty relation to use as the source. It applies the appropriate type to each column of the empty relation so the join with the target makes sense to the query compiler.	2025-03-12 12:43:01 +03:00
Naisila Puka	3b1c082791	Drops PG14 support (#7753 ) DESCRIPTION: Drops PG14 support 1. Remove "$version_num" != 'xx' from configure file 2. delete all PG_VERSION_NUM = PG_VERSION_XX references in the code 3. Look at pg_version_compat.h file, remove all _compat functions etc defined specifically for PGXX differences 4. delete all PG_VERSION_NUM >= PG_VERSION_(XX+1), PG_VERSION_NUM < PG_VERSION_(XX+1) ifs in the codebase 5. delete ruleutils_xx.c file 6. cleanup normalize.sed file from pg14 specific lines 7. delete all alternative output files for that particular PG version, server_version_ge variable helps here	2025-03-12 12:43:01 +03:00
Naisila Puka	dce54db494	PG17 compatibility: Resolve compilation issues (#7699 ) This PR provides successful compilation against PG17.0. - Remove ExecFreeExprContext call Relevant PG commit d060e921ea5aa47b6265174c32e1128cebdbc3df `d060e921ea` - PG17 uses streaming IO in analyze, fix scan_analyze_next_block function Relevant PG commit 041b96802efa33d2bc9456f2ad946976b92b5ae1 `041b96802e` - Define ObjectClass for PG17+ only since it's removed Relevant PG commit: 89e5ef7e21812916c9cf9fcf56e45f0f74034656 `89e5ef7e21` - Remove ReorderBufferTupleBuf structure. Relevant PG commit: 08e6344fd6423210b339e92c069bb979ba4e7cd6 `08e6344fd6` - Define colliculocale and daticulocale since they have been renamed Relevant PG commit: f696c0cd5f299f1b51e214efc55a22a782cc175d `f696c0cd5f` - makeStringConst defined in PG17 Relevant PG commit: de3600452b61d1bc3967e9e37e86db8956c8f577 `de3600452b` - RangeVarCallbackOwnsTable was replaced by RangeVarCallbackMaintainsTable Relevant PG commit: ecb0fd33720fab91df1207e85704f382f55e1eb7 `ecb0fd3372` - attstattarget is nullable, define pg compatible functions for it Relevant PG commit: 4f622503d6de975ac87448aea5cea7de4bc140d5 `4f622503d6` - stxstattarget is nullable in PG17, write compat functions for it Relevant PG commit: 012460ee93c304fbc7220e5b55d9d0577fc766ab `012460ee93` - Use ResourceOwner to track WaitEventSet in PG17 Relevant PG commit: 50c67c2019ab9ade8aa8768bfe604cd802fe8591 `50c67c2019` - getIdentitySequence now uses Relation instead of relation_id Relevant PG commit: 509199587df73f06eda898ae13284292f4ae573a `509199587d` - Remove no-op tuplestore_donestoring function Relevant PG commit: 75680c3d805e2323cd437ac567f0677fdfc7b680 `75680c3d80` - MergeAction can have 3 merge kinds (now enum) in PG17, write compat Relevant PG commit: 0294df2f1f842dfb0eed79007b21016f486a3c6c `0294df2f1f` - EXPLAIN (MEMORY) is added, make changes to ExplainOnePlan Relevant PG commit: 5de890e3610d5a12cdaea36413d967cf5c544e20 `5de890e361` - LIMIT_OPTION_DEFAULT has been removed as it's useless, use LIMIT_OPTION_COUNT Relevant PG commit: a6be0600ac3b71dda8277ab0fcbe59ee101ac1ce `a6be0600ac` - write compat for create_foreignscan_path bcs of more arguments in PG17 Relevant PG commit: 9e9931d2bf40e2fea447d779c2e133c2c1256ef3 `9e9931d2bf` - pgprocno and lxid have been combined into a struct in PGPROC Relevant PG commits: 28f3915b73f75bd1b50ba070f56b34241fe53fd1 `28f3915b73` ab355e3a88de745607f6dd4c21f0119b5c68f2ad `ab355e3a88` 024c521117579a6d356050ad3d78fdc95e44eefa `024c521117` - Simplify CitusNewNode (#7434) postgres refactored newNode() in PG 17, the main point for doing this is the original tricks is no longer neccessary for modern compilers[1]. This does the same for Citus. This should have no backward compatibility issues since it just replaces palloc0fast with palloc0. This is good for forward compatibility since palloc0fast no longer exists in PG 17. [1] https://www.postgresql.org/message-id/b51f1fa7-7e6a-4ecc-936d-90a8a1659e7c@iki.fi (cherry picked from commit `4b295cc`)	2025-03-12 11:01:49 +03:00
Naisila Puka	6bd3474804	Rename foreach_ macros to foreach_declared_ macros (#7700 ) This is prep work for successful compilation with PG17 PG17added foreach_ptr, foreach_int and foreach_oid macros Relevant PG commit 14dd0f27d7cd56ffae9ecdbe324965073d01a9ff `14dd0f27d7` We already have these macros, but they are different with the PG17 ones because our macros take a DECLARED variable, whereas the PG16 macros declare a locally-scoped loop variable themselves. Hence I am renaming our macros to foreach_declared_ I am separating this into its own PR since it touches many files. The main compilation PR is https://github.com/citusdata/citus/pull/7699	2025-03-12 11:01:49 +03:00
Karina	9ff8436f14	Create directories and files with pg_file_create_mode and pg_dir_create_mode permissions (#7479 ) Since Postgres commit da9b580d files and directories are supposed to be created with pg_file_create_mode and pg_dir_create_mode permissions when default permissions are expected. This fixes a failure of one of the postgres tests: If we create file add.conf containing ``` shared_preload_libraries='citus' ``` and run postgres tests ``` TEMP_CONFIG=/path/to/add.conf make installcheck -C src/bin/pg_ctl/ ``` then 001_start_stop.pl fails with ``` .../data/base/pgsql_job_cache mode must be 0750 ``` in the log. In passing this also stops creating directories that we haven't used since Citus 7.4 This change explicitely doesn't change permissions of certificates/keys that we create. --------- Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>	2024-02-07 12:48:31 +01:00
Gürkan İndibay	863713e9b7	Refactors ExtendedTaskList methods (#7372 ) ExecuteTaskListIntoTupleDestWithParam and ExecuteTaskListIntoTupleDest are nearly the same. I parameterized and a made a reusable structure here --------- Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2024-01-24 06:00:19 +00:00
eaydingol	ee11492a0e	Generate qualified relation name (#7427 ) This change refactors the code by using generate_qualified_relation_name from id instead of using a sequence of functions to generate the relation name. Fixes #6602	2024-01-22 17:32:49 +03:00
zhjwpku	51e607878b	remove a duplicate forward declaration and polish some comments (#7371 ) remove a duplicate forward declaration and polish some comments Signed-off-by: Zhao Junwang <zhjwpku@gmail.com>	2024-01-17 14:30:23 +00:00
Karina	20dc58cf5d	Fix getting heap tuple size (#7387 ) This fixes #7230. First of all, using HeapTupleHeaderGetDatumLength(heapTuple) is definetly wrong, it gives a number that's 4 times less than the correct tuple size (heapTuple.t_len). See https://github.com/postgres/postgres/blob/REL_16_0/src/include/access/htup_details.h#L455-L456 https://github.com/postgres/postgres/blob/REL_16_0/src/include/varatt.h#L279 https://github.com/postgres/postgres/blob/REL_16_0/src/include/varatt.h#L225-L226 When I fixed it, the limit_intermediate_size test failed, so I tried to understand what's going on there. In original commit `fd546cf` these queries were supposed to fail. Then in `b3af63c` three of the queries that were supposed to fail suddenly worked and tests were changed to pass without understanding why the output had changed or how to keep test testing what it had to test. Even comments saying that these queries should fail were left untouched. Commit message gives no clue about why exactly test has changed: > It seems that when we use adaptive executor instead of task tracker, we > exceed the intermediate result size less in the test. Therefore updated > the tests accordingly. Then `3fda2c3` also blindly raised the limit for one of the queries to keep it working: `3fda2c3254 (diff-a9b7b617f9dfd345318cb8987d5897143ca1b723c87b81049bbadd94dcc86570R19)` When in `fe3caf3` that HeapTupleHeaderGetDatumLength(heapTuple) call was finally added, one of those test queries became failing again. The other two of them now also failing after the fix. I don't understand how exactly the calculation of "intermediate result size" that is limited by citus.max_intermediate_result_size had changed through `b3af63c` and `fe3caf3`, but these numbers are now closer to what they originally were when this limitation was added in `fd546cf`. So these queries should fail, like in the original version of the limit_intermediate_size test. Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>	2024-01-08 17:09:30 +01:00
Nils Dijk	0620c8f9a6	Sort includes (#7326 ) This change adds a script to programatically group all includes in a specific order. The script was used as a one time invocation to group and sort all includes throught our formatted code. The grouping is as follows: - System includes (eg. `#include<...>`) - Postgres.h (eg. `#include "postgres.h"`) - Toplevel imports from postgres, not contained in a directory (eg. `#include "miscadmin.h"`) - General postgres includes (eg . `#include "nodes/..."`) - Toplevel citus includes, not contained in a directory (eg. `#include "citus_verion.h"`) - Columnar includes (eg. `#include "columnar/..."`) - Distributed includes (eg. `#include "distributed/..."`) Because it is quite hard to understand the difference between toplevel citus includes and toplevel postgres includes it hardcodes the list of toplevel citus includes. In the same manner it assumes anything not prefixed with `columnar/` or `distributed/` as a postgres include. The sorting/grouping is enforced by CI. Since we do so with our own script there are not changes required in our uncrustify configuration.	2023-11-23 18:19:54 +01:00
Nils Dijk	0dac63afc0	move pg_version_constants.h to toplevel include (#7335 ) In preparation of sorting and grouping all includes we wanted to move this file to the toplevel includes for good grouping/sorting.	2023-11-09 15:09:39 +00:00
cvbhjkl	e535f53ce5	Fix typo in local_executor.c (#7324 ) Fix a typo 'remaning' -> 'remaining' in local_executor.c	2023-11-03 12:14:11 +00:00
Gürkan İndibay	71a4633dad	Fixes typo and renames multi_process_utility (#7259 )	2023-10-17 16:39:37 +03:00
zhjwpku	5034f8eba5	polish the codebase by fixing dozens of typos (#7166 )	2023-09-01 12:21:53 +02:00
Önder Kalacı	4ae3982d14	Add single-shard router Merge command support (#7088 ) Similar to https://github.com/citusdata/citus/pull/7077. As PG 16+ has changed the join restriction information for certain outer joins, MERGE is also impacted given that is is also underlying an outer join. See #7077 for the details.	2023-08-04 08:16:29 +03:00
Önder Kalacı	960a5f6104	Improve failure handling of distributed execution (#7090 ) Prior to this commit, the code would skip processing the errors happened for local commands. Prior to https://github.com/citusdata/citus/pull/5379, it might make sense to allow the execution continue. But, as of today, if a modification fails on any placement, we can safely fail the execution. The first commit show the problem in action. The second commit includes the fix and the test fixes.	2023-08-01 16:47:59 +03:00
Naisila Puka	69af3e8509	Drop PG13 Support Phase 2 - Remove PG13 specific paths/tests (#7007 ) This commit is the second and last phase of dropping PG13 support. It consists of the following: - Removes all PG_VERSION_13 & PG_VERSION_14 from codepaths - Removes pg_version_compat entries and columnar_version_compat entries specific for PG13 - Removes alternative pg13 test outputs - Removes PG13 normalize lines and fix the test outputs based on that It is a continuation of `5bf163a27d`	2023-06-21 14:18:23 +03:00
aykut-bozkurt	f667f14029	Rewind tuple store to fix scrollable with hold cursor fetches (#7014 ) We need to rewind the tuplestorestate's tuple index to get correct results on fetching scrollable with hold cursors. `PersistHoldablePortal` is responsible for persisting out tuplestorestate inside a with hold cursor before commiting a transaction. It rewinds the cursor like below (`ExecutorRewindcalls` calls `rescan`): ```c if (portal->cursorOptions & CURSOR_OPT_SCROLL) { ExecutorRewind(queryDesc); } ``` At the end, it adjusts tuple index for holdStore in the portal properly. ```c if (portal->cursorOptions & CURSOR_OPT_SCROLL) { if (!tuplestore_skiptuples(portal->holdStore, portal->portalPos, true)) elog(ERROR, "unexpected end of tuple stream"); } ``` DESCRIPTION: Fixes incorrect results on fetching scrollable with hold cursors. Fixes https://github.com/citusdata/citus/issues/7010	2023-06-19 23:00:18 +03:00
Teja Mupparti	58da8771aa	This pull request introduces support for nonroutable merge commands in the following scenarios: 1) For distributed tables that are not colocated. 2) When joining on a non-distribution column for colocated tables. 3) When merging into a distributed table using reference or citus-local tables as the data source. This is accomplished primarily through the implementation of the following two strategies. Repartition: Plan the source query independently, execute the results into intermediate files, and repartition the files to co-locate them with the merge-target table. Subsequently, compile a final merge query on the target table using the intermediate results as the data source. Pull-to-coordinator: Execute the plan that requires evaluation at the coordinator, run the query on the coordinator, and redistribute the resulting rows to ensure colocation with the target shards. Direct the MERGE SQL operation to the worker nodes' target shards, using the intermediate files colocated with the data as the data source.	2023-06-19 12:23:40 -07:00
Naisila Puka	ba40eb363c	Fix some gucs' initial and boot values, and flag combinations (#6957 ) PG16beta1 added some sanity checks for GUCS, find the Relevant PG commits below: 1- Add check on initial and boot values when loading GUCs `a73952b795` 2- Extend check_GUC_init() with checks on flag combinations when loading GUCs `009f8d1714` I fixed our currently problematic GUCS, we can merge this directly into main as these make sense for any PG version. There was a particular NodeConninfo issue: Previously we would rely on the fact that NodeConninfo initial value is an empty string. However, with PG16 enforcing same initial and boot values, we can't use an empty initial value for NodeConninfo anymore. Therefore we add a new flag to indicate whether we are at boot check.	2023-06-14 11:55:52 +03:00
Gokhan Gulbiz	e0ccd155ab	Make citus_stat_tenants work with schema-based tenants. (#6936 ) DESCRIPTION: Enabling citus_stat_tenants to support schema-based tenants. This pull request modifies the existing logic to enable tenant monitoring with schema-based tenants. The changes made are as follows: - If a query has a partitionKeyValue (which serves as a tenant key/identifier for distributed tables), Citus annotates the query with both the partitionKeyValue and colocationId. This allows for accurate tracking of the query. - If a query does not have a partitionKeyValue, but its colocationId belongs to a distributed schema, Citus annotates the query with only the colocationId. The tenant monitor can then easily look up the schema to determine if it's a distributed schema and make a decision on whether to track the query. --------- Co-authored-by: Jelte Fennema <jelte.fennema@microsoft.com>	2023-06-13 14:11:45 +03:00
Teja Mupparti	f6a516dab5	Refactor repartitioning code into generic format	2023-06-05 09:06:05 -07:00
Teja Mupparti	ff2062e8c3	Rename insert-select redistribute code base to generic purpose	2023-06-01 09:43:43 -07:00
Emel Şimşek	02f815ce1f	Disable local execution when Explain Analyze is requested for a query. (#6892 ) DESCRIPTION: Fixes a crash when explain analyze is requested for a query that is normally locally executed. When explain analyze is requested for a query, a task with two queries is created. Those two queries are 1. Wrapped Query --> `SELECT ... FROM worker_save_query_explain_analyze(<query>, <explain analyze options>)` 2. Fetch Query -->` SELECT explain_analyze_output, execution_duration FROM worker_last_saved_explain_analyze();` When the query is locally executed a task with multiple queries causes a crash in production. See the Assert at `57455dc64d/src/backend/distributed/executor/tuple_destination.c`#:~:text=Assert(task%2D%3EqueryCount%20%3D%3D%201)%3B This becomes a critical issue when auto_explain extension is used. When auto_explain extension is enabled, explain analyze is automatically requested for every query. One possible solution could be not to create two queries for a locally executed query. The fetch part may not have to be a query since the values are available in local variables. Until we enable local execution for explain analyze, it is best to disable local execution. Fixes #6777.	2023-05-23 14:33:22 +03:00
Onur Tirtir	56d217b108	Mark objects as distributed even when pg_dist_node is empty (#6900 ) We mark objects as distributed objects in Citus metadata only if we need to propagate given the command that creates it to worker nodes. For this reason, we were not doing this for the objects that are created while pg_dist_node is empty. One implication of doing so is that we defer the schema propagation to the time when user creates the first distributed table in the schema. However, this doesn't help for schema-based sharding (#6866) because we want to sync pg_dist_tenant_schema to the worker nodes even for empty schemas too. * Support test dependencies for isolation tests without a schedule * Comment out a test due to a known issue (#6901) * Also, reduce the verbosity for some log messages and make some tests compatible with run_test.py.	2023-05-16 11:45:42 +03:00
Gokhan Gulbiz	8782ea1582	Ensure partitionKeyValue and colocationId are set for proper tenant stats gathering (#6834 ) This PR updates the tenant stats implementation to set partitionKeyValue and colocationId in ExecuteLocalTaskListExtended, in addition to LocallyExecuteTaskPlan. This ensures that tenant stats can be properly gathered regardless of the code path taken. The changes were initially made while testing stored procedure calls for tenant stats.	2023-04-17 09:35:26 +03:00
Halil Ozan Akgül	52ad2d08c7	Multi tenant monitoring (#6725 ) DESCRIPTION: Adds views that monitor statistics on tenant usages This PR adds `citus_stats_tenants` view that monitors the tenants on the cluster. `citus_stats_tenants` shows the node id, colocation id, tenant attribute, read count in this period and last period, and query count in this period and last period of the tenant. Tenant attribute currently is the tenant's distribution column value, later when schema based sharding is introduced, this meaning might change. A period is a time bucket the queries are counted by. Read and query counts for this period can increase until the current period ends. After that those counts are moved to last period's counts, which cannot change. The period length can be set using 'citus.stats_tenants_period'. `SELECT` queries are counted as _read_ queries, `INSERT`, `UPDATE` and `DELETE` queries are counted as _write_ queries. So in the view read counts are `SELECT` counts and query counts are `SELECT`, `INSERT`, `UPDATE` and `DELETE` count. The data is stored in shared memory, in a struct named `MultiTenantMonitor`. `citus_stats_tenants` shows the data from local tenants. `citus_stats_tenants` show up to `citus.stats_tenant_limit` number of tenants. The tenants are scored based on the number of queries they run and the recency of those queries. Every query ran increases the score of tenant by `ONE_QUERY_SCORE`, and after every period ends the scores are halved. Halving is done lazily. To retain information a longer the monitor keeps up to 3 times `citus.stats_tenant_limit` tenants. When the tenant count hits `3 * citus.stats_tenant_limit`, last `citus.stats_tenant_limit` tenants are removed. To see all stored tenants you can use `citus_stats_tenants(return_all_tenants := true)` - [x] Create collector view that gets data from all nodes. #6761 - [x] Add monitoring log #6762 - [x] Create enable/disable GUC #6769 - [x] Parse the annotation string correctly #6796 - [x] Add local queries and prepared statements #6797 - [x] Rename to citus_stat_statements #6821 - [x] Run pgbench - [x] Fix role permissions #6812 --------- Co-authored-by: Gokhan Gulbiz <ggulbiz@gmail.com> Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2023-04-05 17:44:17 +03:00
Marco Slot	343d1c5072	Refactor executor utility functions into multiple files (#6593 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2023-03-31 13:07:48 +02:00
Marco Slot	b09d239809	Propagate CREATE PUBLICATION statements	2023-03-29 00:59:12 +02:00
rajeshkt78	85b8a2c7a1	CDC implementation for Citus using Logical Replication (#6623 ) Description: Implementing CDC changes using Logical Replication to avoid re-publishing events multiple times by setting up replication origin session, which will add "DoNotReplicateId" to every WAL entry. - shard splits - shard moves - create distributed table - undistribute table - alter distributed tables (for some cases) - reference table operations The citus decoder which will be decoding WAL events for CDC clients, ignores any WAL entry with replication origin that is not zero. It also maps the shard names to distributed table names.	2023-03-28 16:00:21 +05:30
aykut-bozkurt	9e69dd0e7f	fix single tuple result memory leak (#6724 ) We should not omit to free PGResult when we receive single tuple result from an internal backend. Single tuple results are normally freed by our ReceiveResults for `tupleDescriptor != NULL` flow but not for those with `tupleDescriptor == NULL`. See PR #6722 for details. DESCRIPTION: Fixes memory leak issue with query results that returns single row.	2023-02-17 14:15:09 +03:00
Jelte Fennema	81dcddd1ef	Actually skip constraint validation on shards after shard move (#6640 ) DESCRIPTION: Fix foreign key validation skip at the end of shard move In `eadc88a` we started completely skipping foreign key constraint validation at the end of a non blocking shard move, instead of only for foreign keys to reference tables. However, it turns out that this didn't work at all because of a hard to notice bug: By resetting the SkipConstraintValidation flag at the end of our utility hook, we actually make the SET command that sets it a no-op. This fixes that bug by removing the code that resets it. This is fine because #6543 removed the only place where we set the flag in C code. So the resetting of the flag has no purpose anymore. This PR also adds a regression test, because it turned out we didn't have any otherwise we would have caught that the feature was completely broken. It also moves the constraint validation skipping to the utility hook. The reason is that #6550 showed us that this is the better place to skip it, because it will also skip the planning phase and not just the execution.	2023-01-27 13:08:05 +01:00
Onder Kalaci	feb5534c65	Do not create additional WaitEventSet for RemoteSocketClosed checks Before this commit, we created an additional WaitEventSet for checking whether the remote socket is closed per connection - only once at the start of the execution. However, for certain workloads, such as pgbench select-only workloads, the creation/deletion of the additional WaitEventSet adds ~7% CPU overhead, which is also reflected on the benchmark results. With this commit, we use the same WaitEventSet for the purposes of checking the remote socket at the start of the execution. We use "rebuildWaitEventSet" flag so that the executor can re-use the existing WaitEventSet. As a result, we see the following improvements on PG 15: main : 120051 tps, 0.532 ms latency avg. avoid_wes_rebuild: 127119 tps, 0.503 ms latency avg. And, on PG 14, as expected, there is no difference main : 129191 tps, 0.495 ms latency avg. avoid_wes_rebuild: 129480 tps, 0.494 ms latency avg. But, note that PG 15 is slightly (~1.5%) slower than PG 14. That is probably the overhead of checking the remote socket.	2022-12-14 22:42:55 +01:00
Onder Kalaci	d52da55ac0	Move WaitEvent to DistributedExecution Prep. for caching WaitEventsSet/WaitEvents	2022-12-14 21:59:19 +01:00
Marco Slot	666696c01c	Deprecate citus.replicate_reference_tables_on_activate, make it always off (#6474 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-11-04 16:21:10 +01:00
Önder Kalacı	8b624b5c9d	Detect remotely closed sockets and add a single connection retry in the executor (#6404 ) PostgreSQL 15 exposes WL_SOCKET_CLOSED in WaitEventSet API, which is useful for detecting closed remote sockets. In this patch, we use this new event and try to detect closed remote sockets in the executor. When a closed socket is detected, the executor now has the ability to retry the connection establishment. Note that, the executor can retry connection establishments only for the connection that has not been used. Basically, this patch is mostly useful for preventing the executor to fail if a cached connection is closed because of the worker node restart (or worker failover). In other words, the executor cannot retry connection establishment if we are in a distributed transaction AND any command has been sent over the connection. That requires more sophisticated retry mechanisms. For now, fixing the above use case is enough. Fixes #5538 Earlier discussions: #5908, #6259 and #6283 ### Summary of the current approach regards to earlier trials As noted, we explored some alternatives before getting into this. https://github.com/citusdata/citus/pull/6283 is simple, but lacks an important property. We should be checking for `WL_SOCKET_CLOSED` _before_ sending anything over the wire. Otherwise, it becomes very tricky to understand which connection is actually safe to retry. For example, in the current patch, we can safely check `transaction->transactionState == REMOTE_TRANS_NOT_STARTED` before restarting a connection. #6259 does what we intent here (e.g., check for sending any command). However, as @marcocitus noted, it is very tricky to handle `WaitEventSets` in multiple places. And, the executor is designed such that it reacts to the events. So, adding anything `pre-executor` seemed too ugly. In the end, I converged into this patch. This patch relies on the simplicity of #6283 and also does a very limited handling of `WaitEventSets`, just for our purpose. Just before we add any connection to the execution, we check if the remote session has already closed. With that, we do a brief interaction of multiple wait event processing, but with different purposes. The new wait event processing we added does not even consider cancellations. We let that handled by the main event processing loop. Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-10-14 15:08:49 +02:00
Ahmet Gedemenli	eadc88a800	Introduce GUC citus.skip_constraint_validation (#6281 ) Introduces a new GUC named citus.skip_constraint_validation, which basically skips constraint validation when set to on. For some several places that we hack to skip the foreign key validation phase, now we use this GUC.	2022-09-08 18:13:18 +03:00
Gokhan Gulbiz	ac96370ddf	Use IsMultiStatementTransaction for SELECT .. FOR UPDATE queries (#6288 ) * Use IsMultiStatementTransaction instead of IsTransaction for row-locking operations. * Add regression test for SELECT..FOR UPDATE statement	2022-09-06 16:38:41 +02:00
aykut-bozkurt	69726648ab	verify shards if exists for insert, delete, update (#6280 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-09-06 15:29:14 +02:00
Marco Slot	639588bee0	Remove unused functions (#6220 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-08-22 11:53:25 +03:00
Jelte Fennema	3f6ce889eb	Use CreateSimpleHash (and variants) whenever possible (#6177 ) This is a refactoring PR that starts using our new hash table creation helper function. It adds a few more macros for ease of use, because C doesn't have default arguments. It also adds a macro to check if a struct contains automatic padding bytes. No struct that is hashed using tag_hash should have automatic padding bytes, because those bytes are undefined and thus using them to create a hash will result in undefined behaviour (usually a random hash).	2022-08-17 13:01:59 +03:00
Onder Kalaci	bdaeb40b51	Add missing relation access record for local utility command While testing `5670dffd33`, I realized that we have a missing RecordNonDistTableAccessesForTask() for local utility commands. Although we don't have to record the relation access for local only cases, we really want to keep the behaviour for scale-out be the same with single node on all aspects. We wouldn't want any single node complex transaction to work on single machine, but not on multi node cluster. Hence, we apply the same restrictions. For example, on a distributed cluster, the following errors, and after this commit this errors locally as well ```SQL CREATE TABLE ref(a int primary key); INSERT INTO ref VALUES (1); CREATE TABLE dist(a int REFERENCES ref(a)); SELECT create_reference_table('ref'); SELECT create_distributed_table('dist', 'a'); BEGIN; SELECT * FROM dist; TRUNCATE ref CASCADE; ERROR: cannot execute DDL on table "ref" because there was a parallel SELECT access to distributed table "dist" in the same transaction HINT: Try re-running the transaction with "SET LOCAL citus.multi_shard_modify_mode TO 'sequential';" COMMIT; ``` We also add the comprehensive test suite and run the same locally.	2022-07-29 11:36:33 +02:00
Onder Kalaci	149771792b	Remove useless version compats most likely leftover from earlier versions	2022-07-29 10:31:55 +02:00
Marco Slot	cff013a057	Fix issues with insert..select casts and column ordering	2022-07-28 13:23:57 +02:00
Marco Slot	5fabf94e39	Allow WITH HOLD cursors with parameters	2022-07-21 12:00:59 +02:00
Naisila Puka	7d6410c838	Drop postgres 12 support (#6040 ) * Remove if conditions with PG_VERSION_NUM < 13 * Remove server_above_twelve(&eleven) checks from tests * Fix tests * Remove pg12 and pg11 alternative test output files * Remove pg12 specific normalization rules * Some more if conditions in the code * Change RemoteCollationIdExpression and some pg12/pg13 comments * Remove some more normalization rules	2022-07-20 17:49:36 +03:00
Önder Kalacı	90b1afe31e	Merge branch 'main' into baby_step_pg_15	2022-07-18 15:02:39 +02:00
Nitish Upreti	5b3537cdff	Shard Split for Citus (#6029 ) * Blocking split setup * Add missing type * Missing API from Metadata Sync * Shard Split e2e code * Worker Split Copy DestReceiver skeleton * Basic destreceiver code * worker_split_copy UDF * UDF calling * Split points are text * Isolate Tenant and Split Shard Unification * Fixing executor and misc * Reindent code * Fixing UDF definitions * Hello World Local Copy works * Remote copy hello world works * Local and Remote binary test * Fixing text local copy and adding tests * Hello World shard split works * Negative tests * Blocking Split workflow works * Refactor * Bug fix * Reindent * Cleaning up and adding comments * Basic test for shard split workflow * ReIndent * Circle CI integration * Removing include causing circle-ci build failure * Remove SplitCopyDestReceiver and use PartitionedResultDestReceiver * Add support for citus.enable_binary_protocol * Reindent * Fix build break * Update Test * Cleanup on catch * Addressing open comments * Update downgrade script and quote schema/table in COPY statement * Fix metadata sync issue. Update regression test * Isolation test and bug fix * Add Isolation test, fix foreign constraint deadlock issue * Misc code review comments * Test name needing to be quoted * Refactor code from review comments * Explaining shardGroupSplitIntervalListList * Fix upgrade & downgrade * Fix broken test * Test fix Round 2 * Fixing bug and modifying test appropriately * Fully qualify copy udf name. Run Reindent * Address PR comments * Fix null handling when creating AuxiliaryStructures * Ensure local copy is triggered in tests * Limit max shards that can be created with split * Test failure fix * Remove split_mode and use shard_transfer_mode instead' * Fix test failure * Fix test failure * Fixing permission issue when splitting non-superuser owned tables * Fix test expected output * Remove extra space * Fix test * attempt to fix test * Addressing Marco's PR comment * Only clean shards created by workflow * Remove from merge * Update test	2022-07-18 02:54:15 -07:00
Onder Kalaci	483a3a5875	PG 15 Compat: Resolve compile issues + shmem requests Similar to #5897, one more step for running Citus with PG 15. This PR at least make Citus run with PG 15. I have not tried running the tests with PG 15. Shmem changes are based on `4f2400cb3f` Compile breaks are mostly due to #6008	2022-07-15 10:11:39 +02:00
Jelte Fennema	184c7c0bce	Make enterprise features open source (#6008 ) This PR makes all of the features open source that were previously only available in Citus Enterprise. Features that this adds: 1. Non blocking shard moves/shard rebalancer (`citus.logical_replication_timeout`) 2. Propagation of CREATE/DROP/ALTER ROLE statements 3. Propagation of GRANT statements 4. Propagation of CLUSTER statements 5. Propagation of ALTER DATABASE ... OWNER TO ... 6. Optimization for COPY when loading JSON to avoid double parsing of the JSON object (`citus.skip_jsonb_validation_in_copy`) 7. Support for row level security 8. Support for `pg_dist_authinfo`, which allows storing different authentication options for different users, e.g. you can store passwords or certificates here. 9. Support for `pg_dist_poolinfo`, which allows using connection poolers in between coordinator and workers 10. Tracking distributed query execution times using citus_stat_statements (`citus.stat_statements_max`, `citus.stat_statements_purge_interval`, `citus.stat_statements_track`). This is disabled by default. 11. Blocking tenant_isolation 12. Support for `sslkey` and `sslcert` in `citus.node_conninfo`	2022-06-16 00:23:46 -07:00

1 2 3 4 5 ...

704 Commits (77cd55939d281cb4e5b10a9d08c901cb4af0a10c)