citus

Commit Graph

Author	SHA1	Message	Date
Marco Slot	666696c01c	Deprecate citus.replicate_reference_tables_on_activate, make it always off (#6474 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-11-04 16:21:10 +01:00
Onur Tirtir	1af28b3f27	Use CommitContext for subxact mgmt and reduce memory usage in CommitContext (#6099 ) (Hopefully) Fixes #5000. If memory allocation done for `SubXactContext state` in `PushSubXact()` fails, then `PopSubXact()` might segfault, for example, when grabbing the topmost `SubXactContext` from `activeSubXactContexts` if this is the first ever subxact within the current xact, with the following stack trace: ```c citus.so!list_nth_cell(const List list, int n) (\opt\pgenv\pgsql-14.3\include\server\nodes\pg_list.h:260) citus.so!PopSubXact(SubTransactionId subId) (\home\onurctirtir\citus\src\backend\distributed\transaction\transaction_management.c:761) citus.so!CoordinatedSubTransactionCallback(SubXactEvent event, SubTransactionId subId, SubTransactionId parentSubid, void * arg) (\home\onurctirtir\citus\src\backend\distributed\transaction\transaction_management.c:673) CallSubXactCallbacks(SubXactEvent event, SubTransactionId mySubid, SubTransactionId parentSubid) (\opt\pgenv\src\postgresql-14.3\src\backend\access\transam\xact.c:3644) AbortSubTransaction() (\opt\pgenv\src\postgresql-14.3\src\backend\access\transam\xact.c:5058) AbortCurrentTransaction() (\opt\pgenv\src\postgresql-14.3\src\backend\access\transam\xact.c:3366) PostgresMain(int argc, char ** argv, const char * dbname, const char * username) (\opt\pgenv\src\postgresql-14.3\src\backend\tcop\postgres.c:4250) BackendRun(Port * port) (\opt\pgenv\src\postgresql-14.3\src\backend\postmaster\postmaster.c:4530) BackendStartup(Port * port) (\opt\pgenv\src\postgresql-14.3\src\backend\postmaster\postmaster.c:4252) ServerLoop() (\opt\pgenv\src\postgresql-14.3\src\backend\postmaster\postmaster.c:1745) PostmasterMain(int argc, char argv) (\opt\pgenv\src\postgresql-14.3\src\backend\postmaster\postmaster.c:1417) main(int argc, char argv) (\opt\pgenv\src\postgresql-14.3\src\backend\main\main.c:209) ``` For this reason, to be more defensive against memory-allocation errors that could happen at `PushSubXact()`, now we use our pre-allocated memory context for the objects created in `PushSubXact()`. This commit also attempts reducing the memory allocations done under CommitContext to reduce the chances of consuming all the memory available to CommitContext. Note that it's problematic to encounter with such a memory-allocation error for other objects created in `PushSubXact()` as well, so above is an example scenario that might result in a segfault. DESCRIPTION: Fixes a bug that might cause segfaults when handling deeply nested subtransactions	2022-11-03 00:57:32 +03:00
Onur Tirtir	a5f7f001b0	Make sure to disallow triggers that depend on extensions (#6399 ) DESCRIPTION: Makes sure to disallow triggers that depend on extensions We were already doing so for `ALTER trigger DEPENDS ON EXTENSION` commands. However, we also need to disallow creating Citus tables having such triggers already, so this PR fixes that.	2022-11-02 16:27:31 +03:00
Alexander Kukushkin	deeacfee04	Improve a query that terminates compeling backends from citus_update_node() (#6468 ) DESCRIPTION: Improve a query that terminates compeling backends from citus_update_node() 1. Use pg_blocking_pids() function instead of self join on pg_locks. It exists since 9.6 and more accurate than pg_locks. 2. Prefix all function calls with pg_catalog schema to prevent privilege escalation by creating functions with similar names in a public schema. 3. Change logs and update comments to reflect the fact that the pg_terminate_backend() function only sends SIGTERM but not wating for the actual backend termination.	2022-11-02 12:32:00 +01:00
Alexander Kukushkin	402a30a2b7	Allow citus_update_node() to work with nodes from different clusters (#6466 ) DESCRIPTION: Allow citus_update_node() to work with nodes from different clusters citus_update_node(), citus_nodename_for_nodeid(), and citus_nodeport_for_nodeid() functions only checked for nodes in their own clusters and hence last two returned NULLs and the first one showed an error is the nodeId was from a different cluster. Fixes https://github.com/citusdata/citus/issues/6433	2022-11-02 10:07:01 +01:00
oohira	3f66f3d9dd	Add missing space to citus.shard_count description (#6464 ) DESCRIPTION: Add missing space to citus.shard_count description	2022-10-31 10:37:14 +01:00
Teja Mupparti	01103ce05d	This implements a new UDF citus_get_cluster_clock() that returns a monotonically increasing logical clock. Clock guarantees to never go back in value after restarts, and makes best attempt to keep the value close to unix epoch time in milliseconds. Also, introduces a new GUC "citus.enable_cluster_clock", when true, every distributed transaction is stamped with logical causal clock and persisted in a catalog pg_dist_commit_transaction.	2022-10-28 10:15:08 -07:00
Ahmet Gedemenli	c379ff8614	Drop defer drop gucs (#6447 ) DESCRIPTION: Drops GUC defer_drop_after_shard_split DESCRIPTION: Drops GUC defer_drop_after_shard_move Drop GUCs and related parts from the code. Delete tests that specifically added for the GUCs. Keep tests that can be used without the GUCs. Update test output changes. The motivation for this PR is to have an "always deferring" mechanism. These two GUCs provide an option to not deferring dropping objects during a shard move/split, and dropping them immediately. With this PR, we will be always deferring dropping orphaned shards and other types of objects. We will have a separate PR to extend the deferred cleanup operation, so that we would create records for deferred drop, for Subscriptions, Publications, Replication Slots etc. This will make us be able to keep track of created objects that needs to be dropped, during a shard move/split. We will have objects created specifically for the current operation; and those objects will be dropped at the end. We have an issue (a draft roadmap) for enabling parallel shard moves. For details please see: https://github.com/citusdata/citus/issues/6437	2022-10-25 16:48:34 +03:00
Onur Tirtir	2d14dd85e9	Not hardcode "false" in UpdateAutoConvertedForConnectedRelations (#6452 ) This didn't cause any bugs since today we're always calling UpdateAutoConvertedForConnectedRelations with autoconverted=false, so we don't need to backport this to anywhere.	2022-10-21 18:14:20 +03:00
Onur Tirtir	dbe2749bbf	Drop unreachable code from query_pushdown_planning.c (#6451 ) Given that we cannot continue after a `RaiseDeferredErrorInternal(.., ERROR)` call.	2022-10-21 18:04:31 +03:00
aykut-bozkurt	162c8a5160	Drop worker_fetch_foreign_file/worker_repartition_cleanup only if they exist when upgrading Citus (#6441 ) We should not introduce breaking sql changes to upgrade files after they are released. We did that for worker_fetch_foreign_file in v9.0.0 and worker_repartition_cleanup in v9.2.0. Later when we try to drop those udfs, they were missing for some clients unexpectedly due to breaking change in an old upgrade script. For that case, the fix is to add DROP IF EXISTS for those 2 udfs in 11.0-4--11.1-1.	2022-10-21 14:32:42 +03:00
Emel Şimşek	02fd1e6c03	Fix the crash that happens when using auto_explain extension with recursive queries (#6406 ) This crash happens with recursively planned queries. For such queries, subplans are explained via the ExplainOnePlan function of postgresql. This function reconstructs the query description from the plan therefore it expects the ActiveSnaphot for the query be available. This fix makes sure that the snapshot is in the stack before calling ExplainOnePlan. Fixes #2920.	2022-10-19 18:04:45 +03:00
Jelte Fennema	737e2bb1bb	Don't leak search_path to workers on DDL (#6444 ) DESCRIPTION: Don't leak search_path to workers on DDL For DDL we have to set the `search_path` on workers to the same as on the coordinator for some DDL to work. Previously this search_path would leak outside of the transaction that was used for the DDL. This fixes that by using `SET LOCAL` instead of `SET`. The only place where we still use plain `SET` is for DDL commands that are not allowed within transactions, such as `CREATE INDEX CONCURRENLTY`. This fixes this flaky test: ```diff CONTEXT: SQL statement "SELECT change_id FROM distributed_triggers.data_changes WHERE shard_key_value = NEW.shard_key_value AND object_id = NEW.object_id ORDER BY change_id DESC LIMIT 1" -PL/pgSQL function record_change() line XX at SQL statement +PL/pgSQL function distributed_triggers.record_change() line 17 at SQL statement while executing command on localhost:57638 DELETE FROM data_ref_table where shard_key_value = 'hello'; ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/27849/workflows/75ae5f1a-100b-4b7a-b991-7de069f39ee1/jobs/831429 I had tried to fix this flaky test in #5894 and then I tried implementing a better fix in #5896, where @marcocitus suggested this better fix. This change reverts the fix from #5894 and implements the fix suggested by Marco. Our multi_mx_alter_distributed_table test actually depended on the old buggy search_path leaking behavior. After fixing the bug that test would fail like this: ```diff CALL proc_0(1.0); DEBUG: pushing down the procedure -NOTICE: Res: 3 -DETAIL: from localhost:xxxxx +ERROR: relation "test_proc_colocation_0" does not exist +CONTEXT: PL/pgSQL function mx_alter_distributed_table.proc_0(double precision) line 5 at SQL statement +while executing command on localhost:57637 RESET client_min_messages; ``` I fixed this test by fully qualifying the table names used in the procedure. I think it's quite unlikely that actual users depend on this behavior though. Since it would require first doing DDL before calling a procedure in a session where the search_path was changed after connecting.	2022-10-19 16:47:35 +02:00
Ahmet Gedemenli	cdbda9ea6b	Add failure test for shard move (#6325 ) DESCRIPTION: Adds failure test for shard move DESCRIPTION: Remove function `WaitForAllSubscriptionsToBecomeReady` and related tests Adding some failure tests for shard moves. Dropping the not-needed-anymore function `WaitForAllSubscriptionsToBecomeReady`, as the subscriptions now start as ready from the beginning because we don't use logical replication table sync workers anymore. fixes: #6260	2022-10-19 14:25:26 +02:00
Onur Tirtir	5aec88d084	Not try locking relations referencing to views (#6430 ) Since there can't be such a foreign key already. This mainly fixes the error that Citus throws when trying to truncate a distributed view. Fixes #5990.	2022-10-19 11:24:22 +03:00
Gokhan Gulbiz	e87eda6496	Introduce a new GUC to propagate local settings to new connections in rebalancer (#6396 ) DESCRIPTION: Introduce ```citus.propagate_session_settings_for_loopback_connection``` GUC to propagate local settings to new connections. Fixes: #5289	2022-10-18 12:50:30 +03:00
Jelte Fennema	60eb67b908	Increase shard move test coverage by improving advisory locks (#6429 ) To be able to test non-blocking shard moves we take an advisory lock, so we can pause the shard move at an interesting moment. Originally this was during the logical replication catch up phase. But when I added tests for the rebalancer progress I moved this lock before the initial data copy. This allowed testing of the rebalance progress, but inadvertently made our non-blocking tests not actually test if we held unintended locks during logical replication catch up. This fixes that by creating two types of advisory locks, one before the copy and one after. This causes the tests to actually test their intended scenario again. Furthermore it starts using one of these locks for blocking shard moves too. Which allowed me to reduce the complexity of the rebalance progress test suite quite a bit. It also allowed enabling some flaky tests again, because this stopped them from being flaky. And finally it allowed testing of rebalance progress for blocking shard copy operations as well. In passing it fixes a flaky test during parallel blocking shard moves by ordering the output.	2022-10-17 17:32:28 +02:00
Ahmet Gedemenli	96912d9ba1	Add status column to get_rebalance_progress() (#6403 ) DESCRIPTION: Adds status column to get_rebalance_progress() Introduces a new column named `status` for the function `get_rebalance_progress()`. For each ongoing shard move, this column will reveal information about that shard move operation's current status. For now, candidate status messages could be one of the below. * Not Started * Setting Up * Copying Data * Catching Up * Creating Constraints * Final Catchup * Creating Foreign Keys * Completing * Completed	2022-10-17 16:55:31 +03:00
Onur Tirtir	4152a391c2	Properly set col names for shard rels that citus_extradata_container points to (#6428 ) Deparser function set_relation_column_names() knows that it needs to re-evaluate column names based on relation's tuple descriptor when the rte belongs to a relation (RTE_RELATION). However before this commit, it didn't know about the fact that citus might wrap such an rte with an rte that points to citus_extradata_container() placeholder. And because of this, it was simply taking the column aliases (e.g., "bar" in "foo AS bar") into the account and this might result in an incorrectly deparsed query as in below case: * Say, if we had view based on following query: ```sql SELECT a FROM table; ``` * And if we rename column "a" to "b", the view query normally becomes: ```sql SELECT b AS a FROM table; ``` * So before this commit, deparsing a query based on that view was resulting in such a query due to deparsing based on the column aliases, which is not correct: ```sql SELECT a FROM table; ``` Fixes #5932. DESCRIPTION: Fixes a bug that might cause failing to query the views based on tables that have renamed columns	2022-10-14 17:31:25 +03:00
Önder Kalacı	8b624b5c9d	Detect remotely closed sockets and add a single connection retry in the executor (#6404 ) PostgreSQL 15 exposes WL_SOCKET_CLOSED in WaitEventSet API, which is useful for detecting closed remote sockets. In this patch, we use this new event and try to detect closed remote sockets in the executor. When a closed socket is detected, the executor now has the ability to retry the connection establishment. Note that, the executor can retry connection establishments only for the connection that has not been used. Basically, this patch is mostly useful for preventing the executor to fail if a cached connection is closed because of the worker node restart (or worker failover). In other words, the executor cannot retry connection establishment if we are in a distributed transaction AND any command has been sent over the connection. That requires more sophisticated retry mechanisms. For now, fixing the above use case is enough. Fixes #5538 Earlier discussions: #5908, #6259 and #6283 ### Summary of the current approach regards to earlier trials As noted, we explored some alternatives before getting into this. https://github.com/citusdata/citus/pull/6283 is simple, but lacks an important property. We should be checking for `WL_SOCKET_CLOSED` _before_ sending anything over the wire. Otherwise, it becomes very tricky to understand which connection is actually safe to retry. For example, in the current patch, we can safely check `transaction->transactionState == REMOTE_TRANS_NOT_STARTED` before restarting a connection. #6259 does what we intent here (e.g., check for sending any command). However, as @marcocitus noted, it is very tricky to handle `WaitEventSets` in multiple places. And, the executor is designed such that it reacts to the events. So, adding anything `pre-executor` seemed too ugly. In the end, I converged into this patch. This patch relies on the simplicity of #6283 and also does a very limited handling of `WaitEventSets`, just for our purpose. Just before we add any connection to the execution, we check if the remote session has already closed. With that, we do a brief interaction of multiple wait event processing, but with different purposes. The new wait event processing we added does not even consider cancellations. We let that handled by the main event processing loop. Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-10-14 15:08:49 +02:00
Jelte Fennema	0cee79a7ab	Actually enable improved blocked process detection (#6426 ) In #6405 I added better improved blocked process detection for isolation tests. But when cleaning up unnecessary code I cleaned up a bit too much. This actually includes the new function definition in our migrations.	2022-10-13 09:50:37 +02:00
Onur Tirtir	20847515fa	Hint users to call "citus_set_coordinator_host" first (#6425 ) If an operation requires having coordinator in pg_dist_node and if that is not the case, then we automatically add the coordinator into pg_dist_node if user didn't add any worker nodes yet. However, if user have already added some worker nodes before, we throw an error. With this commit, we improve the error thrown in that case. Closes #6423 based on the discussion made there.	2022-10-12 18:18:51 +03:00
Jelte Fennema	6277ffd69e	Reduce isolation flakyness by improving blocked process detection (#6405 ) Sometimes our CI randomly fails on a test in a way similar to this: ```diff step s2-drop: DROP TABLE cancel_table; - + <waiting ...> +step s2-drop: <... completed> starting permutation: s1-timeout s1-begin s1-sleep10000 s1-rollback s1-reset s1-drop ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/26524/workflows/5415b84f-13a3-482f-bef9-648314c79a67/jobs/756377 I tried to fix that already in #6252 by disabling the maintenance daemon during isolation tests. But it seems that hasn't fixed all cases of these errors. This is another attempt at fixing these issues that seems to have better results. What it does is that it starts using the pInterestingPids parameter that citus_isolation_test_session_is_blocked receives. With this change we start filter out block-edges that are not caused by any of these pids. In passing this change also makes it possible to run `isolation_create_distributed_table_concurrently` with `check-isolation-base`	2022-10-12 16:35:09 +02:00
Hanefi Onaldi	ec3eebbaf6	Rename a function that collides with PG15 (#6422 ) PG15 introduced a function called ReplicationSlotName that causes conflicts with our function with the same name. I solved this issue by renaming our function to ReplicationSlotNameForNodeAndOwner Relevant PG commit: `c3b5992b91`	2022-10-12 13:24:04 +03:00
Jelte Fennema	cb34adf7ac	Don't reassign global PID when already assigned (#6412 ) DESCRIPTION: Fix bug in global PID assignment for rebalancer sub-connections In CI our isolation_shard_rebalancer_progress test would sometimes fail like this: ```diff +isolationtester: canceling step s1-rebalance-c1-block-writes after 60 seconds step s1-rebalance-c1-block-writes: SELECT rebalance_table_shards('colocated1', shard_transfer_mode:='block_writes'); - <waiting ...> + +ERROR: canceling statement due to user request step s7-get-progress: ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/27855/workflows/2a7e335a-f3e8-46ed-b6bd-6920d42f7214/jobs/831710 It turned out this was an actual bug in the way our assigning of global PIDs interacts with the way we connect to ourselves as the shard rebalancer. The first command the shard rebalancer sends is a SET ommand to change the application_name to `citus_rebalancer`. If `StartupCitusBackend` is called after this command is processed, then it overwrites the global PID that was extracted from the previous application_name. This makes sure that we don't do that, and continue to use the original global PID. While it might seem that we only call `StartupCitusBackend` once for each query backend, this isn't actually the case. Whenever pg_dist_partition gets ANALYZEd by autovacuum we indirectly call `StartupCitusBackend` again, because we invalidate the cache then. In passing this fixes two other things as well: 1. It sets `distributedCommandOriginator` correctly in `AssignGlobalPID`, by using IsExternalClientBackend(). This doesn't matter much anymore, since AssignGlobalPID effectively becomes a no-op in this PR for any non-external client backends. 2. It passes the application_name to InitializeBackendData in StartupCitusBackend, instead of INVALID_CITUS_INTERNAL_BACKEND_GPID (which effectively got casted to NULL). In practice this doesn't change the behaviour of the call, since the call is a no-op for every backend except the maintenance daemon. And the behaviour of the call is the same for NULL as for the application_name of the maintenance daemon.	2022-10-11 16:41:01 +02:00
Naisila Puka	89aa9a015f	Fixes empty password issue (#6417 )	2022-10-11 15:56:44 +03:00
Onur Tirtir	517b72a9d5	Fix use-after-free in GetAlterTriggerStateCommand() (#6413 ) Fix use-after-free in GetAlterTriggerStateCommand() introduced in #6398.	2022-10-10 16:38:21 +03:00
Gokhan Gulbiz	1776bdf654	Limit citus_drain_node to drain the specified node only (#6361 ) DESCRIPTION: Fixes citus_drain_node to drain the specified worker only. Fixes #6267	2022-10-09 13:33:08 +03:00
Onur Tirtir	86e186f671	Retain trigger settings when re-creating the triggers (on shards) (#6398 ) Fixes https://github.com/citusdata/citus/issues/6394. DESCRIPTION: Fixes a bug that causes creating disabled-triggers on shards as enabled Since CREATE TRIGGER doesn't have syntax support to specify whether the trigger should be enabled/disabled, the underlying PG function (`pg_get_triggerdef()`) that we use to generate the command to create the trigger is not enough. For this reason, we append a second command to enable/disable trigger, right after creating it. We don't retain explicit extension dependencies set by using `ALTER trigger DEPENDS ON EXTENSION` commands too, but apparently right fix for that is to throw an error as in `PreprocessAlterTriggerDependsStmt()`; so, opened a separate PR to fix that #6399.	2022-10-06 14:51:07 +03:00
Naisila Puka	27e867afbc	Propagates column aliases (#6400 ) Propagates column aliases in the shard-level commands	2022-10-06 12:27:31 +03:00
Naisila Puka	b5cba3a3fe	Use original relation to retrieve column name because of syscache (#6387 ) During alter_distributed_table, we create a new table like the original table but with the altered options. To retrieve the name of the distribution column, we were using the attribute syscache of the new table, since we already created the new table as identical to the original table. However, the attribute syscaches of these two tables are not the same if the original table has dropped columns. The reason is that dropped columns are all still present in the cache. Hence, for example, the attnos would be different in the syscaches. So, let's use the attribute syscache of the original table.	2022-10-06 12:08:00 +03:00
Ahmet Gedemenli	e36890ce55	Add source_lsn and target_lsn fields into get_rebalance_progress (#6364 ) DESCRIPTION: Adds source_lsn and target_lsn fields into get_rebalance_progress Adding two fields named `source_lsn` and `target_lsn` to the function `get_rebalance_progress`. Target lsn data is fetched in `GetShardStatistics`, by expanding the query sent to workers (joining with pg_subscription_rel and pg_stat_subscription). Then put into the hashmap, for each shard. Source lsn data is fetched in `BuildWorkerShardStatististicsHash`, in the loop that iterate each node, by sending a pg_current_wal_lsn query to each node. Then put into the hashmap, for each node.	2022-10-05 11:12:24 +03:00
Ahmet Gedemenli	d0fa10a98c	Bump Citus to 11.2devel (#6385 )	2022-09-30 14:47:42 +03:00
Jelte Fennema	24e06af6d2	Reuse connections for Splits and Logical Replication (#6314 ) In Split, Logical replication logic and ShardCleaner we call `SendCommandListToWorkerOutsideTransaction` and `SendOptionalCommandListToWorkerOutsideTransaction` frequently. This opens new connection for each of those calls, even though we already have a perfectly good connection lying around. This PR adds two new APIs `SendCommandListToWorkerOutsideTransactionWithConnection` and `SendOptionalCommandListToWorkerOutsideTransactionWithConnection` that allow sending a list of queries in a transaction over an existing connection. We also update the callers (Split, ShardCleaner, Logical Replication) to use these new APIs instead. Co-authored-by: Nitish Upreti <niupre@microsoft.com> Co-authored-by: Onder Kalaci <onderkalaci@gmail.com>	2022-09-26 13:37:40 +02:00
Jelte Fennema	d9a9a3263b	Revert replica identity creation order for shard moves (#6367 ) In Citus 11.1.0 we changed the order of doing the initial data copy and the replica identity creation when doing a non blocking shard move. This was done to try and increase the speed with which shard moves could be done. But after doing more extensive performance testing this change turned out to have a negative impact on the speed of moves on the setups that I tested. Looking at the resource usage metrics of the VMs the reason for this seems to be that these shard moves were bottlenecked by disk bandwidth. While creating replica identities in bulk after the initial copy will reduce CPU usage a bit, it does require an additional sequence scan of the just written data. So when a VM is bottlenecked on disk, it makes sense to spend a little bit more CPU to avoid an additional scan. Since PKs are usually simple indexes that don't require lots of CPU to update, as opposed to e.g. GiST indexes. This reverts the order change to avoid a regression on shard move speed in these cases. For future releases we might consider re-evaluating our index creation order for other indexes too, and create "simple" indexes before the copy.	2022-09-23 14:55:25 +02:00
Onur Tirtir	a868cc049a	Not allow ON DELETE/UPDATE SET DEFAULT actions on columns that default to sequences (#6340 ) Given that we drop DEFAULT nextval('sequence') expressions from shard relation columns, allowing `ON DELETE/UPDATE SET DEFAULT` on such columns might cause inserting NULL values as a result of a delete/update operation. For this reason, we disallow ON DELETE/UPDATE SET DEFAULT actions on columns that default to sequences. DESCRIPTION: Disallows having ON DELETE/UPDATE SET DEFAULT actions on columns that default to sequences Fixes #6339.	2022-09-23 03:34:02 -07:00
Onur Tirtir	de24a3eda5	Not drop default col exprs from shard when adding local table to metadata (#6323 ) As we did for GENERATED STORED columns in #4613, we should not drop column default expressions that are not based on sequences from shard relation since such expressions need to exist e.g. for foreign key actions. For the column default expressions that are based on sequences we cannot do much, so we need to disallow having ON DELETE SET DEFAULT actions on such columns in a separate PR, see #6339. Fixes #6318. DESCRIPTION: Fixes a bug that might cause inserting incorrect DEFAULT values when applying foreign key actions	2022-09-23 03:05:08 -07:00
Ahmet Gedemenli	bae4b47c2f	Fix dropping replication slot (#6359 ) DESCRIPTION: Fixes dropping replication slots As detected by a flaky test, Citus sometimes fails to drop replication slots, possibly due to a race condition, at the end of a shard split. With this PR, we retry to drop them in case of an `OBJECT_IN_USE` error, consistently for 20 seconds. fixes: #6326	2022-09-21 16:29:56 +03:00
Nitish Upreti	e9508b2603	Shard Split : Add / Update logging (#6336 ) DESCRIPTION: Improve logging during shard split and resource cleanup ### DESCRIPTION This PR makes logging improvements to Shard Split : 1. Update confusing logging to fix #6312 2. Added new `ereport(LOG` to make debugging easier as part of telemetry review.	2022-09-16 09:39:08 -07:00
Marco Slot	8544346a78	Allow create_distributed_table_concurrently on an empty node (#6353 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-09-16 10:55:02 +02:00
Onder Kalaci	766f340ce0	Prevent failures on partitioned distributed tables with statistics objects on PG 15 Comment from the code is clear on this: /* * The statistics objects of the distributed table are not relevant * for the distributed planning, so we can override it. * * Normally, we should not need this. However, the combination of * Postgres commit 269b532aef55a579ae02a3e8e8df14101570dfd9 and * Citus function AdjustPartitioningForDistributedPlanning() * forces us to do this. The commit expects statistics objects * of partitions to have "inh" flag set properly. Whereas, the * function overrides "inh" flag. To avoid Postgres to throw error, * we override statlist such that Postgres does not try to process * any statistics objects during the standard_planner() on the * coordinator. In the end, we do not need the standard_planner() * on the coordinator to generate an optimized plan. We call * into standard_planner() for other purposes, such as generating the * relationRestrictionContext here. * * AdjustPartitioningForDistributedPlanning() is a hack that we use * to prevent Postgres' standard_planner() to expand all the partitions * for the distributed planning when a distributed partitioned table * is queried. It is required for both correctness and performance * reasons. Although we can eliminate the use of the function for * the correctness (e.g., make sure that rest of the planner can handle * partitions), it's performance implication is hard to avoid. Certain * planning logic of Citus (such as router or query pushdown) relies * heavily on the relationRestrictionList. If * AdjustPartitioningForDistributedPlanning() is removed, all the * partitions show up in the, causing high planning times for * such queries. */	2022-09-15 14:36:05 +03:00
aykut-bozkurt	739b91afa6	ensure we have more active nodes than replication factor. (#6341 ) DESCRIPTION: Fixes floating exception during create_distributed_table_concurrently. Fixes #6332. During create_distributed_table_concurrently, when there is no active primary node, it fails with floating exception. We added similar check with create_distributed_table. It will fail with proper message if current active node is less than replication factor.	2022-09-14 18:20:50 +03:00
Marco Slot	4ab415c43a	Fix escaping in sequence dependency queries (#6345 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-09-14 17:43:24 +03:00
Sameer Awasekar	4851b4e8f2	Introduce code changes to fix Issue:6303 (#6328 ) The PR introduces code changes to fix Issue [6303](https://github.com/citusdata/citus/issues/6303) `create_distributed_table_concurrently` following drop column, creates a buggy situation in split decoder. * Consider the below scenario: * Session1 : Drop column followed by create_distributed_table_concurrently * Session2 : Concurrent insert workload The child shards created by `create_distributed_table_concurrently` will have less columns than the source shard because some column were dropped. The incoming tuple from session2 will have more columns as the writes happened on source shard. But now the tuple needs to be applied on child shard. So we need to format existing tuple according to child schema and skip dropped column values. The PR fixes this by reformatting the tuple according the target child schema. Test: 1) isolation_create_distributed_concurrently_after_drop_column - Repros the issue and tests on the same.	2022-09-14 19:56:32 +05:30
Marco Slot	7a92d873b6	Fix bugs in CheckIfRelationWithSameNameExists (#6343 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-09-14 15:42:46 +02:00
Nils Dijk	da527951ca	Fix: rebalance stop non super user (#6334 ) No need for description, fixing issue introduced with new feature for 11.1 Fixes #6333 Due to Postgres' C api being o-indexed and postgres' attributes being 1-indexed, we were reading the wrong Datum as the Task owner when cancelling. Here we add a test to show the error and fix the off-by-one error.	2022-09-13 23:19:31 +02:00
Hanefi Onaldi	f34467dcb3	Remove missing declaration warning (#6330 ) When I built Citus on PG15beta4 locally, I get a warning message. ``` utils/background_jobs.c:902:5: warning: declaration does not declare anything [-Wmissing-declarations] __attribute__((fallthrough)); ^ 1 warning generated. ``` This is a hint to the compiler that we are deliberately falling through in a switch-case block.	2022-09-13 13:48:51 +03:00
Jelte Fennema	f13b140621	Show citus_copy_shard_placement progress in get_rebalance_progress (#6322 ) DESCRIPTION: Show citus_copy_shard_placement progress in get_rebalance_progress When rebalancing to a new node that does not have reference tables yet the rebalancer will first copy the reference tables to the nodes. Depending on the size of the reference tables, this might take a long time. However, there's no indication of what's happening at this stage of the rebalance. This PR improves this situation by also showing the progress of any citus_copy_shard_placement calls when calling get_rebalance_progress.	2022-09-13 08:59:52 +00:00
Naisila Puka	76ff4ab188	Adds support for unlogged distributed sequences (#6292 ) We can now do the following: - Distribute sequence with logged/unlogged option - ALTER TABLE my_sequence SET LOGGED/UNLOGGED - ALTER SEQUENCE my_sequence SET LOGGED/UNLOGGED Relevant PG commit `344d62fb9a`	2022-09-13 10:53:39 +03:00
Hanefi Onaldi	5cfcc63308	Add warning messages for cluster commands on partitioned tables (#6306 ) PG15 introduces `CLUSTER` commands for partitioned tables. Similar to a `CLUSTER` command with no supplied table names, these commands also can not be run inside transaction blocks and therefore can not be propagated in a distributed transaction block with ease. Therefore we raise warnings. Relevant PG commit: cfdd03f45e6afc632fbe70519250ec19167d6765	2022-09-13 00:05:58 +03:00

1 2 3 4 5 ...

2883 Commits (666696c01c307f9ea6a939294d9d2d313790f27c)