citus

Commit Graph

Author	SHA1	Message	Date
eaydingol	ce9a7b7c79	Run citus tests on latest so, specified sql	2025-11-25 16:16:42 +03:00
Onur Tirtir	2e1de77744	Also use pid in valgrind logfile name (#8150 ) Also use pid in valgrind logfile name to avoid overwriting the valgrind logs due to the memory errors that can happen in different processes concurrently: (from https://valgrind.org/docs/manual/manual-core.html) ``` --log-file=<filename> Specifies that Valgrind should send all of its messages to the specified file. If the file name is empty, it causes an abort. There are three special format specifiers that can be used in the file name. %p is replaced with the current process ID. This is very useful for program that invoke multiple processes. WARNING: If you use --trace-children=yes and your program invokes multiple processes OR your program forks without calling exec afterwards, and you don't use this specifier (or the %q specifier below), the Valgrind output from all those processes will go into one file, possibly jumbled up, and possibly incomplete. ``` With this change, we'll start having lots of valgrind output files generated under "src/test/regress" with the same prefix, citus_valgrind_test_log.txt, by default, during valgrind tests, so it'll look a bit ugly; but one can use `cat src/test/regress/citus_valgrind_test_log.txt.[0-9]*"` or such to combine them into a single valgrind log file later.	2025-08-27 14:01:25 +00:00
Muhammad Usama	be6668e440	Snapshot-Based Node Split – Foundation and Core Implementation (#8122 ) DESCRIPTION: This pull request introduces the foundation and core logic for the snapshot-based node split feature in Citus. This feature enables promoting a streaming replica (referred to as a clone in this feature and UI) to a primary node and rebalancing shards between the original and the newly promoted node without requiring a full data copy. This significantly reduces rebalance times for scale-out operations where the new node already contains a full copy of the data via streaming replication. Key Highlights: 1. Replica (Clone) Registration & Management Infrastructure Introduces a new set of UDFs to register and manage clone nodes: - citus_add_clone_node() - citus_add_clone_node_with_nodeid() - citus_remove_clone_node() - citus_remove_clone_node_with_nodeid() These functions allow administrators to register a streaming replica of an existing worker node as a clone, making it eligible for later promotion via snapshot-based split. 2. Snapshot-Based Node Split (Core Implementation) New core UDF: - citus_promote_clone_and_rebalance() This function implements the full workflow to promote a clone and rebalance shards between the old and new primaries. Steps include: 1. Ensuring Exclusivity – Blocks any concurrent placement-changing operations. 2. Blocking Writes – Temporarily blocks writes on the primary to ensure consistency. 3. Replica Catch-up – Waits for the replica to be fully in sync. 4. Promotion – Promotes the replica to a primary using pg_promote. 5. Metadata Update – Updates metadata to reflect the newly promoted primary node. 6. Shard Rebalancing – Redistributes shards between the old and new primary nodes. 3. Split Plan Preview A new helper UDF get_snapshot_based_node_split_plan() provides a preview of the shard distribution post-split, without executing the promotion. Example: ``` reb 63796> select * from pg_catalog.get_snapshot_based_node_split_plan('127.0.0.1',5433,'127.0.0.1',5453); table_name \| shardid \| shard_size \| placement_node --------------+---------+------------+---------------- companies \| 102008 \| 0 \| Primary Node campaigns \| 102010 \| 0 \| Primary Node ads \| 102012 \| 0 \| Primary Node mscompanies \| 102014 \| 0 \| Primary Node mscampaigns \| 102016 \| 0 \| Primary Node msads \| 102018 \| 0 \| Primary Node mscompanies2 \| 102020 \| 0 \| Primary Node mscampaigns2 \| 102022 \| 0 \| Primary Node msads2 \| 102024 \| 0 \| Primary Node companies \| 102009 \| 0 \| Clone Node campaigns \| 102011 \| 0 \| Clone Node ads \| 102013 \| 0 \| Clone Node mscompanies \| 102015 \| 0 \| Clone Node mscampaigns \| 102017 \| 0 \| Clone Node msads \| 102019 \| 0 \| Clone Node mscompanies2 \| 102021 \| 0 \| Clone Node mscampaigns2 \| 102023 \| 0 \| Clone Node msads2 \| 102025 \| 0 \| Clone Node (18 rows) ``` 4 Test Infrastructure Enhancements - Added a new test case scheduler for snapshot-based split scenarios. - Enhanced pg_regress_multi.pl to support creating node backups with slightly modified options to simulate real-world backup-based clone creation. ### 5. Usage Guide The snapshot-based node split can be performed using the following workflow: - Take a Backup of the Worker Node Run pg_basebackup (or an equivalent tool) against the existing worker node to create a physical backup. `pg_basebackup -h <primary_worker_host> -p <port> -D /path/to/replica/data --write-recovery-conf ` - Start the Replica Node Start PostgreSQL on the replica using the backup data directory, ensuring it is configured as a streaming replica of the original worker node. - Register the Backup Node as a Clone Mark the registered replica as a clone of its original worker node: `SELECT * FROM citus_add_clone_node('<clone_host>', <clone_port>, '<primary_host>', <primary_port>); ` - Promote and Rebalance the Clone Promote the clone to a primary and rebalance shards between it and the original worker: `SELECT * FROM citus_promote_clone_and_rebalance('clone_node_id'); ` - Drop Any Replication Slots from the Original Worker After promotion, clean up any unused replication slots from the original worker: `SELECT pg_drop_replication_slot('<slot_name>'); `	2025-08-19 14:13:55 +03:00
SongYoungUk	743c9bbf87	fix #7715 - add assign hook for CDC library path adjustment (#8025 ) DESCRIPTION: Automatically updates dynamic_library_path when CDC is enabled fix : #7715 According to the documentation and `pg_settings`, the context of the `citus.enable_change_data_capture` parameter is user. However, changing this parameter — even as a superuser — doesn't work as expected: while the initial copy phase works correctly, subsequent change events are not propagated. This appears to be due to the fact that `dynamic_library_path` is only updated to `$libdir/citus_decoders:$libdir` when the server is restarted and the `_PG_init` function is invoked. To address this, I added an `EnableChangeDataCaptureAssignHook` that automatically updates `dynamic_library_path` at runtime when `citus.enable_change_data_capture` is enabled, ensuring that the CDC decoder libraries are properly loaded. Note that `dynamic_library_path` is already a `superuser`-context parameter in base PostgreSQL, so updating it from within the assign hook should be safe and consistent with PostgreSQL’s configuration model. If there’s any reason this approach might be problematic or if there’s a preferred alternative, I’d appreciate any feedback. cc. @jy-min --------- Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com> Co-authored-by: ibrahim halatci <ihalatci@gmail.com>	2025-07-18 11:07:17 +03:00
Onur Tirtir	3d61c4dc71	Add citus_stat_counters view and citus_stat_counters_reset() function to reset it (#7917 ) DESCRIPTION: Adds citus_stat_counters view that can be used to query stat counters that Citus collects while the feature is enabled, which is controlled by citus.enable_stat_counters. citus_stat_counters() can be used to query the stat counters for the provided database oid and citus_stat_counters_reset() can be used to reset them for the provided database oid or for the current database if nothing or 0 is provided. Today we don't persist stat counters on server shutdown. In other words, stat counters are automatically reset in case of a server restart. Details on the underlying design can be found in header comment of stat_counters.c and in the technical readme. ------- Here are the details about what we track as of this PR: For connection management, we have three statistics about the inter-node connections initiated by the node itself: * connection_establishment_succeeded * connection_establishment_failed * connection_reused While the first two are relatively easier to understand, the third one covers the case where a connection is reused. This can happen when a connection was already established to the desired node, Citus decided to cache it for some time (see citus.max_cached_conns_per_worker & citus.max_cached_connection_lifetime), and then reused it for a new remote operation. Here are the other important details about these connection statistics: 1. connection_establishment_failed doesn't care about the connections that we could establish but are lost later in the transaction. Plus, we cannot guarantee that the connections that are counted in connection_establishment_succeeded were not lost later. 2. connection_establishment_failed doesn't care about the optional connections (see OPTIONAL_CONNECTION flag) that we gave up establishing because of the connection throttling rules we follow (see citus.max_shared_pool_size & citus.local_shared_pool_size). The reaason for this is that we didn't even try to establish these connections. 3. For the rest of the cases where a connection failed for some reason, we always increment connection_establishment_failed even if the caller was okay with the failure and know how to recover from it (e.g., the adaptive executor knows how to fall back local execution when the target node is the local node and if it cannot establish a connection to the local node). The reason is that even if it's likely that we can still serve the operation, we still failed to establish the connection and we want to track this. 4. Finally, the connection failures that we count in connection_establishment_failed might be caused by any of the following reasons and for now we prefer to _not_ further distinguish them for simplicity: a. remote node is down or cannot accept any more connections, or overloaded such that citus.node_connection_timeout is not enough to establish a connection b. any internal Citus error that might result in preparing a bad connection string so that libpq fails when parsing the connection string even before actually trying to establish a connection via connect() call c. broken citus.node_conninfo or such Citus configuration that was incorrectly set by the user can also result in similar outcomes as in b d. internal waitevent set / poll errors or OOM in local node We also track two more statistics for query execution: * query_execution_single_shard * query_execution_multi_shard And more importantly, both query_execution_single_shard and query_execution_multi_shard are not only tracked for the top-level queries but also for the subplans etc. The reason is that for some queries, e.g., the ones that go through recursive planning, after Citus performs the heavy work as part of subplans, the work that needs to be done for the top-level query becomes quite straightforward. And for such query types, it would be deceiving if we only incremented the query stat counters for the top-level query. Similarly, for non-pushable INSERT .. SELECT and MERGE queries, we perform separate counter increments for the SELECT / source part of the query besides the final INSERT / MERGE query.	2025-04-28 12:23:52 +00:00
eaydingol	117bd1d04f	Disable nonmaindb interface (#7905 ) DESCRIPTION: The PR disables the non-main db related features. The non-main db related features were introduced in https://github.com/citusdata/citus/pull/7203.	2025-02-21 13:36:19 +03:00
Jelte Fennema-Nio	8c9de08b76	Fix CI issues after Github Actions networking changes (#7624 ) For some reason using localhost in our hba file doesn't have the intended effect anymore in our Github Actions runners. Probably because of some networking change (IPv6 maybe) or some change in the `/etc/hosts` file. Replacing localhost with the equivalent loopback IPv4 and IPv6 addresses resolved this issue.	2024-06-14 16:20:23 +02:00
Karina	41d99249d9	Use expecteddir option when running vanilla tests (#7573 ) In PostgreSQL 16 a new option expecteddir was introduced to pg_regress. Together with fix in [196eeb6b](https://github.com/postgres/postgres/commit/196eeb6b) it causes check-vanilla failure if expecteddir is not specified. Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>	2024-04-10 16:08:54 +00:00
Halil Ozan Akgül	b877d606c7	Adds 2PC distributed commands from other databases (#7203 ) DESCRIPTION: Adds support for 2PC from non-Citus main databases This PR only adds support for `CREATE USER` queries, other queries need to be added. But it should be simple because this PR creates the underlying structure. Citus main database is the database where the Citus extension is created. A non-main database is all the other databases that are in the same node with a Citus main database. When a `CREATE USER` query is run on a non-main database we: 1. Run `start_management_transaction` on the main database. This function saves the outer transaction's xid (the non-main database query's transaction id) and marks the current query as main db command. 2. Run `execute_command_on_remote_nodes_as_user("CREATE USER <username>", <username to run the command>)` on the main database. This function creates the users in the rest of the cluster by running the query on the other nodes. The user on the current node is created by the query on the outer, non-main db, query to make sure consequent commands in the same transaction can see this user. 3. Run `mark_object_distributed` on the main database. This function adds the user to `pg_dist_object` in all of the nodes, including the current one. This PR also implements transaction recovery for the queries from non-main databases.	2023-12-22 19:19:41 +03:00
Gürkan İndibay	3b556cb5ed	Adds create / drop database propagation support (#7240 ) DESCRIPTION: Adds support for propagating `CREATE`/`DROP` database In this PR, create and drop database support is added. For CREATE DATABASE: * "oid" option is not supported * specifying "strategy" to be different than "wal_log" is not supported * specifying "template" to be different than "template1" is not supported The last two are because those are not saved in `pg_database` and when activating a node, we cannot assume what parameters were provided when creating the database. And "oid" is not supported because whether user specified an arbitrary oid when creating the database is not saved in pg_database and we want to avoid from oid collisions that might arise from attempting to use an auto-assigned oid on workers. Finally, in case of node activation, GRANTs for the database are also propagated. --------- Co-authored-by: Jelte Fennema-Nio <github-tech@jeltef.nl> Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com> Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2023-11-21 16:43:51 +03:00
Japin Li	e14e8667cc	Fix redundant variable declaration (#7353 ) The `$workerCount` declare twice in src/test/regress/pg_regress_multi.pl.	2023-11-17 13:01:23 +03:00
Naisila Puka	0d1f18862b	Propagates SECURITY LABEL ON ROLE stmt (#7304 ) We propagate `SECURITY LABEL [for provider] ON ROLE rolename IS labelname` to the worker nodes. We also make sure to run the relevant `SecLabelStmt` commands on a newly added node by looking at roles found in `pg_shseclabel`. See official docs for explanation on how this command works: https://www.postgresql.org/docs/current/sql-security-label.html This command stores the role label in the `pg_shseclabel` catalog table. This commit also fixes the regex string in `check_gucs_are_alphabetically_sorted.sh` script such that it escapes the dot. Previously it was looking for all strings starting with "citus" instead of "citus." as it should. To test this feature, I currently make use of a special GUC to control label provider registration in PG_init when creating the Citus extension.	2023-11-16 13:12:30 +03:00
Halil Ozan Akgül	2c7beee562	Fix citus.tenant_stats_limit test by setting it to 2 (#6899 ) citus.tenant_stats_limit was set to 2 when we were adding tests for it. Then we changed it to 10, making the tests incorrect. This PR fixes that without breaking other tests.	2023-05-23 17:44:07 +03:00
Halil Ozan Akgül	52ad2d08c7	Multi tenant monitoring (#6725 ) DESCRIPTION: Adds views that monitor statistics on tenant usages This PR adds `citus_stats_tenants` view that monitors the tenants on the cluster. `citus_stats_tenants` shows the node id, colocation id, tenant attribute, read count in this period and last period, and query count in this period and last period of the tenant. Tenant attribute currently is the tenant's distribution column value, later when schema based sharding is introduced, this meaning might change. A period is a time bucket the queries are counted by. Read and query counts for this period can increase until the current period ends. After that those counts are moved to last period's counts, which cannot change. The period length can be set using 'citus.stats_tenants_period'. `SELECT` queries are counted as _read_ queries, `INSERT`, `UPDATE` and `DELETE` queries are counted as _write_ queries. So in the view read counts are `SELECT` counts and query counts are `SELECT`, `INSERT`, `UPDATE` and `DELETE` count. The data is stored in shared memory, in a struct named `MultiTenantMonitor`. `citus_stats_tenants` shows the data from local tenants. `citus_stats_tenants` show up to `citus.stats_tenant_limit` number of tenants. The tenants are scored based on the number of queries they run and the recency of those queries. Every query ran increases the score of tenant by `ONE_QUERY_SCORE`, and after every period ends the scores are halved. Halving is done lazily. To retain information a longer the monitor keeps up to 3 times `citus.stats_tenant_limit` tenants. When the tenant count hits `3 * citus.stats_tenant_limit`, last `citus.stats_tenant_limit` tenants are removed. To see all stored tenants you can use `citus_stats_tenants(return_all_tenants := true)` - [x] Create collector view that gets data from all nodes. #6761 - [x] Add monitoring log #6762 - [x] Create enable/disable GUC #6769 - [x] Parse the annotation string correctly #6796 - [x] Add local queries and prepared statements #6797 - [x] Rename to citus_stat_statements #6821 - [x] Run pgbench - [x] Fix role permissions #6812 --------- Co-authored-by: Gokhan Gulbiz <ggulbiz@gmail.com> Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2023-04-05 17:44:17 +03:00
rajeshkt78	85b8a2c7a1	CDC implementation for Citus using Logical Replication (#6623 ) Description: Implementing CDC changes using Logical Replication to avoid re-publishing events multiple times by setting up replication origin session, which will add "DoNotReplicateId" to every WAL entry. - shard splits - shard moves - create distributed table - undistribute table - alter distributed tables (for some cases) - reference table operations The citus decoder which will be decoding WAL events for CDC clients, ignores any WAL entry with replication origin that is not zero. It also maps the shard names to distributed table names.	2023-03-28 16:00:21 +05:30
aykut-bozkurt	ea3093bdb6	Make workerCount configurable for regression tests (#6764 ) Make worker count flexible in our regression tests instead of hardcoding it to 2 workers.	2023-03-20 12:06:31 +03:00
Jelte Fennema	aa9cd16d15	Use correct guc value to disable statistics collection (#6641 ) The `citus.enable_statistics_collection` is a boolean GUC not an integer one. Setting it to `-1` showed errors in the logs.	2023-01-24 15:32:50 +01:00
Jelte Fennema	93fcc5c5d8	Move tablespace directory creation to pg_regress_multi.pl (#6629 ) Multiple `check-xxx` targets create tablespaces. If you run two of these at the same time you would get an error like: ```diff CREATE TABLESPACE test_tablespace LOCATION :'test_tablespace'; +ERROR: directory "/home/rajesh/citus/citus/src/test/regress/tmp_check/ts0/PG_14_202107181" already in use as a tablespace ``` This fixes that by moving creation of table space directory creation and removal to pg_regress_multi.pl instead of being in the Makefile.	2023-01-20 12:34:33 +00:00
aykut-bozkurt	8be4ce546e	fix vanilla test status on CI (#6555 ) - Because of the make command used for vanilla tests, test status is always shown as success on CI. As a fix, I added `&& false` at the end of the copying diff file to make the command fail when check-vanilla fails. ```make check-vanilla: all $(pg_regress_multi_check) --vanillatest \|\| (cp $(vanilla_diffs_file) $(citus_abs_srcdir)/regression.diffs && false) ``` - I also fixed some vanilla tests that fails due to recently added clock related operators shown up at some queries.	2022-12-13 11:15:47 +03:00
aykut-bozkurt	442cdb2ea5	pg_regress needs the option dlpath for postgres tests to find regress.so (#6416 ) When you run vanilla tests in your local environment, some of the tests tries to find path for regress.so which is not in default lib path. That is why we need to specify regress.so path as dlpath option. Example failure: ``` LOAD :'regresslib'; +ERROR: could not access file "/home/aykutbozkurt/.pgenv/pgsql-15beta4/lib/regress.so": No such file or directory ``` It is actually in `~/.pgenv/src/postgresql-15beta4/src/test/regress/regress.so` which is found by `$regresslibdir`.	2022-10-11 14:43:06 +03:00
Jelte Fennema	5c64227223	Hopefully reduce flaky tests by disabling the maintenance daemon (#6252 ) Sometimes our CI randomly fails on a test in a way similar to this: ```diff step s2-drop: DROP TABLE cancel_table; - + <waiting ...> +step s2-drop: <... completed> starting permutation: s1-timeout s1-begin s1-sleep10000 s1-rollback s1-reset s1-drop ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/26524/workflows/5415b84f-13a3-482f-bef9-648314c79a67/jobs/756377 Another example of a failure like this: ```diff stop_session_level_connection_to_node ------------------------------------- (1 row) step s3-display: SELECT * FROM ref_table ORDER BY id, value; SELECT * FROM dist_table ORDER BY id, value; - + <waiting ...> +step s3-display: <... completed> id\|value --+----- ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/26551/workflows/91dca4b2-bb1c-4cae-b2ef-ce3f9c689ce5/jobs/757781 A step that shouldn't be blocked is detected as "waiting..." temporarily and then gets unblocked automatically immediately after. I'm not certain of the reason for this, but one explanation is that the maintenance daemon is doing something that blocks the query. In the shown case my hunch is that it could be the deferred shard deletion. This PR disables all the features of the maintenance daemon during isolation testing to try and prevent process from randomly being detected as blocking. NOTE: I'm not certain that this will actually fix this issue. If the issue persists even after this change, at least we know that it's not the maintenance daemon that's blocking it.	2022-10-04 14:33:57 +03:00
Önder Kalacı	1df943e0d5	Use Posix locale in the tests (#6261 ) Commit `9653a0065e` has changed it to C.UTF-8 , which fails on MacOS	2022-08-29 12:52:03 +02:00
Jelte Fennema	ee5af1ab90	Use C.UTF-8 locale in tests (#6242 ) I upgraded my OS to Ubuntu 22.04 a while back and since then some tests order output slightly differently. I think it might be because of the glibc upgrade that changed ordering for things like underscores and spaces. Changing the locale to C.UTF-8 solves this issue.	2022-08-25 13:10:49 +02:00
Jelte Fennema	1dd775fae8	Speed up logical replication tests to fix flakyness (#6229 ) The isolation_tenant_isolation_nonblocking test would sometimes randomly fail in CI, because we have a limit of runtime limit of 2 minutes per test. ``` test isolation_tenant_isolation_nonblocking ... make: *** [Makefile:171: check-enterprise-isolation] Terminated Too long with no output (exceeded 2m0s): context deadline exceeded ``` One solution would obviously be to increase the timeout, but instead I spent some time to increase the speed of our tests by tweaking some timings. On my local machine the time it took to run the isolation_tenant_isolation_nonblocking test went from 75s to 15s. So now we should easily stay within the 2 minute per test limit. I also checked if the new settings improved other logical replication tests, but the impect differs wildly per test. One other example of a test that runs much quicker due to the change is isolation_non_blocking_shard_split_fkey. But the shard move tests I tried are impacted much less. Example of failed tests: https://app.circleci.com/pipelines/github/citusdata/citus/26373/workflows/4fa660e4-63c8-4844-bef8-70a7bea902b7/jobs/748199	2022-08-23 17:37:31 +02:00
Jelte Fennema	25e5cf2e50	Fix flakyness in failure_setup (#6205 ) In CI sometimes failure_setup will fail with the following error: ```diff SELECT master_add_node('localhost', :worker_2_proxy_port); -- an mitmproxy which forwards to the second worker - master_add_node ---------------------------------------------------------------------- - 2 -(1 row) - +ERROR: connection to the remote node localhost:9060 failed with the following error: could not connect to server: Connection refused + Is the server running on host "localhost" (127.0.0.1) and accepting + TCP/IP connections on port 9060? +could not connect to server: Connection refused + Is the server running on host "localhost" (127.0.0.1) and accepting + TCP/IP connections on port 9060? +could not connect to server: Cannot assign requested address + Is the server running on host "localhost" (::1) and accepting + TCP/IP connections on port 9060? diff -dU10 -w /home/circleci/project/src/test/regress/expected/failure_online_move_shard_placement.out /home/circleci/project/src/test/regress/results/failure_online_move_shard_placement.out ``` This then breaks all the tests run after it as well, because we're missing one worker node. Locally I was able to reproduce this error by sleeping for 10 seconds in the forked process sleep before actually starting mitmproxy. So I'm expecting what's happening in CI is that due to limited resources, mitmproxy is not up yet when we try to add its port as a workernode. This PR fixes this by waiting until mitmproxy is listening on its socket before actually starting to run our tests. This fixed it locally for me when I made the forked process sleep for 10 seconds before starting mitmproxy. In passing it also improves the detection and errors that we already had for the case where something was already listening on the mitmproxy port. Because both @gledis69 and me were changing things in our CI images at the same time this also includes a bump of the style checker tools. Closes #6200	2022-08-19 13:03:08 +00:00
Jelte Fennema	78a5013e24	Support changing CPU priorities for backends and shard moves (#6126 ) Intro This adds support to Citus to change the CPU priority values of backends. This is created with two main usecases in mind: 1. Users might want to run the logical replication part of the shard moves or shard splits at a higher speed than they would do by themselves. This might cause some small loss of DB performance for their regular queries, but this is often worth it. During high load it's very possible that the logical replication WAL sender is not able to keep up with the WAL that is generated. This is especially a big problem when the machine is close to running out of disk when doing a rebalance. 2. Users might have certain long running queries that they don't impact their regular workload too much. Be very careful!!! Using CPU priorities to control scheduling can be helpful in some cases to control which processes are getting more CPU time than others. However, due to an issue called "[priority inversion][1]" it's possible that using CPU priorities together with the many locks that are used within Postgres cause the exact opposite behavior of what you intended. This is why this PR only allows the PG superuser to change the CPU priority of its own processes. Currently it's not recommended to set `citus.cpu_priority` directly. Currently the only recommended interface for users is the setting called `citus.cpu_priority_for_logical_replication_senders`. This setting controls CPU priority for a very limited set of processes (the logical replication senders). So, the dangers of priority inversion are also limited with when using it for this usecase. Background Before reading the rest it's important to understand some basic background regarding process CPU priorities, because they are a bit counter intuitive. A lower priority value, means that the process will be scheduled more and whatever it's doing will thus complete faster. The default priority for processes is 0. Valid values are from -20 to 19 inclusive. On Linux a larger difference between values of two processes will result in a bigger difference in percentage of scheduling. Handling the usecases Usecase 1 can be achieved by setting `citus.cpu_priority_for_logical_replication_senders` to the priority value that you want it to have. It's necessary to set this both on the workers and the coordinator. Example: ``` citus.cpu_priority_for_logical_replication_senders = -10 ``` Usecase 2 can with this PR be achieved by running the following as superuser. Note that this is only possible as superuser currently due to the dangers mentioned in the "Be very carefull!!!" section. And although this is possible it's NOT recommended: ```sql ALTER USER background_job_user SET citus.cpu_priority = 5; ``` OS configuration To actually make these settings work well it's important to run Postgres with more a more permissive value for the 'nice' resource limit than Linux will do by default. By default Linux will not allow a process to set its priority lower than it currently is, even if it was lower when the process originally started. This capability is necessary to reset the CPU priority to its original value after a transaction finishes. Depending on how you run Postgres this needs to be done in one of two ways: If you use systemd to start Postgres all you have to do is add a line like this to the systemd service file: ```conf LimitNice=+0 # the + is important, otherwise its interpreted incorrectly as 20 ``` If that's not the case you'll have to configure `/etc/security/limits.conf` like so, assuming that you are running Postgres as the `postgres` OS user: ``` postgres soft nice 0 postgres hard nice 0 ``` Finally you'd have add the following line to `/etc/pam.d/common-session` ``` session required pam_limits.so ``` These settings would allow to change the priority back after setting it to a higher value. However, to actually allow you to set priorities even lower than the default priority value you would need to change the values in the config to something lower than 0. So for example: ```conf LimitNice=-10 ``` or ``` postgres soft nice -10 postgres hard nice -10 ``` If you use WSL2 you'll likely have to do another thing. You have to open a new shell, because when PAM is only used during login, and WSL2 doesn't actually log you in. You can force a login like this: ``` sudo su $USER --shell /bin/bash ``` Source: https://stackoverflow.com/a/68322992/2570866 [1]: https://en.wikipedia.org/wiki/Priority_inversion	2022-08-16 13:07:17 +03:00
aykut-bozkurt	ccf1e0f584	Pg vanilla tests can be run with citus created. (#6018 )	2022-08-11 12:53:22 +03:00
Hanefi Onaldi	4185543910	Pass source directory in env to regression tests PostgreSQL 15 dropped usage of .source files that are used to generate .sql and .out files by replacing some placeholders with the actual values before test runs. Instead, the information is passed from pg_regress to the .sql and .out files directly via env variables. Those variables are read via \getenv psql command in relevant test files. PostgreSQL 15 commit d1029bb5a26cb84b116b0dee4dde312291359f2a introduced some changes to pg_regress binary that allowed this to happen. However this change is not backported to earlier versions of PG, and thus we come up with a similar mechanism in pg_regress_multi that works in all available PG versions.	2022-08-09 14:15:51 +03:00
aykut-bozkurt	3ddc089651	stop distributing views with no distributed dependency if GUC DistributeLocalViews is set false. (#6083 )	2022-08-04 12:34:40 +03:00
aykut-bozkurt	5f27445b69	enable propagation warnings before postgres vanilla tests (#6081 )	2022-07-27 10:34:41 +03:00
aykut-bozkurt	67ac3da2b0	added citus_depended_objects udf and HideCitusDependentObjects GUC to hide citus depended objects from pg meta queries (#6055 ) use RecurseObjectDependencies api to find if an object is citus depended make vanilla tests runnable to see if citus_depended function is working correctly	2022-07-25 16:43:34 +03:00
Jelte Fennema	184c7c0bce	Make enterprise features open source (#6008 ) This PR makes all of the features open source that were previously only available in Citus Enterprise. Features that this adds: 1. Non blocking shard moves/shard rebalancer (`citus.logical_replication_timeout`) 2. Propagation of CREATE/DROP/ALTER ROLE statements 3. Propagation of GRANT statements 4. Propagation of CLUSTER statements 5. Propagation of ALTER DATABASE ... OWNER TO ... 6. Optimization for COPY when loading JSON to avoid double parsing of the JSON object (`citus.skip_jsonb_validation_in_copy`) 7. Support for row level security 8. Support for `pg_dist_authinfo`, which allows storing different authentication options for different users, e.g. you can store passwords or certificates here. 9. Support for `pg_dist_poolinfo`, which allows using connection poolers in between coordinator and workers 10. Tracking distributed query execution times using citus_stat_statements (`citus.stat_statements_max`, `citus.stat_statements_purge_interval`, `citus.stat_statements_track`). This is disabled by default. 11. Blocking tenant_isolation 12. Support for `sslkey` and `sslcert` in `citus.node_conninfo`	2022-06-16 00:23:46 -07:00
gledis69	4731630741	Add distributing lock command support	2022-05-20 12:28:07 +03:00
Marco Slot	ceb593c9da	Convert citus.hide_shards_from_app_name_prefixes to citus.show_shards_for_app_name_prefixes	2022-05-03 14:22:13 +02:00
Jelte Fennema	68bfc8d1c0	Use good initdb options in arbitrary configs tests (#5802 ) In `pg_regress_multi.pl` we're running `initdb` with some options that the `common.py` `initdb` is currently not using. All these flags seem reasonable, so this brings `common.py` in line with `pg_regress_multi.pl`. In passing change the `--nosync` flag to `--no-sync`, since that's what the PG documentation lists as the official option name (but both work).	2022-03-17 13:22:23 +01:00
Marco Slot	33bfa0b191	Hide shards from application_name's with a specific prefix	2022-01-18 15:20:55 +04:00
Ahmet Gedemenli	042d45b263	Propagate foreign server ops	2021-12-23 17:54:04 +03:00
Marco Slot	78866df13c	Remove master_append_table_to_shard UDF	2021-11-08 10:43:24 +01:00
Marco Slot	386d2567d4	Reduce reliance on append tables in regression tests	2021-10-08 21:27:14 +02:00
Jelte Fennema	bb5c494104	Enable binary encoding by default on PG14 Since PG14 we can now use binary encoding for arrays and composite types that contain user defined types. This was fixed in this commit in Postgres: `670c0a1d47` This change starts using that knowledge, by not necessarily falling back to text encoding anymore for those types. While doing this and testing a bit more I found various cases where binary encoding would fail that our checks didn't cover. This fixes those cases and adds tests for those. It also fixes EXPLAIN ANALYZE never using binary encoding, which was a leftover of workaround that was not necessary anymore. Finally, it changes the default for both `citus.enable_binary_protocol` and `citus.binary_worker_copy_format` to `true` for PG14 and up. In our cloud offering `binary_worker_copy_format` already was true by default. `enable_binary_protocol` had some bug with MX and user defined types, this bug was fixed by the above mentioned fixes.	2021-09-06 10:27:29 +02:00
Sait Talha Nisanci	2fa1e5ffe3	Use the default max_parallel_workers_per_gather for vanilla	2021-09-03 15:41:28 +03:00
Sait Talha Nisanci	d1c0403055	Disable Query Idenfifier calculation in tests When queryId is not 0 and verbose is true, the query identifier is emitted to the explain output. This is breaking Postgres outputs. We disable de query identifier calculation in the tests. Commit on PG that introduced the query identifier in the explain output: 4f0b0966c866ae9f0e15d7cc73ccf7ce4e1af84b	2021-09-03 15:41:28 +03:00
Ahmet Gedemenli	089ef35940	Disable dropping and truncating known shards Add test for disabling dropping and truncating known shards	2021-06-02 14:30:27 +02:00
SaitTalhaNisanci	82f34a8d88	Enable citus.defer_drop_after_shard_move by default (#4961 ) Enable citus.defer_drop_after_shard_move by default	2021-05-21 10:48:32 +03:00
Jelte Fennema	10f06ad753	Fetch shard size on the fly for the rebalance monitor Without this change the rebalancer progress monitor gets the shard sizes from the `shardlength` column in `pg_dist_placement`. This column needs to be updated manually by calling `citus_update_table_statistics`. However, `citus_update_table_statistics` could lead to distributed deadlocks while database traffic is on-going (see #4752). To work around this we don't use `shardlength` column anymore. Instead for every rebalance we now fetch all shard sizes on the fly. Two additional things this does are: 1. It adds tests for the rebalance progress function. 2. If a shard move cannot be done because a source or target node is unreachable, then we error in stop the rebalance, instead of showing a warning and continuing. When using the by_disk_size rebalance strategy it's not safe to continue with other moves if a specific move failed. It's possible that the failed move made space for the next move, and because the failed move never happened this space now does not exist. 3. Adds two new columns to the result of `get_rebalancer_progress` which shows the size of the shard on the source and target node. Fixes #4930	2021-05-20 16:38:17 +02:00
Nils Dijk	1c1999ed7b	incorporate the fixopen fix for osx users on bigsur (#4837 ) comparable to https://github.com/citusdata/tools/pull/88 this patch adds checks to the perl script running the testing harness of citus to start the postgres instances via the fixopen binary when present to work around `Interrupted System` call errors on OSX Big Sur.	2021-03-22 16:22:08 +01:00
Hadi Moshayedi	0e0fd6599a	Faster logical replication tests. Logical replication status can take wal_receiver_status_interval seconds to get updated. Default is 10s, which means tests in which logical replication is used can take a long time to finish. We reduce it to 1 second to speed these tests up. Logical replication apply launcher launches workers every wal_retrieve_retry_interval, so if we have many shard moves with logical replication consecutively, they will be throttled by this parameter. Default is 5s, we reduce it to 1s so we finish tests faster.	2021-01-19 07:48:47 -08:00
Marco Slot	48caca4084	Improve regression test settings	2020-11-30 20:34:03 +01:00
Onur Tirtir	d912d4bc38	Print full file path in valgrind testing (#4299 )	2020-11-06 10:26:53 +03:00
Sait Talha Nisanci	078dcae18c	Write settings to postgres configuration file directly In our test structure, we have been passing postgres configurations from the terminal, which causes problems after it hits to a certain length hence it cannot start the server and understanding why it failed is not easy because there isn't a nice error message. This commit changes this to write the settings directly to the postgres configuration file. This way we can add as many postgres settings as we want to without needing to worry about the length problem.	2020-10-05 22:09:08 +03:00

1 2 3

128 Commits (906583f81151820e81ef25abd7fc2f95c5b120d4)