DESCRIPTION: Drops PG14 support
1. Remove "$version_num" != 'xx' from configure file
2. delete all PG_VERSION_NUM = PG_VERSION_XX references in the code
3. Look at pg_version_compat.h file, remove all _compat functions etc
defined specifically for PGXX differences
4. delete all PG_VERSION_NUM >= PG_VERSION_(XX+1), PG_VERSION_NUM <
PG_VERSION_(XX+1) ifs in the codebase
5. delete ruleutils_xx.c file
6. cleanup normalize.sed file from pg14 specific lines
7. delete all alternative output files for that particular PG version,
server_version_ge variable helps here
PostgreSQL 16 adds an extra condition (id IS NOT NULL) to the subquery.
This condition is likely used to ensure that no null values are
processed in the subquery. Instead of using the condition id IS NOT
NULL, PostgreSQL 17 generates the subplan with a trivial condition
(WHERE true), indicating that it does not need to explicitly check for
non-null values.
PostgreSQL 17 likely includes optimizations to handle null checks more
efficiently. The WHERE (id IS NOT NULL) condition that was present in
PostgreSQL 16 may now be considered redundant by the planner, as it is
implicitly handled by the query execution engine.
https://github.com/postgres/postgres/commit/b262ad44
```diff
SELECT
foo1.id
FROM
(SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo9,
(SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo8,
(SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo7,
(SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo6,
(SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo5,
(SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo4,
(SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo3,
(SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo2,
(SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo10,
(SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo1
WHERE
foo1.id = foo9.id AND
foo1.id = foo8.id AND
foo1.id = foo7.id AND
foo1.id = foo6.id AND
foo1.id = foo5.id AND
foo1.id = foo4.id AND
foo1.id = foo3.id AND
foo1.id = foo2.id AND
foo1.id = foo10.id AND
foo1.id = foo1.id
ORDER BY 1;
...
-DEBUG: generating subplan XXX_10 for subquery SELECT id FROM local_dist_join_mixed.local WHERE (id IS NOT NULL)
+DEBUG: generating subplan XXX_10 for subquery SELECT id FROM local_dist_join_mixed.local WHERE true
...
```
This fix ensures that the expected DEBUG error messages from the router
planner in `multi_router_planner`, `multi_router_planner_fast_path` and
`query_single_shard_table` are present with PG17.
In `query_single_shard_table` the diff:
```
SELECT COUNT(*) FROM citus_local_table t1
WHERE t1.b IN (
SELECT b+1 FROM nullkey_c1_t1 t2 WHERE t2.b = t1.a
);
-DEBUG: router planner does not support queries that reference non-colocated distributed tables
+DEBUG: Local tables cannot be used in distributed queries.
```
occurred because of[ this PG17
commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=9f1337639)
which enables the optimizer to pull up a correlated ANY subquery to a
join. The fix inhibits subquery pull up by including a volatile function
in the predicate involving the ANY subquery, preserving the pre-PG17
optimizer treatment of the query.
In the case of `multi_router_planner` and
`multi_router_planner_fast_path` the diffs:
```
-- partition_column is null clause does not prune out any shards,
-- all shards remain after shard pruning, not router plannable
SELECT *
FROM articles_hash a
WHERE a.author_id is null;
-DEBUG: Router planner cannot handle multi-shard select queries
+DEBUG: Creating router plan
```
are because of [this PG17
commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b262ad440),
which enables the optimizer to detect and remove redundant IS (NOT) NULL
expressions. The fix is to adjust the table definition so the column
used for distribution is not marked NOT NULL, thus preserving the
pre-PG17 query planning behavior.
Finallly, a rule is added to `normalize.sed` to ignore DEBUG logging in CREATE MATERIALIZED
VIEW AS statements introduced by [this PG17
commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b4da732fd64);
_when creating materialized views, use REFRESH logic to load data_, a
consequence of which is that with `client_min_messages` at `DEBUG2`
Postgres emits extra detail for CREATE MATERIALIZED VIEW AS statements.
```
CREATE MATERIALIZED VIEW mv_articles_hash_empty AS
SELECT * FROM articles_hash WHERE author_id = 1;
DEBUG: Creating router plan
DEBUG: query has a single distribution column value: 1
+DEBUG: drop auto-cascades to type multi_router_planner.pg_temp_61391
+DEBUG: drop auto-cascades to type multi_router_planner.pg_temp_61391[]
```
The rule can be changed to a normalization, or possibly dropped, when 17 becomes the minimum supported version.
A recent Postgres commit (*) that refactored error messages is the cause
of the diffs in pg16 regress test when running Citus on Postgres 17. The
fix changes the pg16 goldfile and includes a normalization rule for the
error messages so pg16 will pass when running with version 16 of
Postgres.
(*)
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=498ee9ee2f
Test isolation_update_node fails on some systems with the following error:
```
-s2: WARNING: connection to the remote node non-existent:57637 failed with the following error: could not translate host name "non-existent" to address: Name or service not known
+s2: WARNING: connection to the remote node non-existent:57637 failed with the following error: could not translate host name "non-existent" to address: Temporary failure in name resolution
```
This slightly modifies an already existing [normalization
rule](739c6d26df/src/test/regress/bin/normalize.sed (L217-L218))
to fix it.
Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>
PG16 compatibility - part 9
Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
part 5 6056cb2c29
part 6 b36c431abb
part 7 ee3153fe50
part 8 2c50b5f7ff
This commit is in the series of PG16 compatibility commits. It makes some changes
to our tests in order to be compatible with the following in PG16:
- Fix multi_subquery_in_where_reference_clause test
somehow PG got rid of the outer join
(e.g., explain doesn't show outer joins),
hence we can pushdown the subquery.
Changing to users_reference_table
- Fix unqualified column names for views in PG16
Relevant PG commit:
47bb9db759
47bb9db75996232ea71fc1e1888ffb0e70579b54
- Fix global_cancel test
Error wording and detail changed
Relevant PG commit:
2631ebab7b
2631ebab7b18bdc079fd86107c47d6104a6b3c6e
- Fix local_table_join_test with lateral subquery
Possible relevant PG commit:
ae89129aa3
ae89129aa3555c263b8c3ccc4c0f1ef7e46201aa
I removed the where clause and the limit count error was hit again.
With the where clause the query unexpectedly works.
- Fix test outputs
Relevant PG commits:
-- 1349d2790b
-- f4c7c410ee
For multi_explain and multi_complex_count_distinct there were too many places
touched so I just added an alternative test output.
For the other tests I modified the problematic parts.
More PG16 compatibility commits are coming soon ...
PG16 compatibility - part 7
Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
part 5 6056cb2c29
part 6 b36c431abb
part 7 ee3153fe50
This commit is in the series of PG16 compatibility commits. PG16 introduced a new entry
varnnullingrels to Var, which represents our partkey in pg_dist_partition.
This commit does the necessary changes in Citus to support this.
Relevant PG commit:
2489d76c49
2489d76c4906f4461a364ca8ad7e0751ead8aa0d
More PG16 compatibility commits are coming soon ...
PG16 compatibility - part 7
Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
part 5 6056cb2c29
part 6 b36c431abb
This commit is in the series of PG16 compatibility commits. It makes some changes
to our tests in order to be compatible with the following in PG16:
- PG16 removed logic for converting a table to a view
Relevant PG commit:
b23cd185fd
b23cd185fd5410e5204683933f848d4583e34b35
- Fix changed error message in certificate verification
Relevant PG commit:
8eda731465
8eda7314652703a2ae30d6c4a69c378f6813a7f2
- Fix backend type order in tests
Relevant PG commit:
0c679464a8
0c679464a837079acc75ff1d45eaa83f79e05690
- Reduce log level to omit extra NOTICE in create collation in PG16
Relevant PG commit:
a14e75eb0b
a14e75eb0b6a73821e0d66c0d407372ec8376105
That commit made LOCALE parameter apply regardless of the
provider used, and it printed the following notice:
NOTICE: using standard form "und-u-ks-level2" for ICU locale "@colStrength=secondary"
We omit this notice to omit output change between pg versions.
- Fix columnar_memory test
TopMemoryContext now has more children contexts
Possible relevant PG commit:
9d3ebba729
9d3ebba729ebaf5882a92f0f5f662a3312037605
memusage is now around 8.5 MB, whereas it was less than 8MB before.
To avoid differences between PG versions, I changed the test to compare
to less than 9 MB. It still reflects very well the improvement from
28MB.
- Alternative test output for GRANTOR values in pg_auth_members
grantor changed in PG16
Relevant PG commit:
ce6b672e44
ce6b672e4455820a0348214be0da1a024c3f619f
- Remove redundant grouping columns from our tests
Relevant PG commit:
8d83a5d0a2
8d83a5d0a2673174dc478e707de1f502935391a5
- Fix tests with different order in Filters
Relevant PG commit:
2489d76c49
2489d76c4906f4461a364ca8ad7e0751ead8aa0d
More PG16 compatibility commits are coming soon ...
PG16 compatibility - Part 4
Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
This commit is in the series of PG16 compatibility commits.
It adds some outer join checks to the planner,
the new password_required option to the subscription,
and a crash fix related to PGIOAlignedBlock, see below for more details:
- Fix PGIOAlignedBlock Assert crash in PG16
Relevant PG commit:
faeedbcefd
faeedbcefd40bfdf314e048c425b6d9208896d90
- Pass planner info as argument to make_simple_restrictinfo
Pre PG16 passing plannerInfo to make_simple_restrictinfo
was only needed for placeholder Vars, which is not the case
in this part of the codebase because we are building the
expression from shard intervals which don't have placeholder
vars.
However, PG16 is counting baserels appearing in clause_relids
and is deleting the rels mentioned in plannerinfo->outer_join_rels
Hence directly accessing plannerinfo.
We will crash if we leave it as NULL.
For reference
2489d76c49 (diff-e045c41eda9686451a7993e91518e40056b3739365e39eb1b70ae438dc1f7c76R207)
Relevant PG commit:
2489d76c49
2489d76c4906f4461a364ca8ad7e0751ead8aa0d
- Add outer join checks, root->simple_rel_array
- fix rebalancer to include passwork_required option
Relevant PG commit:
c3afe8cf5a
c3afe8cf5a1e465bd71e48e4bc717f5bfdc7a7d6
More PG16 compatibility commits are coming soon ...
This commit is the second and last phase of dropping PG13 support.
It consists of the following:
- Removes all PG_VERSION_13 & PG_VERSION_14 from codepaths
- Removes pg_version_compat entries and columnar_version_compat entries
specific for PG13
- Removes alternative pg13 test outputs
- Removes PG13 normalize lines and fix the test outputs based on that
It is a continuation of 5bf163a27d
1) For distributed tables that are not colocated.
2) When joining on a non-distribution column for colocated tables.
3) When merging into a distributed table using reference or citus-local tables as the data source.
This is accomplished primarily through the implementation of the following two strategies.
Repartition: Plan the source query independently,
execute the results into intermediate files, and repartition the files to
co-locate them with the merge-target table. Subsequently, compile a final
merge query on the target table using the intermediate results as the data
source.
Pull-to-coordinator: Execute the plan that requires evaluation at the coordinator,
run the query on the coordinator, and redistribute the resulting rows to ensure
colocation with the target shards. Direct the MERGE SQL operation to the worker
nodes' target shards, using the intermediate files colocated with the data as the
data source.
DESCRIPTION: Drops PG13 Support
This commit is the first phase of dropping PG13 support.
It consists of the following:
- Removes pg13 from CI tests
Among other things, Citus upgrade tests should now use PG14.
Earliest Citus version supporting PG14 is 10.2.
We also pick 11.3 version for upgrade_pg_dist_cleanup tests.
Therefore, we run the citus upgrade tests with versions 10.2 and 11.3.
- Removes pg13 from configure script
- Remove upgrade_columnar_metapage upgrade tests
We populate first_row_number column of columnar.stripe table
during citus 10.1-10.2 upgrade. Given that we start from citus 10.2.0,
which is the oldest version supporting PG14, we don't have that
upgrade path anymore. Hence we remove these tests.
- Removes upgrade_pg_dist_object_test and upgrade_partition_constraints tests
These upgrade tests require the citus old version to be less than 10.0.
Given that we drop support for PG13, we run upgrade tests with PG14,
which starts with 10.2.
So we remove these upgrade tests.
- Documents that upgrade_post_11 should upgrade from version less than 11
In this way we make sure we run
citus_finalize_upgrade_to_citus11 script
- Adds needed alternative output for upgrade_citus_finish_citus_upgrade
Given that we use 11.3 as the citus old version as well,
we add this alternative output because pg_catalog.citus_finish_citus_upgrade()
makes sense if last_upgrade_major_version < 11. See below for reference:
pg_catalog.citus_finish_citus_upgrade():
...
IF last_upgrade_major_version < 11 THEN
PERFORM citus_finalize_upgrade_to_citus11();
performed_upgrade := true;
END IF;
IF NOT performed_upgrade THEN
RAISE NOTICE 'already at the latest distributed
schema version (%)', last_upgrade_version_string;
RETURN;
END IF;
...
And that's it :)
The second phase of dropping PG13 support will consist in removing
all the PG13 specific compilation paths/tests in the Citus repo.
Will be done soon.
DESCRIPTION: Enabling citus_stat_tenants to support schema-based
tenants.
This pull request modifies the existing logic to enable tenant
monitoring with schema-based tenants. The changes made are as follows:
- If a query has a partitionKeyValue (which serves as a tenant
key/identifier for distributed tables), Citus annotates the query with
both the partitionKeyValue and colocationId. This allows for accurate
tracking of the query.
- If a query does not have a partitionKeyValue, but its colocationId
belongs to a distributed schema, Citus annotates the query with only the
colocationId. The tenant monitor can then easily look up the schema to
determine if it's a distributed schema and make a decision on whether to
track the query.
---------
Co-authored-by: Jelte Fennema <jelte.fennema@microsoft.com>
Adds Support for Single Shard Tables in
`update_distributed_table_colocation`.
This PR changes checks that make sure tables should be hash distributed
table to hash or single shard distributed tables.
DESCRIPTION: Fixes a bug in background shard rebalancer where the
replicate reference tables task fails if the current user is not a
superuser.
This change is to be backported to earlier releases. We should fix the
permissions for replicate_reference_tables on main branch such that it
can be run by non-superuser roles.
Fixes#6925.
Fixes#6926.
When we bump columnar version, some tests fail because of the output
change. Instead of changing those lines every time, I think it is better
to normalize it in tests.
DESCRIPTION: Adds views that monitor statistics on tenant usages
This PR adds `citus_stats_tenants` view that monitors the tenants on the
cluster.
`citus_stats_tenants` shows the node id, colocation id, tenant
attribute, read count in this period and last period, and query count in
this period and last period of the tenant.
Tenant attribute currently is the tenant's distribution column value,
later when schema based sharding is introduced, this meaning might
change.
A period is a time bucket the queries are counted by. Read and query
counts for this period can increase until the current period ends. After
that those counts are moved to last period's counts, which cannot
change. The period length can be set using 'citus.stats_tenants_period'.
`SELECT` queries are counted as _read_ queries, `INSERT`, `UPDATE` and
`DELETE` queries are counted as _write_ queries. So in the view read
counts are `SELECT` counts and query counts are `SELECT`, `INSERT`,
`UPDATE` and `DELETE` count.
The data is stored in shared memory, in a struct named
`MultiTenantMonitor`.
`citus_stats_tenants` shows the data from local tenants.
`citus_stats_tenants` show up to `citus.stats_tenant_limit` number of
tenants.
The tenants are scored based on the number of queries they run and the
recency of those queries. Every query ran increases the score of tenant
by `ONE_QUERY_SCORE`, and after every period ends the scores are halved.
Halving is done lazily.
To retain information a longer the monitor keeps up to 3 times
`citus.stats_tenant_limit` tenants. When the tenant count hits `3 *
citus.stats_tenant_limit`, last `citus.stats_tenant_limit` tenants are
removed. To see all stored tenants you can use
`citus_stats_tenants(return_all_tenants := true)`
- [x] Create collector view that gets data from all nodes. #6761
- [x] Add monitoring log #6762
- [x] Create enable/disable GUC #6769
- [x] Parse the annotation string correctly #6796
- [x] Add local queries and prepared statements #6797
- [x] Rename to citus_stat_statements #6821
- [x] Run pgbench
- [x] Fix role permissions #6812
---------
Co-authored-by: Gokhan Gulbiz <ggulbiz@gmail.com>
Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
This implements the phase - II of MERGE sql support
Support routable query where all the tables in the merge-sql are distributed, co-located, and both the source and
target relations are joined on the distribution column with a constant qual. This should be a Citus single-task
query. Below is an example.
SELECT create_distributed_table('t1', 'id');
SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’);
MERGE INTO t1
USING s1 ON t1.id = s1.id AND t1.id = 100
WHEN MATCHED THEN
UPDATE SET val = s1.val + 10
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED THEN
INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src)
Basically, MERGE checks to see if
There are a minimum of two distributed tables (source and a target).
All the distributed tables are indeed colocated.
MERGE relations are joined on the distribution column
MERGE .. USING .. ON target.dist_key = source.dist_key
The query should touch only a single shard i.e. JOIN AND with a constant qual
MERGE .. USING .. ON target.dist_key = source.dist_key AND target.dist_key = <>
If any of the conditions are not met, it raises an exception.
(cherry picked from commit 44c387b978)
This implements MERGE phase3
Support pushdown query where all the tables in the merge-sql are Citus-distributed, co-located, and both
the source and target relations are joined on the distribution column. This will generate multiple tasks
which execute independently after pushdown.
SELECT create_distributed_table('t1', 'id');
SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’);
MERGE INTO t1
USING s1
ON t1.id = s1.id
WHEN MATCHED THEN
UPDATE SET val = s1.val + 10
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED THEN
INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src)
*The only exception for both the phases II and III is, UPDATEs and INSERTs must be done on the same shard-group
as the joined key; for example, below scenarios are NOT supported as the key-value to be inserted/updated is not
guaranteed to be on the same node as the id distribution-column.
MERGE INTO target t
USING source s ON (t.customer_id = s.customer_id)
WHEN NOT MATCHED THEN - -
INSERT(customer_id, …) VALUES (<non-local-constant-key-value>, ……);
OR this scenario where we update the distribution column itself
MERGE INTO target t
USING source s On (t.customer_id = s.customer_id)
WHEN MATCHED THEN
UPDATE SET customer_id = 100;
(cherry picked from commit fa7b8949a8)
DESCRIPTION: Drop `SHARD_STATE_TO_DELETE` and use the cleanup records
instead
Drops the shard state that is used to mark shards as orphaned. Now we
insert cleanup records into `pg_dist_cleanup` so "orphaned" shards will
be dropped either by maintenance daemon or internal cleanup calls. With
this PR, we make the "cleanup orphaned shards" functions to be no-op, as
they would not be needed anymore.
This PR includes some naming changes about placement functions. We don't
need functions that filter orphaned shards, as there will be no orphaned
shards anymore.
We will also be introducing a small script with this PR, for users with
orphaned shards. We'll basically delete the orphaned shard entries from
`pg_dist_placement` and insert cleanup records into `pg_dist_cleanup`
for each one of them, during Citus upgrade.
We also have a lot of flakiness fixes in this PR.
Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
Fixes task executor SIGTERM handling.
Problem:
When task executors are sent SIGTERM, their default handler
`bgworker_die`, which is set at worker startup, logs FATAL error. But
they do not release locks there before logging the error, which
sometimes causes hanging of the monitor. e.g. Monitor waits for the lock
forever at pg_stat flush after calling proc_exit.
Solution:
Because executors have connection to backend, they should handle SIGTERM
similar to normal backends. Normal backends uses `die` handler, in which
they set ProcDiePending flag and the next CHECK_FOR_INTERRUPTS call
handles it gracefully by releasing any lock before termination.
DESCRIPTION: Create replication artifacts with unique names
We're creating replication objects with generic names. This disallows us
to enable parallel shard moves, as two operations might use the same
objects. With this PR, we'll create below objects with operation
specific names, by appending OparationId to the names.
* Subscriptions
* Publications
* Replication Slots
* Users created for subscriptions
When debugging issues it's quite useful to see the originating gpid in
the application_name of a query on a worker. This already happens for
most queries, but not for queries created by the rebalancer or by
run_command_on_worker. This adds a gpid to those two application_names
too.
Note, that if the GPID of the new application_names is different than
the current GPID of the backend the backend will continue to keep
the old gpid as its actual GPID. This PR is just meant to make sure
that the application_name is as useful as it can be for users to
look at. Updating of gpids will be done in a follow-up PR, and
adding gpids to all internal connections will make this easier.
Fixes a bug that causes crash when using auto_explain extension with
ALTER TABLE...ADD FOREIGN KEY... queries.
Those queries trigger a SELECT query on the citus tables as part of the
foreign key constraint validation check. At the explain hook, workers
try to explain this SELECT query as a distributed query causing memory
corruption in the connection data structures. Hence, we will not explain
ALTER TABLE...ADD FOREIGN KEY... and the triggered queries on the
workers.
Fixes#6424.
increasing logical clock. Clock guarantees to never go back in value after restarts,
and makes best attempt to keep the value close to unix epoch time in milliseconds.
Also, introduces a new GUC "citus.enable_cluster_clock", when true, every
distributed transaction is stamped with logical causal clock and persisted
in a catalog pg_dist_commit_transaction.
DESCRIPTION: Add a rebalancer that uses background tasks for its
execution
Based on the baclground jobs and tasks introduced in #6296 we implement
a new rebalancer on top of the primitives of background execution. This
allows the user to initiate a rebalance and let Citus execute the long
running steps in the background until completion.
Users can invoke the new background rebalancer with `SELECT
citus_rebalance_start();`. It will output information on its job id and
how to track progress. Also it returns its job id for automation
purposes. If you simply want to wait till the rebalance is done you can
use `SELECT citus_rebalance_wait();`
A running rebalance can be canelled/stopped with `SELECT
citus_rebalance_stop();`.
* Adjust configure script to allow PG15
* Adds copy of ruleutils_14.c as ruleutils_15.c
* Uses get_namespace_name_or_temp in ruleutils_15.c
Relevant PG commit:
48c5c9068211e0a04fd9553c8714b2821ed3ad17
* Clean up code using "(expr) ? true : false" in ruleutils_15.c
Relevant PG commit:
fd0625c7a9c679c0c1e896014b8f49a489c3a245
* Change varno from Index (unsigned int) to int in ruleutils_15.c
Relevant PG commit:
e3ec3c00d85bd2844ffddee83df2bd67c4f8297f
* Adds find_recursive_union to ruleutils_15.c
Relevant PG commit:
3f50b82639637c9908afa2087de7588450aa866b
* Fix display of SQL-std func's args in INSERT/SELECT in ruleutils_15.c
Relevant PG commit:
a8d8445a7b2f80f6d0bfe97b19f90bd2cbef8759
* Fix ruleutils_15.c's dumping of whole-row Vars in more contexts
Relevant PG commit:
43c2175121c829c8591fc5117b725f1f22bfb670
* Fix assorted missing logic for GroupingFunc nodes in ruleutils_15.c
Relevant PG commit:
2591ee8ec44d8cbc8e1226550337a64c684746e4
* Adds grammar support for SQL/JSON clauses in ruleutils_15.c
Relevant PG commit:
f79b803dcc98d707450e158db3638dc67ff8380b
* Adds SQL/JSON constructors to ruleutils_15.c
Relevant PG commits:
f4fb45d15c59d7add2e1b81a9d477d0119a9691a
cc7401d5ca498a84d9b47fd2e01cebd8e830e558
* Adds support for MERGE in ruleutils_15.c
Relevant PG commit:
7103ebb7aae8ab8076b7e85f335ceb8fe799097c
* Add IS JSON predicate to ruleutils_15.c
Relevant PG commit:
33a377608fc29cdd1f6b63be561eab0aee5c81f0
* Add SQL/JSON query functions to ruleutils_15.c
Relevant PG commit:
1a36bc9dba8eae90963a586d37b6457b32b2fed4
* Adds three different SQL/JSON values to ruleutils_15.c
Relevant PG commits:
606948b058dc16bce494270eea577011a602810e
49082c2cc3d8167cca70cfe697afb064710828ca
* Adds JSON table functions in ruleutils_15.c
Relevant PG commit:
4e34747c88a03ede6e9d731727815e37273d4bc9
* Add PLAN function for JSON table in ruleutils_15.c
Relevant PG commit:
fadb48b00e02ccfd152baa80942de30205ab3c4f
* Remove extra blank lines before block-closing braces ruleutils_15.c
Relevant PG commit:
24d2b2680a8d0e01b30ce8a41c4eb3b47aca5031
* set_deparse_plan: Reuse variable to appease Coverity ruleutils_15.c
Relevant PG commit:
e70813fbc4aaca35ec012d5a426706bd54e4acab
* Mechanical code beautification ruleutils_15.c
Relevant PG commit:
23e7b38bfe396f919fdb66057174d29e17086418
* Rename value_type to item_type in ruleutils_15.c
Relevant PG commit:
3ab9a63cb638a1fd99475668e2da9c237495aeda
* Show 'AS "?column?"' explicitly when it's important in ruleutils_15.c
Relevant PG commit:
c7461fc25558832dd347a9c8150b0f1ed85e36e8
* Fix ruleutils_15.c issues with dropped cols in funcs-returning-composite
Relevant PG commit:
c1d1e8469c77ce6b8e5310955580b4a3eee7fe96
* Change comment regarding functions returning composite in ruleutils_15.c
Relevant PG commit:
c2fa113ddb1117b1f03e91960f65d5d7d8a90270
* Replace int nodes with bool nodes where needed
In PG15, Boolean nodes are added. Pre PG15, internal Boolean values
in Create Role commands were represented by Integer nodes. This
commit replaces int nodes logic with bool nodes logic where needed.
Mostly there are CREATE ROLE logic changes.
Relevant PG commit:
941460fcf731a32e6a90691508d5cfa3d1f8eeaf
* Handle new option colliculocale in CREATE COLLATION logic
In PG15, there is an added option to use ICU as global locale provider.
pg_collation has three locale-related fields: collcollate and collctype,
which are libc-related fields, and a new one colliculocale, which is the
ICU-related field. Only the libc-related fields or the ICU-related field
is set, never both.
Relevant PG commits:
f2553d43060edb210b36c63187d52a632448e1d2
54637508f87bd5f07fb9406bac6b08240283be3b
* Add PG15 tests to CI using test images that have 15beta2 (#6093)
* Change warning message in pg_signal_backend()
Relevant PG commit:
7fa945b857cc1b2964799411f1633468826861ff
* Revert "Add missing ifdef for PG 15"
This reverts commit c7b51025ab.
* Fixes tests for ALTER TRIGGER RENAME consistency for part. tables
Relevant PG commit:
80ba4bb383538a2ee846fece6a7b8da9518b6866
* Prevent creating child triggers on partitions when adding new node
Pre PG15, tgisinternal is true for a "child" trigger on a partition
cloned from the trigger on the parent.
In PG15, tgisinternal is false in that case. However, we don't want to
create this trigger on the partition since it will create a conflict
when we try to attach the partition to the parent table:
ERROR: trigger "..." for relation "{partition_name}" already exists
Relevant PG commit:
f4566345cf40b068368cb5617e61318da60676ec
* Fix tests for generated columns dependency changes
In PG15, For GENERATED columns, all dependencies of the generation
expression are recorded as NORMAL dependencies of the column itself.
This requires CASCADE to drop generated cols with the original col.
PRE PG15, dependencies were recorded as AUTO, with which
generated columns are silently dropped with the original column.
Relevant PG commit:
cb02fcb4c95bae08adaca1202c2081cfc81a28b5
* Explicitly cast catalog "char" column to text before concatenation
Relevant PG commit:
07eee5a0dc642d26f44d65c4e6263304208e8583
* Remove 'AS "?column?"' from test outputs
There were some instances in the following tst outputs
in planning debug outputs where AS "?column?" is added.
We add a normalization rule to remove it as it is not
important.
cte_inline.out
recursive_relation_planning_restriction_pushdown.out
Relevant PG commit:
c7461fc25558832dd347a9c8150b0f1ed85e36e8
* Use pg_backup_stop(PG15) instead of pg_stop_backup(PG<15)
Add an alternative test output because of the change in the
backup modes of Postgres. Specifically here, there is a renaming
issue: pg_stop_backup PRE PG15 vs pg_backup_stop PG15+
The alternative output can be deleted when we drop support for PG14
Relevant PG commit:
39969e2a1e4d7f5a37f3ef37d53bbfe171e7d77a
* Adds citus.mitmfifo GUC
Previously we setting this configuration parameter
in the fly for failure tests schedule.
However, PG15 doesn't allow that anymore: reserved prefixes
like "citus" cannot be used to set non-existing GUCs.
Relevant PG commit:
88103567cb8fa5be46dc9fac3e3b8774951a2be7
* Handles EXPLAIN output diffs in PG15 - Extra result lines
To handle extra "Result" lines in explain outputs, we add explain
method to multi_test_helpers.sql file
- plan_without_result_lines() is added for cases where we want the
whole explain output with only "Result" lines removed
* Handles EXPLAIN output diffs in PG15, Hash Agg/Join leverage
To handle differences in usage of GroupAggregate vs HashAggregate
or Merge Join vs Hash join in cases where this detail doesn't
seem to matter, we use coordinator_plan().
- coordinator_plan() is updated to remove "Result" lines
There are some cases where we have subplans so we add a new
function that prints all Task Count lines as well
- coordinator_plan_with_subplans()
Still not sure of the relevant PG commit
Could be db0d67db2401eb6238ccc04c6407a4fd4f985832
but disabling enable_group_by_reordering didn't help.
* Handles EXPLAIN output diffs in PG15: enable_group_by_reordering
Relevant PG commit
db0d67db2401eb6238ccc04c6407a4fd4f985832
* Normalizes Memory Usage, Buckets, Batches for PG15 explain diffs
We create a new function in multi_test_helpers, which is similar
to explain_merge function in PG15. This explain helper function
normalies Memory Usage, Buckets and Batches, and we use it in the
tests which give a different output for PG15.
* Bump test images to 15beta3 (#6172)
* Omit namespace in post-copy errmsg
Relevant PG commit:
069d33d0c5a021601245e44df77a0423ddd69359
* Handles EXPLAIN output diffs in PG15: extra arrows&result lines
To handle extra "->" arrows resulting from extra Result lines
in explain outputs, we add the following explain method to
multi_test_helpers.sql file
- plan_without_arrows() is added for cases where we want the
whole explain output without arrows and without Result lines
* Alters public schema's owner to pg_database_owner in PG15
In PG15, public schema is owned by pg_database_owner role.
In multi_extension, we drop and recreate the ppublic schema,
hence its owner become the default user in our tests, postgres.
Change that to pg_database_owner for PG15 consistency.
This results in alternative test output for public schema grants
in the following test:
grant_on_schema_propagation.sql
Relevant PG commit: b073c3ccd06e4cb845e121387a43faa8c68a7b62
* Add alternative test outputs for change in Insert Select display
citus_local_tables_queries.sql
coordinator_shouldhaveshards.sql
cte_inline.sql
insert_select_repartition.sql
intermediate_result_pruning.sql
local_shard_execution.sql
local_shard_execution_replicated.sql
multi_deparse_shard_query.sql
multi_insert_select.sql
multi_insert_select_conflict.sql
multi_mx_insert_select_repartition.sql
mx_coordinator_shouldhaveshards.sql
single_node.sql
Relevant PG commit:
a8d8445a7b2f80f6d0bfe97b19f90bd2cbef8759
* Fixes columnar tap tests for PG15
In PG15, Perl test modules have been moved to a new namespace.
Also, postgres node new() and get_new_node() methods have been
unified to one method: new()
We create separate tap tests for PG13/14 and PG15+
and update the Makefiles accordingly.
Relevant PG commits:
201a76183e2056c2217129e12d68c25ec9c559c8
b3b4d8e68ae83f432f43f035c7eb481ef93e1583
* Handles EXPLAIN output diffs in PG15: HashAgg Leverage,alt. output
Still not sure of the relevant PG commit
Could be db0d67db2401eb6238ccc04c6407a4fd4f985832
but disabling enable_group_by_reordering didn't help.
**Intro**
This adds support to Citus to change the CPU priority values of
backends. This is created with two main usecases in mind:
1. Users might want to run the logical replication part of the shard moves
or shard splits at a higher speed than they would do by themselves.
This might cause some small loss of DB performance for their regular
queries, but this is often worth it. During high load it's very possible
that the logical replication WAL sender is not able to keep up with the
WAL that is generated. This is especially a big problem when the
machine is close to running out of disk when doing a rebalance.
2. Users might have certain long running queries that they don't impact
their regular workload too much.
**Be very careful!!!**
Using CPU priorities to control scheduling can be helpful in some cases
to control which processes are getting more CPU time than others.
However, due to an issue called "[priority inversion][1]" it's possible that
using CPU priorities together with the many locks that are used within
Postgres cause the exact opposite behavior of what you intended. This
is why this PR only allows the PG superuser to change the CPU priority
of its own processes. Currently it's not recommended to set `citus.cpu_priority`
directly. Currently the only recommended interface for users is the setting
called `citus.cpu_priority_for_logical_replication_senders`. This setting
controls CPU priority for a very limited set of processes (the logical
replication senders). So, the dangers of priority inversion are also limited
with when using it for this usecase.
**Background**
Before reading the rest it's important to understand some basic
background regarding process CPU priorities, because they are a bit
counter intuitive. A lower priority value, means that the process will
be scheduled more and whatever it's doing will thus complete faster. The
default priority for processes is 0. Valid values are from -20 to 19
inclusive. On Linux a larger difference between values of two processes
will result in a bigger difference in percentage of scheduling.
**Handling the usecases**
Usecase 1 can be achieved by setting `citus.cpu_priority_for_logical_replication_senders`
to the priority value that you want it to have. It's necessary to set
this both on the workers and the coordinator. Example:
```
citus.cpu_priority_for_logical_replication_senders = -10
```
Usecase 2 can with this PR be achieved by running the following as
superuser. Note that this is only possible as superuser currently
due to the dangers mentioned in the "Be very carefull!!!" section.
And although this is possible it's **NOT** recommended:
```sql
ALTER USER background_job_user SET citus.cpu_priority = 5;
```
**OS configuration**
To actually make these settings work well it's important to run Postgres
with more a more permissive value for the 'nice' resource limit than
Linux will do by default. By default Linux will not allow a process to
set its priority lower than it currently is, even if it was lower when
the process originally started. This capability is necessary to reset
the CPU priority to its original value after a transaction finishes.
Depending on how you run Postgres this needs to be done in one of two
ways:
If you use systemd to start Postgres all you have to do is add a line
like this to the systemd service file:
```conf
LimitNice=+0 # the + is important, otherwise its interpreted incorrectly as 20
```
If that's not the case you'll have to configure `/etc/security/limits.conf`
like so, assuming that you are running Postgres as the `postgres` OS user:
```
postgres soft nice 0
postgres hard nice 0
```
Finally you'd have add the following line to `/etc/pam.d/common-session`
```
session required pam_limits.so
```
These settings would allow to change the priority back after setting it
to a higher value.
However, to actually allow you to set priorities even lower than the
default priority value you would need to change the values in the
config to something lower than 0. So for example:
```conf
LimitNice=-10
```
or
```
postgres soft nice -10
postgres hard nice -10
```
If you use WSL2 you'll likely have to do another thing. You have to
open a new shell, because when PAM is only used during login, and
WSL2 doesn't actually log you in. You can force a login like this:
```
sudo su $USER --shell /bin/bash
```
Source: https://stackoverflow.com/a/68322992/2570866
[1]: https://en.wikipedia.org/wiki/Priority_inversion
This creates consistent test output for isolation tests that involve
`CREATE INDEX CONCURRENTLY`. `CREATE INDEX CONCURRENTLY` is sometimes
temporarily detected as blocking, even though it will complete without any other
queries needing to be run. This change makes sure that we wait until that happens
without running any other queries in the meantime. This way we always get consistent
output. The way we do that is addressed by using an empty step in the same
session as the `CREATE INDEX CONCURRENLTY` command. Doing so forces
the isolation tester to wait until the command is finished and not continue with
steps from other sessions. This is [the recommended approach by Postgres][1].
There's two separate cases which are addressed in slightly different ways:
1. If `CREATE INDEX CONCURRENTLY` is actually blocked on another session: Add an
empty step right after the commit of blocking session.
e.g. `"s2-ddl-create-index-concurrently" "s1-commit" "s2-empty"`
2. If it's not actually blocked on another session: Add [an asterisk marker][2] to make
it look like it's blocked (because sometimes this happens randomly) and right
after that we add an empty step to trigger waiting.
e.g. `"s2-ddl-create-index-concurrently"(*) "s2-empty" "s1-commit"`
In passing this also enables isolation tests that were disabled due to a
bug that has already been fixed for a while.
Fixes#5993
Related to #5910 and #2966
[1]: 5f0adec253/src/test/isolation/README (L197-L204)
[2]: 5f0adec253/src/test/isolation/README (L174-L179)
Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
Historically we have been testing with the 'latest' version of libpq
when the CI images were build. This has the downside that rebuilding the
images often break our tests due to different errors returned from
libpq.
With this change we will actually test with a stable version of libpq
that is based on the postgres minor version that we test against.
This will make it easier to maintain postgres images over time, as well
as running _all_ tests locally, where we change libpq in sync with the
postgres server version.
* Remove if conditions with PG_VERSION_NUM < 13
* Remove server_above_twelve(&eleven) checks from tests
* Fix tests
* Remove pg12 and pg11 alternative test output files
* Remove pg12 specific normalization rules
* Some more if conditions in the code
* Change RemoteCollationIdExpression and some pg12/pg13 comments
* Remove some more normalization rules
This PR makes all of the features open source that were previously only
available in Citus Enterprise.
Features that this adds:
1. Non blocking shard moves/shard rebalancer
(`citus.logical_replication_timeout`)
2. Propagation of CREATE/DROP/ALTER ROLE statements
3. Propagation of GRANT statements
4. Propagation of CLUSTER statements
5. Propagation of ALTER DATABASE ... OWNER TO ...
6. Optimization for COPY when loading JSON to avoid double parsing of
the JSON object (`citus.skip_jsonb_validation_in_copy`)
7. Support for row level security
8. Support for `pg_dist_authinfo`, which allows storing different
authentication options for different users, e.g. you can store
passwords or certificates here.
9. Support for `pg_dist_poolinfo`, which allows using connection poolers
in between coordinator and workers
10. Tracking distributed query execution times using
citus_stat_statements (`citus.stat_statements_max`,
`citus.stat_statements_purge_interval`,
`citus.stat_statements_track`). This is disabled by default.
11. Blocking tenant_isolation
12. Support for `sslkey` and `sslcert` in `citus.node_conninfo`
We remove `<waiting ...>` and `<... completed>` outputs for some CREATE
INDEX CONCURRENTLY commands since they can cause flakiness in some scenarios.
Postgres calls WaitForOlderSnapshots() and this can cause CREATE INDEX
CONCURRENTLY commands for shards to get blocked by each other for brief
periods of time. The extra waits can pop-up, or they can get completed
at different lines in the output files. To remedy that, we rename those
indexes to be captured by the new normalization rule.
We were already doing so for functions & types believing that
this cannot be the case for other object types.
However, as in #5830, we cannot distribute an object that user
attempts creating in temp schema. Even more, this doesn't only
apply to functions and types but also to many other object types.
So with this commit, we teach preprocess/postprocess functions
(that need to create dependencies on worker nodes) how to skip
trying to distribute such objects.
We also start identifying temp schemas as the objects that we
don't know how to propagate to worker nodes so that we can
simply create objects locally if user attempts creating them
in a temp schema.
There are 36 callers of `EnsureDependenciesExistOnAllNodes` in
the codebase atm and for the most we still need to throw a hard
error (i.e.: not use `DeferErrorIfHasUnsupportedDependency`
beforehand), such as:
i) user explicitly wants to create a distributed object
* CreateCitusLocalTable
* CreateDistributedTable
* master_create_worker_shards
* master_create_empty_shard
* create_distributed_function
* EnsureExtensionFunctionCanBeDistributed
ii) we don't want to skip altering distributed table on worker nodes
* PostprocessIndexStmt
* PostprocessCreateTriggerStmt
* PostprocessCreateStatisticsStmt
iii) object is already distributed / handled by Citus before, so we
aren't okay with not propagating the ALTER command
* PostprocessAlterTableSchemaStmt
* PostprocessAlterCollationOwnerStmt
* PostprocessAlterCollationSchemaStmt
* PostprocessAlterDatabaseOwnerStmt
* PostprocessAlterExtensionSchemaStmt
* PostprocessAlterFunctionOwnerStmt
* PostprocessAlterFunctionSchemaStmt
* PostprocessAlterSequenceOwnerStmt
* PostprocessAlterSequenceSchemaStmt
* PostprocessAlterStatisticsSchemaStmt
* PostprocessAlterStatisticsOwnerStmt
* PostprocessAlterTextSearchConfigurationSchemaStmt
* PostprocessAlterTextSearchDictionarySchemaStmt
* PostprocessAlterTextSearchConfigurationOwnerStmt
* PostprocessAlterTextSearchDictionaryOwnerStmt
* PostprocessAlterTypeSchemaStmt
* PostprocessAlterForeignServerOwnerStmt
iv) we already cannot create those objects in temp schemas, so skipping
for now
* PostprocessCreateExtensionStmt
* PostprocessCreateForeignServerStmt
Also note that there are 3 more callers of
`EnsureDependenciesExistOnAllNodes` in enterprise in addition to those
36 but we don't need to do anything specific about them due to the same
reasoning given in iii).
* Refactor some checks in citus local tables
* all existing citus local tables are auto converted after upgrade
* Update warning messages in CreateCitusLocalTable
* Hide notice msg for auto converting local tables
* Hide hint msg
Co-authored-by: Ahmet Gedemenli <afgedemenli@gmail.com>
With Citus 9.0, we introduced `citus.single_shard_commit_protocol` which
defaults to 2PC.
With this commit, we prevent any user to set it to 1PC and drop support
for `citus.single_shard_commit_protocol`.
Although this might add some overhead for users, it is already the default
behaviour (so less likely) and marking placements as INVALID is much
worse.
- get_missing_time_partition_ranges: Gets the ranges of missing partitions for the given table, interval and range unless any existing partition conflicts with calculated missing ranges.
- create_time_partitions: Creates partitions by getting range values from get_missing_time_partition_ranges.
- drop_old_time_partitions: Drops partitions of the table older than given threshold.