Compare commits

..

162 Commits

Author SHA1 Message Date
Hanefi Onaldi 7603fe510a
Bump Citus version to 11.0.8 2023-04-25 14:23:44 +03:00
Hanefi Onaldi 367c22ac11
Add changelog entries for 11.0.8
(cherry picked from commit 214bc39a5a)
2023-04-25 14:23:43 +03:00
Gürkan İndibay f9b02ae2c7
Fix packaging test pipelines
We had #6737 fix the same issue on main branch, but we need to fix it on
release branches as well. As the patch does not really apply to earlier
release branches, we just added the fix manually.

(cherry picked from commit 4f9a344085)
2023-04-25 14:23:43 +03:00
Jelte Fennema 360537bc42 Use pg_total_relation_size in citus_shards (#6748)
DESCRIPTION: Correctly report shard size in citus_shards view

When looking at citus_shards, people are interested in the actual size
that all the data related to the shard takes up on disk.
`pg_total_relation_size` is the function to use for that purpose. The
previously used `pg_relation_size` does not include indexes or TOAST.
Especially the missing toast can have enormous impact on the size of the
shown data.

(cherry picked from commit b489d763e1)
2023-03-06 11:39:21 +01:00
aykut-bozkurt 01506e8a57 fix single tuple result memory leak (#6724)
We should not omit to free PGResult when we receive single tuple result
from an internal backend.
Single tuple results are normally freed by our ReceiveResults for
`tupleDescriptor != NULL` flow but not for those with `tupleDescriptor
== NULL`. See PR #6722 for details.

DESCRIPTION: Fixes memory leak issue with query results that returns
single row.

(cherry picked from commit 9e69dd0e7f)
2023-02-17 14:37:06 +03:00
Jelte Fennema 5729f9e690 Quote all identifiers that we use for logical replication (#6604)
In #6598 it was noticed that Citus could generate syntactically invalid
statements during logical replication. With #6603 we resolved the direct
issue, by only generating valid subscription names. But there was also
the underlying problem that we did not escape certain identifier
strings. While in theory this should be okay since we should only
generate names that are valid, this issue reiterated that we should not
take this for granted. As an extra line of defense this quotes all
identifiers we use during logical replication setup.

(cherry picked from commit c2b4087ff0)
2023-02-10 16:26:46 +01:00
Gürkan İndibay be8cd00d3f Fixes validate Output phase of packaging pipeline (#6678)
Pyenv is installed in our container images but I found out that pyenv is
not being activated since it is activated from ~/bashrc script and in
GitHub Actions (GHA) this script is not being executed
Since pyenv is not activated, default python versions comes from docker
images is being used and in this case we get errors for python version
3.11.
Additionally, $HOME directory is /github/home for containers executed
under GHA and our pyenv installation is under /root directory which is
normally home directory for our packaging containers
This PR activates usage of pyenv and additionally uses pyenv virtualenv
feature to execute validate_output function in isolation

---------

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
(cherry picked from commit d919506076)
2023-01-31 14:01:05 +03:00
Onur Tirtir b9e18406fa Fall-back to seq-scan when accessing columnar metadata if the index doesn't exist
Fixes #6570.

In the past, having columnar tables in the cluster was causing pg
upgrades to fail when attempting to access columnar metadata. This is
because, pg_dump doesn't see objects that we use for columnar-am related
booking as the dependencies of the tables using columnar-am.
To fix that; in #5456, we inserted some "normal dependency" edges (from
those objects to columnar-am) into pg_depend.

This helped us ensuring the existency of a class of metadata objects
--such as columnar.storageid_seq-- and helped fixing #5437.

However, the normal-dependency edges that we added for indexes on
columnar metadata tables --such columnar.stripe_pkey-- didn't help at
all because they were indeed causing dependency loops (#5510) and
pg_dump was not able to take those dependency edges into the account.

For this reason, instead of inserting such dependency edges from indexes
to columnar-am, we allow columnar metadata accessors to fall-back to
sequential scan during pg upgrades.

(cherry picked from commit 1c51ddae49)
2023-01-30 19:13:58 +03:00
aykut-bozkurt 3eab345b67 fix dropping table_name option from foreign table (#6669)
We should disallow dropping table_name option if foreign table is in
metadata. Otherwise, we get table not found error which contains
shardid.

DESCRIPTION: Fixes an unexpected foreign table error by disallowing to drop the table_name option.

Fixes #6663

(cherry picked from commit 8a9bb272e4)
2023-01-30 17:48:47 +03:00
Gokhan Gulbiz 9b441c21da Allow plain pg foreign tables without a table_name option (#6652)
(cherry picked from commit 4e26464969)
2023-01-30 17:48:07 +03:00
Ahmet Gedemenli 2027d592a1 Fix crash when trying to replicate a ref table that is actually dropped (#6595)
DESCRIPTION: Fix crash when trying to replicate a ref table that is actually dropped

see #6592
We should have a real solution for it.

(cherry picked from commit bc3383170e)
(cherry picked from commit 9e32e34313)
2023-01-10 14:50:01 +03:00
Ahmet Gedemenli 1f03f13665 Use %u instead of %i for naming subscriptions & roles 2023-01-06 17:01:39 +03:00
Gürkan İndibay db7bd8e1a3
Add jobs to test builds on different distros (release-11.0) (#6542)
With this PR, citus code will be tested in all packaging environments.
Sometimes, there can be compile errors which blocks packaging and in
this case unplanned delays may occur.
By testing the code in packaging environments, I'm aiming to detect any
compilation errors before packaging.

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>

DESCRIPTION: PR description that will go into the change log, up to 78
characters

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
2022-12-05 14:23:42 +03:00
Jelte Fennema bd7ee60991 Correctly fix OpenSSL 3.0 warnings (#6502)
In #6038 I tried to fix OpenSSL 3.0 warnings with PG13, but I had made a
mistake when doing that. This actually fixes these warnings.

(cherry picked from commit a477ffdf4b)
2022-12-05 09:30:57 +01:00
Jelte Fennema a9a9aa1098 Fix compilation warning on PG13 + OpenSSL 3.0 (#6038)
This removes some warnings that are present when building on Ubuntu 22.04.
It removes warnings on PG13 + OpenSSL 3.0. OpenSSL 3.0 has marked some
functions that we use as deprecated, but we want to continue support OpenSSL
1.0.1 for the time being too. This indicates that to OpenSSL 3.0, so it doesn't
show warnings.

(cherry picked from commit 3fadb98380)
2022-12-05 09:30:54 +01:00
Teja Mupparti 9aea377ce5 Fix the dangling pointer bug in get_merged_argument_list()
(cherry picked from commit edaf88e0ff)
2022-11-22 10:49:04 -08:00
Onur Tirtir 96e20ee42e Fix dangling pointer warning in AnyTableReplicated (#6504)
DESCRIPTION: Fixes a potential dangling pointer issue

Need to backport to 11.0 & 11.1 since we might want to release packages
for debian/bookworm based on those branches in future.

(cherry picked from commit 80faf47ab5)
2022-11-21 16:43:28 +03:00
Hanefi Onaldi a005db871a
Bump Citus version to 11.0.7 2022-11-08 12:12:17 +03:00
Hanefi Onaldi 011f5cfa83
Add changelog entries for 11.0.7 2022-11-08 12:11:16 +03:00
Naisila Puka b1a72bb822 Fixes empty password issue (#6417)
(cherry picked from commit 89aa9a015f)
2022-10-11 15:59:12 +03:00
Onur Tirtir 39225963c7 Use memcpy instead of memcpy_s to avoid pointless limits in columnar (#6419)
DESCRIPTION: Raises memory limits in columnar from 256MB to 1GB for
reads and writes

This doesn't completely fix #5918 but at least increases the
buffer limits that might cause throwing an error when reading
from or writing into into columnar storage. A way better approach
to fix this is documented in #6420.

Replacing memcpy_s with memcpy is quite safe in those places
since we anyway make sure to allocate enough amount of memory
before writing into related buffers.

(cherry picked from commit 0b81f68def)
2022-10-11 15:01:12 +03:00
Onur Tirtir 99697fb1e5 Fix use-after-free in GetAlterTriggerStateCommand() (#6413)
Fix use-after-free in GetAlterTriggerStateCommand() introduced in #6398.

(cherry picked from commit 517b72a9d5)
2022-10-10 16:38:52 +03:00
Onur Tirtir 9929d9240e Retain trigger settings when re-creating the triggers (on shards) (#6398)
Fixes https://github.com/citusdata/citus/issues/6394.

DESCRIPTION: Fixes a bug that causes creating disabled-triggers on
shards as enabled

Since CREATE TRIGGER doesn't have syntax support to specify
whether the trigger should be enabled/disabled, the underlying
PG function (`pg_get_triggerdef()`) that we use to generate the
command to create the trigger is not enough. For this reason, we
append a second command to enable/disable trigger, right after
creating it.

We don't retain explicit extension dependencies set by using
`ALTER trigger DEPENDS ON EXTENSION` commands too, but apparently
right fix for that is to throw an error as in
`PreprocessAlterTriggerDependsStmt()`; so, opened a separate PR
to fix that #6399.

(cherry picked from commit 86e186f671)
2022-10-10 11:24:01 +03:00
Ying Xu d3757ff15d
[Columnar] 11.0 Cherry-Pick Bugfix for Columnar: options ignored during ALTER TABLE (#6410)
DESCRIPTION: Fixes a bug that prevents retaining columnar table options
after a table-rewrite A fix for this issue: Columnar: options ignored
during ALTER TABLE rewrite #5927
The OID for the temporary table created during ALTER TABLE was not the
same as the original table's OID so the columnar options were not being
applied during rewrite.

The change is that I applied the original table's columnar options to
the new table so that it has the correct options during write. I also
added a test.

cherry-pick from commit f21cbe68f8
2022-10-09 22:30:59 -07:00
Naisila Puka 51da46c021 Use original relation to retrieve column name because of syscache (#6387)
During alter_distributed_table, we create a new table like the
original table but with the altered options.

To retrieve the name of the distribution column, we were using
the attribute syscache of the new table, since we already created
the new table as identical to the original table.

However, the attribute syscaches of these two tables are not
the same if the original table has dropped columns. The reason
is that dropped columns are all still present in the cache.
Hence, for example, the attnos would be different in the syscaches.

So, let's use the attribute syscache of the original table.
2022-10-06 12:11:35 +03:00
Hanefi Onaldi ff6358749d
Ensure no dependencies to index before drop
(cherry picked from commit 11a9a3771f)
2022-10-04 21:08:39 +03:00
Hanefi Onaldi 628908e990
Document failing downgrades from 10.2-4 to 10.2-2
(cherry picked from commit 5ddd4754a2)
2022-10-04 21:08:39 +03:00
Hanefi Onaldi 3efafe49ba
Fix tests for missing downgrades
(cherry picked from commit 0efd6f7829)
2022-10-04 21:08:39 +03:00
Onur Tirtir b14bae6311 Not allow ON DELETE/UPDATE SET DEFAULT actions on columns that default to sequences (#6340)
Given that we drop DEFAULT nextval('sequence') expressions from
shard relation columns, allowing `ON DELETE/UPDATE SET DEFAULT`
on such columns might cause inserting NULL values as a result
of a delete/update operation.

For this reason, we disallow ON DELETE/UPDATE SET DEFAULT actions
on columns that default to sequences.

DESCRIPTION: Disallows having ON DELETE/UPDATE SET DEFAULT actions on
columns that default to sequences

Fixes #6339.

(cherry picked from commit a868cc049a)

 Conflicts (dropped those changes since pg15 is not supported on 11.0):
	src/test/regress/expected/pg15.out
	src/test/regress/sql/pg15.sql
2022-09-23 14:24:29 +03:00
Onur Tirtir 308f9298d7 Include IntegerArrayTypeToList from worker_protocol.h instead of array_type.h
On Citus versions <= 11.0, IntegerArrayTypeToList() doesn't exist and
its helpers (DeconstructArrayObject() & ArrayObjectCount()) are defined
in worker_protocol.h. (See 9476f377b5).

So we add IntegerArrayTypeToList() into worker_protocol.c and include
IntegerArrayTypeToList from worker_protocol.h instead of array_type.h
in foreign_constraint.c.

This is needed to backport a868cc049a into
this (release-11.0) branch, see the next commit.
2022-09-23 14:24:22 +03:00
Onur Tirtir fcd0bdf370 Not drop default col exprs from shard when adding local table to metadata (#6323)
As we did for GENERATED STORED columns in #4613, we should not drop
column
default expressions that are not based on sequences from shard relation
since
such expressions need to exist e.g. for foreign key actions.

For the column default expressions that are based on sequences we cannot
do much, so we need to disallow having ON DELETE SET DEFAULT actions on
such columns in a separate PR, see #6339.

Fixes #6318.

DESCRIPTION: Fixes a bug that might cause inserting incorrect DEFAULT
values when applying foreign key actions

(cherry picked from commit de24a3eda5)

 Conflicts (dropped those changes since pg15 is not supported on 11.0):
	src/test/regress/expected/pg15.out
	src/test/regress/sql/pg15.sql
2022-09-23 13:58:23 +03:00
Naisila Puka 498131b4f6 Use RelationGetPrimaryKeyIndex for citus catalog tables (#6262)
pg_dist_node and pg_dist_colocation have a primary key index, not a replica identity index.

Citus catalog tables are created in public schema, which has replica identity index by default 
as primary key index. Later the citus catalog tables are moved to pg_catalog schema.

During pg_upgrade, all tables are recreated, and given that pg_dist_colocation is found in
pg_catalog schema, it is recreated in that schema, and when it is recreated it doesn't
have a replica identity index, because catalog tables have no replica identity.

Further action:
Do we even need to acquire this lock on the primary key index?
Postgres doesn't acquire such locks on indexes before deleting catalog tuples.
Also, catalog tuples don't have replica identities by definition.
2022-09-22 12:50:11 +03:00
Jelte Fennema b2d0ac5a9c Revert to using old image tag for upgrade tests 2022-09-07 13:27:49 +02:00
Hanefi Onaldi 38f088d10c Normalize messages from different libpq versions
Historically we have been testing with the 'latest' version of libpq
when the CI images were build. This has the downside that rebuilding the
images often break our tests due to different errors returned from
libpq.

With this change we will actually test with a stable version of libpq
that is based on the postgres minor version that we test against.

This will make it easier to maintain postgres images over time, as well
as running _all_ tests locally, where we change libpq in sync with the
postgres server version.

(cherry picked from commit f944f97d01)
2022-09-07 13:27:49 +02:00
Hanefi Onaldi 0ae1c630cf Introduce one new alternative text output to fix flakiness (#5913)
Here is a flaky test output that is quite hard to fix:

```diff
diff -dU10 -w /home/circleci/project/src/test/regress/expected/isolation_master_update_node_1.out /home/circleci/project/src/test/regress/results/isolation_master_update_node.out
--- /home/circleci/project/src/test/regress/expected/isolation_master_update_node_1.out.modified	2022-03-21 19:03:54.237042562 +0000
+++ /home/circleci/project/src/test/regress/results/isolation_master_update_node.out.modified	2022-03-21 19:03:54.257043084 +0000
@@ -49,18 +49,20 @@
  <waiting ...>
 step s2-update-node-1-force: <... completed>
 master_update_node
 ------------------

 (1 row)

 step s2-abort: ABORT;
 step s1-abort: ABORT;
 FATAL:  terminating connection due to administrator command
-SSL connection has been closed unexpectedly
+server closed the connection unexpectedly
+	This probably means the server terminated abnormally
+	before or while processing the request.
```

I could not come up with a solution that would decrease the flakiness in the test outputs. We already have 3 output files for the same test and now I introduced a 4th one.

I can also add complex regular expressions that span multiple lines, and normalize these error messages. Feel free to suggest a normalized error message in a comment here.

## Current alternative file contents

`isolation_master_update_node.out`
```
step s1-abort: ABORT;
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
SSL connection has been closed unexpectedly
```

`isolation_master_update_node_0.out`
```
step s1-abort: ABORT;
WARNING: this step had a leftover error message
FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
```

`isolation_master_update_node_1.out`
```
step s1-abort: ABORT;
FATAL:  terminating connection due to administrator command
SSL connection has been closed unexpectedly
```

new file: `isolation_master_update_node_2.out`
```
step s1-abort: ABORT;
FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
```

(cherry picked from commit 518fb0873e)
2022-09-07 13:27:49 +02:00
Jelte Fennema 4d21903d3e Fix flakyness in failure_setup (#6205)
In CI sometimes failure_setup will fail with the following error:
```diff
 SELECT master_add_node('localhost', :worker_2_proxy_port);  -- an mitmproxy which forwards to the second worker
- master_add_node
----------------------------------------------------------------------
-               2
-(1 row)
-
+ERROR:  connection to the remote node localhost:9060 failed with the following error: could not connect to server: Connection refused
+	Is the server running on host "localhost" (127.0.0.1) and accepting
+	TCP/IP connections on port 9060?
+could not connect to server: Connection refused
+	Is the server running on host "localhost" (127.0.0.1) and accepting
+	TCP/IP connections on port 9060?
+could not connect to server: Cannot assign requested address
+	Is the server running on host "localhost" (::1) and accepting
+	TCP/IP connections on port 9060?
diff -dU10 -w /home/circleci/project/src/test/regress/expected/failure_online_move_shard_placement.out /home/circleci/project/src/test/regress/results/failure_online_move_shard_placement.out
```

This then breaks all the tests run after it as well, because we're
missing one worker node.

Locally I was able to reproduce this error by sleeping for 10 seconds in
the forked process sleep before actually starting mitmproxy. So I'm
expecting what's happening in CI is that due to limited resources,
mitmproxy is not up yet when we try to add its port as a workernode.

This PR fixes this by waiting until mitmproxy is listening on its socket
before actually starting to run our tests. This fixed it locally for me
when I made the forked process sleep for 10 seconds before starting
mitmproxy.

In passing it also improves the detection and errors that we already
had for the case where something was already listening on the
mitmproxy port.

Because both @gledis69 and me were changing things in our CI images
at the same time this also includes a bump of the style checker tools.
Closes #6200

(cherry picked from commit 25e5cf2e50)
2022-09-07 13:27:49 +02:00
Gledis Zeneli c2b584c4bf Update stylechecker version (#6194)
Update stylechecker image to include versions similar to the other test images.

(cherry picked from commit 2b74735496)
2022-09-07 13:27:49 +02:00
Jelte Fennema 5901b815a2 Fix flakyness in adaptive_executor (#6275)
Sometimes in CI our adaptive_executor test would fail randomly with the
following error:

```diff
 SELECT sum(result::bigint) FROM run_command_on_workers($$
   SELECT count(*) FROM pg_stat_activity
   WHERE pid <> pg_backend_pid() AND query LIKE '%8010090%'
 $$);
  sum
 -----
-   4
+   2
 (1 row)

 END;
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26665/workflows/40665680-0044-4852-8fe4-5fd628f9fb47/jobs/764371

This means that the low slow start interval did not have any effect on
the number of connections being opened. I could see two possibilities
for this to happen:
1. CI was slow and actually doing the start of the second connection. I
   tried to solve this by doubling the time a query to the worker takes.
2. The second option is that the shards were queried in the oposite
   order than we expect. This would mean that the first query to the
   worker completes quickly because there's no, sleep because it doesn't
   contain any rows. I tried to solve this option by adding a row to
   each shard.

After trying to reproduce the random failure in CI it turned out that I
needed both of these fixes to resolve the random failure.

(cherry picked from commit f22a47981a)
2022-09-07 13:27:49 +02:00
Jelte Fennema 8451dd3554 Hopefully fix flakyeness in drop_partitioned_table (#6270)
Sometimes in CI our drop_partitioned_talbe test would fail with the
following error:

```diff
 NOTICE:  issuing SELECT worker_drop_distributed_table('drop_partitioned_table.child1')
 NOTICE:  issuing SELECT worker_drop_distributed_table('drop_partitioned_table.child1')
 NOTICE:  issuing DROP TABLE IF EXISTS drop_partitioned_table.child1_727001 CASCADE
-NOTICE:  issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100047)
-NOTICE:  issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100047)
+NOTICE:  issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100046)
+NOTICE:  issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100046)
 ROLLBACK;
 NOTICE:  issuing ROLLBACK
 NOTICE:  issuing ROLLBACK
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26631/workflows/31536032-e1ba-493b-b12a-f40757f3a7d6/jobs/762170

For some reason the colocationid of the distributed partitioned table
would be one less than we expected. Why this happens I'm not sure, but
it seems fairly harmless that it does.

In an attempt to work around this flakyness I now reset the colocation
id sequence right before creating the table in question. This is good
practice in general, because it allows us to run the test successfully
using `check-minimal` and it also allows us to rerun it multiple times.

(cherry picked from commit 895a484b39)
2022-09-07 13:27:49 +02:00
Jelte Fennema 33913bed37 Fix flakyness in failure_connection_establishment (#6251)
In CI sometimes failure_connection_establishment would fail with the
following error:
```diff
 -- cancel all connections to this node
 SELECT citus.mitmproxy('conn.onAuthenticationOk().cancel(' || pg_backend_pid() || ')');
- mitmproxy
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  canceling statement due to user request
+CONTEXT:  COPY mitmproxy_result, line 1: ""
+SQL statement "COPY mitmproxy_result FROM '/home/circleci/project/src/test/regress/tmp_check/mitmproxy.fifo'"
+PL/pgSQL function citus.mitmproxy(text) line 11 at EXECUTE
 SELECT * FROM citus_check_cluster_node_health();
```

The reason for this is that the mitm command that was used is very
broad and doesn't actually do what the comment says. What happens is
that if any connection is made, the current backend is cancelled, which
is not the always the same as the backend that made the connection. My
assessment is that likely the maintenance daemon makes a connection to
the node while we are executing the mitmproxy command. The mitmproxy
command goes through, and then triggers a cancel of itself due to the
connection made by the maintenance daemon.

This PR simply removes this test, since it doesn't seem to test what it
intended to test anyway. There's also still the "kill" version of this
test, which does do the intended thing. So I don't think we lose
important coverage by removing this test.

(cherry picked from commit 2a0c0b3ba6)
2022-09-07 13:27:49 +02:00
Jelte Fennema 77596cd62f Fix flakyness in multi_transaction_recovery (#6249)
Sometimes in CI multi_transaction_recovery would fail with the following
error:
```diff
 SET LOCAL citus.defer_drop_after_shard_move TO OFF;
 SELECT citus_move_shard_placement((SELECT * FROM selected_shard), 'localhost', :worker_1_port, 'localhost', :worker_2_port, shard_transfer_mode := 'block_writes');
- citus_move_shard_placement
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  could not find placement matching "localhost:57637"
+HINT:  Confirm the placement still exists and try again.
 COMMIT;
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26510/workflows/8269ea93-d9b4-4376-ae0e-8332a5c15fc6/jobs/755548

The reason for this was that when choosing `selected_shard` we didn't
ensure that it was actually located on the node that we were moving it
from. Instead we simply picked the first shard for the table that was
returned by the query.

To fix this issue this PR adds a filter to only choose shards that are
located on the intended node.

(cherry picked from commit 18015ca501)
2022-09-07 13:27:49 +02:00
Jelte Fennema bf63788b98 Fix flakyness in isolation_distributed_deadlock_detection (#6240)
Our isolation_distributed_deadlock_detection test would fail randomly in
CI in three different ways.

The first type of failure looked like this:

```diff
 check_distributed_deadlocks
 ---------------------------
 t
 (1 row)

-step s1-update-5: <... completed>
 step s5-update-1: <... completed>
 ERROR:  canceling the transaction since it was involved in a distributed deadlock
+step s1-update-5: <... completed>
 step s1-commit:
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26399/workflows/d213ee85-397a-467a-9ffb-39e4f44e6688/jobs/749533

This random change in output was harmless and happened because when the
deadlock detector cancelled a query, two queries would continue: The one
that was cancelled would throw an error (and thus complete), and the one
that was unblocked would now complete.

It was random which of the two the isolation tester would first detect
as completed. To resolve this PR starts using the ["marker" feature][1],
this allows us to make sure one of the steps won't be marked as
completed until the other one completed first.

The second random failure was very similar:
```diff
 check_distributed_deadlocks
 ---------------------------
 t
 (1 row)

-step s2-update-2: <... completed>
-step s3-update-3: <... completed>
-ERROR:  canceling the transaction since it was involved in a distributed deadlock
 step s6-commit:
   COMMIT;

 step s5-update-6: <... completed>
+step s2-update-2: <... completed>
+step s3-update-3: <... completed>
+ERROR:  canceling the transaction since it was involved in a distributed deadlock
 step s5-commit:
```

Again a harmless difference in test output. In this case it's possible
that the deadlock detector would not detect the unblocked processes
right away, and would thus continue with to the next step. This step was
a commit on a session that was not blocked, and which thus could
complete without issues.

To solve this I changed the order of the commits at the end of the
permutation, to always have the first session that would commit be the
session that would be unblocked the last. This ensures that no commit
will ever be executed before completing all the queries.

The third issue was different and looked like this:
```diff
 step s4-update-5: <... completed>
 step s4-commit:
   COMMIT;

+step s1-update-4: <... completed>
+isolationtester: canceling step s3-update-4 after 5 seconds
 step s3-update-4: <... completed>
+ERROR:  canceling statement due to user request
+step s2-update-2: <... completed>
 step s3-commit:
   COMMIT;

-step s2-update-2: <... completed>
-step s1-update-4: <... completed>
 step s1-commit:
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26411/workflows/9089beec-4f0f-4027-b4ce-0e84889afc06/jobs/750143

The reason for this failure is not entirely clear to me, but I was able
to remove the flakyness without impacting the goal of the test. What was
happening was that both `s1` and `s3` were waiting for `s4` to commit
and release it's lock on the row 4. For some reason it wasn't
deterministic which of the two sessions would be granted the lock after
it was released by row 4. The test expected `s3` to be granted the lock,
but sometimes it would be granted to `s1` instead. Which would in turn
cause `s3` to still be blocked.

To solve this I simply removed `s1` completely from this test. It wasn't
actually part of the cycle that the deadlock detector should detect and
was an unrelated appendage:

```mermaid
  graph TD;
      s2-->s3;
      s3-->s4;
      s1-->s4;
      s4-->s5;
      s5-->s6;
      s6-->s5;
```

By removing `s1` completely there was no contention for the lock and
`s3` could always acquire it.

[1]: a73d6c87f2/src/test/isolation/README (L163-L188)

(cherry picked from commit 9749622399)
2022-09-07 13:27:49 +02:00
Jelte Fennema 45838a0139 Fix flakyness in multi_replicate_reference_table (#6235)
In CI multi_replicate_reference_table would sometimes fail like this:

```diff
 -- detects correctly that referecence table doesn't have replica identity
 SELECT replicate_reference_tables();
-ERROR:  cannot use logical replication to transfer shards of the relation initially_not_replicated_reference_table since it doesn't have a REPLICA IDENTITY or PRIMARY KEY
+ERROR:  cannot use logical replication to transfer shards of the relation ref_table since it doesn't have a REPLICA IDENTITY or PRIMARY KEY
 DETAIL:  UPDATE and DELETE commands on the shard will error out during logical replication unless there is a REPLICA IDENTITY or PRIMARY KEY.
 HINT:  If you wish to continue without a replica identity set the shard_transfer_mode to 'force_logical' or 'block_writes'.
```

Because `CitusTableTypeIdList` returns tables in heap order so it's
a bit random which one is first in the list. And the test contained
multiple tables that didn't have a primary key or replica identity. So
it made sense that the error could be for either one of these tables.
This PR makes the test output consistent by changing one of the tables
to have a primary key.

Example of failing test: https://app.circleci.com/pipelines/github/citusdata/citus/26387/workflows/fc3196e7-ddf2-4000-a70b-5ac71c836321/jobs/748940

(cherry picked from commit 5c0205ce10)
2022-09-07 13:27:49 +02:00
Jelte Fennema f87940221f Fix flakyness in ch_benchmarks_1 (#6228)
One of our arbitrary config tests would sometimes fail like this in CI:
```diff
     su_nationkey,
     cust_nation,
     l_year;
- supp_nation | cust_nation | l_year | revenue
----------------------------------------------------------------------
-           9 | C           |   2008 |    3.00
-(1 row)
-
+ERROR:  cannot connect to localhost:10212 to fetch intermediate results
+CONTEXT:  while executing command on localhost:10211
```

When looking at the logs it seems like we were running out of
connections:
```
2022-08-23 14:03:52.856 UTC [28122] FATAL:  sorry, too many clients already
2022-08-23 14:03:52.860 UTC [21027] ERROR:  cannot connect to localhost:10212 to fetch intermediate results
```

This happened with `CitusThreeWorkersManyShards` config. This test on
purpose tries to push the limits of Citus quite far. And the
`ch_benchmarks_1` test is also run in parallel with a few more ones. So
it's not too weird that it ran out of connections. This doubles the
connection limit in the arbitrary config tests to hopefully not hit this
error again.

Example of failed test: https://app.circleci.com/pipelines/github/citusdata/citus/26365/workflows/7a1b5688-85cc-4bc3-ade5-9bd1d83cd0ed/jobs/747908/parallel-runs/1

(cherry picked from commit 21780b4f65)
2022-09-07 13:27:49 +02:00
Jelte Fennema 5d60bbf7f8 Better test failure debugging for arbitrary-configs (#5861)
This improves debugging of arbitrary configs in two ways:
1. Enable logging of distributed deadlock detection
2. Show output of `psql` commands

(cherry picked from commit a645cb4b94)
2022-09-07 13:27:49 +02:00
Jelte Fennema 0fd78b1bde Fix flakyness in failure_connection_establishment (#6226)
In CI our failure_connection_establishment sometimes failed randomly
with the following error:
```diff
 -- verify a connection attempt was made to the intercepted node, this would have cause the
 -- connection to have been delayed and thus caused a timeout
 SELECT * FROM citus.dump_network_traffic() WHERE conn=0;
  conn | source | message
 ------+--------+---------
-    0 | coordinator | [initial message]
-(1 row)
+(0 rows)

 SELECT citus.mitmproxy('conn.allow()');
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26318/workflows/d3354024-9a67-4b01-9416-5cf79aec6bd8/jobs/745558

The way I fixed this was by removing the dump_network_traffic call. This
might sound simple, but doing this while continuing to let the test
serve its intended purpose required quite some more changes.

This dump_network_traffic call was there because we didn't want to show
warnings in the queries above, because the exact warnings were not
reliable. The main reason this error was not reliable was because we
were using round-robin task assignment. We did the same query twice, so
that it would hit the node with the intercepted connection in one of
those connections. Instead of doing that I'm now using the
"first-replica" policy and do the queries only once. This works, because
the first placements by placementid for each of the used tables are on
the second node, so first-replica will cause the first connection to go
there.

This solved most of the flakyness, but when confirming that the
flakyness was fixed I found some additional errors:

```diff
 -- show that INSERT failed
 SELECT citus.mitmproxy('conn.allow()');
  mitmproxy
 -----------

 (1 row)

 SELECT count(*) FROM single_replicatated WHERE key = 100;
- count
----------------------------------------------------------------------
-     0
-(1 row)
-
+ERROR:  could not establish any connections to the node localhost:9060 after 400 ms
 RESET client_min_messages;
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26321/workflows/fd5f4622-400c-465e-8d82-83f5f55a87ec/jobs/745666

I addressed this with a combination of two things:
1. Only change citus.node_connection_timeout for the queries that we
   want to test timeout behaviour for. When those queries are done I
   reset the value to the default again.
2. Change our mitm framework to only delay the initial connection packet
   instead of all packets. I think sometimes a follow on packet of a previous
   connection attempt was causing the next connection attempt to be delayed
   even if `conn.allow()` was already called. For our tests we only care about
   connection timeouts, so there's no reason to delay any other packets than
   the initial connection packet.

Then there was some last flakyness in the exact error that was given:

```diff
 -- tests for connectivity checks
 SELECT name FROM r1 WHERE id = 2;
 WARNING:  could not establish any connections to the node localhost:9060 after 900 ms
+WARNING:  connection to the remote node localhost:9060 failed with the following error:
  name
 ------
  bar
 (1 row)
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26338/workflows/9610941c-4d01-4f62-84dc-b91abc56c252/jobs/746467

I don't have a good explaination for this slight change in error message, but
given that it is missing the actual error message I expected this to be related
to some small difference in timing: e.g. the server responding to the connection
attempt right after the coordinator determined that the connection timed out.
To solve this last  flakyness I increased the connection timeouts and made the
difference between the timeout and the delay a bit bigger. With these tweaks
I wasn't able to reproduce this error on CI anymore.

Finally, I made most of the same changes to failure_failover_to_local_execution,
since it was using the `conn.delay()` mitm method too. The only change that
I left out was the timing increase, since it might not be strictly necessary and
increases time it takes to run the test. If this test ever becomes flaky the first
thing we should try is increase its timeout.

(cherry picked from commit cc7e93a56a)
2022-09-07 13:27:49 +02:00
Jelte Fennema 583080b872 Fix flakyness in failure_single_select (#6223)
The failure_single_select test would sometimes fail with an error that's
similar to this:
```diff
 -- cancel after first SELECT; txn should fail and nothing should be marked as invalid
 SELECT citus.mitmproxy('conn.onQuery(query="^SELECT").cancel(' ||  pg_backend_pid() || ')');
- mitmproxy
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  canceling statement due to user request
+CONTEXT:  COPY mitmproxy_result, line 1: ""
+SQL statement "COPY mitmproxy_result FROM '/home/circleci/project/src/test/regress/tmp_check/mitmproxy.fifo'"
+PL/pgSQL function citus.mitmproxy(text) line 11 at EXECUTE
 BEGIN;
```

This error looked very to the one from #6217 and indeed the cause turned
out to be similar. Because we were canceling all SELECT queries, we
would actually sometimes cancel our mitmproxy SELECT queries itself.

This puts some additional restrictions on the queries that we cancel,
most importantly it should contain the name of the table that we're
selecting from.

I was able to reproduce the original issue locally pretty reliably. With
the changes in this PR it didn't happen again.

In passing this also changes one other failure test that was cancelling
all selects and puts similar additional restrictions on those
cancellations.

Example of failed test in CI: https://app.circleci.com/pipelines/github/citusdata/citus/26305/workflows/4d942b91-f83c-453c-8d9a-ae22d608e756/jobs/745071

(cherry picked from commit 506c16efdf)
2022-09-07 13:27:49 +02:00
Jelte Fennema 9b5027917a Fix flakyness in failure_create_distributed_table_non_empty (#6217)
The failure_create_distributed_table_non_empty test would sometimes fail
like this:
```diff
 -- in the first test, cancel the first connection we sent from the coordinator
 SELECT citus.mitmproxy('conn.cancel(' ||  pg_backend_pid() || ')');
- mitmproxy
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  canceling statement due to user request
+CONTEXT:  COPY mitmproxy_result, line 1: ""
+SQL statement "COPY mitmproxy_result FROM '/home/circleci/project/src/test/regress/tmp_check/mitmproxy.fifo'"
+PL/pgSQL function citus.mitmproxy(text) line 11 at EXECUTE
 SELECT create_distributed_table('test_table', 'id');
```

Because the cancel command had no filter it would actually sometimes
cancel the mitmproxy cancel command itself. This PR addresses that by
filtering on CREATE TABLE, which is one of the command that
create_distributed_table will send to the workers.

Example of failing test: https://app.circleci.com/pipelines/github/citusdata/citus/26252/workflows/1b7e5464-cca4-4ec1-99b3-48ddf25c29fa/jobs/742829

(cherry picked from commit e2a24b921e)
2022-09-07 13:27:49 +02:00
Jelte Fennema a3325c1146 Fix flakyness in columnar_memory test (#6216)
Sometimes in CI the columnar_memory test was using slightly more memory
than expected.
```diff
 SELECT CASE WHEN 1.0 * TopMemoryContext / :top_post BETWEEN 0.98 AND 1.02 THEN 1 ELSE 1.0 * TopMemoryContext / :top_post END AS top_growth
 FROM columnar_test_helpers.columnar_store_memory_stats();
--[ RECORD 1 ]-
-top_growth | 1
+-[ RECORD 1 ]------------------
+top_growth | 1.0206132116232119

 -- before this change, max mem usage while executing inserts was 28MB and
```

This PR changes the expectation to be slightly higher, such that this
random increase in memory usage doesn't cause a flaky test.

Failing test: https://app.circleci.com/pipelines/github/citusdata/citus/26256/workflows/c0870f66-3346-4f8d-a1d3-36dfd7c98289/jobs/743028

(cherry picked from commit 4ce17f015b)
2022-09-07 13:27:49 +02:00
Jelte Fennema 242f40d640 Improve debugability for columnar_memory flakyness (#6203)
Sometimes the columnar_memory test fails in CI with the following error:
```diff
 SELECT 1.0 * TopMemoryContext / :top_post BETWEEN 0.98 AND 1.02 AS top_growth_ok
 FROM columnar_test_helpers.columnar_store_memory_stats();
 -[ RECORD 1 ]-+--
-top_growth_ok | t
+top_growth_ok | f

 -- before this change, max mem usage while executing inserts was 28MB and
```

This is almost certainly a harmless failure that simply requires bumping
the margin a little bit. However, it's impossible to say with the
current output. I was unable to reproduce this on-demand on my local
machine or even in CI. So this changes the test to include the actual
value difference in the size of TopMemoryContext when it's outside the
expected range. Then next time it fails we at least have some
information about why.

Example of failing test: https://app.circleci.com/pipelines/github/citusdata/citus/25966/workflows/d472a57b-419a-4f33-b8bc-2e174a98d4d6/jobs/730576

(cherry picked from commit e6a1a86db0)
2022-09-07 13:27:49 +02:00
Jelte Fennema c1f58fac6c Don't run any isolation tests in parallel (#6212)
By running isolation tests in parallel we're just asking for flaky
tasks. The first test might temporarily block one of the commands in the
second test, which we then detect as waiting like this:
```diff
 step s2-vacuum-analyze:
     VACUUM ANALYZE test_insert_vacuum;
-
+ <waiting ...>
 step s1-commit:
     COMMIT;

+step s2-vacuum-analyze: <... completed>
```

Debugging flaky tests is also much harder when they are run in parallel.
This PR starts running all our isolation tests sequentially.

The reason for opening this PR was me seeing this failing test:
https://app.circleci.com/pipelines/github/citusdata/citus/26194/workflows/ff57e2cf-8ac4-40fe-bc0c-74a7f8fecb53/jobs/740454

As well as having fixed a similar issue recently in #6122

(cherry picked from commit 85305b2773)
2022-09-07 13:27:49 +02:00
Jelte Fennema 3d67ae6497 Fix flakyness in failure_insert_select_repartition (#6202)
This fixes our most commonly randomly failing failure test. The failing
diff is as follows:

```diff
SELECT citus.mitmproxy('conn.onQuery(query="fetch_intermediate_results").kill()');
  mitmproxy
 -----------

 (1 row)

 INSERT INTO target_table SELECT * FROM source_table;
-ERROR:  connection to the remote node localhost:xxxxx failed with the following error: connection not open
+ERROR:  could not open file "base/pgsql_job_cache/10_0_40/repartitioned_results_20770193413_from_4213590_to_1.data": No such file or directory
+CONTEXT:  while executing command on localhost:9060
+while executing command on localhost:57637
 SELECT * FROM target_table ORDER BY a;
```

As far as I can tell this is the cause of a race condition: After killing
fetch_intermediate_results on worker 9060, the previously created data
file gets cleaned up. The fetch_intermediate_results call that's sent
to worker 57637 will be cancelled and rolled back soon because of the
failure on the other connection. But if that fetch_intermediate_results
call is able to connect to 9060 before it is cancelled, it won't find
the file it's looking for there anymore. So while it's not the error we
expect, it does indicate that we succeeded.

To avoid this issue instead of killing the fetch_intermediate_results
call directly, we kill the COPY command that it uses to do the fetch.
This results in stable output as can be seen here, where 227 runs of
failure_insert_select_repartition succeeded:
https://app.circleci.com/pipelines/github/citusdata/citus/26168/workflows/9c64a3b6-f46c-4725-9fb4-8f6a2d00a023/jobs/739389

To be clear this changes the test to affects the opposite
fetch_intermediate_results call. This kills the fetch_intermediate_results
call of worker 57637, instead of killing the fetch_intermediate_results call
on worker 9060.

Example of failing test: https://app.circleci.com/pipelines/github/citusdata/citus/26147/workflows/780e95ea-264a-4c9f-ad2e-cf11449a795e/jobs/738467

(cherry picked from commit 8ce12eb51f)
2022-09-07 13:27:49 +02:00
Önder Kalacı 173b7ed5ad Properly add / remove coordinator for isolation tests (#6181)
We used to rely on a seperate session to add the coordinator.
However, that might prevent the existing sessions to get
assigned proper gpids, which causes flaky tests.

(cherry picked from commit 961fcff5db)
2022-09-07 13:27:49 +02:00
Jelte Fennema eed4d6452d Fix flakyness in columnar_first_row_number test (#6192)
When running columnar_first_row_number in parallel with the
columnar_query test sometimes it would fail. This bug is tracked
in #6191. For now to make CI less flaky we simply don't run these tests
in parallel.

Example of failed test: https://app.circleci.com/pipelines/github/citusdata/citus/26106/workflows/75d00ea9-23f8-4bff-a927-bced19e1f81b/jobs/736713

Fixes #6184

(cherry picked from commit 0a045afd3a)
2022-09-07 13:27:49 +02:00
Jelte Fennema aa879108b7 Remove the flaky rollback_to_savepoint test (#6190)
This removes a flaky test that I introduced in #3868 after I fixed the
issue described in #3622. This test is sometimes fails randomly in CI.
The way it fails indicates that there might be some bug: A connection
breaks after rolling back to a savepoint.

I tried reproducing this issue locally, but I wasn't able to. I don't
understand what causes the failure.

Things that I tried were:

1. Running the test with:
   ```sql
   SET citus.force_max_query_parallelization = true;
   ```
2. Running the test with:
   ```sql
   SET citus.max_adaptive_executor_pool_size = 1;
   ```
3. Running the test in parallel with the same tests that it is run in
   parallel with in multi_schedule.

None of these allowed me to reproduce the issue locally.

So I think it's time to give on fixing this test and simply remove the
test. The regression that this test protects against seems very unlikely
to reappear, since in #3868 I also added a big comment about the need
for the newly added `UnclaimConnection` call. So, I think the need for
the test is quite small, and removing it will make our CI less flaky.

In case the cause of the bug ever gets found, I tracked the bug in #6189

Example of a failing CI run:
https://app.circleci.com/pipelines/github/citusdata/citus/26098/workflows/f84741d9-13b1-4ae7-9155-c21ed3466951/jobs/736424

For reference the unexpected diff is this (so both warnings and an error):
```diff
 INSERT INTO t SELECT i FROM generate_series(1, 100) i;
+WARNING:  connection to the remote node localhost:57638 failed with the following error:
+WARNING:
+CONTEXT:  while executing command on localhost:57638
+ERROR:  connection to the remote node localhost:57638 failed with the following error:
 ROLLBACK;
```

This test is also mentioned as the most failing regression test in #5975

(cherry picked from commit d16b458e2a)
2022-09-07 13:27:49 +02:00
Jelte Fennema ee887ef648 Fix flakyness in create index concurrently isolation tests (#6158)
This creates consistent test output for isolation tests that involve
`CREATE INDEX CONCURRENTLY`. `CREATE INDEX CONCURRENTLY` is sometimes
temporarily detected as blocking, even though it will complete without any other
queries needing to be run. This change makes sure that we wait until that happens
without running any other queries in the meantime. This way we always get consistent
output. The way we do that is addressed by using an empty step in the same
session as the `CREATE INDEX CONCURRENLTY` command. Doing so forces
the isolation tester to wait until the command is finished and not continue with
steps from other sessions. This is [the recommended approach by Postgres][1].

There's two separate cases which are addressed in slightly different ways:
1. If `CREATE INDEX CONCURRENTLY` is actually blocked on another session: Add an
    empty step right after the commit of blocking session.
    e.g. `"s2-ddl-create-index-concurrently" "s1-commit" "s2-empty"`
2. If it's not actually blocked on another session: Add [an asterisk marker][2] to make
    it look like it's blocked (because sometimes this happens randomly) and right
    after that we add an empty step to trigger waiting.
    e.g. `"s2-ddl-create-index-concurrently"(*) "s2-empty" "s1-commit"`

In passing this also enables isolation tests that were disabled due to a
bug that has already been fixed for a while.

Fixes #5993
Related to #5910 and #2966

[1]: 5f0adec253/src/test/isolation/README (L197-L204)
[2]: 5f0adec253/src/test/isolation/README (L174-L179)

Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
(cherry picked from commit fd07cc9baf)
2022-09-07 13:27:49 +02:00
Jelte Fennema a9a114145a Fix flakyness in isolation_data_migration.spec (#6122)
The tests isolation_concurrent_dml and isolation_data_migration tests
were being run in parallel, but they were interfering with each others
output. Sometimes queries from isolation_concurrent_dml were blocking
create_distributed_table in isolation_data_migration:

1. https://app.circleci.com/pipelines/github/citusdata/citus/25562/workflows/f9d0a6ff-bb7a-4b71-9fcf-1a3e46d54425/jobs/713270
2. https://app.circleci.com/pipelines/github/citusdata/citus/25562/workflows/1e22454c-1623-48a7-97fb-c6803c7959c7/jobs/713223
3. https://app.circleci.com/pipelines/github/citusdata/citus/25562/workflows/618c419e-eefb-4582-9482-322dbb9ac96d/jobs/713110

This fixes it changing the schedule to not run these tests in parallel.

(cherry picked from commit dff71abc32)
2022-09-07 13:27:49 +02:00
Jelte Fennema 86614e3555 Fix flakyness in isolation_replicate_reference_tables_to_coordinator.spec (#6123)
When the deadlock detector kills s2-update-dist-table both sessions
finish at the same time. The order in which they are displayed can be
swapped. To counteract this we start using the ["marker" feature][1] of
the isolationtester framework to create consistent output.

In passing this also sets the next_shard_id to the expected value by
this test so it can be run using `make check-isolation-base`.

Failed CI test: https://app.circleci.com/pipelines/github/citusdata/citus/25562/workflows/dfe6f88a-c306-4d91-b771-d5d1deb1798d/jobs/713417

[1]: ec62ce55a8/src/test/isolation/README (L152)

(cherry picked from commit 8bbc1a45e1)
2022-09-07 13:27:49 +02:00
Hanefi Onaldi a29b689fc9 Replace isolation tester func only once on enterprise tests (#6064)
This is a continuation of a refactor (with commit sha
2b7cf0c097) that aimed to use Citus helper
UDFs by default in iso tests.

PostgreSQL isolation test infrastructure uses some UDFs to detect
whether concurrent sessions block each other. Citus implements
alternatives to that UDF so that we are able to detect and report
distributed transactions that get blocked on the worker nodes as well.

We needed to explicitly replace PG helper functions with Citus
implementations in each isolation file. Now we replace them by default.

(cherry picked from commit ae58ca5783)
2022-09-07 13:27:49 +02:00
Hanefi Onaldi b90523628f Replace iso tester func only once (#5964)
Use Citus helper UDFs by default in iso tests

PostgreSQL isolation test infrastructure uses some UDFs to detect
whether concurrent sessions block each other. Citus implements
alternatives to that UDF so that we are able to detect and report
distributed transactions that get blocked on the worker nodes as well.

We needed to explicitly replace PG helper functions with Citus
implementations in each isolation file. Now we replace them by default.

(cherry picked from commit 2b7cf0c097)
2022-09-07 13:27:49 +02:00
Jelte Fennema 75cf7a748d
Define symbols required for downgrade from 11.1 (#6301)
Since #6300/e29db74 changed the C symbol that our bigint overrides of
pg_cancel_backend and pg_terminate_backend called. We needed to do
something to continue to make these functions work after downgrading.

Recreating the old definition with a downgrade scripts is not really
possible, since people are expected to run the downgrade steps when
using the new .so file, which does not contain the old symbols.

So, the easiest way to solve it was also defining the new symbols in our
old Citus versions. Luckily our overrides haven't existed for long, so
these symbol definitions only needed to be backported to 11.0.
2022-09-07 12:18:39 +02:00
Marco Slot 5f57d77899 Allow citus_internal application_name with additional suffix (#6282)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-05 21:41:06 +02:00
Marco Slot 0a11da1291 Add an allow_unsafe_constraints flag for constraints without distribution column (#6237)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-08-25 16:13:07 +02:00
Gokhan Gulbiz 07143e7d12 Use the same colocation group for child and parent rels when altering a distributed table (#6225)
* Alter_distributed_table colocateWith:none bug fix for partitioned tables.

* Regression tests added for alter_distributed_table colocateWith:none for partitioned tables

* Update query comparision to be more accurate

(cherry picked from commit 69d2fcf5c0)
2022-08-25 11:47:06 +03:00
Marco Slot 006c8eacc0 Verify that we can replicate reference tables using rebalancer 2022-08-23 23:26:49 +02:00
Marco Slot 9dc6273b88 Set application_name to citus_rebalancer when copying reference tables 2022-08-23 23:26:49 +02:00
Onur Tirtir dbfdaca0f0 Add changelog entries for 11.0.6
(cherry picked from commit b6b8f198d9)
2022-08-19 11:09:46 +03:00
Onur Tirtir a5d6e841df Bump citus version to 11.0.6 2022-08-19 10:58:38 +03:00
Jelte Fennema 73e993e908
Fix flakyness in isolation_reference_table (#6193)
The newly introduced isolation_reference_table test had some flakyness,
because the assumption on how the arbitrary reference table gets chosen
was incorrect. This introduces a VACUUM FULL at the start of the test to
ensure the assumption actually holds.

Example of failed test: https://app.circleci.com/pipelines/github/citusdata/citus/26108/workflows/0a5cd526-006b-423e-8b67-7411b9c6be36/jobs/736802
2022-08-18 14:47:59 +02:00
Nils Dijk 08dee6fe08
Fix reference table lock contention (#6173)
DESCRIPTION: Fix reference table lock contention

Dropping and creating reference tables unintentionally blocked on each other due to the use of an ExclusiveLock for both the Drop and conditionally copying existing reference tables to (new) nodes.

The patch does the following:
 - Lower lock lever for dropping (reference) tables to `ShareLock` so they don't self conflict
 - Treat reference tables and distributed tables equally and acquire the colocation lock when dropping any table that is in a colocation group
 - Perform the precondition check for copying reference tables twice, first time with a lower lock that doesn't conflict with anything. Could have been a NoLock, however, in preparation for dropping a colocation group, it is an `AccessShareLock`

During normal operation the first check will always pass and we don't have to escalate that lock. Making it that we won't be blocked on adding and remove reference tables. Only after a node addition the first `create_reference_table` will still need to acquire an `ExclusiveLock` on the colocation group to perform the copy.
2022-08-18 13:22:31 +02:00
Onder Kalaci 87787dd146 Support Sequences owned by columns before distributing tables
There are 3 different ways that a sequence can be interacting
with tables. (1) and (2) are already supported. This commit adds
support for (3).

     (1) column DEFAULT nextval('seq'):

	The dependency is roughly like below,
	and ExpandCitusSupportedTypes() is responsible
	for finding the depending sequences.

        schema <--- table <--- column <---- default value
         ^                                     |
         |------------------ sequence <--------|

    (2) serial columns: Bigserial/small serial etc:

	The dependency is roughly like below,
	and ExpandCitusSupportedTypes() is responsible
	for finding the depending sequences.

        schema <--- table <--- column <---- default value
                                 ^             |
				 |             |
          		     sequence <--------|

   (3) Sequence OWNED BY table.column: Added support for
       this type of resolution in this commit.

       The dependency is almost like the following, and
       ExpandCitusSupportedTypes() is NOT responsible for finding
       the dependency.

        schema <--- table <--- column
                                 ^
				 |
          		     sequence

(cherry picked from commit 9ec8e627c1)
2022-08-18 11:22:25 +02:00
Marco Slot 56939f0d14
Fix relation access tracking for local only transactions on release-11.0 (#6182)
Co-authored-by: Onder Kalaci <onderkalaci@gmail.com>
2022-08-18 10:13:41 +02:00
Ahmet Gedemenli 7df8588107
Fix upgrade paths for 11.0 (#6171)
* Fix upgrade paths for 11.0
2022-08-17 21:34:23 +03:00
aykut-bozkurt e0b4455e45 sysid should be parsed as int. (#6150)
(cherry picked from commit 898801504e)
2022-08-11 11:03:41 +03:00
Onur Tirtir f8e3e8c444 Add CHANGELOG entries for 11.0.5 (#6108)
(cherry picked from commit 0a04b115aa)
2022-08-01 13:40:43 +03:00
Onur Tirtir a18f6c4e40 Bump citus version to 11.0.5 2022-08-01 10:56:31 +03:00
Jelte Fennema d6c885713e Work around flaky test related to search_path (#5894)
For some reason search_path is not always set correctly on the worker
when calling a distributed function, this shows up when calling
`insert_document` in our distributed_triggers test. The underlying
reason is currently unknown and warrants deeper investigation.

Currently this test is one of the main causes for random CI failures. So
this change sets the search_path of each function explicitly, to reduce
these failures. So other devs can be more efficient, while I continue
investigating the root cause of this issue.

Also changes explicit `SET citus.enable_unsafe_triggers = false` to
`RESET citus.enable_unsafe_triggers` in passing.

(cherry picked from commit 6d8c5931d6)
2022-08-01 10:17:26 +03:00
Ying Xu a8aa82a3ec Bugfix for IN clause to be considered during planner phase in Columnar (#6030)
Reported bug #5803 shows that we are currently not sending the IN clause to our planner for columnar. This PR fixes it by checking for ScalarArrayOpExpr in ExtractPushdownClause so that we do not skip it. Also added a test case for this new addition.
2022-07-29 17:56:24 +02:00
Ahmet Gedemenli 2f1719c149
Do not create truncate triggers on foreign tables (#6103) 2022-07-29 16:43:09 +03:00
Marco Slot 4eb0749369 Avoid catalog read via superuser() call in DecrementSharedConnectionCounter 2022-07-29 14:22:51 +02:00
Marco Slot 4439124b6d Fix issues with insert..select casts and column ordering 2022-07-28 13:54:04 +02:00
Jelte Fennema 1cf079581f Avoid possible information leakage about existing users (#6090)
(cherry picked from commit 0f50bef696)
2022-07-27 17:58:24 +02:00
Ahmet Gedemenli 4d01af5160 Error out for views with circular dependencies (#6051)
Adds error check for views with circular dependencies

(cherry picked from commit 2b2a529653)
2022-07-27 17:59:49 +03:00
Marco Slot e45b6ece0d Allow WITH HOLD cursors with parameters 2022-07-27 14:08:18 +02:00
Onder Kalaci 9af736c7a6 Concurrent shard move/copy and colocated table creation fix
It turns out that create_distributed_table
and citus_move/copy_shard_placement does not
work well concurrently.

To fix that, we need to acquire a lock, which
sounds like a good use of colocation lock.

However, the current usage of colocation lock is
limited to higher level UDFs like rebalance_table_shards
etc. Those usage of lock is still useful, but
we cannot acquire the same lock on citus_move_shard_placement
etc. because the coordinator connects to itself to acquire
the lock. Hence, the high level UDF blocks itself.

To fix that, we use one more colocation lock, with the placements
are the main objects to consider.

(cherry picked from commit 12fa3aaf6b)
2022-07-27 10:10:46 +02:00
Onder Kalaci a21a4e128c Optimize StringJoin() for when prefix-postfix is needed
Before this commit, we required multiple copies of the
same stringInfo if we needed to append/prepend data to
the stringInfo. Now, we optionally get prefix/postfix.

For large string operations, this can save up to %10
memory.

(cherry picked from commit 26fdcb68f0)
2022-07-27 10:02:32 +02:00
Onder Kalaci 2a684e426c Do not cache all the metadata during fix_all_partition_shard_index_names
(cherry picked from commit f076e81166)
2022-07-27 10:02:05 +02:00
Onder Kalaci 377375de2a Reduce memory consumption while adjust partition index names
Previously, CreateFixPartitionShardIndexNames() created all
the relevant query strings for all the shards, and executed
the large query string. And, in terms of the memory consumption,
this huge command (and its ExprContext generated while running
the command) is the main bottleneck/

With this change, we are reducing the total amount of memory
usage to almost 1/shard_count.

On my local machine, a distributed partitioned table with 120 partitions,
each 32 shards, the total memory consumption reduced from ~3GB
to ~0.1GB. And, the total execution time increased from ~28 seconds
to ~30 seconds. This seems like a good trade-off.

(cherry picked from commit b8008999dc)
2022-07-27 10:02:00 +02:00
Nitish Upreti fcdf4434c6
Fix blocking shard moves failure due to constraint failure.
DESCRIPTION:
Fix Bug #4949 where Blocking shard moves fails if there is a foreign key between partitioned distributed tables (from child to parent). This is because we try to create constraints before attaching child partitions to parent. This causes constraint failure as parent table will be empty. Fix is to reverse the order i.e. attach partitions before we create constraints.

TESTING:
Added a new test 'shard_move_constraints_blocking' inspired for existing 'shard_move_constraints' where we trigger shard move with 'block_writes' instead of 'force_logical' to add coverage for this scenario.
2022-07-24 21:21:25 -07:00
Hanefi Onaldi 5ca792aef9
Bump Citus version to 11.0.4 2022-07-13 18:06:04 +03:00
Hanefi Onaldi 096047bfbc
Add changelog entry for 11.0.4 2022-07-13 18:05:19 +03:00
Onder Kalaci c51095c462 Add more generic read-replica tests
(cherry picked from commit 6cd7319f12)
2022-07-13 15:16:04 +02:00
Onder Kalaci 857a770b86 Add regression tests for LOCK command citus.use_secondary_nodes=always mode
(cherry picked from commit 3c343d4563)
2022-07-13 15:15:52 +02:00
Onder Kalaci 06e55df141 Make sure citus_is_coordinator works on read replicas
(cherry picked from commit b2e9a5baf1)
2022-07-13 15:15:46 +02:00
Onder Kalaci 06d6ffbb6e LOCK COMMAND does not require primaries at the start
(cherry picked from commit 8ab696f7e2)
2022-07-13 15:15:40 +02:00
Hanefi Onaldi 2bb106508a
Bump Citus version to 11.0.3 2022-07-05 13:19:10 +03:00
Hanefi Onaldi 5f46f2e9f7
Add changelog entry for 11.0.3
(cherry picked from commit c33915c3e6)
2022-07-05 13:19:10 +03:00
Ahmet Gedemenli ac7511de7d Fix matviews for citus_add_local_table_to_metadata (#6023)
(cherry picked from commit c8e1e243b8)
2022-07-04 17:01:40 +03:00
Hanefi Onaldi 0eee7fd9b8
Fix downgrade scripts from 11.0-2 to 11.0-1
(cherry picked from commit f60809a6c1)

Conflicts:
	src/test/regress/expected/multi_extension.out
	src/test/regress/sql/multi_extension.sql
2022-06-29 22:52:07 +03:00
Önder Kalacı 03a4305e06
Fixes a bug that prevents upgrades when there are no worker nodes (#6037)
(cherry picked from commit bab4c0a8c3)
2022-06-29 14:36:24 +03:00
Onder Kalaci d397dd0dfe Fixes a bug that prevents upgrades when there COMPRESSION and DEFAULT columns 2022-06-29 10:45:33 +02:00
Hanefi Onaldi 9d05c30c13
Bump Citus version to 11.0.2 2022-06-16 16:54:47 +03:00
Hanefi Onaldi bd02bd2dda
Add changelog entries for 11.0.2 (#6007)
(cherry picked from commit 26172636c9)
2022-06-16 16:53:40 +03:00
Ahmet Gedemenli b559ae5813 Fix creating stats bug when CREATE TABLE LIKE (#6006)
(cherry picked from commit 1ee3e8b7f4)
2022-06-16 12:45:23 +03:00
Jelte Fennema a01e45f3df Make enterprise features open source
This PR makes all of the features open source that were previously only
available in Citus Enterprise.

Features that this adds:
1. Non blocking shard moves/shard rebalancer
   (`citus.logical_replication_timeout`)
2. Propagation of CREATE/DROP/ALTER ROLE statements
3. Propagation of GRANT statements
4. Propagation of CLUSTER statements
5. Propagation of ALTER DATABASE ... OWNER TO ...
6. Optimization for COPY when loading JSON to avoid double parsing of
   the JSON object (`citus.skip_jsonb_validation_in_copy`)
7. Support for row level security
8. Support for `pg_dist_authinfo`, which allows storing different
   authentication options for different users, e.g. you can store
   passwords or certificates here.
9. Support for `pg_dist_poolinfo`, which allows using connection poolers
   in between coordinator and workers
10. Tracking distributed query execution times using
   citus_stat_statements (`citus.stat_statements_max`,
   `citus.stat_statements_purge_interval`,
   `citus.stat_statements_track`). This is disabled by default.
11. Blocking tenant_isolation
12. Support for `sslkey` and `sslcert` in `citus.node_conninfo`
2022-06-16 08:09:45 +02:00
Marco Slot 0861c80c8b Fix bug in unqualified, non-existing DROP DOMAIN IF EXISTS
(cherry picked from commit ee34e1ed9d)
2022-06-15 16:53:25 +02:00
Burak Velioglu de6373b842 Fix dropping temporary view without specifying the explicit schema name
(cherry picked from commit 4d533c3c56)
2022-06-15 16:36:52 +02:00
Ahmet Gedemenli 4345627480 Fix materialized view intermediate result filename (#5982)
(cherry picked from commit 268d3fa3a6)
2022-06-14 15:43:18 +03:00
Onder Kalaci 978d31f330 Use citus_finish_citus_upgrade() in the tests
We already have tests relying on citus_finalize_upgrade_to_citus11().
Now, adjust those to rely on citus_finish_citus_upgrade() and
always call citus_finish_citus_upgrade().
2022-06-13 13:28:41 +02:00
Marco Slot 4bcffce036 Introduce a citus_finish_citus_upgrade() function 2022-06-13 13:28:31 +02:00
Halil Ozan Akgul 7166901492 Fixes the bug where undistribute can drop Citus extension
(cherry picked from commit b255706189)
2022-06-01 18:56:56 +03:00
Hanefi Onaldi 8ef705012a
Add normalization rules for flaky isolation tests
We remove `<waiting ...>` and `<... completed>` outputs for some CREATE
INDEX CONCURRENTLY commands since they can cause flakiness in some scenarios.

Postgres calls WaitForOlderSnapshots() and this can cause CREATE INDEX
CONCURRENTLY commands for shards to get blocked by each other for brief
periods of time. The extra waits can pop-up, or they can get completed
at different lines in the output files. To remedy that, we rename those
indexes to be captured by the new normalization rule.

(cherry picked from commit 52541c5802)
2022-06-01 16:12:01 +03:00
Hanefi Onaldi 530aafd8ee
Grep logs for deterministic global_cancel test results (#5948)
(cherry picked from commit 313104ab9b)
2022-06-01 16:12:01 +03:00
Gledis Zeneli c440cbb643 Fix memory error with citus_add_node reported by valgrind test (#5967)
The error comes due to the datum jsonb in pg_dist_metadata_node.metadata being 0 in some scenarios. This is likely due to not copying the data when receiving a datum from a tuple and pg deciding to deallocate that memory when the table that the tuple was from is closed.
Also fix another place in the code that might have been susceptible to this issue.
I tested on both multi-vg and multi-1-vg and the test were successful.

(cherry picked from commit beef392f5a)
2022-06-01 13:06:54 +03:00
gledis69 a64e135a36 Revert "Copy data from heap tuples instead of using references"
This reverts commit 50e8638ede.
2022-06-01 13:06:38 +03:00
gledis69 50e8638ede Copy data from heap tuples instead of using references
The general rule is:
If the data is used within the bounds of table_open ... table_close > no need to copy
If the data is required for use even after the table is closed > copy

(cherry picked from commit dc9da7630f)
2022-06-01 12:27:11 +03:00
jeff-davis b34b1ce06b Columnar: fix wraparound bug. (#5962)
columnar_vacuum_rel() now advances relfrozenxid.

Fixes #5958.

(cherry picked from commit 74ce210f8b)
2022-05-31 07:46:12 -07:00
Onder Kalaci 0d0dd0af1c Show that no metadata is sent when disabled
(cherry picked from commit 89c1ccb7a5)
2022-05-30 17:01:49 +02:00
Onder Kalaci 3227d6551e Do not send metadata changes during add node if citus.enable_metadata_sync is set to false
(cherry picked from commit 7157152f6c)
2022-05-30 17:01:44 +02:00
Onder Kalaci d147d5d0c5 Avoid assertion failure on citus_add_node
(cherry picked from commit 010a2a408e)
2022-05-30 17:01:38 +02:00
Ahmet Gedemenli 4b5f749c23 Propagate dependent views upon distribution (#5950)
(cherry picked from commit 26d927178c)
2022-05-26 18:58:04 +03:00
Burak Velioglu 29c67c660d Create view and materialized views with right schema and owner while
altering the distributed table.

To be able to alter view's owner without enforcing sequential mode.
Alter view process functions have been udpated to use metadata
connection.
2022-05-25 10:42:54 +03:00
Gledis Zeneli 6da2d41e00 Do not obtain AccessShareLock before actual lock (#5965)
Do not obtain AccessShareLock before acquiring the distributed locks.

Acquiring an AccessShareLock ensures that the relations which we are trying to get a distributed lock on will not be dropped in the time between when the LOCK command is issued and the LOCK commands are send to the worker. However, this also leads to distributed deadlocks in such scenarios:

```sql
-- for dist lock acquiring order coor, w1, w2

-- on w2
LOCK t1 IN ACCESS EXLUSIVE MODE;
-- acquire AccessShareLock locally on t1 to ensure it is not dropped while we get ready to distribute the lock

      -- concurrently on w1
      LOCK t1 IN ACCESS EXLUSIVE MODE;
      -- acquire AccessShareLock locally on t1 to ensure it is not dropped while we get ready to distribute the lock
      -- acquire dist lock on coor, w1, gets blocked on local AccessShareLock on w2

-- on w2 continuation of the execution above
-- starts to acquire dist locks and gets blocked on the coor by the lock acquired by w1

-- distributed deadlock

```

We opt for avoiding such deadlocks with the cost of the possibility of running into errors when the relations on which we are trying to acquire locks on get dropped.

(cherry picked from commit 27ddb4fc8e)
2022-05-23 17:28:37 +03:00
Onder Kalaci 2d5560537b Due to new commits in master branch, outputs diverged 2022-05-23 09:36:38 +02:00
Onder Kalaci 8b0499c91a Parallelize metadata syncing on node activate
It is often useful to be able to sync the metadata in parallel
across nodes.

Also citus_finalize_upgrade_to_citus11() uses
start_metadata_sync_to_primary_nodes() after this commit.

Note that this commit does not parallelize all pieces of node
activation or metadata syncing. Instead, it tries to parallelize
potenially large parts of metadata, which is the objects and
distributed tables (in general Citus tables).

In the future, it would be nice to sync the reference tables
in parallel across nodes.

Create ~720 distributed tables / ~23450 shards
```SQL
-- declaratively partitioned table
CREATE TABLE github_events_looooooooooooooong_name (
  event_id bigint,
  event_type text,
  event_public boolean,
  repo_id bigint,
  payload jsonb,
  repo jsonb,
  actor jsonb,
  org jsonb,
  created_at timestamp
) PARTITION BY RANGE (created_at);

SELECT create_time_partitions(
  table_name         := 'github_events_looooooooooooooong_name',
  partition_interval := '1 day',
  end_at             := now() + '24 months'
);

CREATE INDEX ON github_events_looooooooooooooong_name USING btree (event_id, event_type, event_public, repo_id);
SELECT create_distributed_table('github_events_looooooooooooooong_name', 'repo_id');

SET client_min_messages TO ERROR;

```

across 1 node: almost same as expected
```SQL

SELECT start_metadata_sync_to_primary_nodes();
Time: 15664.418 ms (00:15.664)

select start_metadata_sync_to_node(nodename,nodeport) from pg_dist_node;
Time: 14284.069 ms (00:14.284)
```

across 7 nodes: ~3.5x improvement
```SQL

SELECT start_metadata_sync_to_primary_nodes();
┌──────────────────────────────────────┐
│ start_metadata_sync_to_primary_nodes │
├──────────────────────────────────────┤
│ t                                    │
└──────────────────────────────────────┘
(1 row)

Time: 25711.192 ms (00:25.711)

-- across 7 nodes
select start_metadata_sync_to_node(nodename,nodeport) from pg_dist_node;
Time: 82126.075 ms (01:22.126)
```

(cherry picked from commit dd02e1755f)
2022-05-23 09:25:31 +02:00
Onder Kalaci 513e073206 Fixes a bug that prevents dropping/altering indexes
There are two problems in this area. First, when there are expressions
on the index name, we should call `transformIndexExpression()` before
generating the index name. That is what Postgres does.

Second, because of 40c24bfef9
PG 13 and PG 14 generates different names for indexes with function calls even for local PG tables.
Assume we have:
```SQL
create table t(id int);
select create_distributed_table('t', 'id');
create index ON t (my_very_boring_function(id));
```

On PG 13, the name of the index is `t_expr_idx`
```SQL
\d t
Table "public.t"
┌────────┬─────────┬───────────┬──────────┬─────────┐
│ Column │  Type   │ Collation │ Nullable │ Default │
├────────┼─────────┼───────────┼──────────┼─────────┤
│ id     │ integer │           │          │         │
└────────┴─────────┴───────────┴──────────┴─────────┘
Indexes:
    "t_expr_idx" btree (my_very_boring_function(id::bigint))
```

On PG 14, the name of the index is `t_my_very_boring_function_idx`
```SQL
\d t
 Table "public.t"
┌────────┬─────────┬───────────┬──────────┬─────────┐
│ Column │  Type   │ Collation │ Nullable │ Default │
├────────┼─────────┼───────────┼──────────┼─────────┤
│ id     │ integer │           │          │         │
└────────┴─────────┴───────────┴──────────┴─────────┘
Indexes:
    "t_my_very_boring_function_idx" btree (my_very_boring_function(id::bigint))

```

The second issue is not very critical. The important part is that
we adjust regression tests to drop all the indexes, which ensures
the index names are sane on any version.

(cherry picked from commit 2cc4053fc1)
2022-05-23 09:22:25 +02:00
Onder Kalaci 4b5cb7e2b9 Mark existing views as distributed when upgrade to 11.0+
We have a mechanism which ensures that newly distributed
objects are recorded during `alter extension citus update`.

However, the logic was lacking "view"s. With this commit, we make
sure that existing views are also marked as distributed during
upgrade.

(cherry picked from commit ee45e7bfbf)
2022-05-23 09:22:17 +02:00
Gledis Zeneli 97b453e679 Add TRUNCATE arbitrary config tests (#5848)
Adds TRUNCATE arbitrary config tests.
Also adds the ability to skip tests from particular configs.
2022-05-20 19:53:18 +02:00
Marco Slot 8c5035c0a5 Improve nested execution checks and add GUC to disable 2022-05-20 19:35:59 +02:00
Marco Slot 7c6784b1f4 Add caching for functions that check the backend type 2022-05-20 19:35:52 +02:00
Marco Slot 556f43f24a Fix prepared statement bug when switching from local to remote execution 2022-05-20 19:35:45 +02:00
gledis69 909b72b027 Add distributing lock command support
(cherry picked from commit 4731630741)
2022-05-20 18:02:34 +03:00
Gledis Zeneli 3f282c660b Switch to using LOCK instead of lock_relation_if_exists in TRUNCATE (#5930)
Breaking down #5899 into smaller PR-s

This particular PR changes the way TRUNCATE acquires distributed locks on the relations it is truncating to use the LOCK command instead of lock_relation_if_exists. This has the benefit of using pg's recursive locking logic it implements for the LOCK command instead of us having to resolve relation dependencies and lock them explicitly. While this does not directly affect truncate, it will allow us to generalize this locking logic to then log different relations where the pg recursive locking will become useful (e.g. locking views).

This implementation is a bit more complex that it needs to be due to pg not supporting locking foreign tables. We can however, still lock foreign tables with lock_relation_if_exists. So for a command:

TRUNCATE dist_table_1, dist_table_2, foreign_table_1, foreign_table_2, dist_table_3;

We generate and send the following command to all the workers in metadata:
```sql
SEL citus.enable_ddl_propagation TO FALSE;
LOCK dist_table_1, dist_table_2 IN ACCESS EXCLUSIVE MODE;
SELECT lock_relation_if_exists('foreign_table_1', 'ACCESS EXCLUSIVE');
SELECT lock_relation_if_exists('foreign_table_2', 'ACCESS EXCLUSIVE');
LOCK dist_table_3 IN ACCESS EXCLUSIVE MODE;
SEL citus.enable_ddl_propagation TO TRUE;
```

Note that we need to alternate between the lock command and lock_table_if_exists in order to preserve the TRUNCATE order of relations.
When pg supports locking foreign tables, we will be able to massive simplify this logic and send a single LOCK command.

(cherry picked from commit 4c6f62efc6)
2022-05-20 17:24:44 +03:00
Marco Slot 73fd4f7ded Allow distributed execution from run_command_on_* functions 2022-05-20 15:42:50 +02:00
Burak Velioglu 8229d4b7ee Add ALTER VIEW support
Adds support for propagation ALTER VIEW commands to
- Change owner of view
- SET/RESET option
- Rename view and view's column name
- Change schema of the view

Since PG also supports targeting views with ALTER TABLE
commands, related code also added to direct such ALTER TABLE
commands to ALTER VIEW commands while sending them to workers.
2022-05-20 12:18:14 +03:00
Burak Velioglu 0cf769c43a Introduce CREATE/DROP VIEW
Adds support for propagating create/drop view commands and views to
worker node while scaling out the cluster. Since views are dropped while
converting the table type, metadata connection will be used while
propagating view commands to not switch to sequential mode.
2022-05-20 12:18:02 +03:00
Burak Velioglu 591f2565cc Use object address instead of relation id on DDLJob to decide on syncing metadata 2022-05-20 12:17:56 +03:00
Ahmet Gedemenli ddfcbfdca1 Add tests for materialized views 2022-05-20 12:17:48 +03:00
Ahmet Gedemenli 16071fac1d Add view tests to arbitrary configs 2022-05-20 12:17:41 +03:00
Onder Kalaci 9c4e3329f6 Rename metadata sync to node metadata sync where applicable 2022-05-19 11:00:51 +02:00
Onder Kalaci 36f641c586 Serialize reference table modifications with node changes & restore point
With Citus MX enabled, when a reference table is modified, it does
some operations on the first worker node(e.g., acquire locks).

If node metadata is locked (via add node or create restore point),
the changes to the reference tables should be blocked.
2022-05-19 11:00:51 +02:00
Onder Kalaci 5fe384329e Adds "sync" option to citus_disable_node() UDF 2022-05-19 11:00:51 +02:00
Marco Slot c20732142e Add a run_command_on_coordinator function 2022-05-19 10:41:10 +02:00
Marco Slot 082a14656d Fix downgrade scripts and add new downgrade tests 2022-05-19 10:37:56 +02:00
Marco Slot 33dede5b75 Add a citus_is_coordinator function 2022-05-19 10:36:22 +02:00
Nils Dijk 5e4c0e4bea
Merge pull request #5931 from citusdata/refactor/dedupe-object-propagation
Refactor: reduce complexity and code duplication for Object Propagation
2022-05-18 18:06:24 +02:00
Ahmet Gedemenli c2d9e88bf5
Fix schema name bug for sequences (#5937) 2022-05-18 17:29:30 +02:00
Ahmet Gedemenli 88369b6b23
Merge pull request #5934 from citusdata/fix-alter-statistics-nspname
Fix alter statistics namespace name
2022-05-18 17:29:30 +02:00
Onder Kalaci b7a39a232d Refrain reading the metadata cache for all tables during upgrade
First, it is not needed. Second, in the past we had issues regarding
this: https://github.com/citusdata/citus/pull/4344

When I create 10k tables, ~120K shards, this saves
40Mb of memory during ALTER EXTENSION citus UPDATE.

Before the change:  MetadataCacheMemoryContext: 41943040 ~ 40MB
After the change:  MetadataCacheMemoryContext: 8192

(cherry picked from commit f193e16a01)
2022-05-06 13:53:43 +02:00
Marco Slot e8b41d1e5b Convert citus.hide_shards_from_app_name_prefixes to citus.show_shards_for_app_name_prefixes 2022-05-05 13:24:23 +02:00
Onder Kalaci b4a65b9c45 Do not set coordinator's metadatasynced column to false
After a disable_node

(cherry picked from commit 5fc7661169)
2022-04-25 09:35:00 +02:00
Onder Kalaci 6ca3478c8d Do not assign distributed transaction ids for local execution
In the past, for all modifications on the local execution,
we enabled 2PC (with 6a7ed7b309).

This also required us to enable coordinated transactions
via https://github.com/citusdata/citus/pull/4831 .

However, it does have a very substantial impact on the
distributed deadlock detection. The distributed deadlock
detection is designed to avoid single-statement transactions
because they cannot lead to any actual deadlocks.

The implementation is to skip backends without distributed
transactions are assigned. Now that we assign single
statement local executions in the lock graphs, we are
conflicting with the design of distributed deadlock
detection.

In general, we should fix it. However, one might
think that it is not a big deal, even if the processes
show up in the lock graphs, the deadlock detection
should not be causing any false positives. That is
false, unless https://github.com/citusdata/citus/issues/1803
is fixed. Now that local processes are considered as a single
distributed backend, the lock graphs might find:

    local execution 1 [tx id: 1] -> any local process [tx id: 0]
    any local process [tx id: 0] -> local execution 2 [tx id: 2]

And, decides that there is a distributed deadlock.

This commit is:
   (a) right thing to do, as local execuion should not need any
       distributed tx id
   (b) Eliminates performance issues that might come up with
       deadlock detection does a lot of unncessary checks
   (c) After moving local execution after the remote execution
       via https://github.com/citusdata/citus/pull/4301, the
       vauge requirement for assigning distributed tx ids are
       already gone.

(cherry picked from commit a2debe0f02)
2022-04-25 09:34:32 +02:00
Hanefi Onaldi 86df61cae8
Bump Citus to 11.0.1_beta 2022-04-11 16:09:11 +03:00
Hanefi Onaldi e20a6dcd78
Add changelog entries for 11.0.1_beta
(cherry picked from commit 3ec1fc48fc)
2022-04-11 16:08:16 +03:00
Burak Velioglu 6eed51b75c
Create function in transaction according to create object propagation guc
(cherry picked from commit 5d9599f964)
2022-04-11 13:01:14 +03:00
Nils Dijk 675ba65f22
Implement DOMAIN propagation for citus 2022-04-08 16:18:02 +02:00
Marco Slot d611a50a80 Allow adding a unique constraint with an index 2022-04-07 16:41:10 +02:00
Marco Slot c5797030de Fix EXPLAIN ANALYZE JSON format for subplans 2022-04-07 16:00:12 +02:00
Marco Slot a74d991445 Handle user-defined type parameters in EXPLAIN ANALYZE 2022-04-07 11:37:43 +02:00
Marco Slot cb9e510e40 Add TABLESAMPLE support 2022-04-01 16:48:29 +02:00
Onder Kalaci e336b92552 Only hide shards from client backends and pg bg workers
The aim of hiding shards is to hide shards from client applications.

Certain bg workers (such as pg_cron or Citus maintanince daemon)
should be treated like client applications because users can run
queries from such bg workers. And, these bg workers should follow
the similar application_name checks as client backeends.

Certain other bg workers, such as logical replication or postgres'
parallel workers, should never hide shards. They are internal
operations.

Similarly the other backend types like the walsender or
checkpointer or autovacuum should never hide shards.

(cherry picked from commit 9043a1ed3f)
2022-03-30 17:44:03 +02:00
Hanefi Onaldi 4784d5579b
Bump Citus to 11.0.0_beta 2022-03-24 16:17:47 +03:00
1934 changed files with 50805 additions and 232283 deletions

808
.circleci/config.yml Normal file
View File

@ -0,0 +1,808 @@
version: 2.1
orbs:
codecov: codecov/codecov@1.1.1
azure-cli: circleci/azure-cli@1.0.0
parameters:
image_suffix:
type: string
default: '-vb4dd087'
pg13_version:
type: string
default: '13.4'
pg14_version:
type: string
default: '14.0'
upgrade_pg_versions:
type: string
default: '13.4-14.0'
style_checker_tools_version:
type: string
default: '0.8.18'
jobs:
build:
description: Build the citus extension
parameters:
pg_major:
description: postgres major version building citus for
type: integer
image:
description: docker image to use for the build
type: string
default: citus/extbuilder
image_tag:
description: tag to use for the docker image
type: string
docker:
- image: '<< parameters.image >>:<< parameters.image_tag >><< pipeline.parameters.image_suffix >>'
steps:
- checkout
- run:
name: 'Configure, Build, and Install'
command: |
./ci/build-citus.sh
- persist_to_workspace:
root: .
paths:
- build-<< parameters.pg_major >>/*
- install-<<parameters.pg_major >>.tar
check-style:
docker:
- image: 'citus/stylechecker:<< pipeline.parameters.style_checker_tools_version >><< pipeline.parameters.image_suffix >>'
steps:
- checkout
- run:
name: 'Check Style'
command: citus_indent --check
- run:
name: 'Fix whitespace'
command: ci/editorconfig.sh && git diff --exit-code
- run:
name: 'Remove useless declarations'
command: ci/remove_useless_declarations.sh && git diff --cached --exit-code
- run:
name: 'Normalize test output'
command: ci/normalize_expected.sh && git diff --exit-code
- run:
name: 'Check for C-style comments in migration files'
command: ci/disallow_c_comments_in_migrations.sh && git diff --exit-code
- run:
name: 'Check for comment--cached ns that start with # character in spec files'
command: ci/disallow_hash_comments_in_spec_files.sh && git diff --exit-code
- run:
name: 'Check for gitignore entries .for source files'
command: ci/fix_gitignore.sh && git diff --exit-code
- run:
name: 'Check for lengths of changelog entries'
command: ci/disallow_long_changelog_entries.sh
- run:
name: 'Check for banned C API usage'
command: ci/banned.h.sh
- run:
name: 'Check for tests missing in schedules'
command: ci/check_all_tests_are_run.sh
- run:
name: 'Check if all CI scripts are actually run'
command: ci/check_all_ci_scripts_are_run.sh
- run:
name: 'Check if all GUCs are sorted alphabetically'
command: ci/check_gucs_are_alphabetically_sorted.sh
check-sql-snapshots:
docker:
- image: 'citus/extbuilder:latest'
steps:
- checkout
- run:
name: 'Check Snapshots'
command: ci/check_sql_snapshots.sh
test-pg-upgrade:
description: Runs postgres upgrade tests
parameters:
old_pg_major:
description: 'postgres major version to use before the upgrade'
type: integer
new_pg_major:
description: 'postgres major version to upgrade to'
type: integer
image:
description: 'docker image to use as for the tests'
type: string
default: citus/pgupgradetester
image_tag:
description: 'docker image tag to use'
type: string
default: 12-13
docker:
- image: '<< parameters.image >>:<< parameters.image_tag >>-vabaecad'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install Extension'
command: |
tar xfv "${CIRCLE_WORKING_DIRECTORY}/install-<< parameters.old_pg_major >>.tar" --directory /
tar xfv "${CIRCLE_WORKING_DIRECTORY}/install-<< parameters.new_pg_major >>.tar" --directory /
- run:
name: 'Configure'
command: |
chown -R circleci .
gosu circleci ./configure
- run:
name: 'Enable core dumps'
command: |
ulimit -c unlimited
- run:
name: 'Install and test postgres upgrade'
command: |
gosu circleci \
make -C src/test/regress \
check-pg-upgrade \
old-bindir=/usr/lib/postgresql/<< parameters.old_pg_major >>/bin \
new-bindir=/usr/lib/postgresql/<< parameters.new_pg_major >>/bin
no_output_timeout: 2m
- run:
name: 'Regressions'
command: |
if [ -f "src/test/regress/regression.diffs" ]; then
cat src/test/regress/regression.diffs
exit 1
fi
when: on_fail
- run:
name: 'Copy coredumps'
command: |
mkdir -p /tmp/core_dumps
if ls core.* 1> /dev/null 2>&1; then
cp core.* /tmp/core_dumps
fi
when: on_fail
- run:
name: 'Copy pg_upgrade logs for newData dir'
command: |
mkdir -p /tmp/pg_upgrade_newData_logs
if ls src/test/regress/tmp_upgrade/newData/*.log 1> /dev/null 2>&1; then
cp src/test/regress/tmp_upgrade/newData/*.log /tmp/pg_upgrade_newData_logs
fi
when: on_fail
- store_artifacts:
name: 'Save regressions'
path: src/test/regress/regression.diffs
- store_artifacts:
name: 'Save core dumps'
path: /tmp/core_dumps
- store_artifacts:
name: 'Save pg_upgrade logs for newData dir'
path: /tmp/pg_upgrade_newData_logs
- codecov/upload:
flags: 'test_<< parameters.old_pg_major >>_<< parameters.new_pg_major >>,upgrade'
test-arbitrary-configs:
description: Runs tests on arbitrary configs
parallelism: 6
parameters:
pg_major:
description: 'postgres major version to use'
type: integer
image:
description: 'docker image to use as for the tests'
type: string
default: citus/failtester
image_tag:
description: 'docker image tag to use'
type: string
default: 12-13
docker:
- image: '<< parameters.image >>:<< parameters.image_tag >><< pipeline.parameters.image_suffix >>'
resource_class: xlarge
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install Extension'
command: |
tar xfv "${CIRCLE_WORKING_DIRECTORY}/install-<< parameters.pg_major >>.tar" --directory /
- run:
name: 'Configure'
command: |
chown -R circleci .
gosu circleci ./configure
- run:
name: 'Enable core dumps'
command: |
ulimit -c unlimited
- run:
name: 'Test arbitrary configs'
command: |
TESTS=$(src/test/regress/citus_tests/print_test_names.py | circleci tests split)
# Our test suite expects comma separated values
TESTS=$(echo $TESTS | tr ' ' ',')
# TESTS will contain subset of configs that will be run on a container and we use multiple containers
# to run the test suite
gosu circleci \
make -C src/test/regress \
check-arbitrary-configs parallel=4 CONFIGS=$TESTS
no_output_timeout: 2m
- run:
name: 'Show regressions'
command: |
find src/test/regress/tmp_citus_test/ -name "regression*.diffs" -exec cat {} +
lines=$(find src/test/regress/tmp_citus_test/ -name "regression*.diffs" | wc -l)
if [ $lines -ne 0 ]; then
exit 1
fi
when: on_fail
- run:
name: 'Copy logfiles'
command: |
mkdir src/test/regress/tmp_citus_test/logfiles
find src/test/regress/tmp_citus_test/ -name "logfile_*" -exec cp -t src/test/regress/tmp_citus_test/logfiles/ {} +
when: on_fail
- run:
name: 'Copy coredumps'
command: |
mkdir -p /tmp/core_dumps
if ls core.* 1> /dev/null 2>&1; then
cp core.* /tmp/core_dumps
fi
when: on_fail
- store_artifacts:
name: 'Save core dumps'
path: /tmp/core_dumps
- store_artifacts:
name: 'Save logfiles'
path: src/test/regress/tmp_citus_test/logfiles
- codecov/upload:
flags: 'test_<< parameters.pg_major >>,upgrade'
test-citus-upgrade:
description: Runs citus upgrade tests
parameters:
pg_major:
description: 'postgres major version'
type: integer
image:
description: 'docker image to use as for the tests'
type: string
default: citus/citusupgradetester
image_tag:
description: 'docker image tag to use'
type: string
docker:
- image: '<< parameters.image >>:<< parameters.image_tag >><< pipeline.parameters.image_suffix >>'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Configure'
command: |
chown -R circleci .
gosu circleci ./configure
- run:
name: 'Enable core dumps'
command: |
ulimit -c unlimited
- run:
name: 'Install and test citus upgrade'
command: |
# run make check-citus-upgrade for all citus versions
# the image has ${CITUS_VERSIONS} set with all verions it contains the binaries of
for citus_version in ${CITUS_VERSIONS}; do \
gosu circleci \
make -C src/test/regress \
check-citus-upgrade \
bindir=/usr/lib/postgresql/${PG_MAJOR}/bin \
citus-old-version=${citus_version} \
citus-pre-tar=/install-pg${PG_MAJOR}-citus${citus_version}.tar \
citus-post-tar=/home/circleci/project/install-$PG_MAJOR.tar; \
done;
# run make check-citus-upgrade-mixed for all citus versions
# the image has ${CITUS_VERSIONS} set with all verions it contains the binaries of
for citus_version in ${CITUS_VERSIONS}; do \
gosu circleci \
make -C src/test/regress \
check-citus-upgrade-mixed \
citus-old-version=${citus_version} \
bindir=/usr/lib/postgresql/${PG_MAJOR}/bin \
citus-pre-tar=/install-pg${PG_MAJOR}-citus${citus_version}.tar \
citus-post-tar=/home/circleci/project/install-$PG_MAJOR.tar; \
done;
no_output_timeout: 2m
- run:
name: 'Regressions'
command: |
if [ -f "src/test/regress/regression.diffs" ]; then
cat src/test/regress/regression.diffs
exit 1
fi
when: on_fail
- run:
name: 'Copy coredumps'
command: |
mkdir -p /tmp/core_dumps
if ls core.* 1> /dev/null 2>&1; then
cp core.* /tmp/core_dumps
fi
when: on_fail
- store_artifacts:
name: 'Save regressions'
path: src/test/regress/regression.diffs
- store_artifacts:
name: 'Save core dumps'
path: /tmp/core_dumps
- codecov/upload:
flags: 'test_<< parameters.pg_major >>,upgrade'
test-citus:
description: Runs the common tests of citus
parameters:
pg_major:
description: 'postgres major version'
type: integer
image:
description: 'docker image to use as for the tests'
type: string
default: citus/exttester
image_tag:
description: 'docker image tag to use'
type: string
make:
description: 'make target'
type: string
docker:
- image: '<< parameters.image >>:<< parameters.image_tag >><< pipeline.parameters.image_suffix >>'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install Extension'
command: |
tar xfv "${CIRCLE_WORKING_DIRECTORY}/install-${PG_MAJOR}.tar" --directory /
- run:
name: 'Configure'
command: |
chown -R circleci .
gosu circleci ./configure
- run:
name: 'Enable core dumps'
command: |
ulimit -c unlimited
- run:
name: 'Run Test'
command: |
gosu circleci make -C src/test/regress << parameters.make >>
no_output_timeout: 2m
- run:
name: 'Regressions'
command: |
if [ -f "src/test/regress/regression.diffs" ]; then
cat src/test/regress/regression.diffs
exit 1
fi
when: on_fail
- run:
name: 'Copy coredumps'
command: |
mkdir -p /tmp/core_dumps
if ls core.* 1> /dev/null 2>&1; then
cp core.* /tmp/core_dumps
fi
when: on_fail
- store_artifacts:
name: 'Save regressions'
path: src/test/regress/regression.diffs
- store_artifacts:
name: 'Save mitmproxy output (failure test specific)'
path: src/test/regress/proxy.output
- store_artifacts:
name: 'Save results'
path: src/test/regress/results/
- store_artifacts:
name: 'Save core dumps'
path: /tmp/core_dumps
- codecov/upload:
flags: 'test_<< parameters.pg_major >>,<< parameters.make >>'
when: always
tap-test-citus:
description: Runs tap tests for citus
parameters:
pg_major:
description: 'postgres major version'
type: integer
image:
description: 'docker image to use as for the tests'
type: string
default: citus/exttester
image_tag:
description: 'docker image tag to use'
type: string
suite:
description: 'name of the tap test suite to run'
type: string
make:
description: 'make target'
type: string
default: installcheck
docker:
- image: '<< parameters.image >>:<< parameters.image_tag >><< pipeline.parameters.image_suffix >>'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install Extension'
command: |
tar xfv "${CIRCLE_WORKING_DIRECTORY}/install-${PG_MAJOR}.tar" --directory /
- run:
name: 'Configure'
command: |
chown -R circleci .
gosu circleci ./configure
- run:
name: 'Enable core dumps'
command: |
ulimit -c unlimited
- run:
name: 'Run Test'
command: |
gosu circleci make -C src/test/<< parameters.suite >> << parameters.make >>
no_output_timeout: 2m
- run:
name: 'Copy coredumps'
command: |
mkdir -p /tmp/core_dumps
if ls core.* 1> /dev/null 2>&1; then
cp core.* /tmp/core_dumps
fi
when: on_fail
- store_artifacts:
name: 'Save tap logs'
path: /home/circleci/project/src/test/<< parameters.suite >>/tmp_check/log
- store_artifacts:
name: 'Save core dumps'
path: /tmp/core_dumps
- codecov/upload:
flags: 'test_<< parameters.pg_major >>,tap_<< parameters.suite >>_<< parameters.make >>'
when: always
check-merge-to-enterprise:
docker:
- image: citus/extbuilder:<< pipeline.parameters.pg13_version >>
working_directory: /home/circleci/project
steps:
- checkout
- run:
command: |
ci/check_enterprise_merge.sh
ch_benchmark:
docker:
- image: buildpack-deps:stretch
working_directory: /home/circleci/project
steps:
- checkout
- azure-cli/install
- azure-cli/login-with-service-principal
- run:
command: |
cd ./src/test/hammerdb
sh run_hammerdb.sh citusbot_ch_benchmark_rg
name: install dependencies and run ch_benchmark tests
no_output_timeout: 20m
tpcc_benchmark:
docker:
- image: buildpack-deps:stretch
working_directory: /home/circleci/project
steps:
- checkout
- azure-cli/install
- azure-cli/login-with-service-principal
- run:
command: |
cd ./src/test/hammerdb
sh run_hammerdb.sh citusbot_tpcc_benchmark_rg
name: install dependencies and run ch_benchmark tests
no_output_timeout: 20m
workflows:
version: 2
build_and_test:
jobs:
- build:
name: build-13
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
- build:
name: build-14
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
- check-style
- check-sql-snapshots
- test-citus:
name: 'test-13_check-multi'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-multi
requires: [build-13]
- test-citus:
name: 'test-13_check-multi-1'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-multi-1
requires: [build-13]
- test-citus:
name: 'test-13_check-mx'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-multi-mx
requires: [build-13]
- test-citus:
name: 'test-13_check-vanilla'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-vanilla
requires: [build-13]
- test-citus:
name: 'test-13_check-isolation'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-isolation
requires: [build-13]
- test-citus:
name: 'test-13_check-worker'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-worker
requires: [build-13]
- test-citus:
name: 'test-13_check-operations'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-operations
requires: [build-13]
- test-citus:
name: 'test-13_check-follower-cluster'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-follower-cluster
requires: [build-13]
- test-citus:
name: 'test-13_check-columnar'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-columnar
requires: [build-13]
- test-citus:
name: 'test-13_check-columnar-isolation'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-columnar-isolation
requires: [build-13]
- tap-test-citus:
name: 'test_13_tap-recovery'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
suite: recovery
requires: [build-13]
- tap-test-citus:
name: 'test-13_tap-columnar-freezing'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
suite: columnar_freezing
requires: [build-13]
- test-citus:
name: 'test-13_check-failure'
pg_major: 13
image: citus/failtester
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-failure
requires: [build-13]
- test-citus:
name: 'test-13_check-enterprise'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-enterprise
requires: [build-13]
- test-citus:
name: 'test-13_check-enterprise-isolation'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-enterprise-isolation
requires: [build-13]
- test-citus:
name: 'test-13_check-enterprise-isolation-logicalrep-1'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-enterprise-isolation-logicalrep-1
requires: [build-13]
- test-citus:
name: 'test-13_check-enterprise-isolation-logicalrep-2'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-enterprise-isolation-logicalrep-2
requires: [build-13]
- test-citus:
name: 'test-13_check-enterprise-isolation-logicalrep-3'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-enterprise-isolation-logicalrep-3
requires: [build-13]
- test-citus:
name: 'test-13_check-enterprise-failure'
pg_major: 13
image: citus/failtester
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-enterprise-failure
requires: [build-13]
- test-citus:
name: 'test-14_check-enterprise'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-enterprise
requires: [build-14]
- test-citus:
name: 'test-14_check-enterprise-isolation'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-enterprise-isolation
requires: [build-14]
- test-citus:
name: 'test-14_check-enterprise-isolation-logicalrep-1'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-enterprise-isolation-logicalrep-1
requires: [build-14]
- test-citus:
name: 'test-14_check-enterprise-isolation-logicalrep-2'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-enterprise-isolation-logicalrep-2
requires: [build-14]
- test-citus:
name: 'test-14_check-enterprise-isolation-logicalrep-3'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-enterprise-isolation-logicalrep-3
requires: [build-14]
- test-citus:
name: 'test-14_check-enterprise-failure'
pg_major: 14
image: citus/failtester
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-enterprise-failure
requires: [build-14]
- test-citus:
name: 'test-14_check-multi'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-multi
requires: [build-14]
- test-citus:
name: 'test-14_check-multi-1'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-multi-1
requires: [build-14]
- test-citus:
name: 'test-14_check-mx'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-multi-mx
requires: [build-14]
- test-citus:
name: 'test-14_check-vanilla'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-vanilla
requires: [build-14]
- test-citus:
name: 'test-14_check-isolation'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-isolation
requires: [build-14]
- test-citus:
name: 'test-14_check-worker'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-worker
requires: [build-14]
- test-citus:
name: 'test-14_check-operations'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-operations
requires: [build-14]
- test-citus:
name: 'test-14_check-follower-cluster'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-follower-cluster
requires: [build-14]
- test-citus:
name: 'test-14_check-columnar'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-columnar
requires: [build-14]
- test-citus:
name: 'test-14_check-columnar-isolation'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-columnar-isolation
requires: [build-14]
- tap-test-citus:
name: 'test_14_tap-recovery'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
suite: recovery
requires: [build-14]
- tap-test-citus:
name: 'test-14_tap-columnar-freezing'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
suite: columnar_freezing
requires: [build-14]
- test-citus:
name: 'test-14_check-failure'
pg_major: 14
image: citus/failtester
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-failure
requires: [build-14]
- test-arbitrary-configs:
name: 'test-13_check-arbitrary-configs'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
requires: [build-13]
- test-arbitrary-configs:
name: 'test-14_check-arbitrary-configs'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
requires: [build-14]
- test-pg-upgrade:
name: 'test-13-14_check-pg-upgrade'
old_pg_major: 13
new_pg_major: 14
image_tag: '<< pipeline.parameters.upgrade_pg_versions >>'
requires: [build-13, build-14]
- test-citus-upgrade:
name: test-13_check-citus-upgrade
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
requires: [build-13]
- ch_benchmark:
requires: [build-13]
filters:
branches:
only:
- /ch_benchmark\/.*/ # match with ch_benchmark/ prefix
- tpcc_benchmark:
requires: [build-13]
filters:
branches:
only:
- /tpcc_benchmark\/.*/ # match with tpcc_benchmark/ prefix

View File

@ -1,7 +0,0 @@
exclude_patterns:
- "src/backend/distributed/utils/citus_outfuncs.c"
- "src/backend/distributed/deparser/ruleutils_*.c"
- "src/include/distributed/citus_nodes.h"
- "src/backend/distributed/safeclib"
- "src/backend/columnar/safeclib"
- "**/vendor/"

View File

@ -1,33 +0,0 @@
# gdbpg.py contains scripts to nicely print the postgres datastructures
# while in a gdb session. Since the vscode debugger is based on gdb this
# actually also works when debugging with vscode. Providing nice tools
# to understand the internal datastructures we are working with.
source /root/gdbpg.py
# when debugging postgres it is convenient to _always_ have a breakpoint
# trigger when an error is logged. Because .gdbinit is sourced before gdb
# is fully attached and has the sources loaded. To make sure the breakpoint
# is added when the library is loaded we temporary set the breakpoint pending
# to on. After we have added out breakpoint we revert back to the default
# configuration for breakpoint pending.
# The breakpoint is hard to read, but at entry of the function we don't have
# the level loaded in elevel. Instead we hardcode the location where the
# level of the current error is stored. Also gdb doesn't understand the
# ERROR symbol so we hardcode this to the value of ERROR. It is very unlikely
# this value will ever change in postgres, but if it does we might need to
# find a way to conditionally load the correct breakpoint.
set breakpoint pending on
break elog.c:errfinish if errordata[errordata_stack_depth].elevel == 21
set breakpoint pending auto
echo \n
echo ----------------------------------------------------------------------------------\n
echo when attaching to a postgres backend a breakpoint will be set on elog.c:errfinish \n
echo it will only break on errors being raised in postgres \n
echo \n
echo to disable this breakpoint from vscode run `-exec disable 1` in the debug console \n
echo this assumes it's the first breakpoint loaded as it is loaded from .gdbinit \n
echo this can be verified with `-exec info break`, enabling can be done with \n
echo `-exec enable 1` \n
echo ----------------------------------------------------------------------------------\n
echo \n

View File

@ -1 +0,0 @@
postgresql-*.tar.bz2

View File

@ -1,7 +0,0 @@
\timing on
\pset linestyle unicode
\pset border 2
\setenv PAGER 'pspg --no-mouse -bX --no-commandbar --no-topbar'
\set HISTSIZE 100000
\set PROMPT1 '\n%[%033[1m%]%M %n@%/:%> (PID: %p)%R%[%033[0m%]%# '
\set PROMPT2 ' '

View File

@ -1,12 +0,0 @@
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"
[packages]
docopt = "*"
[dev-packages]
[requires]
python_version = "3.9"

28
.devcontainer/.vscode/Pipfile.lock generated vendored
View File

@ -1,28 +0,0 @@
{
"_meta": {
"hash": {
"sha256": "6956a6700ead5804aa56bd597c93bb4a13f208d2d49d3b5399365fd240ca0797"
},
"pipfile-spec": 6,
"requires": {
"python_version": "3.9"
},
"sources": [
{
"name": "pypi",
"url": "https://pypi.org/simple",
"verify_ssl": true
}
]
},
"default": {
"docopt": {
"hashes": [
"sha256:49b3a825280bd66b3aa83585ef59c4a8c82f2c8a522dbe754a8bc8d08c85c491"
],
"index": "pypi",
"version": "==0.6.2"
}
},
"develop": {}
}

View File

@ -1,84 +0,0 @@
#! /usr/bin/env pipenv-shebang
"""Generate C/C++ properties file for VSCode.
Uses pgenv to iterate postgres versions and generate
a C/C++ properties file for VSCode containing the
include paths for the postgres headers.
Usage:
generate_c_cpp_properties-json.py <target_path>
generate_c_cpp_properties-json.py (-h | --help)
generate_c_cpp_properties-json.py --version
Options:
-h --help Show this screen.
--version Show version.
"""
import json
import subprocess
from docopt import docopt
def main(args):
target_path = args['<target_path>']
output = subprocess.check_output(['pgenv', 'versions'])
# typical output is:
# 14.8 pgsql-14.8
# * 15.3 pgsql-15.3
# 16beta2 pgsql-16beta2
# where the line marked with a * is the currently active version
#
# we are only interested in the first word of each line, which is the version number
# thus we strip the whitespace and the * from the line and split it into words
# and take the first word
versions = [line.strip('* ').split()[0] for line in output.decode('utf-8').splitlines()]
# create the list of configurations per version
configurations = []
for version in versions:
configurations.append(generate_configuration(version))
# create the json file
c_cpp_properties = {
"configurations": configurations,
"version": 4
}
# write the c_cpp_properties.json file
with open(target_path, 'w') as f:
json.dump(c_cpp_properties, f, indent=4)
def generate_configuration(version):
"""Returns a configuration for the given postgres version.
>>> generate_configuration('14.8')
{
"name": "Citus Development Configuration - Postgres 14.8",
"includePath": [
"/usr/local/include",
"/home/citus/.pgenv/src/postgresql-14.8/src/**",
"${workspaceFolder}/**",
"${workspaceFolder}/src/include/",
],
"configurationProvider": "ms-vscode.makefile-tools"
}
"""
return {
"name": f"Citus Development Configuration - Postgres {version}",
"includePath": [
"/usr/local/include",
f"/home/citus/.pgenv/src/postgresql-{version}/src/**",
"${workspaceFolder}/**",
"${workspaceFolder}/src/include/",
],
"configurationProvider": "ms-vscode.makefile-tools"
}
if __name__ == '__main__':
arguments = docopt(__doc__, version='0.1.0')
main(arguments)

View File

@ -1,40 +0,0 @@
{
"version": "0.2.0",
"configurations": [
{
"name": "Attach Citus (devcontainer)",
"type": "cppdbg",
"request": "attach",
"processId": "${command:pickProcess}",
"program": "/home/citus/.pgenv/pgsql/bin/postgres",
"additionalSOLibSearchPath": "/home/citus/.pgenv/pgsql/lib",
"setupCommands": [
{
"text": "handle SIGUSR1 noprint nostop pass",
"description": "let gdb not stop when SIGUSR1 is sent to process",
"ignoreFailures": true
}
],
},
{
"name": "Open core file",
"type": "cppdbg",
"request": "launch",
"program": "/home/citus/.pgenv/pgsql/bin/postgres",
"coreDumpPath": "${input:corefile}",
"cwd": "${workspaceFolder}",
"MIMode": "gdb",
}
],
"inputs": [
{
"id": "corefile",
"type": "command",
"command": "extension.commandvariable.file.pickFile",
"args": {
"dialogTitle": "Select core file",
"include": "**/core*",
},
},
],
}

View File

@ -1,222 +0,0 @@
FROM ubuntu:22.04 AS base
# environment is to make python pass an interactive shell, probably not the best timezone given a wide variety of colleagues
ENV TZ=UTC
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
# install build tools
RUN apt update && apt install -y \
bison \
bzip2 \
cpanminus \
curl \
docbook-xml \
docbook-xsl \
flex \
gcc \
git \
libcurl4-gnutls-dev \
libicu-dev \
libkrb5-dev \
liblz4-dev \
libpam0g-dev \
libreadline-dev \
libselinux1-dev \
libssl-dev \
libxml2-utils \
libxslt-dev \
libzstd-dev \
locales \
make \
perl \
pkg-config \
python3 \
python3-pip \
software-properties-common \
sudo \
uuid-dev \
valgrind \
xsltproc \
zlib1g-dev \
&& add-apt-repository ppa:deadsnakes/ppa -y \
&& apt install -y \
python3.9-full \
# software properties pulls in pkexec, which makes the debugger unusable in vscode
&& apt purge -y \
software-properties-common \
&& apt autoremove -y \
&& apt clean
RUN sudo pip3 install pipenv pipenv-shebang
RUN cpanm install IPC::Run
RUN locale-gen en_US.UTF-8
# add the citus user to sudoers and allow all sudoers to login without a password prompt
RUN useradd -ms /bin/bash citus \
&& usermod -aG sudo citus \
&& echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
WORKDIR /home/citus
USER citus
# run all make commands with the number of cores available
RUN echo "export MAKEFLAGS=\"-j \$(nproc)\"" >> "/home/citus/.bashrc"
RUN git clone --branch v1.3.2 --depth 1 https://github.com/theory/pgenv.git .pgenv
COPY --chown=citus:citus pgenv/config/ .pgenv/config/
ENV PATH="/home/citus/.pgenv/bin:${PATH}"
ENV PATH="/home/citus/.pgenv/pgsql/bin:${PATH}"
USER citus
# build postgres versions separately for effective parrallelism and caching of already built versions when changing only certain versions
FROM base AS pg15
RUN MAKEFLAGS="-j $(nproc)" pgenv build 15.13
RUN rm .pgenv/src/*.tar*
RUN make -C .pgenv/src/postgresql-*/ clean
RUN make -C .pgenv/src/postgresql-*/src/include install
# create a staging directory with all files we want to copy from our pgenv build
# we will copy the contents of the staged folder into the final image at once
RUN mkdir .pgenv-staging/
RUN cp -r .pgenv/src .pgenv/pgsql-* .pgenv/config .pgenv-staging/
RUN rm .pgenv-staging/config/default.conf
FROM base AS pg16
RUN MAKEFLAGS="-j $(nproc)" pgenv build 16.9
RUN rm .pgenv/src/*.tar*
RUN make -C .pgenv/src/postgresql-*/ clean
RUN make -C .pgenv/src/postgresql-*/src/include install
# create a staging directory with all files we want to copy from our pgenv build
# we will copy the contents of the staged folder into the final image at once
RUN mkdir .pgenv-staging/
RUN cp -r .pgenv/src .pgenv/pgsql-* .pgenv/config .pgenv-staging/
RUN rm .pgenv-staging/config/default.conf
FROM base AS pg17
RUN MAKEFLAGS="-j $(nproc)" pgenv build 17.5
RUN rm .pgenv/src/*.tar*
RUN make -C .pgenv/src/postgresql-*/ clean
RUN make -C .pgenv/src/postgresql-*/src/include install
# create a staging directory with all files we want to copy from our pgenv build
# we will copy the contents of the staged folder into the final image at once
RUN mkdir .pgenv-staging/
RUN cp -r .pgenv/src .pgenv/pgsql-* .pgenv/config .pgenv-staging/
RUN rm .pgenv-staging/config/default.conf
FROM base AS uncrustify-builder
RUN sudo apt update && sudo apt install -y cmake tree
WORKDIR /uncrustify
RUN curl -L https://github.com/uncrustify/uncrustify/archive/uncrustify-0.68.1.tar.gz | tar xz
WORKDIR /uncrustify/uncrustify-uncrustify-0.68.1/
RUN mkdir build
WORKDIR /uncrustify/uncrustify-uncrustify-0.68.1/build/
RUN cmake ..
RUN MAKEFLAGS="-j $(nproc)" make -s
RUN make install DESTDIR=/uncrustify
# builder for all pipenv's to get them contained in a single layer
FROM base AS pipenv
WORKDIR /workspaces/citus/
# tools to sync pgenv with vscode
COPY --chown=citus:citus .vscode/Pipfile .vscode/Pipfile.lock .devcontainer/.vscode/
RUN ( cd .devcontainer/.vscode && pipenv install )
# environment to run our failure tests
COPY --chown=citus:citus src/ src/
RUN ( cd src/test/regress && pipenv install )
# assemble the final container by copying over the artifacts from separately build containers
FROM base AS devcontainer
LABEL org.opencontainers.image.source=https://github.com/citusdata/citus
LABEL org.opencontainers.image.description="Development container for the Citus project"
LABEL org.opencontainers.image.licenses=AGPL-3.0-only
RUN yes | sudo unminimize
# install developer productivity tools
RUN sudo apt update \
&& sudo apt install -y \
autoconf2.69 \
bash-completion \
fswatch \
gdb \
htop \
libdbd-pg-perl \
libdbi-perl \
lsof \
man \
net-tools \
psmisc \
pspg \
tree \
vim \
&& sudo apt clean
# Since gdb will run in the context of the root user when debugging citus we will need to both
# download the gdbpg.py script as the root user, into their home directory, as well as add .gdbinit
# as a file owned by root
# This will make that as soon as the debugger attaches to a postgres backend (or frankly any other process)
# the gdbpg.py script will be sourced and the developer can direcly use it.
RUN sudo curl -o /root/gdbpg.py https://raw.githubusercontent.com/tvesely/gdbpg/6065eee7872457785f830925eac665aa535caf62/gdbpg.py
COPY --chown=root:root .gdbinit /root/
# install developer dependencies in the global environment
RUN --mount=type=bind,source=requirements.txt,target=requirements.txt pip install -r requirements.txt
# for persistent bash history across devcontainers we need to have
# a) a directory to store the history in
# b) a prompt command to append the history to the file
# c) specify the history file to store the history in
# b and c are done in the .bashrc to make it persistent across shells only
RUN sudo install -d -o citus -g citus /commandhistory \
&& echo "export PROMPT_COMMAND='history -a' && export HISTFILE=/commandhistory/.bash_history" >> "/home/citus/.bashrc"
# install citus-dev
RUN git clone --branch develop https://github.com/citusdata/tools.git citus-tools \
&& ( cd citus-tools/citus_dev && pipenv install ) \
&& mkdir -p ~/.local/bin \
&& ln -s /home/citus/citus-tools/citus_dev/citus_dev-pipenv .local/bin/citus_dev \
&& sudo make -C citus-tools/uncrustify install bindir=/usr/local/bin pkgsysconfdir=/usr/local/etc/ \
&& mkdir -p ~/.local/share/bash-completion/completions/ \
&& ln -s ~/citus-tools/citus_dev/bash_completion ~/.local/share/bash-completion/completions/citus_dev
# TODO some LC_ALL errors, possibly solved by locale-gen
RUN git clone https://github.com/so-fancy/diff-so-fancy.git \
&& mkdir -p ~/.local/bin \
&& ln -s /home/citus/diff-so-fancy/diff-so-fancy .local/bin/
COPY --link --from=uncrustify-builder /uncrustify/usr/ /usr/
COPY --link --from=pg15 /home/citus/.pgenv-staging/ /home/citus/.pgenv/
COPY --link --from=pg16 /home/citus/.pgenv-staging/ /home/citus/.pgenv/
COPY --link --from=pg17 /home/citus/.pgenv-staging/ /home/citus/.pgenv/
COPY --link --from=pipenv /home/citus/.local/share/virtualenvs/ /home/citus/.local/share/virtualenvs/
# place to run your cluster with citus_dev
VOLUME /data
RUN sudo mkdir /data \
&& sudo chown citus:citus /data
COPY --chown=citus:citus .psqlrc .
# with the copy linking of layers github actions seem to misbehave with the ownership of the
# directories leading upto the link, hence a small patch layer to have to right ownerships set
RUN sudo chown --from=root:root citus:citus -R ~
# sets default pg version
RUN pgenv switch 17.5
# make connecting to the coordinator easy
ENV PGPORT=9700

View File

@ -1,11 +0,0 @@
init: ../.vscode/c_cpp_properties.json ../.vscode/launch.json
../.vscode:
mkdir -p ../.vscode
../.vscode/launch.json: ../.vscode .vscode/launch.json
cp .vscode/launch.json ../.vscode/launch.json
../.vscode/c_cpp_properties.json: ../.vscode
./.vscode/generate_c_cpp_properties-json.py ../.vscode/c_cpp_properties.json

View File

@ -1,37 +0,0 @@
{
"image": "ghcr.io/citusdata/citus-devcontainer:main",
"runArgs": [
"--cap-add=SYS_PTRACE",
"--ulimit=core=-1",
],
"forwardPorts": [
9700
],
"customizations": {
"vscode": {
"extensions": [
"eamodio.gitlens",
"GitHub.copilot-chat",
"GitHub.copilot",
"github.vscode-github-actions",
"github.vscode-pull-request-github",
"ms-vscode.cpptools-extension-pack",
"ms-vsliveshare.vsliveshare",
"rioj7.command-variable",
],
"settings": {
"files.exclude": {
"**/*.o": true,
"**/.deps/": true,
}
},
}
},
"mounts": [
"type=volume,target=/data",
"source=citus-bashhistory,target=/commandhistory,type=volume",
],
"updateContentCommand": "./configure",
"postCreateCommand": "make -C .devcontainer/",
}

View File

@ -1,15 +0,0 @@
PGENV_MAKE_OPTIONS=(-s)
PGENV_CONFIGURE_OPTIONS=(
--enable-debug
--enable-depend
--enable-cassert
--enable-tap-tests
'CFLAGS=-ggdb -Og -g3 -fno-omit-frame-pointer -DUSE_VALGRIND'
--with-openssl
--with-libxml
--with-libxslt
--with-uuid=e2fs
--with-icu
--with-lz4
)

View File

@ -1,9 +0,0 @@
black==23.11.0
click==8.1.7
isort==5.12.0
mypy-extensions==1.0.0
packaging==23.2
pathspec==0.11.2
platformdirs==4.0.0
tomli==2.0.1
typing_extensions==4.8.0

View File

@ -1,28 +0,0 @@
[[source]]
name = "pypi"
url = "https://pypi.python.org/simple"
verify_ssl = true
[packages]
mitmproxy = {editable = true, ref = "main", git = "https://github.com/citusdata/mitmproxy.git"}
construct = "*"
docopt = "==0.6.2"
cryptography = ">=41.0.4"
pytest = "*"
psycopg = "*"
filelock = "*"
pytest-asyncio = "*"
pytest-timeout = "*"
pytest-xdist = "*"
pytest-repeat = "*"
pyyaml = "*"
werkzeug = "==2.3.7"
[dev-packages]
black = "*"
isort = "*"
flake8 = "*"
flake8-bugbear = "*"
[requires]
python_version = "3.9"

File diff suppressed because it is too large Load Diff

View File

@ -17,7 +17,13 @@ trim_trailing_whitespace = true
insert_final_newline = unset
trim_trailing_whitespace = unset
[*.{sql,sh,py,toml}]
# Don't change test/regress/output directory, this needs to be a separate rule
# for some reason
[/src/test/regress/output/**]
insert_final_newline = unset
trim_trailing_whitespace = unset
[*.{sql,sh,py}]
indent_style = space
indent_size = 4
tab_width = 4

View File

@ -1,7 +0,0 @@
[flake8]
# E203 is ignored for black
extend-ignore = E203
# black will truncate to 88 characters usually, but long string literals it
# might keep. That's fine in most cases unless it gets really excessive.
max-line-length = 150
exclude = .git,__pycache__,vendor,tmp_*

9
.gitattributes vendored
View File

@ -16,6 +16,7 @@ README.* conflict-marker-size=32
# Test output files that contain extra whitespace
*.out -whitespace
src/test/regress/output/*.source -whitespace
# These files are maintained or generated elsewhere. We take them as is.
configure -whitespace
@ -25,9 +26,11 @@ configure -whitespace
# except these exceptions...
src/backend/distributed/utils/citus_outfuncs.c -citus-style
src/backend/distributed/deparser/ruleutils_15.c -citus-style
src/backend/distributed/deparser/ruleutils_16.c -citus-style
src/backend/distributed/deparser/ruleutils_17.c -citus-style
src/backend/distributed/utils/pg11_snprintf.c -citus-style
src/backend/distributed/deparser/ruleutils_11.c -citus-style
src/backend/distributed/deparser/ruleutils_12.c -citus-style
src/backend/distributed/deparser/ruleutils_13.c -citus-style
src/backend/distributed/deparser/ruleutils_14.c -citus-style
src/backend/distributed/commands/index_pg_source.c -citus-style
src/include/distributed/citus_nodes.h -citus-style

View File

@ -1,23 +0,0 @@
name: 'Parallelization matrix'
inputs:
count:
required: false
default: 32
outputs:
json:
value: ${{ steps.generate_matrix.outputs.json }}
runs:
using: "composite"
steps:
- name: Generate parallelization matrix
id: generate_matrix
shell: bash
run: |-
json_array="{\"include\": ["
for ((i = 1; i <= ${{ inputs.count }}; i++)); do
json_array+="{\"id\":\"$i\"},"
done
json_array=${json_array%,}
json_array+=" ]}"
echo "json=$json_array" >> "$GITHUB_OUTPUT"
echo "json=$json_array"

View File

@ -1,38 +0,0 @@
name: save_logs_and_results
inputs:
folder:
required: false
default: "log"
runs:
using: composite
steps:
- uses: actions/upload-artifact@v4.6.0
name: Upload logs
with:
name: ${{ inputs.folder }}
if-no-files-found: ignore
path: |
src/test/**/proxy.output
src/test/**/results/
src/test/**/tmp_check/master/log
src/test/**/tmp_check/worker.57638/log
src/test/**/tmp_check/worker.57637/log
src/test/**/*.diffs
src/test/**/out/ddls.sql
src/test/**/out/queries.sql
src/test/**/logfile_*
/tmp/pg_upgrade_newData_logs
- name: Publish regression.diffs
run: |-
diffs="$(find src/test/regress -name "*.diffs" -exec cat {} \;)"
if ! [ -z "$diffs" ]; then
echo '```diff' >> $GITHUB_STEP_SUMMARY
echo -E "$diffs" >> $GITHUB_STEP_SUMMARY
echo '```' >> $GITHUB_STEP_SUMMARY
echo -E $diffs
fi
shell: bash
- name: Print stack traces
run: "./ci/print_stack_trace.sh"
if: failure()
shell: bash

View File

@ -1,35 +0,0 @@
name: setup_extension
inputs:
pg_major:
required: false
skip_installation:
required: false
default: false
type: boolean
runs:
using: composite
steps:
- name: Expose $PG_MAJOR to Github Env
run: |-
if [ -z "${{ inputs.pg_major }}" ]; then
echo "PG_MAJOR=${PG_MAJOR}" >> $GITHUB_ENV
else
echo "PG_MAJOR=${{ inputs.pg_major }}" >> $GITHUB_ENV
fi
shell: bash
- uses: actions/download-artifact@v4.1.8
with:
name: build-${{ env.PG_MAJOR }}
- name: Install Extension
if: ${{ inputs.skip_installation == 'false' }}
run: tar xfv "install-$PG_MAJOR.tar" --directory /
shell: bash
- name: Configure
run: |-
chown -R circleci .
git config --global --add safe.directory ${GITHUB_WORKSPACE}
gosu circleci ./configure --without-pg-version-check
shell: bash
- name: Enable core dumps
run: ulimit -c unlimited
shell: bash

View File

@ -1,27 +0,0 @@
name: coverage
inputs:
flags:
required: false
codecov_token:
required: true
runs:
using: composite
steps:
- uses: codecov/codecov-action@v3
with:
flags: ${{ inputs.flags }}
token: ${{ inputs.codecov_token }}
verbose: true
gcov: true
- name: Create codeclimate coverage
run: |-
lcov --directory . --capture --output-file lcov.info
lcov --remove lcov.info -o lcov.info '/usr/*'
sed "s=^SF:$PWD/=SF:=g" -i lcov.info # relative pats are required by codeclimate
mkdir -p /tmp/codeclimate
cc-test-reporter format-coverage -t lcov -o /tmp/codeclimate/${{ inputs.flags }}.json lcov.info
shell: bash
- uses: actions/upload-artifact@v4.6.0
with:
path: "/tmp/codeclimate/*.json"
name: codeclimate-${{ inputs.flags }}

View File

@ -1,18 +1,3 @@
#!/bin/bash
set -ex
# Function to get the OS version
get_rpm_os_version() {
if [[ -f /etc/centos-release ]]; then
cat /etc/centos-release | awk '{print $4}'
elif [[ -f /etc/oracle-release ]]; then
cat /etc/oracle-release | awk '{print $5}'
else
echo "Unknown"
fi
}
package_type=${1}
# Since $HOME is set in GH_Actions as /github/home, pyenv fails to create virtualenvs.
@ -25,27 +10,11 @@ pyenv versions
pyenv virtualenv ${PACKAGING_PYTHON_VERSION} packaging_env
pyenv activate packaging_env
git clone -b v0.8.27 --depth=1 https://github.com/citusdata/tools.git tools
git clone -b v0.8.25 --depth=1 https://github.com/citusdata/tools.git tools
python3 -m pip install -r tools/packaging_automation/requirements.txt
echo "Package type: ${package_type}"
echo "OS version: $(get_rpm_os_version)"
# For RHEL 7, we need to install urllib3<2 due to below execution error
# ImportError: urllib3 v2.0 only supports OpenSSL 1.1.1+, currently the 'ssl'
# module is compiled with 'OpenSSL 1.0.2k-fips 26 Jan 2017'.
# See: https://github.com/urllib3/urllib3/issues/2168
if [[ ${package_type} == "rpm" && $(get_rpm_os_version) == 7* ]]; then
python3 -m pip uninstall -y urllib3
python3 -m pip install 'urllib3<2'
fi
python3 -m tools.packaging_automation.validate_build_output --output_file output.log \
--ignore_file .github/packaging/packaging_ignore.yml \
--package_type ${package_type}
pyenv deactivate
# Set $HOME back to /github/home
export HOME=${GITHUB_HOME}
# Print the output to the console

View File

@ -1,545 +0,0 @@
name: Build & Test
run-name: Build & Test - ${{ github.event.pull_request.title || github.ref_name }}
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
on:
workflow_dispatch:
inputs:
skip_test_flakyness:
required: false
default: false
type: boolean
push:
branches:
- "main"
- "release-*"
pull_request:
types: [opened, reopened,synchronize]
merge_group:
jobs:
# Since GHA does not interpolate env varibles in matrix context, we need to
# define them in a separate job and use them in other jobs.
params:
runs-on: ubuntu-latest
name: Initialize parameters
outputs:
build_image_name: "ghcr.io/citusdata/extbuilder"
test_image_name: "ghcr.io/citusdata/exttester"
citusupgrade_image_name: "ghcr.io/citusdata/citusupgradetester"
fail_test_image_name: "ghcr.io/citusdata/failtester"
pgupgrade_image_name: "ghcr.io/citusdata/pgupgradetester"
style_checker_image_name: "ghcr.io/citusdata/stylechecker"
style_checker_tools_version: "0.8.18"
sql_snapshot_pg_version: "17.5"
image_suffix: "-dev-d28f316"
pg15_version: '{ "major": "15", "full": "15.13" }'
pg16_version: '{ "major": "16", "full": "16.9" }'
pg17_version: '{ "major": "17", "full": "17.5" }'
upgrade_pg_versions: "15.13-16.9-17.5"
steps:
# Since GHA jobs need at least one step we use a noop step here.
- name: Set up parameters
run: echo 'noop'
check-sql-snapshots:
needs: params
runs-on: ubuntu-latest
container:
image: ${{ needs.params.outputs.build_image_name }}:${{ needs.params.outputs.sql_snapshot_pg_version }}${{ needs.params.outputs.image_suffix }}
options: --user root
steps:
- uses: actions/checkout@v4
- name: Check Snapshots
run: |
git config --global --add safe.directory ${GITHUB_WORKSPACE}
ci/check_sql_snapshots.sh
check-style:
needs: params
runs-on: ubuntu-latest
container:
image: ${{ needs.params.outputs.style_checker_image_name }}:${{ needs.params.outputs.style_checker_tools_version }}${{ needs.params.outputs.image_suffix }}
steps:
- name: Check Snapshots
run: |
git config --global --add safe.directory ${GITHUB_WORKSPACE}
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Check C Style
run: citus_indent --check
- name: Check Python style
run: black --check .
- name: Check Python import order
run: isort --check .
- name: Check Python lints
run: flake8 .
- name: Fix whitespace
run: ci/editorconfig.sh && git diff --exit-code
- name: Remove useless declarations
run: ci/remove_useless_declarations.sh && git diff --cached --exit-code
- name: Sort and group includes
run: ci/sort_and_group_includes.sh && git diff --exit-code
- name: Normalize test output
run: ci/normalize_expected.sh && git diff --exit-code
- name: Check for C-style comments in migration files
run: ci/disallow_c_comments_in_migrations.sh && git diff --exit-code
- name: 'Check for comment--cached ns that start with # character in spec files'
run: ci/disallow_hash_comments_in_spec_files.sh && git diff --exit-code
- name: Check for gitignore entries .for source files
run: ci/fix_gitignore.sh && git diff --exit-code
- name: Check for lengths of changelog entries
run: ci/disallow_long_changelog_entries.sh
- name: Check for banned C API usage
run: ci/banned.h.sh
- name: Check for tests missing in schedules
run: ci/check_all_tests_are_run.sh
- name: Check if all CI scripts are actually run
run: ci/check_all_ci_scripts_are_run.sh
- name: Check if all GUCs are sorted alphabetically
run: ci/check_gucs_are_alphabetically_sorted.sh
- name: Check for missing downgrade scripts
run: ci/check_migration_files.sh
build:
needs: params
name: Build for PG${{ fromJson(matrix.pg_version).major }}
strategy:
fail-fast: false
matrix:
image_name:
- ${{ needs.params.outputs.build_image_name }}
image_suffix:
- ${{ needs.params.outputs.image_suffix}}
pg_version:
- ${{ needs.params.outputs.pg15_version }}
- ${{ needs.params.outputs.pg16_version }}
- ${{ needs.params.outputs.pg17_version }}
runs-on: ubuntu-latest
container:
image: "${{ matrix.image_name }}:${{ fromJson(matrix.pg_version).full }}${{ matrix.image_suffix }}"
options: --user root
steps:
- uses: actions/checkout@v4
- name: Expose $PG_MAJOR to Github Env
run: echo "PG_MAJOR=${PG_MAJOR}" >> $GITHUB_ENV
shell: bash
- name: Build
run: "./ci/build-citus.sh"
shell: bash
- uses: actions/upload-artifact@v4.6.0
with:
name: build-${{ env.PG_MAJOR }}
path: |-
./build-${{ env.PG_MAJOR }}/*
./install-${{ env.PG_MAJOR }}.tar
test-citus:
name: PG${{ fromJson(matrix.pg_version).major }} - ${{ matrix.make }}
strategy:
fail-fast: false
matrix:
suite:
- regress
image_name:
- ${{ needs.params.outputs.test_image_name }}
pg_version:
- ${{ needs.params.outputs.pg15_version }}
- ${{ needs.params.outputs.pg16_version }}
- ${{ needs.params.outputs.pg17_version }}
make:
- check-split
- check-multi
- check-multi-1
- check-multi-mx
- check-vanilla
- check-isolation
- check-operations
- check-follower-cluster
- check-columnar
- check-columnar-isolation
- check-enterprise
- check-enterprise-isolation
- check-enterprise-isolation-logicalrep-1
- check-enterprise-isolation-logicalrep-2
- check-enterprise-isolation-logicalrep-3
include:
- make: check-failure
pg_version: ${{ needs.params.outputs.pg15_version }}
suite: regress
image_name: ${{ needs.params.outputs.fail_test_image_name }}
- make: check-failure
pg_version: ${{ needs.params.outputs.pg16_version }}
suite: regress
image_name: ${{ needs.params.outputs.fail_test_image_name }}
- make: check-failure
pg_version: ${{ needs.params.outputs.pg17_version }}
suite: regress
image_name: ${{ needs.params.outputs.fail_test_image_name }}
- make: check-enterprise-failure
pg_version: ${{ needs.params.outputs.pg15_version }}
suite: regress
image_name: ${{ needs.params.outputs.fail_test_image_name }}
- make: check-enterprise-failure
pg_version: ${{ needs.params.outputs.pg16_version }}
suite: regress
image_name: ${{ needs.params.outputs.fail_test_image_name }}
- make: check-enterprise-failure
pg_version: ${{ needs.params.outputs.pg17_version }}
suite: regress
image_name: ${{ needs.params.outputs.fail_test_image_name }}
- make: check-pytest
pg_version: ${{ needs.params.outputs.pg15_version }}
suite: regress
image_name: ${{ needs.params.outputs.fail_test_image_name }}
- make: check-pytest
pg_version: ${{ needs.params.outputs.pg16_version }}
suite: regress
image_name: ${{ needs.params.outputs.fail_test_image_name }}
- make: check-pytest
pg_version: ${{ needs.params.outputs.pg17_version }}
suite: regress
image_name: ${{ needs.params.outputs.fail_test_image_name }}
- make: installcheck
suite: cdc
image_name: ${{ needs.params.outputs.test_image_name }}
pg_version: ${{ needs.params.outputs.pg15_version }}
- make: installcheck
suite: cdc
image_name: ${{ needs.params.outputs.test_image_name }}
pg_version: ${{ needs.params.outputs.pg16_version }}
- make: installcheck
suite: cdc
image_name: ${{ needs.params.outputs.test_image_name }}
pg_version: ${{ needs.params.outputs.pg17_version }}
- make: check-query-generator
pg_version: ${{ needs.params.outputs.pg15_version }}
suite: regress
image_name: ${{ needs.params.outputs.fail_test_image_name }}
- make: check-query-generator
pg_version: ${{ needs.params.outputs.pg16_version }}
suite: regress
image_name: ${{ needs.params.outputs.fail_test_image_name }}
- make: check-query-generator
pg_version: ${{ needs.params.outputs.pg17_version }}
suite: regress
image_name: ${{ needs.params.outputs.fail_test_image_name }}
runs-on: ubuntu-latest
container:
image: "${{ matrix.image_name }}:${{ fromJson(matrix.pg_version).full }}${{ needs.params.outputs.image_suffix }}"
options: --user root --dns=8.8.8.8
# Due to Github creates a default network for each job, we need to use
# --dns= to have similar DNS settings as our other CI systems or local
# machines. Otherwise, we may see different results.
needs:
- params
- build
steps:
- uses: actions/checkout@v4
- uses: "./.github/actions/setup_extension"
- name: Run Test
run: gosu circleci make -C src/test/${{ matrix.suite }} ${{ matrix.make }}
timeout-minutes: 20
- uses: "./.github/actions/save_logs_and_results"
if: always()
with:
folder: ${{ fromJson(matrix.pg_version).major }}_${{ matrix.make }}
- uses: "./.github/actions/upload_coverage"
if: always()
with:
flags: ${{ env.PG_MAJOR }}_${{ matrix.suite }}_${{ matrix.make }}
codecov_token: ${{ secrets.CODECOV_TOKEN }}
test-arbitrary-configs:
name: PG${{ fromJson(matrix.pg_version).major }} - check-arbitrary-configs-${{ matrix.parallel }}
runs-on: ["self-hosted", "1ES.Pool=1es-gha-citusdata-pool"]
container:
image: "${{ matrix.image_name }}:${{ fromJson(matrix.pg_version).full }}${{ needs.params.outputs.image_suffix }}"
options: --user root
needs:
- params
- build
strategy:
fail-fast: false
matrix:
image_name:
- ${{ needs.params.outputs.fail_test_image_name }}
pg_version:
- ${{ needs.params.outputs.pg15_version }}
- ${{ needs.params.outputs.pg16_version }}
- ${{ needs.params.outputs.pg17_version }}
parallel: [0,1,2,3,4,5] # workaround for running 6 parallel jobs
steps:
- uses: actions/checkout@v4
- uses: "./.github/actions/setup_extension"
- name: Test arbitrary configs
run: |-
# we use parallel jobs to split the tests into 6 parts and run them in parallel
# the script below extracts the tests for the current job
N=6 # Total number of jobs (see matrix.parallel)
X=${{ matrix.parallel }} # Current job number
TESTS=$(src/test/regress/citus_tests/print_test_names.py |
tr '\n' ',' | awk -v N="$N" -v X="$X" -F, '{
split("", parts)
for (i = 1; i <= NF; i++) {
parts[i % N] = parts[i % N] $i ","
}
print substr(parts[X], 1, length(parts[X])-1)
}')
echo $TESTS
gosu circleci \
make -C src/test/regress \
check-arbitrary-configs parallel=4 CONFIGS=$TESTS
- uses: "./.github/actions/save_logs_and_results"
if: always()
with:
folder: ${{ env.PG_MAJOR }}_arbitrary_configs_${{ matrix.parallel }}
- uses: "./.github/actions/upload_coverage"
if: always()
with:
flags: ${{ env.PG_MAJOR }}_arbitrary_configs_${{ matrix.parallel }}
codecov_token: ${{ secrets.CODECOV_TOKEN }}
test-pg-upgrade:
name: PG${{ matrix.old_pg_major }}-PG${{ matrix.new_pg_major }} - check-pg-upgrade
runs-on: ubuntu-latest
container:
image: "${{ needs.params.outputs.pgupgrade_image_name }}:${{ needs.params.outputs.upgrade_pg_versions }}${{ needs.params.outputs.image_suffix }}"
options: --user root
needs:
- params
- build
strategy:
fail-fast: false
matrix:
include:
- old_pg_major: 15
new_pg_major: 16
- old_pg_major: 16
new_pg_major: 17
- old_pg_major: 15
new_pg_major: 17
env:
old_pg_major: ${{ matrix.old_pg_major }}
new_pg_major: ${{ matrix.new_pg_major }}
steps:
- uses: actions/checkout@v4
- uses: "./.github/actions/setup_extension"
with:
pg_major: "${{ env.old_pg_major }}"
- uses: "./.github/actions/setup_extension"
with:
pg_major: "${{ env.new_pg_major }}"
- name: Install and test postgres upgrade
run: |-
gosu circleci \
make -C src/test/regress \
check-pg-upgrade \
old-bindir=/usr/lib/postgresql/${{ env.old_pg_major }}/bin \
new-bindir=/usr/lib/postgresql/${{ env.new_pg_major }}/bin
- name: Copy pg_upgrade logs for newData dir
run: |-
mkdir -p /tmp/pg_upgrade_newData_logs
if ls src/test/regress/tmp_upgrade/newData/*.log 1> /dev/null 2>&1; then
cp src/test/regress/tmp_upgrade/newData/*.log /tmp/pg_upgrade_newData_logs
fi
if: failure()
- uses: "./.github/actions/save_logs_and_results"
if: always()
with:
folder: ${{ env.old_pg_major }}_${{ env.new_pg_major }}_upgrade
- uses: "./.github/actions/upload_coverage"
if: always()
with:
flags: ${{ env.old_pg_major }}_${{ env.new_pg_major }}_upgrade
codecov_token: ${{ secrets.CODECOV_TOKEN }}
test-citus-upgrade:
name: PG${{ fromJson(needs.params.outputs.pg15_version).major }} - check-citus-upgrade
runs-on: ubuntu-latest
container:
image: "${{ needs.params.outputs.citusupgrade_image_name }}:${{ fromJson(needs.params.outputs.pg15_version).full }}${{ needs.params.outputs.image_suffix }}"
options: --user root
needs:
- params
- build
steps:
- uses: actions/checkout@v4
- uses: "./.github/actions/setup_extension"
with:
skip_installation: true
- name: Install and test citus upgrade
run: |-
# run make check-citus-upgrade for all citus versions
# the image has ${CITUS_VERSIONS} set with all verions it contains the binaries of
for citus_version in ${CITUS_VERSIONS}; do \
gosu circleci \
make -C src/test/regress \
check-citus-upgrade \
bindir=/usr/lib/postgresql/${PG_MAJOR}/bin \
citus-old-version=${citus_version} \
citus-pre-tar=/install-pg${PG_MAJOR}-citus${citus_version}.tar \
citus-post-tar=${GITHUB_WORKSPACE}/install-$PG_MAJOR.tar; \
done;
# run make check-citus-upgrade-mixed for all citus versions
# the image has ${CITUS_VERSIONS} set with all verions it contains the binaries of
for citus_version in ${CITUS_VERSIONS}; do \
gosu circleci \
make -C src/test/regress \
check-citus-upgrade-mixed \
citus-old-version=${citus_version} \
bindir=/usr/lib/postgresql/${PG_MAJOR}/bin \
citus-pre-tar=/install-pg${PG_MAJOR}-citus${citus_version}.tar \
citus-post-tar=${GITHUB_WORKSPACE}/install-$PG_MAJOR.tar; \
done;
- uses: "./.github/actions/save_logs_and_results"
if: always()
with:
folder: ${{ env.PG_MAJOR }}_citus_upgrade
- uses: "./.github/actions/upload_coverage"
if: always()
with:
flags: ${{ env.PG_MAJOR }}_citus_upgrade
codecov_token: ${{ secrets.CODECOV_TOKEN }}
upload-coverage:
if: always()
env:
CC_TEST_REPORTER_ID: ${{ secrets.CC_TEST_REPORTER_ID }}
runs-on: ubuntu-latest
container:
image: ${{ needs.params.outputs.test_image_name }}:${{ fromJson(needs.params.outputs.pg17_version).full }}${{ needs.params.outputs.image_suffix }}
needs:
- params
- test-citus
- test-arbitrary-configs
- test-citus-upgrade
- test-pg-upgrade
steps:
- uses: actions/download-artifact@v4.1.8
with:
pattern: codeclimate*
path: codeclimate
merge-multiple: true
- name: Upload coverage results to Code Climate
run: |-
cc-test-reporter sum-coverage codeclimate/*.json -o total.json
cc-test-reporter upload-coverage -i total.json
ch_benchmark:
name: CH Benchmark
if: startsWith(github.ref, 'refs/heads/ch_benchmark/')
runs-on: ubuntu-latest
needs:
- build
steps:
- uses: actions/checkout@v4
- uses: azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: install dependencies and run ch_benchmark tests
uses: azure/CLI@v1
with:
inlineScript: |
cd ./src/test/hammerdb
chmod +x run_hammerdb.sh
run_hammerdb.sh citusbot_ch_benchmark_rg
tpcc_benchmark:
name: TPCC Benchmark
if: startsWith(github.ref, 'refs/heads/tpcc_benchmark/')
runs-on: ubuntu-latest
needs:
- build
steps:
- uses: actions/checkout@v4
- uses: azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: install dependencies and run tpcc_benchmark tests
uses: azure/CLI@v1
with:
inlineScript: |
cd ./src/test/hammerdb
chmod +x run_hammerdb.sh
run_hammerdb.sh citusbot_tpcc_benchmark_rg
prepare_parallelization_matrix_32:
name: Prepare parallelization matrix
if: ${{ needs.test-flakyness-pre.outputs.tests != ''}}
needs: test-flakyness-pre
runs-on: ubuntu-latest
outputs:
json: ${{ steps.parallelization.outputs.json }}
steps:
- uses: actions/checkout@v4
- uses: "./.github/actions/parallelization"
id: parallelization
with:
count: 32
test-flakyness-pre:
name: Detect regression tests need to be ran
if: ${{ !inputs.skip_test_flakyness }}}
runs-on: ubuntu-latest
needs: build
outputs:
tests: ${{ steps.detect-regression-tests.outputs.tests }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Detect regression tests need to be ran
id: detect-regression-tests
run: |-
detected_changes=$(git diff origin/main... --name-only --diff-filter=AM | (grep 'src/test/regress/sql/.*\.sql\|src/test/regress/spec/.*\.spec\|src/test/regress/citus_tests/test/test_.*\.py' || true))
tests=${detected_changes}
# split the tests to be skipped --today we only skip upgrade tests
skipped_tests=""
not_skipped_tests=""
for test in $tests; do
if [[ $test =~ ^src/test/regress/sql/upgrade_ ]]; then
skipped_tests="$skipped_tests $test"
else
not_skipped_tests="$not_skipped_tests $test"
fi
done
if [ ! -z "$skipped_tests" ]; then
echo "Skipped tests " $skipped_tests
fi
if [ -z "$not_skipped_tests" ]; then
echo "Not detected any tests that flaky test detection should run"
else
echo "Detected tests " $not_skipped_tests
fi
echo 'tests<<EOF' >> $GITHUB_OUTPUT
echo "$not_skipped_tests" >> "$GITHUB_OUTPUT"
echo 'EOF' >> $GITHUB_OUTPUT
test-flakyness:
if: ${{ needs.test-flakyness-pre.outputs.tests != ''}}
name: Test flakyness
runs-on: ubuntu-latest
container:
image: ${{ needs.params.outputs.fail_test_image_name }}:${{ fromJson(needs.params.outputs.pg17_version).full }}${{ needs.params.outputs.image_suffix }}
options: --user root
env:
runs: 8
needs:
- params
- build
- test-flakyness-pre
- prepare_parallelization_matrix_32
strategy:
fail-fast: false
matrix: ${{ fromJson(needs.prepare_parallelization_matrix_32.outputs.json) }}
steps:
- uses: actions/checkout@v4
- uses: actions/download-artifact@v4.1.8
- uses: "./.github/actions/setup_extension"
- name: Run minimal tests
run: |-
tests="${{ needs.test-flakyness-pre.outputs.tests }}"
tests_array=($tests)
for test in "${tests_array[@]}"
do
test_name=$(echo "$test" | sed -r "s/.+\/(.+)\..+/\1/")
gosu circleci src/test/regress/citus_tests/run_test.py $test_name --repeat ${{ env.runs }} --use-whole-schedule-line
done
shell: bash
- uses: "./.github/actions/save_logs_and_results"
if: always()
with:
folder: test_flakyness_parallel_${{ matrix.id }}

View File

@ -1,79 +0,0 @@
name: "CodeQL"
on:
schedule:
- cron: '59 23 * * 6'
workflow_dispatch:
jobs:
analyze:
name: Analyze
runs-on: ubuntu-22.04
permissions:
actions: read
contents: read
security-events: write
strategy:
fail-fast: false
matrix:
language: [ 'cpp', 'python']
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
- name: Install package dependencies
run: |
# Create the file repository configuration:
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main 15" > /etc/apt/sources.list.d/pgdg.list'
# Import the repository signing key:
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
sudo apt-get update
sudo apt-get install -y --no-install-recommends \
autotools-dev \
build-essential \
ca-certificates \
curl \
debhelper \
devscripts \
fakeroot \
flex \
libcurl4-openssl-dev \
libdistro-info-perl \
libedit-dev \
libfile-fcntllock-perl \
libicu-dev \
libkrb5-dev \
liblz4-1 \
liblz4-dev \
libpam0g-dev \
libreadline-dev \
libselinux1-dev \
libssl-dev \
libxslt-dev \
libzstd-dev \
libzstd1 \
lintian \
postgresql-server-dev-15 \
postgresql-server-dev-all \
python3-pip \
python3-setuptools \
wget \
zlib1g-dev
- name: Configure, Build and Install Citus
if: matrix.language == 'cpp'
run: |
./configure
make -sj8
sudo make install-all
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3

View File

@ -1,54 +0,0 @@
name: "Build devcontainer"
# Since building of containers can be quite time consuming, and take up some storage,
# there is no need to finish a build for a tag if new changes are concurrently being made.
# This cancels any previous builds for the same tag, and only the latest one will be kept.
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
on:
push:
paths:
- ".devcontainer/**"
workflow_dispatch:
jobs:
docker:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
attestations: write
id-token: write
steps:
-
name: Docker meta
id: meta
uses: docker/metadata-action@v5
with:
images: |
ghcr.io/citusdata/citus-devcontainer
tags: |
type=ref,event=branch
type=sha
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
-
name: 'Login to GitHub Container Registry'
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{github.actor}}
password: ${{secrets.GITHUB_TOKEN}}
-
name: Build and push
uses: docker/build-push-action@v5
with:
context: "{{defaultContext}}:.devcontainer"
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max

View File

@ -1,79 +0,0 @@
name: Flaky test debugging
run-name: Flaky test debugging - ${{ inputs.flaky_test }} (${{ inputs.flaky_test_runs_per_job }}x${{ inputs.flaky_test_parallel_jobs }})
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
on:
workflow_dispatch:
inputs:
flaky_test:
required: true
type: string
description: Test to run
flaky_test_runs_per_job:
required: false
default: 8
type: number
description: Number of times to run the test
flaky_test_parallel_jobs:
required: false
default: 32
type: number
description: Number of parallel jobs to run
jobs:
build:
name: Build Citus
runs-on: ubuntu-latest
container:
image: ${{ vars.build_image_name }}:${{ vars.pg15_version }}${{ vars.image_suffix }}
options: --user root
steps:
- uses: actions/checkout@v4
- name: Configure, Build, and Install
run: |
echo "PG_MAJOR=${PG_MAJOR}" >> $GITHUB_ENV
./ci/build-citus.sh
shell: bash
- uses: actions/upload-artifact@v4.6.0
with:
name: build-${{ env.PG_MAJOR }}
path: |-
./build-${{ env.PG_MAJOR }}/*
./install-${{ env.PG_MAJOR }}.tar
prepare_parallelization_matrix:
name: Prepare parallelization matrix
runs-on: ubuntu-latest
outputs:
json: ${{ steps.parallelization.outputs.json }}
steps:
- uses: actions/checkout@v4
- uses: "./.github/actions/parallelization"
id: parallelization
with:
count: ${{ inputs.flaky_test_parallel_jobs }}
test_flakyness:
name: Test flakyness
runs-on: ubuntu-latest
container:
image: ${{ vars.fail_test_image_name }}:${{ vars.pg15_version }}${{ vars.image_suffix }}
options: --user root
needs:
[build, prepare_parallelization_matrix]
env:
test: "${{ inputs.flaky_test }}"
runs: "${{ inputs.flaky_test_runs_per_job }}"
skip: false
strategy:
fail-fast: false
matrix: ${{ fromJson(needs.prepare_parallelization_matrix.outputs.json) }}
steps:
- uses: actions/checkout@v4
- uses: "./.github/actions/setup_extension"
- name: Run minimal tests
run: |-
gosu circleci src/test/regress/citus_tests/run_test.py ${{ env.test }} --repeat ${{ env.runs }} --use-whole-schedule-line
shell: bash
- uses: "./.github/actions/save_logs_and_results"
if: always()
with:
folder: check_flakyness_parallel_${{ matrix.id }}

View File

@ -1,16 +1,11 @@
name: Build tests in packaging images
on:
pull_request:
types: [opened, reopened,synchronize]
merge_group:
push:
branches: "**"
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
get_postgres_versions_from_file:
@ -19,22 +14,20 @@ jobs:
pg_versions: ${{ steps.get-postgres-versions.outputs.pg_versions }}
steps:
- name: Checkout
uses: actions/checkout@v4
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Get Postgres Versions
id: get-postgres-versions
run: |
set -euxo pipefail
# Postgres versions are stored in .github/workflows/build_and_test.yml
# file in json strings with major and full keys.
# Below command extracts the versions and get the unique values.
pg_versions=$(cat .github/workflows/build_and_test.yml | grep -oE '"major": "[0-9]+", "full": "[0-9.]+"' | sed -E 's/"major": "([0-9]+)", "full": "([0-9.]+)"/\1/g' | sort | uniq | tr '\n', ',')
# Postgres versions are stored in .circleci/config.yml file in "build-[pg-version] format. Below command
# extracts the versions and get the unique values.
pg_versions=`grep -Eo 'build-[[:digit:]]{2}' .circleci/config.yml|sed -e "s/^build-//"|sort|uniq|tr '\n' ','| head -c -1`
pg_versions_array="[ ${pg_versions} ]"
echo "Supported PG Versions: ${pg_versions_array}"
# Below line is needed to set the output variable to be used in the next job
echo "pg_versions=${pg_versions_array}" >> $GITHUB_OUTPUT
shell: bash
rpm_build_tests:
name: rpm_build_tests
needs: get_postgres_versions_from_file
@ -47,8 +40,10 @@ jobs:
# For this reason, we need to use a "matrix" to generate names of
# rpm images, e.g. citus/packaging:centos-7-pg12
packaging_docker_image:
- oraclelinux-7
- oraclelinux-8
- almalinux-8
- centos-7
- centos-8
- almalinux-9
POSTGRES_VERSION: ${{ fromJson(needs.get_postgres_versions_from_file.outputs.pg_versions) }}
@ -58,7 +53,7 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@v4
uses: actions/checkout@v3
- name: Set Postgres and python parameters for rpm based distros
run: |
@ -78,18 +73,8 @@ jobs:
- name: Make
run: |
git config --global --add safe.directory ${GITHUB_WORKSPACE}
make CFLAGS="-Wno-missing-braces" -sj$(cat /proc/cpuinfo | grep "core id" | wc -l) 2>&1 | tee -a output.log
# Check the exit code of the make command
make_exit_code=${PIPESTATUS[0]}
# If the make command returned a non-zero exit code, exit with the same code
if [[ $make_exit_code -ne 0 ]]; then
echo "make command failed with exit code $make_exit_code"
exit $make_exit_code
fi
- name: Make install
run: |
make CFLAGS="-Wno-missing-braces" install 2>&1 | tee -a output.log
@ -100,6 +85,11 @@ jobs:
PACKAGING_DOCKER_IMAGE: ${{ matrix.packaging_docker_image }}
run: |
echo "Postgres version: ${POSTGRES_VERSION}"
## Install required packages to execute packaging tools for rpm based distros
yum install python3-pip python3-devel postgresql-devel -y
python3 -m pip install wheel
./.github/packaging/validate_build_output.sh "rpm"
deb_build_tests:
@ -116,10 +106,13 @@ jobs:
# for each deb based image and we use POSTGRES_VERSION to set
# PG_CONFIG variable in each of those runs.
packaging_docker_image:
- debian-buster-all
- debian-bookworm-all
- debian-bullseye-all
- ubuntu-bionic-all
- ubuntu-focal-all
- ubuntu-jammy-all
- ubuntu-kinetic-all
POSTGRES_VERSION: ${{ fromJson(needs.get_postgres_versions_from_file.outputs.pg_versions) }}
@ -129,7 +122,7 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@v4
uses: actions/checkout@v3
- name: Set pg_config path and python parameters for deb based distros
run: |
@ -148,22 +141,9 @@ jobs:
make clean
- name: Make
shell: bash
run: |
set -e
git config --global --add safe.directory ${GITHUB_WORKSPACE}
make -sj$(cat /proc/cpuinfo | grep "core id" | wc -l) 2>&1 | tee -a output.log
# Check the exit code of the make command
make_exit_code=${PIPESTATUS[0]}
# If the make command returned a non-zero exit code, exit with the same code
if [[ $make_exit_code -ne 0 ]]; then
echo "make command failed with exit code $make_exit_code"
exit $make_exit_code
fi
- name: Make install
run: |
make install 2>&1 | tee -a output.log
@ -174,4 +154,8 @@ jobs:
PACKAGING_DOCKER_IMAGE: ${{ matrix.packaging_docker_image }}
run: |
echo "Postgres version: ${POSTGRES_VERSION}"
sudo apt-get update -y
## Install required packages to execute packaging tools for deb based distros
sudo apt-get purge -y python3-yaml
./.github/packaging/validate_build_output.sh "deb"

7
.gitignore vendored
View File

@ -38,9 +38,6 @@ lib*.pc
/Makefile.global
/src/Makefile.custom
/compile_commands.json
/src/backend/distributed/cdc/build-cdc-*/*
/src/test/cdc/tmp_check/*
# temporary files vim creates
*.swp
@ -54,7 +51,3 @@ lib*.pc
# style related temporary outputs
*.uncrustify
.venv
# added output when modifying check_gucs_are_alphabetically_sorted.sh
guc.out

File diff suppressed because it is too large Load Diff

View File

@ -1,9 +0,0 @@
# Microsoft Open Source Code of Conduct
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
Resources:
- [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/)
- [Microsoft Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/)
- Contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with questions or concerns

View File

@ -11,65 +11,8 @@ sign a Contributor License Agreement (CLA). For an explanation of
why we ask this as well as instructions for how to proceed, see the
[Microsoft CLA](https://cla.opensource.microsoft.com/).
### Devcontainer / Github Codespaces
The easiest way to start contributing is via our devcontainer. This container works both locally in visual studio code with docker-desktop/docker-for-mac as well as [Github Codespaces](https://github.com/features/codespaces). To open the project in vscode you will need the [Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers). For codespaces you will need to [create a new codespace](https://codespace.new/citusdata/citus).
With the extension installed you can run the following from the command pallet to get started
```
> Dev Containers: Clone Repository in Container Volume...
```
In the subsequent popup paste the url to the repo and hit enter.
```
https://github.com/citusdata/citus
```
This will create an isolated Workspace in vscode, complete with all tools required to build, test and run the Citus extension. We keep this container up to date with the supported postgres versions as well as the exact versions of tooling we use.
To quickly start we suggest splitting your terminal once to have two shells. The left one in the `/workspaces/citus`, the second one changed to `/data`. The left terminal will be used to interact with the project, the right one with a testing cluster.
To get citus installed from source we run `make install -s` in the first terminal. Once installed you can start a Citus cluster in the second terminal via `citus_dev make citus`. The cluster will run in the background, and can be interacted with via `citus_dev`. To get an overview of the available commands.
With the Citus cluster running you can connect to the coordinator in the first terminal via `psql -p9700`. Because the coordinator is the most common entrypoint the `PGPORT` environment is set accordingly, so a simple `psql` will connect directly to the coordinator.
### Debugging in the VS code
1. Start Debugging: Press F5 in VS Code to start debugging. When prompted, you'll need to attach the debugger to the appropriate PostgreSQL process.
2. Identify the Process: If you're running a psql command, take note of the PID that appears in your psql prompt. For example:
```
[local] citus@citus:9700 (PID: 5436)=#
```
This PID (5436 in this case) indicates the process that you should attach the debugger to.
If you are uncertain about which process to attach, you can list all running PostgreSQL processes using the following command:
```
ps aux | grep postgres
```
Look for the process associated with the PID you noted. For example:
```
citus 5436 0.0 0.0 0 0 ? S 14:00 0:00 postgres: citus citus
```
4. Attach the Debugger: Once you've identified the correct PID, select that process when prompted in VS Code to attach the debugger. You should now be able to debug the PostgreSQL session tied to the psql command.
5. Set Breakpoints and Debug: With the debugger attached, you can set breakpoints within the code. This allows you to step through the code execution, inspect variables, and fully debug the PostgreSQL instance running in your container.
### Getting and building
[PostgreSQL documentation](https://www.postgresql.org/support/versioning/) has a
section on upgrade policy.
We always recommend that all users run the latest available minor release [for PostgreSQL] for whatever major version is in use.
We expect Citus users to honor this recommendation and use latest available
PostgreSQL minor release. Failure to do so may result in failures in our test
suite. There are some known improvements in PG test architecture such as
[this commit](https://github.com/postgres/postgres/commit/3f323956128ff8589ce4d3a14e8b950837831803)
that are missing in earlier minor versions.
#### Mac
1. Install Xcode
@ -87,19 +30,11 @@ that are missing in earlier minor versions.
cd citus
./configure
# If you have already installed the project, you need to clean it first
make clean
make
make install
# Optionally, you might instead want to use `make install-all`
# since `multi_extension` regression test would fail due to missing downgrade scripts.
cd src/test/regress
pip install pipenv
pipenv --rm
pipenv install
pipenv shell
make check
```
@ -118,7 +53,7 @@ that are missing in earlier minor versions.
autoconf flex git libcurl4-gnutls-dev libicu-dev \
libkrb5-dev liblz4-dev libpam0g-dev libreadline-dev \
libselinux1-dev libssl-dev libxslt1-dev libzstd-dev \
make uuid-dev
make uuid-dev mitmproxy
```
2. Get, build, and test the code
@ -127,19 +62,11 @@ that are missing in earlier minor versions.
git clone https://github.com/citusdata/citus.git
cd citus
./configure
# If you have already installed the project previously, you need to clean it first
make clean
make
sudo make install
# Optionally, you might instead want to use `sudo make install-all`
# since `multi_extension` regression test would fail due to missing downgrade scripts.
cd src/test/regress
pip install pipenv
pipenv --rm
pipenv install
pipenv shell
make check
```
@ -179,25 +106,53 @@ that are missing in earlier minor versions.
git clone https://github.com/citusdata/citus.git
cd citus
PG_CONFIG=/usr/pgsql-14/bin/pg_config ./configure
# If you have already installed the project previously, you need to clean it first
make clean
make
sudo make install
# Optionally, you might instead want to use `sudo make install-all`
# since `multi_extension` regression test would fail due to missing downgrade scripts.
cd src/test/regress
pip install pipenv
pipenv --rm
pipenv install
pipenv shell
make check
```
### Following our coding conventions
Our coding conventions are documented in [STYLEGUIDE.md](STYLEGUIDE.md).
CircleCI will automatically reject any PRs which do not follow our coding
conventions. The easiest way to ensure your PR adheres to those conventions is
to use the [citus_indent](https://github.com/citusdata/tools/tree/develop/uncrustify)
tool. This tool uses `uncrustify` under the hood.
```bash
# Uncrustify changes the way it formats code every release a bit. To make sure
# everyone formats consistently we use version 0.68.1:
curl -L https://github.com/uncrustify/uncrustify/archive/uncrustify-0.68.1.tar.gz | tar xz
cd uncrustify-uncrustify-0.68.1/
mkdir build
cd build
cmake ..
make -j5
sudo make install
cd ../..
git clone https://github.com/citusdata/tools.git
cd tools
make uncrustify/.install
```
Once you've done that, you can run the `make reindent` command from the top
directory to recursively check and correct the style of any source files in the
current directory. Under the hood, `make reindent` will run `citus_indent` and
some other style corrections for you.
You can also run the following in the directory of this repository to
automatically format all the files that you have changed before committing:
```bash
cat > .git/hooks/pre-commit << __EOF__
#!/bin/bash
citus_indent --check --diff || { citus_indent --diff; exit 1; }
__EOF__
chmod +x .git/hooks/pre-commit
```
### Making SQL changes
@ -234,50 +189,3 @@ style `#include` statements like this:
Any other SQL you can put directly in the main sql file, e.g.
`src/backend/distributed/sql/citus--8.3-1--9.0-1.sql`.
### Backporting a commit to a release branch
1. Check out the release branch that you want to backport to `git checkout release-11.3`
2. Make sure you have the latest changes `git pull`
3. Create a new release branch with a unique name `git checkout -b release-11.3-<yourname>`
4. Cherry-pick the commit that you want to backport `git cherry-pick -x <sha>` (the `-x` is important)
5. Push the branch `git push`
6. Wait for tests to pass
7. If the cherry-pick required non-trivial merge conflicts, create a PR and ask
for a review.
8. After the tests pass on CI, fast-forward the release branch `git push origin release-11.3-<yourname>:release-11.3`
### Running tests
See [`src/test/regress/README.md`](https://github.com/citusdata/citus/blob/master/src/test/regress/README.md)
### Documentation
User-facing documentation is published on [docs.citusdata.com](https://docs.citusdata.com/). When adding a new feature, function, or setting, you can open a pull request or issue against the [Citus docs repo](https://github.com/citusdata/citus_docs/).
Detailed descriptions of the implementation for Citus developers are provided in the [Citus Technical Documentation](src/backend/distributed/README.md). It is currently a single file for ease of searching. Please update the documentation if you make any changes that affect the design or add major new features.
# Making a pull request ready for reviews
Asking for help and asking for reviews are two different things. When you're asking for help, you're asking for someone to help you with something that you're not expected to know.
But when you're asking for a review, you're asking for someone to review your work and provide feedback. So, when you're asking for a review, you're expected to make sure that:
* Your changes don't perform **unnecessary line addition / deletions / style changes on unrelated files / lines**.
* All CI jobs are **passing**, including **style checks** and **flaky test detection jobs**. Note that if you're an external contributor, you don't have to wait CI jobs to run (and finish) because they don't get automatically triggered for external contributors.
* Your PR has necessary amount of **tests** and that they're passing.
* You separated as much as possible work into **separate PRs**, e.g., a prerequisite bugfix, a refactoring etc..
* Your PR doesn't introduce a typo or something that you can easily fix yourself.
* After all CI jobs pass, code-coverage measurement job (CodeCov as of today) then kicks in. That's why it's important to make the **tests passing** first. At that point, you're expected to check **CodeCov annotations** that can be seen in the **Files Changed** tab and expected to make sure that it doesn't complain about any lines that are not covered. For example, it's ok if CodeCov complains about an `ereport()` call that you put for an "unexpected-but-better-than-crashing" case, but it's not ok if it complains about an uncovered `if` branch that you added.
* And finally, perform a **self-review** to make sure that:
* Code and code-comments reflects the idea **without requiring an extra explanation** via a chat message / email / PR comment.
This is important because we don't expect developers to reach out to author / read about the whole discussion in the PR to understand the idea behind a commit merged into `main` branch.
* PR description is clear enough.
* If-and-only-if you're **introducing a user facing change / bugfix**, your PR has a line that starts with `DESCRIPTION: <Present simple tense word that starts with a capital letter, e.g., Adds support for / Fixes / Disallows>`.
* **Commit messages** are clear enough if the commits are doing logically different things.

View File

@ -1,43 +0,0 @@
# Devcontainer
## Coredumps
When postgres/citus crashes, there is the option to create a coredump. This is useful for debugging the issue. Coredumps are enabled in the devcontainer by default. However, not all environments are configured correctly out of the box. The most important configuration that is not standardized is the `core_pattern`. The configuration can be verified from the container, however, you cannot change this setting from inside the container as the filesystem containing this setting is in read only mode while inside the container.
To verify if corefiles are written run the following command in a terminal. This shows the filename pattern with which the corefile will be written.
```bash
cat /proc/sys/kernel/core_pattern
```
This should be configured with a relative path or simply a simple filename, such as `core`. When your environment shows an absolute path you will need to change this setting. How to change this setting depends highly on the underlying system as the setting needs to be changed on the kernel of the host running the container.
You can put any pattern in `/proc/sys/kernel/core_pattern` as you see fit. eg. You can add the PID to the core pattern in one of two ways;
- You either include `%p` in the core_pattern. This gets substituted with the PID of the crashing process.
- Alternatively you could set `/proc/sys/kernel/core_uses_pid` to `1` in the same way as you set `core_pattern`. This will append the PID to the corefile if `%p` is not explicitly contained in the core_pattern.
When a coredump is written you can use the debug/launch configuration `Open core file` which is preconfigured in the devcontainer. This will open a fileprompt that lists all coredumps that are found in your workspace. When you want to debug coredumps from `citus_dev` that are run in your `/data` directory, you can add the data directory to your workspace. In the command pallet of vscode you can run `>Workspace: Add Folder to Workspace...` and select the `/data` directory. This will allow you to open the coredumps from the `/data` directory in the `Open core file` debug configuration.
### Windows (docker desktop)
When running in docker desktop on windows you will most likely need to change this setting. The linux guest in WSL2 that runs your container is the `docker-desktop` environment. The easiest way to get onto the host, where you can change this setting, is to open a powershell window and verify you have the docker-desktop environment listed.
```powershell
wsl --list
```
Among others this should list both `docker-desktop` and `docker-desktop-data`. You can then open a shell in the `docker-desktop` environment.
```powershell
wsl -d docker-desktop
```
Inside this shell you can verify that you have the right environment by running
```bash
cat /proc/sys/kernel/core_pattern
```
This should show the same configuration as the one you see inside the devcontainer. You can then change the setting by running the following command.
This will change the setting for the current session. If you want to make the change permanent you will need to add this to a startup script.
```bash
echo "core" > /proc/sys/kernel/core_pattern
```

View File

@ -11,18 +11,12 @@ endif
include Makefile.global
all: extension
all: extension pg_send_cancellation
# build columnar only
columnar:
$(MAKE) -C src/backend/columnar all
# build extension
extension: $(citus_top_builddir)/src/include/citus_version.h columnar
extension: $(citus_top_builddir)/src/include/citus_version.h
$(MAKE) -C src/backend/distributed/ all
install-columnar: columnar
$(MAKE) -C src/backend/columnar install
install-extension: extension install-columnar
install-extension: extension
$(MAKE) -C src/backend/distributed/ install
install-headers: extension
$(MKDIR_P) '$(DESTDIR)$(includedir_server)/distributed/'
@ -33,35 +27,37 @@ install-headers: extension
clean-extension:
$(MAKE) -C src/backend/distributed/ clean
$(MAKE) -C src/backend/columnar/ clean
clean-full:
$(MAKE) -C src/backend/distributed/ clean-full
.PHONY: extension install-extension clean-extension clean-full
install-downgrades:
$(MAKE) -C src/backend/distributed/ install-downgrades
install-all: install-headers
$(MAKE) -C src/backend/columnar/ install-all
install-all: install-headers install-pg_send_cancellation
$(MAKE) -C src/backend/distributed/ install-all
# build citus_send_cancellation binary
pg_send_cancellation:
$(MAKE) -C src/bin/pg_send_cancellation/ all
install-pg_send_cancellation: pg_send_cancellation
$(MAKE) -C src/bin/pg_send_cancellation/ install
clean-pg_send_cancellation:
$(MAKE) -C src/bin/pg_send_cancellation/ clean
.PHONY: pg_send_cancellation install-pg_send_cancellation clean-pg_send_cancellation
# Add to generic targets
install: install-extension install-headers
clean: clean-extension
install: install-extension install-headers install-pg_send_cancellation
clean: clean-extension clean-pg_send_cancellation
# apply or check style
reindent:
${citus_abs_top_srcdir}/ci/fix_style.sh
check-style:
black . --check --quiet
isort . --check --quiet
flake8
cd ${citus_abs_top_srcdir} && citus_indent --quiet --check
.PHONY: reindent check-style
# depend on install-all so that downgrade scripts are installed as well
check: all install-all
# explicetely does not use $(MAKE) to avoid parallelism
make -C src/test/regress check
$(MAKE) -C src/test/regress check-full
.PHONY: all check clean install install-downgrades install-all

View File

@ -64,8 +64,8 @@ $(citus_top_builddir)/Makefile.global: $(citus_abs_top_srcdir)/configure $(citus
$(citus_top_builddir)/config.status: $(citus_abs_top_srcdir)/configure $(citus_abs_top_srcdir)/src/backend/distributed/citus.control
cd @abs_top_builddir@ && ./config.status --recheck && ./config.status
# Regenerate configure if configure.ac changed
$(citus_abs_top_srcdir)/configure: $(citus_abs_top_srcdir)/configure.ac
# Regenerate configure if configure.in changed
$(citus_abs_top_srcdir)/configure: $(citus_abs_top_srcdir)/configure.in
cd ${citus_abs_top_srcdir} && ./autogen.sh
# If specified via configure, replace the default compiler. Normally

168
README.md
View File

@ -1,14 +1,8 @@
| **<br/>The Citus database is 100% open source.<br/><img width=1000/><br/>Learn what's new in the [Citus 13.0 release blog](https://www.citusdata.com/blog/2025/02/06/distribute-postgresql-17-with-citus-13/) and the [Citus Updates page](https://www.citusdata.com/updates/).<br/><br/>**|
|---|
<br/>
![Citus Banner](images/citus-readme-banner.png)
![Citus Banner](/citus-readme-banner.png)
[![Latest Docs](https://img.shields.io/badge/docs-latest-brightgreen.svg)](https://docs.citusdata.com/)
[![Stack Overflow](https://img.shields.io/badge/Stack%20Overflow-%20-545353?logo=Stack%20Overflow)](https://stackoverflow.com/questions/tagged/citus)
[![Slack](https://cituscdn.azureedge.net/images/social/slack-badge.svg)](https://slack.citusdata.com/)
[![Slack Status](http://slack.citusdata.com/badge.svg)](https://citus-public.slack.com/)
[![Code Coverage](https://codecov.io/gh/citusdata/citus/branch/master/graph/badge.svg)](https://app.codecov.io/gh/citusdata/citus)
[![Twitter](https://img.shields.io/twitter/follow/citusdata.svg?label=Follow%20@citusdata)](https://twitter.com/intent/follow?screen_name=citusdata)
@ -25,21 +19,18 @@ With Citus, you extend your PostgreSQL database with new superpowers:
- **References tables** are replicated to all nodes for joins and foreign keys from distributed tables and maximum read performance.
- **Distributed query engine** routes and parallelizes SELECT, DML, and other operations on distributed tables across the cluster.
- **Columnar storage** compresses data, speeds up scans, and supports fast projections, both on regular and distributed tables.
- **Query from any node** enables you to utilize the full capacity of your cluster for distributed queries
You can use these Citus superpowers to make your Postgres database scale-out ready on a single Citus node. Or you can build a large cluster capable of handling **high transaction throughputs**, especially in **multi-tenant apps**, run **fast analytical queries**, and process large amounts of **time series** or **IoT data** for **real-time analytics**. When your data size and volume grow, you can easily add more worker nodes to the cluster and rebalance the shards.
Our [SIGMOD '21](https://2021.sigmod.org/) paper [Citus: Distributed PostgreSQL for Data-Intensive Applications](https://doi.org/10.1145/3448016.3457551) gives a more detailed look into what Citus is, how it works, and why it works that way.
![Citus scales out from a single node](images/citus-scale-out.png)
![Citus scales out from a single node](/citus-scale-out.png)
Since Citus is an extension to Postgres, you can use Citus with the latest Postgres versions. And Citus works seamlessly with the PostgreSQL tools and extensions you are already familiar with.
- [Why Citus?](#why-citus)
- [Getting Started](#getting-started)
- [Using Citus](#using-citus)
- [Schema-based sharding](#schema-based-sharding)
- [Setting up with High Availability](#setting-up-with-high-availability)
- [Documentation](#documentation)
- [Architecture](#architecture)
- [When to Use Citus](#when-to-use-citus)
@ -65,11 +56,11 @@ Developers choose Citus for two reasons:
## Getting Started
The quickest way to get started with Citus is to use the [Azure Cosmos DB for PostgreSQL](https://learn.microsoft.com/azure/cosmos-db/postgresql/quickstart-create-portal) managed service in the cloud—or [set up Citus locally](https://docs.citusdata.com/en/stable/installation/single_node.html).
The quickest way to get started with Citus is to use the [Hyperscale (Citus)](https://docs.microsoft.com/azure/postgresql/quickstart-create-hyperscale-portal) deployment option in the Azure Database for PostgreSQL managed service—or [set up Citus locally](https://docs.citusdata.com/en/stable/installation/single_node.html).
### Citus Managed Service on Azure
### Hyperscale (Citus) on Azure Database for PostgreSQL
You can get a fully-managed Citus cluster in minutes through the [Azure Cosmos DB for PostgreSQL portal](https://azure.microsoft.com/products/cosmos-db/). Azure will manage your backups, high availability through auto-failover, software updates, monitoring, and more for all of your servers. To get started Citus on Azure, use the [Azure Cosmos DB for PostgreSQL Quickstart](https://learn.microsoft.com/azure/cosmos-db/postgresql/quickstart-create-portal).
You can get a fully-managed Citus cluster in minutes through the Hyperscale (Citus) deployment option in the [Azure Database for PostgreSQL](https://azure.microsoft.com/services/postgresql/) portal. Azure will manage your backups, high availability through auto-failover, software updates, monitoring, and more for all of your servers. To get started with Hyperscale (Citus), use the [Hyperscale (Citus) Quickstart](https://docs.microsoft.com/azure/postgresql/quickstart-create-hyperscale-portal) in the Azure docs.
### Running Citus using Docker
@ -95,14 +86,14 @@ Install packages on Ubuntu / Debian:
```bash
curl https://install.citusdata.com/community/deb.sh > add-citus-repo.sh
sudo bash add-citus-repo.sh
sudo apt-get -y install postgresql-17-citus-13.0
sudo apt-get -y install postgresql-14-citus-10.2
```
Install packages on Red Hat:
Install packages on CentOS / Fedora / Red Hat:
```bash
curl https://install.citusdata.com/community/rpm.sh > add-citus-repo.sh
sudo bash add-citus-repo.sh
sudo yum install -y citus130_17
sudo yum install -y citus102_14
```
To add Citus to your local PostgreSQL database, add the following to `postgresql.conf`:
@ -124,7 +115,7 @@ If you want to set up a multi-node cluster, you can also set up additional Postg
```sql
-- before adding the first worker node, tell future worker nodes how to reach the coordinator
SELECT citus_set_coordinator_host('10.0.0.1', 5432);
-- SELECT citus_set_coordinator_host('10.0.0.1', 5432);
-- add worker nodes
SELECT citus_add_node('10.0.0.2', 5432);
@ -234,42 +225,7 @@ WHERE device_type_id = 55;
Time: 209.961 ms
```
Co-location also helps you scale [INSERT..SELECT](https://docs.citusdata.com/en/stable/articles/aggregation.html), [stored procedures](https://www.citusdata.com/blog/2020/11/21/making-postgres-stored-procedures-9x-faster-in-citus/), and [distributed transactions](https://www.citusdata.com/blog/2017/06/02/scaling-complex-sql-transactions/).
### Distributing Tables without interrupting the application
Some of you already start with Postgres, and decide to distribute tables later on while your application using the tables. In that case, you want to avoid downtime for both reads and writes. `create_distributed_table` command block writes (e.g., DML commands) on the table until the command is finished. Instead, with `create_distributed_table_concurrently` command, your application can continue to read and write the data even during the command.
```sql
CREATE TABLE device_logs (
device_id bigint primary key,
log text
);
-- insert device logs
INSERT INTO device_logs (device_id, log)
SELECT s, 'device log:'||s FROM generate_series(0, 99) s;
-- convert device_logs into a distributed table without interrupting the application
SELECT create_distributed_table_concurrently('device_logs', 'device_id', colocate_with := 'devices');
-- get the count of the logs, parallelized across shards
SELECT count(*) FROM device_logs;
┌───────┐
│ count │
├───────┤
│ 100 │
└───────┘
(1 row)
Time: 48.734 ms
```
Co-location also helps you scale [INSERT..SELECT]( https://docs.citusdata.com/en/stable/articles/aggregation.html), [stored procedures]( https://www.citusdata.com/blog/2020/11/21/making-postgres-stored-procedures-9x-faster-in-citus/), and [distributed transactions](https://www.citusdata.com/blog/2017/06/02/scaling-complex-sql-transactions/).
### Creating Reference Tables
@ -348,74 +304,11 @@ When using columnar storage, you should only load data in batch using `COPY` or
To learn more about columnar storage, check out the [columnar storage README](https://github.com/citusdata/citus/blob/master/src/backend/columnar/README.md).
## Schema-based sharding
Available since Citus 12.0, [schema-based sharding](https://docs.citusdata.com/en/stable/get_started/concepts.html#schema-based-sharding) is the shared database, separate schema model, the schema becomes the logical shard within the database. Multi-tenant apps can a use a schema per tenant to easily shard along the tenant dimension. Query changes are not required and the application usually only needs a small modification to set the proper search_path when switching tenants. Schema-based sharding is an ideal solution for microservices, and for ISVs deploying applications that cannot undergo the changes required to onboard row-based sharding.
### Creating distributed schemas
You can turn an existing schema into a distributed schema by calling `citus_schema_distribute`:
```sql
SELECT citus_schema_distribute('user_service');
```
Alternatively, you can set `citus.enable_schema_based_sharding` to have all newly created schemas be automatically converted into distributed schemas:
```sql
SET citus.enable_schema_based_sharding TO ON;
CREATE SCHEMA AUTHORIZATION user_service;
CREATE SCHEMA AUTHORIZATION time_service;
CREATE SCHEMA AUTHORIZATION ping_service;
```
### Running queries
Queries will be properly routed to schemas based on `search_path` or by explicitly using the schema name in the query.
For [microservices](https://docs.citusdata.com/en/stable/get_started/tutorial_microservices.html) you would create a USER per service matching the schema name, hence the default `search_path` would contain the schema name. When connected the user queries would be automatically routed and no changes to the microservice would be required.
```sql
CREATE USER user_service;
CREATE SCHEMA AUTHORIZATION user_service;
```
For typical multi-tenant applications, you would set the search path to the tenant schema name in your application:
```sql
SET search_path = tenant_name, public;
```
## Setting up with High Availability
One of the most popular high availability solutions for PostgreSQL, [Patroni 3.0](https://github.com/zalando/patroni), has [first class support for Citus 10.0 and above](https://patroni.readthedocs.io/en/latest/citus.html#citus), additionally since Citus 11.2 ships with improvements for smoother node switchover in Patroni.
An example of patronictl list output for the Citus cluster:
```bash
postgres@coord1:~$ patronictl list demo
```
```text
+ Citus cluster: demo ----------+--------------+---------+----+-----------+
| Group | Member | Host | Role | State | TL | Lag in MB |
+-------+---------+-------------+--------------+---------+----+-----------+
| 0 | coord1 | 172.27.0.10 | Replica | running | 1 | 0 |
| 0 | coord2 | 172.27.0.6 | Sync Standby | running | 1 | 0 |
| 0 | coord3 | 172.27.0.4 | Leader | running | 1 | |
| 1 | work1-1 | 172.27.0.8 | Sync Standby | running | 1 | 0 |
| 1 | work1-2 | 172.27.0.2 | Leader | running | 1 | |
| 2 | work2-1 | 172.27.0.5 | Sync Standby | running | 1 | 0 |
| 2 | work2-2 | 172.27.0.7 | Leader | running | 1 | |
+-------+---------+-------------+--------------+---------+----+-----------+
```
## Documentation
If youre ready to get started with Citus or want to know more, we recommend reading the [Citus open source documentation](https://docs.citusdata.com/en/stable/). Or, if you are using Citus on Azure, then the [Azure Cosmos DB for PostgreSQL](https://learn.microsoft.com/azure/cosmos-db/postgresql/introduction) is the place to start.
If youre ready to get started with Citus or want to know more, we recommend reading the [Citus open source documentation](https://docs.citusdata.com/en/stable/). Or, if you are using Citus on Azure, then the [Hyperscale (Citus) documentation](https://docs.microsoft.com/azure/postgresql/hyperscale/) is online and available as part of the Azure Database for PostgreSQL docs.
Our Citus docs contain comprehensive use case guides on how to build a [multi-tenant SaaS application](https://docs.citusdata.com/en/stable/use_cases/multi_tenant.html), [real-time analytics dashboard]( https://docs.citusdata.com/en/stable/use_cases/realtime_analytics.html), or work with [time series data](https://docs.citusdata.com/en/stable/use_cases/timeseries.html).
Our Citus docs contain comprehensive use case guides on how to build a [multi-tenant SaaS application]( https://docs.citusdata.com/en/stable/use_cases/multi_tenant.html), [real-time analytics dashboard]( https://docs.citusdata.com/en/stable/use_cases/realtime_analytics.html), or work with [time series data]( https://docs.citusdata.com/en/stable/use_cases/timeseries.html).
## Architecture
@ -423,13 +316,10 @@ A Citus database cluster grows from a single PostgreSQL node into a cluster by a
Data in distributed tables is stored in “shards”, which are actually just regular PostgreSQL tables on the worker nodes. When querying a distributed table on the coordinator node, Citus will send regular SQL queries to the worker nodes. That way, all the usual PostgreSQL optimizations and extensions can automatically be used with Citus.
![Citus architecture](images/citus-architecture.png)
![Citus architecture](/citus-architecture.png)
When you send a query in which all (co-located) distributed tables have the same filter on the distribution column, Citus will automatically detect that and send the whole query to the worker node that stores the data. That way, arbitrarily complex queries are supported with minimal routing overhead, which is especially useful for scaling transactional workloads. If queries do not have a specific filter, each shard is queried in parallel, which is especially useful in analytical workloads. The Citus distributed executor is adaptive and is designed to handle both query types at the same time on the same system under high concurrency, which enables large-scale mixed workloads.
The schema and metadata of distributed tables and reference tables are automatically synchronized to all the nodes in the cluster. That way, you can connect to any node to run distributed queries. Schema changes and cluster administration still need to go through the coordinator.
Detailed descriptions of the implementation for Citus developers are provided in the [Citus Technical Documentation](src/backend/distributed/README.md).
## When to use Citus
@ -440,56 +330,48 @@ Citus is uniquely capable of scaling both analytical and transactional workloads
The advanced parallel, distributed query engine in Citus combined with PostgreSQL features such as [array types](https://www.postgresql.org/docs/current/arrays.html), [JSONB](https://www.postgresql.org/docs/current/datatype-json.html), [lateral joins](https://heap.io/blog/engineering/postgresqls-powerful-new-join-type-lateral), and extensions like [HyperLogLog](https://github.com/citusdata/postgresql-hll) and [TopN](https://github.com/citusdata/postgresql-topn) allow you to build responsive analytics dashboards no matter how many customers or how much data you have.
Example real-time analytics users: [Algolia](https://www.citusdata.com/customers/algolia)
Example real-time analytics users: [Algolia](https://www.citusdata.com/customers/algolia), [Heap](https://www.citusdata.com/customers/heap)
- **[Time series data](http://docs.citusdata.com/en/stable/use_cases/timeseries.html)**:
Citus enables you to process and analyze very large amounts of time series data. The biggest Citus clusters store well over a petabyte of time series data and ingest terabytes per day.
Citus integrates seamlessly with [Postgres table partitioning](https://www.postgresql.org/docs/current/ddl-partitioning.html) and has [built-in functions for partitioning by time](https://www.citusdata.com/blog/2021/10/22/how-to-scale-postgres-for-time-series-data-with-citus/), which can speed up queries and writes on time series tables. You can take advantage of Cituss parallel, distributed query engine for fast analytical queries, and use the built-in *columnar storage* to compress old partitions.
Citus integrates seamlessly with [Postgres table partitioning](https://www.postgresql.org/docs/current/ddl-partitioning.html) and [pg_partman](https://www.citusdata.com/blog/2018/01/24/citus-and-pg-partman-creating-a-scalable-time-series-database-on-PostgreSQL/), which can speed up queries and writes on time series tables. You can take advantage of Cituss parallel, distributed query engine for fast analytical queries, and use the built-in *columnar storage* to compress old partitions.
Example users: [MixRank](https://www.citusdata.com/customers/mixrank)
Example users: [MixRank](https://www.citusdata.com/customers/mixrank), [Windows team](https://techcommunity.microsoft.com/t5/azure-database-for-postgresql/architecting-petabyte-scale-analytics-by-scaling-out-postgres-on/ba-p/969685)
- **[Software-as-a-service (SaaS) applications](http://docs.citusdata.com/en/stable/use_cases/multi_tenant.html)**:
SaaS and other multi-tenant applications need to be able to scale their database as the number of tenants/customers grows. Citus enables you to transparently shard a complex data model by the tenant dimension, so your database can grow along with your business.
By distributing tables along a tenant ID column and co-locating data for the same tenant, Citus can horizontally scale complex (tenant-scoped) queries, transactions, and foreign key graphs. Reference tables and distributed DDL commands make database management a breeze compared to manual sharding. On top of that, you have a built-in distributed query engine for doing cross-tenant analytics inside the database.
Example multi-tenant SaaS users: [Salesloft](https://fivetran.com/case-studies/replicating-sharded-databases-a-case-study-of-salesloft-citus-data-and-fivetran), [ConvertFlow](https://www.citusdata.com/customers/convertflow)
- **[Microservices](https://docs.citusdata.com/en/stable/get_started/tutorial_microservices.html)**: Citus supports schema based sharding, which allows distributing regular database schemas across many machines. This sharding methodology fits nicely with typical Microservices architecture, where storage is fully owned by the service hence cant share the same schema definition with other tenants. Citus allows distributing horizontally scalable state across services, solving one of the [main problems](https://stackoverflow.blog/2020/11/23/the-macro-problem-with-microservices/) of microservices.
Example multi-tenant SaaS users: [Copper](https://www.citusdata.com/customers/copper), [Salesloft](https://fivetran.com/case-studies/replicating-sharded-databases-a-case-study-of-salesloft-citus-data-and-fivetran), [ConvertFlow](https://www.citusdata.com/customers/convertflow)
- **Geospatial**:
Because of the powerful [PostGIS](https://postgis.net/) extension to Postgres that adds support for geographic objects into Postgres, many people run spatial/GIS applications on top of Postgres. And since spatial location information has become part of our daily life, well, there are more geospatial applications than ever. When your Postgres database needs to scale out to handle an increased workload, Citus is a good fit.
Example geospatial users: [Helsinki Regional Transportation Authority (HSL)](https://customers.microsoft.com/story/845146-transit-authority-improves-traffic-monitoring-with-azure-database-for-postgresql-hyperscale), [MobilityDB](https://www.citusdata.com/blog/2020/11/09/analyzing-gps-trajectories-at-scale-with-postgres-mobilitydb/).
Example geospatial users: [Helsinki Regional Transportation Authority (HSL)](https://customers.microsoft.com/en-us/story/845146-transit-authority-improves-traffic-monitoring-with-azure-database-for-postgresql-hyperscale), [MobilityDB]( https://www.citusdata.com/blog/2020/11/09/analyzing-gps-trajectories-at-scale-with-postgres-mobilitydb/).
## Need Help?
- **Slack**: Ask questions in our Citus community [Slack channel](https://slack.citusdata.com).
- **GitHub issues**: Please submit issues via [GitHub issues](https://github.com/citusdata/citus/issues).
- **Documentation**: Our [Citus docs](https://docs.citusdata.com ) have a wealth of resources, including sections on [query performance tuning](https://docs.citusdata.com/en/stable/performance/performance_tuning.html), [useful diagnostic queries](https://docs.citusdata.com/en/stable/admin_guide/diagnostic_queries.html), and [common error messages](https://docs.citusdata.com/en/stable/reference/common_errors.html).
- **Docs issues**: You can also submit documentation issues via [GitHub issues for our Citus docs](https://github.com/citusdata/citus_docs/issues).
- **Updates & Release Notes**: Learn about what's new in each Citus version on the [Citus Updates page](https://www.citusdata.com/updates/).
- **Docs issues**: You can also submit documentation issues via [GitHub
issues for our Citus docs](https://github.com/citusdata/citus_docs/issues).
## Contributing
Citus is built on and of open source, and we welcome your contributions. The [CONTRIBUTING.md](CONTRIBUTING.md) file explains how to get started developing the Citus extension itself and our code quality guidelines.
## Code of Conduct
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
## Stay Connected
- **Twitter**: Follow us [@citusdata](https://twitter.com/citusdata) to track the latest posts & updates on whats happening.
- **Citus Blog**: Read our popular [Citus Open Source Blog](https://www.citusdata.com/blog/) for posts about PostgreSQL and Citus.
- **Citus Blog**: Read our popular [Citus Blog](https://www.citusdata.com/blog/) for useful & informative posts about PostgreSQL and Citus.
- **Citus Newsletter**: Subscribe to our monthly technical [Citus Newsletter](https://www.citusdata.com/join-newsletter) to get a curated collection of our favorite posts, videos, docs, talks, & other Postgres goodies.
- **Slack**: Our [Citus Public slack](https://slack.citusdata.com/) is a good way to stay connected, not just with us but with other Citus users.
- **Sister Blog**: Read the PostgreSQL posts on the [Azure Cosmos DB for PostgreSQL blog](https://devblogs.microsoft.com/cosmosdb/category/postgresql/) about our managed service on Azure.
- **Slack**: Our [Citus Public slack]( https://slack.citusdata.com/) is a good way to stay connected, not just with us but with other Citus users.
- **Sister Blog**: Read our Azure Database for PostgreSQL [sister blog on Microsoft TechCommunity](https://techcommunity.microsoft.com/t5/azure-database-for-postgresql/bg-p/ADforPostgreSQL) for posts relating to Postgres (and Citus) on Azure.
- **Videos**: Check out this [YouTube playlist](https://www.youtube.com/playlist?list=PLixnExCn6lRq261O0iwo4ClYxHpM9qfVy) of some of our favorite Citus videos and demos. If you want to deep dive into how Citus extends PostgreSQL, you might want to check out Marco Slots talk at Carnegie Mellon titled [Citus: Distributed PostgreSQL as an Extension](https://youtu.be/X-aAgXJZRqM) that was part of Andy Pavlos Vaccination Database Talks series at CMUDB.
- **Our other Postgres projects**: Our team also works on other awesome PostgreSQL open source extensions & projects, including: [pg_cron](https://github.com/citusdata/pg_cron), [HyperLogLog](https://github.com/citusdata/postgresql-hll), [TopN](https://github.com/citusdata/postgresql-topn), [pg_auto_failover](https://github.com/citusdata/pg_auto_failover), [activerecord-multi-tenant](https://github.com/citusdata/activerecord-multi-tenant), and [django-multitenant](https://github.com/citusdata/django-multitenant).
- **Our other Postgres projects**: Our team also works on other awesome PostgreSQL open source extensions & projects, including: [pg_cron]( https://github.com/citusdata/pg_cron), [HyperLogLog](https://github.com/citusdata/postgresql-hll), [TopN](https://github.com/citusdata/postgresql-topn), [pg_auto_failover](https://github.com/citusdata/pg_auto_failover), [activerecord-multi-tenant](https://github.com/citusdata/activerecord-multi-tenant), and [django-multitenant](https://github.com/citusdata/django-multitenant).
___

View File

@ -1,41 +0,0 @@
<!-- BEGIN MICROSOFT SECURITY.MD V0.0.8 BLOCK -->
## Security
Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/).
If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://aka.ms/opensource/security/definition), please report it to us as described below.
## Reporting Security Issues
**Please do not report security vulnerabilities through public GitHub issues.**
Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://aka.ms/opensource/security/create-report).
If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com). If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://aka.ms/opensource/security/pgpkey).
You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://aka.ms/opensource/security/msrc).
Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue:
* Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.)
* Full paths of source file(s) related to the manifestation of the issue
* The location of the affected source code (tag/branch/commit or direct URL)
* Any special configuration required to reproduce the issue
* Step-by-step instructions to reproduce the issue
* Proof-of-concept or exploit code (if possible)
* Impact of the issue, including how an attacker might exploit the issue
This information will help us triage your report more quickly.
If you are reporting for a bug bounty, more complete reports can contribute to a higher bounty award. Please visit our [Microsoft Bug Bounty Program](https://aka.ms/opensource/security/bounty) page for more details about our active programs.
## Preferred Languages
We prefer all communications to be in English.
## Policy
Microsoft follows the principle of [Coordinated Vulnerability Disclosure](https://aka.ms/opensource/security/cvd).
<!-- END MICROSOFT SECURITY.MD BLOCK -->

View File

@ -1,160 +0,0 @@
# Coding style
The existing code-style in our code-base is not super consistent. There are multiple reasons for that. One big reason is because our code-base is relatively old and our standards have changed over time. The second big reason is that our style-guide is different from style-guide of Postgres and some code is copied from Postgres source code and is slightly modified. The below rules are for new code. If you're changing existing code that uses a different style, use your best judgement to decide if you use the rules here or if you match the existing style.
## Using citus_indent
CI pipeline will automatically reject any PRs which do not follow our coding
conventions. The easiest way to ensure your PR adheres to those conventions is
to use the [citus_indent](https://github.com/citusdata/tools/tree/develop/uncrustify)
tool. This tool uses `uncrustify` under the hood.
```bash
# Uncrustify changes the way it formats code every release a bit. To make sure
# everyone formats consistently we use version 0.68.1:
curl -L https://github.com/uncrustify/uncrustify/archive/uncrustify-0.68.1.tar.gz | tar xz
cd uncrustify-uncrustify-0.68.1/
mkdir build
cd build
cmake ..
make -j5
sudo make install
cd ../..
git clone https://github.com/citusdata/tools.git
cd tools
make uncrustify/.install
```
Once you've done that, you can run the `make reindent` command from the top
directory to recursively check and correct the style of any source files in the
current directory. Under the hood, `make reindent` will run `citus_indent` and
some other style corrections for you.
You can also run the following in the directory of this repository to
automatically format all the files that you have changed before committing:
```bash
cat > .git/hooks/pre-commit << __EOF__
#!/bin/bash
citus_indent --check --diff || { citus_indent --diff; exit 1; }
__EOF__
chmod +x .git/hooks/pre-commit
```
## Other rules we follow that citus_indent does not enforce
* We almost always use **CamelCase**, when naming functions, variables etc., **not snake_case**.
* We also have the habits of using a **lowerCamelCase** for some variables named from their type or from their function name, as shown in the examples:
```c
bool IsCitusExtensionLoaded = false;
bool
IsAlterTableRenameStmt(RenameStmt *renameStmt)
{
AlterTableCmd *alterTableCommand = NULL;
..
..
bool isAlterTableRenameStmt = false;
..
}
```
* We **start functions with a comment**:
```c
/*
* MyNiceFunction <something in present simple tense, e.g., processes / returns / checks / takes X as input / does Y> ..
* <some more nice words> ..
* <some more nice words> ..
*/
<static?> <return type>
MyNiceFunction(..)
{
..
..
}
```
* `#includes` needs to be sorted based on below ordering and then alphabetically and we should not include what we don't need in a file:
* System includes (eg. #include<...>)
* Postgres.h (eg. #include "postgres.h")
* Toplevel imports from postgres, not contained in a directory (eg. #include "miscadmin.h")
* General postgres includes (eg . #include "nodes/...")
* Toplevel citus includes, not contained in a directory (eg. #include "citus_verion.h")
* Columnar includes (eg. #include "columnar/...")
* Distributed includes (eg. #include "distributed/...")
* Comments:
```c
/* single line comments start with a lower-case */
/*
* We start multi-line comments with a capital letter
* and keep adding a star to the beginning of each line
* until we close the comment with a star and a slash.
*/
```
* Order of function implementations and their declarations in a file:
We define static functions after the functions that call them. For example:
```c
#include<..>
#include<..>
..
..
typedef struct
{
..
..
} MyNiceStruct;
..
..
PG_FUNCTION_INFO_V1(my_nice_udf1);
PG_FUNCTION_INFO_V1(my_nice_udf2);
..
..
// .. somewhere on top of the file …
static void MyNiceStaticlyDeclaredFunction1(…);
static void MyNiceStaticlyDeclaredFunction2(…);
..
..
void
MyNiceFunctionExternedViaHeaderFile(..)
{
..
..
MyNiceStaticlyDeclaredFunction1(..);
..
..
MyNiceStaticlyDeclaredFunction2(..);
..
}
..
..
// we define this first because it's called by MyNiceFunctionExternedViaHeaderFile()
// before MyNiceStaticlyDeclaredFunction2()
static void
MyNiceStaticlyDeclaredFunction1(…)
{
}
..
..
// then we define this
static void
MyNiceStaticlyDeclaredFunction2(…)
{
}
```

View File

@ -1,6 +1,6 @@
#!/bin/bash
#
# autogen.sh converts configure.ac to configure and creates
# autogen.sh converts configure.in to configure and creates
# citus_config.h.in. The resuting resulting files are checked into
# the SCM, to avoid everyone needing autoconf installed.

View File

@ -283,14 +283,6 @@ actually run in CI. This is most commonly forgotten for newly added CI tests
that the developer only ran locally. It also checks that all CI scripts have a
section in this `README.md` file and that they include `ci/ci_helpers.sh`.
## `check_migration_files.sh`
A branch that touches a set of upgrade scripts is also expected to touch
corresponding downgrade scripts as well. If this script fails, read the output
and make sure you update the downgrade scripts in the printed list. If you
really don't need a downgrade to run any SQL. You can write a comment in the
file explaining why a downgrade step is not necessary.
## `disallow_c_comments_in_migrations.sh`
We do not use C-style comments in migration files as the stripped
@ -371,8 +363,11 @@ This was deemed to be error prone and not worth the effort.
This script checks and fixes issues with `.gitignore` rules:
1. Makes sure git ignores the `.sql` files and expected output files that are generated
from `.source` template files. If you created or deleted a `.source` file in a commit,
git ignore rules should be updated to reflect this change.
1. Makes sure we do not commit any generated files that should be ignored. If there is an
2. Makes sure we do not commit any generated files that should be ignored. If there is an
ignored file in the git tree, the user is expected to review the files that are removed
from the git tree and commit them.
@ -381,22 +376,3 @@ This script checks and fixes issues with `.gitignore` rules:
This script checks the order of the GUCs defined in `shared_library_init.c`.
To solve this failure, please check `shared_library_init.c` and make sure that the GUC
definitions are in alphabetical order.
## `print_stack_trace.sh`
This script prints stack traces for failed tests, if they left core files.
## `sort_and_group_includes.sh`
This script checks and fixes issues with include grouping and sorting in C files.
Includes are grouped in the following groups:
- System includes (eg. `#include <math>`)
- Postgres.h include (eg. `#include "postgres.h"`)
- Toplevel postgres includes (includes not in a directory eg. `#include "miscadmin.h`)
- Postgres includes in a directory (eg. `#include "catalog/pg_type.h"`)
- Toplevel citus includes (includes not in a directory eg. `#include "pg_version_constants.h"`)
- Columnar includes (eg. `#include "columnar/columnar.h"`)
- Distributed includes (eg. `#include "distributed/maintenanced.h"`)
Within every group the include lines are sorted alphabetically.

View File

@ -15,6 +15,9 @@ PG_MAJOR=${PG_MAJOR:?please provide the postgres major version}
codename=${VERSION#*(}
codename=${codename%)*}
# get project from argument
project="${CIRCLE_PROJECT_REPONAME}"
# we'll do everything with absolute paths
basedir="$(pwd)"
@ -25,7 +28,7 @@ build_ext() {
pg_major="$1"
builddir="${basedir}/build-${pg_major}"
echo "Beginning build for PostgreSQL ${pg_major}..." >&2
echo "Beginning build of ${project} for PostgreSQL ${pg_major}..." >&2
# do everything in a subdirectory to avoid clutter in current directory
mkdir -p "${builddir}" && cd "${builddir}"

View File

@ -14,8 +14,8 @@ ci_scripts=$(
grep -v -E '^(ci_helpers.sh|fix_style.sh)$'
)
for script in $ci_scripts; do
if ! grep "\\bci/$script\\b" -r .github > /dev/null; then
echo "ERROR: CI script with name \"$script\" is not actually used in .github folder"
if ! grep "\\bci/$script\\b" .circleci/config.yml > /dev/null; then
echo "ERROR: CI script with name \"$script\" is not actually used in .circleci/config.yml"
exit 1
fi
if ! grep "^## \`$script\`\$" ci/README.md > /dev/null; then

View File

@ -7,12 +7,13 @@ source ci/ci_helpers.sh
cd src/test/regress
# 1. Find all *.sql and *.spec files in the sql, and spec directories
# 1. Find all *.sql *.spec and *.source files in the sql, spec and input
# directories
# 2. Strip the extension and the directory
# 3. Ignore names that end with .include, those files are meant to be in an C
# preprocessor #include statement. They should not be in schedules.
test_names=$(
find sql spec -iname "*.sql" -o -iname "*.spec" |
find sql spec input -iname "*.sql" -o -iname "*.spec" -o -iname "*.source" |
sed -E 's#^\w+/([^/]+)\.[^.]+$#\1#g' |
grep -v '.include$'
)

96
ci/check_enterprise_merge.sh Executable file
View File

@ -0,0 +1,96 @@
#!/bin/bash
# Testing this script locally requires you to set the following environment
# variables:
# CIRCLE_BRANCH, GIT_USERNAME and GIT_TOKEN
# fail if trying to reference a variable that is not set.
set -u
# exit immediately if a command fails
set -e
# Fail on pipe failures
set -o pipefail
PR_BRANCH="${CIRCLE_BRANCH}"
ENTERPRISE_REMOTE="https://${GIT_USERNAME}:${GIT_TOKEN}@github.com/citusdata/citus-enterprise"
# shellcheck disable=SC1091
source ci/ci_helpers.sh
# List executed commands. This is done so debugging this script is easier when
# it fails. It's explicitly done after git remote add so username and password
# are not shown in CI output (even though it's also filtered out by CircleCI)
set -x
check_compile () {
echo "INFO: checking if merged code can be compiled"
./configure --without-libcurl
make -j10
}
# Clone current git repo (which should be community) to a temporary working
# directory and go there
GIT_DIR_ROOT="$(git rev-parse --show-toplevel)"
TMP_GIT_DIR="$(mktemp --directory -t citus-merge-check.XXXXXXXXX)"
git clone "$GIT_DIR_ROOT" "$TMP_GIT_DIR"
cd "$TMP_GIT_DIR"
# Fails in CI without this
git config user.email "citus-bot@microsoft.com"
git config user.name "citus bot"
# Disable "set -x" temporarily, because $ENTERPRISE_REMOTE contains passwords
{ set +x ; } 2> /dev/null
git remote add enterprise "$ENTERPRISE_REMOTE"
set -x
git remote set-url --push enterprise no-pushing
# Fetch enterprise-master
git fetch enterprise enterprise-master
git checkout "enterprise/enterprise-master"
if git merge --no-commit "origin/$PR_BRANCH"; then
echo "INFO: community PR branch could be merged into enterprise-master"
# check that we can compile after the merge
if check_compile; then
exit 0
fi
echo "WARN: Failed to compile after community PR branch was merged into enterprise"
fi
# undo partial merge
git merge --abort
# If we have a conflict on enterprise merge on the master branch, we have a problem.
# Provide an error message to indicate that enterprise merge is needed to fix this check.
if [[ $PR_BRANCH = master ]]; then
echo "ERROR: Master branch has merge conflicts with enterprise-master."
echo "Try re-running this CI job after merging your changes into enterprise-master."
exit 1
fi
if ! git fetch enterprise "$PR_BRANCH" ; then
echo "ERROR: enterprise/$PR_BRANCH was not found and community PR branch could not be merged into enterprise-master"
exit 1
fi
# Show the top commit of the enterprise PR branch to make debugging easier
git log -n 1 "enterprise/$PR_BRANCH"
# Check that this branch contains the top commit of the current community PR
# branch. If it does not it means it's not up to date with the current PR, so
# the enterprise branch should be updated.
if ! git merge-base --is-ancestor "origin/$PR_BRANCH" "enterprise/$PR_BRANCH" ; then
echo "ERROR: enterprise/$PR_BRANCH is not up to date with community PR branch"
exit 1
fi
# Now check if we can merge the enterprise PR into enterprise-master without
# issues.
git merge --no-commit "enterprise/$PR_BRANCH"
# check that we can compile after the merge
check_compile

View File

@ -4,22 +4,7 @@ set -euo pipefail
# shellcheck disable=SC1091
source ci/ci_helpers.sh
# Find the line that exactly matches "RegisterCitusConfigVariables(void)" in
# shared_library_init.c. grep command returns something like
# "934:RegisterCitusConfigVariables(void)" and we extract the line number
# with cut.
RegisterCitusConfigVariables_begin_linenumber=$(grep -n "^RegisterCitusConfigVariables(void)$" src/backend/distributed/shared_library_init.c | cut -d: -f1)
# Consider the lines starting from $RegisterCitusConfigVariables_begin_linenumber,
# grep the first line that starts with "}" and extract the line number with cut
# as in the previous step.
RegisterCitusConfigVariables_length=$(tail -n +$RegisterCitusConfigVariables_begin_linenumber src/backend/distributed/shared_library_init.c | grep -n -m 1 "^}$" | cut -d: -f1)
# extract the function definition of RegisterCitusConfigVariables into a temp file
tail -n +$RegisterCitusConfigVariables_begin_linenumber src/backend/distributed/shared_library_init.c | head -n $(($RegisterCitusConfigVariables_length)) > RegisterCitusConfigVariables_func_def.out
# extract citus gucs in the form of <tab><tab>"citus.X"
grep -P "^[\t][\t]\"citus\.[a-zA-Z_0-9]+\"" RegisterCitusConfigVariables_func_def.out > gucs.out
LC_COLLATE=C sort -c gucs.out
# extract citus gucs in the form of "citus.X"
grep -o -E "(\.*\"citus.\w+\")," src/backend/distributed/shared_library_init.c > gucs.out
sort -c gucs.out
rm gucs.out
rm RegisterCitusConfigVariables_func_def.out

View File

@ -1,33 +0,0 @@
#! /bin/bash
set -euo pipefail
# shellcheck disable=SC1091
source ci/ci_helpers.sh
# This file checks for the existence of downgrade scripts for every upgrade script that is changed in the branch.
# create list of migration files for upgrades
upgrade_files=$(git diff --name-only origin/main | { grep "src/backend/distributed/sql/citus--.*sql" || exit 0 ; })
downgrade_files=$(git diff --name-only origin/main | { grep "src/backend/distributed/sql/downgrades/citus--.*sql" || exit 0 ; })
ret_value=0
for file in $upgrade_files
do
# There should always be 2 matches, and no need to avoid splitting here
# shellcheck disable=SC2207
versions=($(grep --only-matching --extended-regexp "[0-9]+\.[0-9]+[-.][0-9]+" <<< "$file"))
from_version=${versions[0]};
to_version=${versions[1]};
downgrade_migration_file="src/backend/distributed/sql/downgrades/citus--$to_version--$from_version.sql"
# check for the existence of migration scripts
if [[ $(grep --line-regexp --count "$downgrade_migration_file" <<< "$downgrade_files") == 0 ]]
then
echo "$file is updated, but $downgrade_migration_file is not updated in branch"
ret_value=1
fi
done
exit $ret_value;

View File

@ -1,12 +1,27 @@
#! /bin/bash
# shellcheck disable=SC2012
set -euo pipefail
# shellcheck disable=SC1091
source ci/ci_helpers.sh
# We list all the .source files in alphabetical order, and do a substitution
# before writing the resulting file names that are created by those templates in
# relevant .gitignore files
#
# 1. Capture the file name without the .source extension
# 2. Add the desired extension at the end
# 3. Add a / character at the beginning of each line to conform to .gitignore file format
#
# e.g. multi_copy.source -> /multi_copy.sql
ls -1 src/test/regress/input | sed -E "s#(.*)\.source#/\1.sql#" > src/test/regress/sql/.gitignore
# e.g. multi_copy.source -> /multi_copy.out
ls -1 src/test/regress/output | sed -E "s#(.*)\.source#/\1.out#" > src/test/regress/expected/.gitignore
# Remove all the ignored files from git tree, and error out
# find all ignored files in git tree, and use quotation marks to prevent word splitting on filenames with spaces in them
# NOTE: Option --cached is needed to avoid a bug in git ls-files command.
ignored_lines_in_git_tree=$(git ls-files --ignored --cached --exclude-standard | sed 's/.*/"&"/')
ignored_lines_in_git_tree=$(git ls-files --ignored --exclude-standard | sed 's/.*/"&"/')
if [[ -n $ignored_lines_in_git_tree ]]
then

View File

@ -9,8 +9,6 @@ cidir="${0%/*}"
cd ${cidir}/..
citus_indent . --quiet
black . --quiet
isort . --quiet
ci/editorconfig.sh
ci/remove_useless_declarations.sh
ci/disallow_c_comments_in_migrations.sh
@ -18,5 +16,3 @@ ci/disallow_hash_comments_in_spec_files.sh
ci/disallow_long_changelog_entries.sh
ci/normalize_expected.sh
ci/fix_gitignore.sh
ci/print_stack_trace.sh
ci/sort_and_group_includes.sh

View File

@ -1,157 +0,0 @@
#!/usr/bin/env python3
"""
easy command line to run against all citus-style checked files:
$ git ls-files \
| git check-attr --stdin citus-style \
| grep 'citus-style: set' \
| awk '{print $1}' \
| cut -d':' -f1 \
| xargs -n1 ./ci/include_grouping.py
"""
import collections
import os
import sys
def main(args):
if len(args) < 2:
print("Usage: include_grouping.py <file>")
return
file = args[1]
if not os.path.isfile(file):
sys.exit(f"File '{file}' does not exist")
with open(file, "r") as in_file:
with open(file + ".tmp", "w") as out_file:
includes = []
skipped_lines = []
# This calls print_sorted_includes on a set of consecutive #include lines.
# This implicitly keeps separation of any #include lines that are contained in
# an #ifdef, because it will order the #include lines inside and after the
# #ifdef completely separately.
for line in in_file:
# if a line starts with #include we don't want to print it yet, instead we
# want to collect all consecutive #include lines
if line.startswith("#include"):
includes.append(line)
skipped_lines = []
continue
# if we have collected any #include lines, we want to print them sorted
# before printing the current line. However, if the current line is empty
# we want to perform a lookahead to see if the next line is an #include.
# To maintain any separation between #include lines and their subsequent
# lines we keep track of all lines we have skipped inbetween.
if len(includes) > 0:
if len(line.strip()) == 0:
skipped_lines.append(line)
continue
# we have includes that need to be grouped before printing the current
# line.
print_sorted_includes(includes, file=out_file)
includes = []
# print any skipped lines
print("".join(skipped_lines), end="", file=out_file)
skipped_lines = []
print(line, end="", file=out_file)
# move out_file to file
os.rename(file + ".tmp", file)
def print_sorted_includes(includes, file=sys.stdout):
default_group_key = 1
groups = collections.defaultdict(set)
# define the groups that we separate correctly. The matchers are tested in the order
# of their priority field. The first matcher that matches the include is used to
# assign the include to a group.
# The groups are printed in the order of their group_key.
matchers = [
{
"name": "system includes",
"matcher": lambda x: x.startswith("<"),
"group_key": -2,
"priority": 0,
},
{
"name": "toplevel postgres includes",
"matcher": lambda x: "/" not in x,
"group_key": 0,
"priority": 9,
},
{
"name": "postgres.h",
"matcher": lambda x: x.strip() in ['"postgres.h"'],
"group_key": -1,
"priority": -1,
},
{
"name": "toplevel citus inlcudes",
"matcher": lambda x: x.strip()
in [
'"citus_version.h"',
'"pg_version_compat.h"',
'"pg_version_constants.h"',
],
"group_key": 3,
"priority": 0,
},
{
"name": "columnar includes",
"matcher": lambda x: x.startswith('"columnar/'),
"group_key": 4,
"priority": 1,
},
{
"name": "distributed includes",
"matcher": lambda x: x.startswith('"distributed/'),
"group_key": 5,
"priority": 1,
},
]
matchers.sort(key=lambda x: x["priority"])
# throughout our codebase we have some includes where either postgres or citus
# includes are wrongfully included with the syntax for system includes. Before we
# try to match those we will change the <> to "" to make them match our system. This
# will also rewrite the include to the correct syntax.
common_system_include_error_prefixes = ["<nodes/", "<distributed/"]
# assign every include to a group
for include in includes:
# extract the group key from the include
include_content = include.split(" ")[1]
# fix common system includes which are secretly postgres or citus includes
for common_prefix in common_system_include_error_prefixes:
if include_content.startswith(common_prefix):
include_content = '"' + include_content.strip()[1:-1] + '"'
include = include.split(" ")[0] + " " + include_content + "\n"
break
group_key = default_group_key
for matcher in matchers:
if matcher["matcher"](include_content):
group_key = matcher["group_key"]
break
groups[group_key].add(include)
# iterate over all groups in the natural order of its keys
for i, group in enumerate(sorted(groups.items())):
if i > 0:
print(file=file)
includes = group[1]
print("".join(sorted(includes)), end="", file=file)
if __name__ == "__main__":
main(sys.argv)

View File

@ -1,25 +0,0 @@
#!/bin/bash
set -euo pipefail
# shellcheck disable=SC1091
source ci/ci_helpers.sh
# find all core files
core_files=( $(find . -type f -regex .*core.*\d*.*postgres) )
if [ ${#core_files[@]} -gt 0 ]; then
# print stack traces for the core files
for core_file in "${core_files[@]}"
do
# set print frame-arguments all: show all scalars + structures in the frame
# set print pretty on: show structures in indented mode
# set print addr off: do not show pointer address
# thread apply all bt full: show stack traces for all threads
gdb --batch \
-ex "set print frame-arguments all" \
-ex "set print pretty on" \
-ex "set print addr off" \
-ex "thread apply all bt full" \
postgres "${core_file}"
done
fi

View File

@ -1,12 +0,0 @@
#!/bin/bash
set -euo pipefail
# shellcheck disable=SC1091
source ci/ci_helpers.sh
git ls-files \
| git check-attr --stdin citus-style \
| grep 'citus-style: set' \
| awk '{print $1}' \
| cut -d':' -f1 \
| xargs -n1 ./ci/include_grouping.py

View File

Before

Width:  |  Height:  |  Size: 94 KiB

After

Width:  |  Height:  |  Size: 94 KiB

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Before

Width:  |  Height:  |  Size: 18 KiB

After

Width:  |  Height:  |  Size: 18 KiB

View File

@ -10,7 +10,7 @@
# argument (other than "yes/no"), etc.
#
# The point of this implementation is to reduce code size and
# redundancy in configure.ac and to improve robustness and consistency
# redundancy in configure.in and to improve robustness and consistency
# in the option evaluation code.

53
configure vendored
View File

@ -1,6 +1,6 @@
#! /bin/sh
# Guess values for system-dependent variables and create Makefiles.
# Generated by GNU Autoconf 2.69 for Citus 13.2devel.
# Generated by GNU Autoconf 2.69 for Citus 11.0.8.
#
#
# Copyright (C) 1992-1996, 1998-2012 Free Software Foundation, Inc.
@ -579,8 +579,8 @@ MAKEFLAGS=
# Identity of this package.
PACKAGE_NAME='Citus'
PACKAGE_TARNAME='citus'
PACKAGE_VERSION='13.2devel'
PACKAGE_STRING='Citus 13.2devel'
PACKAGE_VERSION='11.0.8'
PACKAGE_STRING='Citus 11.0.8'
PACKAGE_BUGREPORT=''
PACKAGE_URL=''
@ -644,7 +644,6 @@ LDFLAGS
CFLAGS
CC
vpath_build
with_pg_version_check
PATH
PG_CONFIG
FLEX
@ -693,7 +692,6 @@ ac_subst_files=''
ac_user_opts='
enable_option_checking
with_extra_version
with_pg_version_check
enable_coverage
with_libcurl
with_reports_hostname
@ -1262,7 +1260,7 @@ if test "$ac_init_help" = "long"; then
# Omit some internal or obsolete options to make the list less imposing.
# This message is too long to be a string in the A/UX 3.1 sh.
cat <<_ACEOF
\`configure' configures Citus 13.2devel to adapt to many kinds of systems.
\`configure' configures Citus 11.0.8 to adapt to many kinds of systems.
Usage: $0 [OPTION]... [VAR=VALUE]...
@ -1324,7 +1322,7 @@ fi
if test -n "$ac_init_help"; then
case $ac_init_help in
short | recursive ) echo "Configuration of Citus 13.2devel:";;
short | recursive ) echo "Configuration of Citus 11.0.8:";;
esac
cat <<\_ACEOF
@ -1339,8 +1337,6 @@ Optional Packages:
--without-PACKAGE do not use PACKAGE (same as --with-PACKAGE=no)
--with-extra-version=STRING
append STRING to version
--without-pg-version-check
do not check postgres version during configure
--without-libcurl do not use libcurl for anonymous statistics
collection
--with-reports-hostname=HOSTNAME
@ -1429,7 +1425,7 @@ fi
test -n "$ac_init_help" && exit $ac_status
if $ac_init_version; then
cat <<\_ACEOF
Citus configure 13.2devel
Citus configure 11.0.8
generated by GNU Autoconf 2.69
Copyright (C) 2012 Free Software Foundation, Inc.
@ -1912,7 +1908,7 @@ cat >config.log <<_ACEOF
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.
It was created by Citus $as_me 13.2devel, which was
It was created by Citus $as_me 11.0.8, which was
generated by GNU Autoconf 2.69. Invocation command line was
$ $0 $@
@ -2559,36 +2555,7 @@ if test -z "$version_num"; then
as_fn_error $? "Could not detect PostgreSQL version from pg_config." "$LINENO" 5
fi
# Check whether --with-pg-version-check was given.
if test "${with_pg_version_check+set}" = set; then :
withval=$with_pg_version_check;
case $withval in
yes)
:
;;
no)
:
;;
*)
as_fn_error $? "no argument expected for --with-pg-version-check option" "$LINENO" 5
;;
esac
else
with_pg_version_check=yes
fi
if test "$with_pg_version_check" = no; then
{ $as_echo "$as_me:${as_lineno-$LINENO}: building against PostgreSQL $version_num (skipped compatibility check)" >&5
$as_echo "$as_me: building against PostgreSQL $version_num (skipped compatibility check)" >&6;}
elif test "$version_num" != '15' -a "$version_num" != '16' -a "$version_num" != '17'; then
if test "$version_num" != '13' -a "$version_num" != '14'; then
as_fn_error $? "Citus is not compatible with the detected PostgreSQL version ${version_num}." "$LINENO" 5
else
{ $as_echo "$as_me:${as_lineno-$LINENO}: building against PostgreSQL $version_num" >&5
@ -5393,7 +5360,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
# report actual input values of CONFIG_FILES etc. instead of their
# values after options handling.
ac_log="
This file was extended by Citus $as_me 13.2devel, which was
This file was extended by Citus $as_me 11.0.8, which was
generated by GNU Autoconf 2.69. Invocation command line was
CONFIG_FILES = $CONFIG_FILES
@ -5455,7 +5422,7 @@ _ACEOF
cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
ac_cs_version="\\
Citus config.status 13.2devel
Citus config.status 11.0.8
configured by $0, generated by GNU Autoconf 2.69,
with options \\"\$ac_cs_config\\"

View File

@ -5,7 +5,7 @@
# everyone needing autoconf installed, the resulting files are checked
# into the SCM.
AC_INIT([Citus], [13.2devel])
AC_INIT([Citus], [11.0.8])
AC_COPYRIGHT([Copyright (c) Citus Data, Inc.])
# we'll need sed and awk for some of the version commands
@ -74,13 +74,7 @@ if test -z "$version_num"; then
AC_MSG_ERROR([Could not detect PostgreSQL version from pg_config.])
fi
PGAC_ARG_BOOL(with, pg-version-check, yes,
[do not check postgres version during configure])
AC_SUBST(with_pg_version_check)
if test "$with_pg_version_check" = no; then
AC_MSG_NOTICE([building against PostgreSQL $version_num (skipped compatibility check)])
elif test "$version_num" != '15' -a "$version_num" != '16' -a "$version_num" != '17'; then
if test "$version_num" != '13' -a "$version_num" != '14'; then
AC_MSG_ERROR([Citus is not compatible with the detected PostgreSQL version ${version_num}.])
else
AC_MSG_NOTICE([building against PostgreSQL $version_num])

Binary file not shown.

Before

Width:  |  Height:  |  Size: 95 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 102 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 69 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 111 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 168 KiB

View File

@ -1,40 +0,0 @@
[tool.isort]
profile = 'black'
[tool.black]
include = '(src/test/regress/bin/diff-filter|\.pyi?|\.ipynb)$'
[tool.pytest.ini_options]
addopts = [
"--import-mode=importlib",
"--showlocals",
"--tb=short",
]
pythonpath = 'src/test/regress/citus_tests'
asyncio_mode = 'auto'
# Make test discovery quicker from the root dir of the repo
testpaths = ['src/test/regress/citus_tests/test']
# Make test discovery quicker from other directories than root directory
norecursedirs = [
'*.egg',
'.*',
'build',
'venv',
'ci',
'vendor',
'backend',
'bin',
'include',
'tmp_*',
'results',
'expected',
'sql',
'spec',
'data',
'__pycache__',
]
# Don't find files with test at the end such as run_test.py
python_files = ['test_*.py']

View File

@ -16,6 +16,7 @@ README.* conflict-marker-size=32
# Test output files that contain extra whitespace
*.out -whitespace
src/test/regress/output/*.source -whitespace
# These files are maintained or generated elsewhere. We take them as is.
configure -whitespace

View File

@ -1,3 +0,0 @@
# The directory used to store columnar sql files after pre-processing them
# with 'cpp' in build-time, see src/backend/columnar/Makefile.
/build/

View File

@ -1,60 +0,0 @@
citus_subdir = src/backend/columnar
citus_top_builddir = ../../..
safestringlib_srcdir = $(citus_abs_top_srcdir)/vendor/safestringlib
SUBDIRS = . safeclib
SUBDIRS +=
ENSURE_SUBDIRS_EXIST := $(shell mkdir -p $(SUBDIRS))
OBJS += \
$(patsubst $(citus_abs_srcdir)/%.c,%.o,$(foreach dir,$(SUBDIRS), $(sort $(wildcard $(citus_abs_srcdir)/$(dir)/*.c))))
MODULE_big = citus_columnar
EXTENSION = citus_columnar
template_sql_files = $(patsubst $(citus_abs_srcdir)/%,%,$(wildcard $(citus_abs_srcdir)/sql/*.sql))
template_downgrade_sql_files = $(patsubst $(citus_abs_srcdir)/sql/downgrades/%,%,$(wildcard $(citus_abs_srcdir)/sql/downgrades/*.sql))
generated_sql_files = $(patsubst %,$(citus_abs_srcdir)/build/%,$(template_sql_files))
generated_downgrade_sql_files += $(patsubst %,$(citus_abs_srcdir)/build/sql/%,$(template_downgrade_sql_files))
DATA_built = $(generated_sql_files)
PG_CPPFLAGS += -I$(libpq_srcdir) -I$(safestringlib_srcdir)/include
include $(citus_top_builddir)/Makefile.global
SQL_DEPDIR=.deps/sql
SQL_BUILDDIR=build/sql
$(generated_sql_files): $(citus_abs_srcdir)/build/%: %
@mkdir -p $(citus_abs_srcdir)/$(SQL_DEPDIR) $(citus_abs_srcdir)/$(SQL_BUILDDIR)
@# -MF is used to store dependency files(.Po) in another directory for separation
@# -MT is used to change the target of the rule emitted by dependency generation.
@# -P is used to inhibit generation of linemarkers in the output from the preprocessor.
@# -undef is used to not predefine any system-specific or GCC-specific macros.
@# `man cpp` for further information
cd $(citus_abs_srcdir) && cpp -undef -w -P -MMD -MP -MF$(SQL_DEPDIR)/$(*F).Po -MT$@ $< > $@
$(generated_downgrade_sql_files): $(citus_abs_srcdir)/build/sql/%: sql/downgrades/%
@mkdir -p $(citus_abs_srcdir)/$(SQL_DEPDIR) $(citus_abs_srcdir)/$(SQL_BUILDDIR)
@# -MF is used to store dependency files(.Po) in another directory for separation
@# -MT is used to change the target of the rule emitted by dependency generation.
@# -P is used to inhibit generation of linemarkers in the output from the preprocessor.
@# -undef is used to not predefine any system-specific or GCC-specific macros.
@# `man cpp` for further information
cd $(citus_abs_srcdir) && cpp -undef -w -P -MMD -MP -MF$(SQL_DEPDIR)/$(*F).Po -MT$@ $< > $@
.PHONY: install install-downgrades install-all
cleanup-before-install:
rm -f $(DESTDIR)$(datadir)/$(datamoduledir)/citus_columnar.control
rm -f $(DESTDIR)$(datadir)/$(datamoduledir)/columnar--*
rm -f $(DESTDIR)$(datadir)/$(datamoduledir)/citus_columnar--*
install: cleanup-before-install
# install and install-downgrades should be run sequentially
install-all: install
$(MAKE) install-downgrades
install-downgrades: $(generated_downgrade_sql_files)
$(INSTALL_DATA) $(generated_downgrade_sql_files) '$(DESTDIR)$(datadir)/$(datamoduledir)/'

View File

@ -52,7 +52,8 @@ Benefits of Citus Columnar over cstore_fdw:
... FOR UPDATE``)
* No support for serializable isolation level
* Support for PostgreSQL server versions 12+ only
* No support for foreign keys
* No support for foreign keys, unique constraints, or exclusion
constraints
* No support for logical decoding
* No support for intra-node parallel scans
* No support for ``AFTER ... FOR EACH ROW`` triggers
@ -89,25 +90,38 @@ data.
Set options using:
```sql
ALTER TABLE my_columnar_table SET
(columnar.compression = none, columnar.stripe_row_limit = 10000);
alter_columnar_table_set(
relid REGCLASS,
chunk_group_row_limit INT4 DEFAULT NULL,
stripe_row_limit INT4 DEFAULT NULL,
compression NAME DEFAULT NULL,
compression_level INT4)
```
For example:
```sql
SELECT alter_columnar_table_set(
'my_columnar_table',
compression => 'none',
stripe_row_limit => 10000);
```
The following options are available:
* **columnar.compression**: `[none|pglz|zstd|lz4|lz4hc]` - set the compression type
* **compression**: `[none|pglz|zstd|lz4|lz4hc]` - set the compression type
for _newly-inserted_ data. Existing data will not be
recompressed/decompressed. The default value is `zstd` (if support
has been compiled in).
* **columnar.compression_level**: ``<integer>`` - Sets compression level. Valid
* **compression_level**: ``<integer>`` - Sets compression level. Valid
settings are from 1 through 19. If the compression method does not
support the level chosen, the closest level will be selected
instead.
* **columnar.stripe_row_limit**: ``<integer>`` - the maximum number of rows per
* **stripe_row_limit**: ``<integer>`` - the maximum number of rows per
stripe for _newly-inserted_ data. Existing stripes of data will not
be changed and may have more rows than this maximum value. The
default value is `150000`.
* **columnar.chunk_group_row_limit**: ``<integer>`` - the maximum number of rows per
* **chunk_group_row_limit**: ``<integer>`` - the maximum number of rows per
chunk for _newly-inserted_ data. Existing chunks of data will not be
changed and may have more rows than this maximum value. The default
value is `10000`.
@ -233,14 +247,16 @@ CREATE TABLE perf_columnar(LIKE perf_row) USING COLUMNAR;
## Data
```sql
CREATE OR REPLACE FUNCTION random_words(n INT4) RETURNS TEXT LANGUAGE sql AS $$
WITH words(w) AS (
SELECT ARRAY['zero','one','two','three','four','five','six','seven','eight','nine','ten']
),
random (word) AS (
SELECT w[(random()*array_length(w, 1))::int] FROM generate_series(1, $1) AS i, words
)
SELECT string_agg(word, ' ') FROM random;
CREATE OR REPLACE FUNCTION random_words(n INT4) RETURNS TEXT LANGUAGE plpython2u AS $$
import random
t = ''
words = ['zero','one','two','three','four','five','six','seven','eight','nine','ten']
for i in xrange(0,n):
if (i != 0):
t += ' '
r = random.randint(0,len(words)-1)
t += words[r]
return t
$$;
```

View File

@ -1,6 +0,0 @@
# Columnar extension
comment = 'Citus Columnar extension'
default_version = '12.2-1'
module_pathname = '$libdir/citus_columnar'
relocatable = false
schema = pg_catalog

View File

@ -11,20 +11,17 @@
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include <sys/stat.h>
#include <unistd.h>
#include "postgres.h"
#include "miscadmin.h"
#include "utils/guc.h"
#include "utils/rel.h"
#include "citus_version.h"
#include "columnar/columnar.h"
#include "columnar/columnar_tableam.h"
/* Default values for option parameters */
#define DEFAULT_STRIPE_ROW_COUNT 150000
@ -56,14 +53,6 @@ static const struct config_enum_entry columnar_compression_options[] =
{ NULL, 0, false }
};
void
columnar_init(void)
{
columnar_init_gucs();
columnar_tableam_init();
}
void
columnar_init_gucs()
{

View File

@ -13,22 +13,16 @@
*/
#include "postgres.h"
#include "citus_version.h"
#include "common/pg_lzcompress.h"
#include "lib/stringinfo.h"
#include "citus_version.h"
#include "pg_version_constants.h"
#include "columnar/columnar_compression.h"
#if HAVE_CITUS_LIBLZ4
#include <lz4.h>
#endif
#if PG_VERSION_NUM >= PG_VERSION_16
#include "varatt.h"
#endif
#if HAVE_LIBZSTD
#include <zstd.h>
#endif

View File

@ -10,17 +10,18 @@
*-------------------------------------------------------------------------
*/
#include <math.h>
#include "citus_version.h"
#include "postgres.h"
#include "miscadmin.h"
#include <math.h>
#include "access/amapi.h"
#include "access/skey.h"
#include "catalog/pg_am.h"
#include "catalog/pg_statistic.h"
#include "commands/defrem.h"
#include "miscadmin.h"
#include "nodes/extensible.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
@ -32,10 +33,6 @@
#include "optimizer/paths.h"
#include "optimizer/plancat.h"
#include "optimizer/restrictinfo.h"
#if PG_VERSION_NUM >= PG_VERSION_16
#include "parser/parse_relation.h"
#include "parser/parsetree.h"
#endif
#include "utils/builtins.h"
#include "utils/lsyscache.h"
#include "utils/relcache.h"
@ -43,13 +40,10 @@
#include "utils/selfuncs.h"
#include "utils/spccache.h"
#include "citus_version.h"
#include "columnar/columnar.h"
#include "columnar/columnar_customscan.h"
#include "columnar/columnar_metadata.h"
#include "columnar/columnar_tableam.h"
#include "distributed/listutils.h"
/*
@ -127,15 +121,14 @@ static void ColumnarScan_ExplainCustomScan(CustomScanState *node, List *ancestor
static const char * ColumnarPushdownClausesStr(List *context, List *clauses);
static const char * ColumnarProjectedColumnsStr(List *context,
List *projectedColumns);
#if PG_VERSION_NUM >= 130000
static List * set_deparse_context_planstate(List *dpcontext, Node *node,
List *ancestors);
#endif
/* other helpers */
static List * ColumnarVarNeeded(ColumnarScanState *columnarScanState);
static Bitmapset * ColumnarAttrNeeded(ScanState *ss);
#if PG_VERSION_NUM >= PG_VERSION_16
static Bitmapset * fixup_inherited_columns(Oid parentId, Oid childId, Bitmapset *columns);
#endif
/* saved hook value in case of unload */
static set_rel_pathlist_hook_type PreviousSetRelPathlistHook = NULL;
@ -207,7 +200,7 @@ columnar_customscan_init()
&EnableColumnarCustomScan,
true,
PGC_USERSET,
GUC_NO_SHOW_ALL | GUC_NOT_IN_SAMPLE,
GUC_NO_SHOW_ALL,
NULL, NULL, NULL);
DefineCustomBoolVariable(
"columnar.enable_qual_pushdown",
@ -217,7 +210,7 @@ columnar_customscan_init()
&EnableColumnarQualPushdown,
true,
PGC_USERSET,
GUC_NO_SHOW_ALL | GUC_NOT_IN_SAMPLE,
GUC_NO_SHOW_ALL,
NULL, NULL, NULL);
DefineCustomRealVariable(
"columnar.qual_pushdown_correlation_threshold",
@ -231,7 +224,7 @@ columnar_customscan_init()
0.0,
1.0,
PGC_USERSET,
GUC_NO_SHOW_ALL | GUC_NOT_IN_SAMPLE,
GUC_NO_SHOW_ALL,
NULL, NULL, NULL);
DefineCustomIntVariable(
"columnar.max_custom_scan_paths",
@ -243,7 +236,7 @@ columnar_customscan_init()
1,
1024,
PGC_USERSET,
GUC_NO_SHOW_ALL | GUC_NOT_IN_SAMPLE,
GUC_NO_SHOW_ALL,
NULL, NULL, NULL);
DefineCustomEnumVariable(
"columnar.planner_debug_level",
@ -284,11 +277,6 @@ ColumnarSetRelPathlistHook(PlannerInfo *root, RelOptInfo *rel, Index rti,
* into the scan of the table to minimize the data read.
*/
Relation relation = RelationIdGetRelation(rte->relid);
if (!RelationIsValid(relation))
{
ereport(ERROR, (errmsg("could not open relation with OID %u", rte->relid)));
}
if (relation->rd_tableam == GetColumnarTableAmRoutine())
{
if (rte->tablesample != NULL)
@ -363,7 +351,7 @@ ColumnarGetRelationInfoHook(PlannerInfo *root, Oid relationObjectId,
/* disable index-only scan */
IndexOptInfo *indexOptInfo = NULL;
foreach_declared_ptr(indexOptInfo, rel->indexlist)
foreach_ptr(indexOptInfo, rel->indexlist)
{
memset(indexOptInfo->canreturn, false, indexOptInfo->ncolumns * sizeof(bool));
}
@ -381,7 +369,7 @@ RemovePathsByPredicate(RelOptInfo *rel, PathPredicate removePathPredicate)
List *filteredPathList = NIL;
Path *path = NULL;
foreach_declared_ptr(path, rel->pathlist)
foreach_ptr(path, rel->pathlist)
{
if (!removePathPredicate(path))
{
@ -428,7 +416,7 @@ static void
CostColumnarPaths(PlannerInfo *root, RelOptInfo *rel, Oid relationId)
{
Path *path = NULL;
foreach_declared_ptr(path, rel->pathlist)
foreach_ptr(path, rel->pathlist)
{
if (IsA(path, IndexPath))
{
@ -513,11 +501,6 @@ ColumnarIndexScanAdditionalCost(PlannerInfo *root, RelOptInfo *rel,
&indexCorrelation, &fakeIndexPages);
Relation relation = RelationIdGetRelation(relationId);
if (!RelationIsValid(relation))
{
ereport(ERROR, (errmsg("could not open relation with OID %u", relationId)));
}
uint64 rowCount = ColumnarTableRowCount(relation);
RelationClose(relation);
double estimatedRows = rowCount * indexSelectivity;
@ -544,7 +527,7 @@ ColumnarIndexScanAdditionalCost(PlannerInfo *root, RelOptInfo *rel,
* "anti-correlated" (-1) since both help us avoiding from reading the
* same stripe again and again.
*/
double absIndexCorrelation = float_abs(indexCorrelation);
double absIndexCorrelation = Abs(indexCorrelation);
/*
* To estimate the number of stripes that we need to read, we do linear
@ -613,11 +596,6 @@ static int
RelationIdGetNumberOfAttributes(Oid relationId)
{
Relation relation = RelationIdGetRelation(relationId);
if (!RelationIsValid(relation))
{
ereport(ERROR, (errmsg("could not open relation with OID %u", relationId)));
}
int nattrs = relation->rd_att->natts;
RelationClose(relation);
return nattrs;
@ -663,7 +641,7 @@ CheckVarStats(PlannerInfo *root, Var *var, Oid sortop, float4 *absVarCorrelation
* If the Var is not highly correlated, then the chunk's min/max bounds
* will be nearly useless.
*/
if (float_abs(varCorrelation) < ColumnarQualPushdownCorrelationThreshold)
if (Abs(varCorrelation) < ColumnarQualPushdownCorrelationThreshold)
{
if (absVarCorrelation)
{
@ -671,7 +649,7 @@ CheckVarStats(PlannerInfo *root, Var *var, Oid sortop, float4 *absVarCorrelation
* Report absVarCorrelation if caller wants to know why given
* var is rejected.
*/
*absVarCorrelation = float_abs(varCorrelation);
*absVarCorrelation = Abs(varCorrelation);
}
return false;
}
@ -783,7 +761,7 @@ ExtractPushdownClause(PlannerInfo *root, RelOptInfo *rel, Node *node)
List *pushdownableArgs = NIL;
Node *boolExprArg = NULL;
foreach_declared_ptr(boolExprArg, boolExpr->args)
foreach_ptr(boolExprArg, boolExpr->args)
{
Expr *pushdownableArg = ExtractPushdownClause(root, rel,
(Node *) boolExprArg);
@ -1051,15 +1029,6 @@ FindCandidateRelids(PlannerInfo *root, RelOptInfo *rel, List *joinClauses)
candidateRelids = bms_del_members(candidateRelids, rel->relids);
candidateRelids = bms_del_members(candidateRelids, rel->lateral_relids);
/*
* For the relevant PG16 commit requiring this addition:
* postgres/postgres@2489d76
*/
#if PG_VERSION_NUM >= PG_VERSION_16
candidateRelids = bms_del_members(candidateRelids, root->outer_join_rels);
#endif
return candidateRelids;
}
@ -1321,9 +1290,6 @@ AddColumnarScanPath(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte,
cpath->methods = &ColumnarScanPathMethods;
/* necessary to avoid extra Result node in PG15 */
cpath->flags = CUSTOMPATH_SUPPORT_PROJECTION;
/*
* populate generic path information
*/
@ -1386,43 +1352,7 @@ AddColumnarScanPath(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte,
cpath->custom_private = list_make2(NIL, NIL);
}
int numberOfColumnsRead = 0;
#if PG_VERSION_NUM >= PG_VERSION_16
if (rte->perminfoindex > 0)
{
/*
* If perminfoindex > 0, that means that this relation's permission info
* is directly found in the list of rteperminfos of the Query(root->parse)
* So, all we have to do here is retrieve that info.
*/
RTEPermissionInfo *perminfo = getRTEPermissionInfo(root->parse->rteperminfos,
rte);
numberOfColumnsRead = bms_num_members(perminfo->selectedCols);
}
else
{
/*
* If perminfoindex = 0, that means we are skipping the check for permission info
* for this relation, which means that it's either a partition or an inheritance child.
* In these cases, we need to access the permission info of the top parent of this relation.
* After thorough checking, we found that the index of the top parent pointing to the correct
* range table entry in Query's range tables (root->parse->rtable) is found under
* RelOptInfo rel->top_parent->relid.
* For reference, check expand_partitioned_rtentry and expand_inherited_rtentry PG functions
*/
Assert(rel->top_parent);
RangeTblEntry *parent_rte = rt_fetch(rel->top_parent->relid, root->parse->rtable);
RTEPermissionInfo *perminfo = getRTEPermissionInfo(root->parse->rteperminfos,
parent_rte);
numberOfColumnsRead = bms_num_members(fixup_inherited_columns(perminfo->relid,
rte->relid,
perminfo->
selectedCols));
}
#else
numberOfColumnsRead = bms_num_members(rte->selectedCols);
#endif
int numberOfColumnsRead = bms_num_members(rte->selectedCols);
int numberOfClausesPushed = list_length(allClauses);
CostColumnarScan(root, rel, rte->relid, cpath, numberOfColumnsRead,
@ -1442,69 +1372,6 @@ AddColumnarScanPath(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte,
}
#if PG_VERSION_NUM >= PG_VERSION_16
/*
* fixup_inherited_columns
*
* Exact function Copied from PG16 as it's static.
*
* When user is querying on a table with children, it implicitly accesses
* child tables also. So, we also need to check security label of child
* tables and columns, but there is no guarantee attribute numbers are
* same between the parent and children.
* It returns a bitmapset which contains attribute number of the child
* table based on the given bitmapset of the parent.
*/
static Bitmapset *
fixup_inherited_columns(Oid parentId, Oid childId, Bitmapset *columns)
{
Bitmapset *result = NULL;
/*
* obviously, no need to do anything here
*/
if (parentId == childId)
{
return columns;
}
int index = -1;
while ((index = bms_next_member(columns, index)) >= 0)
{
/* bit numbers are offset by FirstLowInvalidHeapAttributeNumber */
AttrNumber attno = index + FirstLowInvalidHeapAttributeNumber;
/*
* whole-row-reference shall be fixed-up later
*/
if (attno == InvalidAttrNumber)
{
result = bms_add_member(result, index);
continue;
}
char *attname = get_attname(parentId, attno, false);
attno = get_attnum(childId, attname);
if (attno == InvalidAttrNumber)
{
elog(ERROR, "cache lookup failed for attribute %s of relation %u",
attname, childId);
}
result = bms_add_member(result,
attno - FirstLowInvalidHeapAttributeNumber);
pfree(attname);
}
return result;
}
#endif
/*
* CostColumnarScan calculates the cost of scanning the columnar table. The
* cost is estimated by using all stripe metadata to estimate based on the
@ -1544,19 +1411,13 @@ static Cost
ColumnarPerStripeScanCost(RelOptInfo *rel, Oid relationId, int numberOfColumnsRead)
{
Relation relation = RelationIdGetRelation(relationId);
if (!RelationIsValid(relation))
{
ereport(ERROR, (errmsg("could not open relation with OID %u", relationId)));
}
List *stripeList = StripesForRelfilelocator(RelationPhysicalIdentifier_compat(
relation));
List *stripeList = StripesForRelfilenode(relation->rd_node);
RelationClose(relation);
uint32 maxColumnCount = 0;
uint64 totalStripeSize = 0;
StripeMetadata *stripeMetadata = NULL;
foreach_declared_ptr(stripeMetadata, stripeList)
foreach_ptr(stripeMetadata, stripeList)
{
totalStripeSize += stripeMetadata->dataLength;
maxColumnCount = Max(maxColumnCount, stripeMetadata->columnCount);
@ -1602,13 +1463,7 @@ static uint64
ColumnarTableStripeCount(Oid relationId)
{
Relation relation = RelationIdGetRelation(relationId);
if (!RelationIsValid(relation))
{
ereport(ERROR, (errmsg("could not open relation with OID %u", relationId)));
}
List *stripeList = StripesForRelfilelocator(RelationPhysicalIdentifier_compat(
relation));
List *stripeList = StripesForRelfilenode(relation->rd_node);
int stripeCount = list_length(stripeList);
RelationClose(relation);
@ -1667,12 +1522,6 @@ ColumnarScanPath_PlanCustomPath(PlannerInfo *root,
cscan->scan.plan.targetlist = list_copy(tlist);
cscan->scan.scanrelid = best_path->path.parent->relid;
#if (PG_VERSION_NUM >= 150000)
/* necessary to avoid extra Result node in PG15 */
cscan->flags = CUSTOMPATH_SUPPORT_PROJECTION;
#endif
return (Plan *) cscan;
}
@ -1930,6 +1779,11 @@ ColumnarScan_EndCustomScan(CustomScanState *node)
*/
TableScanDesc scanDesc = node->ss.ss_currentScanDesc;
/*
* Free the exprcontext
*/
ExecFreeExprContext(&node->ss.ps);
/*
* clean out the tuple table
*/
@ -2119,6 +1973,8 @@ ColumnarVarNeeded(ColumnarScanState *columnarScanState)
}
#if PG_VERSION_NUM >= 130000
/*
* set_deparse_context_planstate is a compatibility wrapper for versions 13+.
*/
@ -2128,3 +1984,6 @@ set_deparse_context_planstate(List *dpcontext, Node *node, List *ancestors)
PlanState *ps = (PlanState *) node;
return set_deparse_context_plan(dpcontext, ps->plan, ancestors);
}
#endif

View File

@ -11,12 +11,13 @@
#include "postgres.h"
#include "funcapi.h"
#include "miscadmin.h"
#include "pg_config.h"
#include "access/nbtree.h"
#include "access/table.h"
#include "catalog/pg_am.h"
#include "catalog/pg_type.h"
#include "distributed/pg_version_constants.h"
#include "miscadmin.h"
#include "storage/fd.h"
#include "storage/smgr.h"
#include "utils/guc.h"
@ -25,8 +26,6 @@
#include "utils/tuplestore.h"
#include "pg_version_compat.h"
#include "pg_version_constants.h"
#include "columnar/columnar.h"
#include "columnar/columnar_storage.h"
#include "columnar/columnar_version_compat.h"
@ -117,6 +116,8 @@ columnar_storage_info(PG_FUNCTION_ARGS)
RelationGetRelationName(rel))));
}
RelationOpenSmgr(rel);
Datum values[STORAGE_INFO_NATTS] = { 0 };
bool nulls[STORAGE_INFO_NATTS] = { 0 };
@ -161,5 +162,5 @@ MemoryContextTotals(MemoryContext context, MemoryContextCounters *counters)
MemoryContextTotals(child, counters);
}
context->methods->stats(context, NULL, NULL, counters, true);
context->methods->stats_compat(context, NULL, NULL, counters, true);
}

View File

@ -19,58 +19,46 @@
*/
#include <sys/stat.h>
#include "postgres.h"
#include "miscadmin.h"
#include "port.h"
#include "safe_lib.h"
#include "citus_version.h"
#include "columnar/columnar.h"
#include "columnar/columnar_storage.h"
#include "columnar/columnar_version_compat.h"
#include "distributed/listutils.h"
#include <sys/stat.h>
#include "access/heapam.h"
#include "access/htup_details.h"
#include "access/nbtree.h"
#include "access/xact.h"
#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_type.h"
#include "catalog/namespace.h"
#include "commands/defrem.h"
#include "commands/sequence.h"
#include "commands/trigger.h"
#include "executor/executor.h"
#include "executor/spi.h"
#include "lib/stringinfo.h"
#include "miscadmin.h"
#include "nodes/execnodes.h"
#include "lib/stringinfo.h"
#include "port.h"
#include "storage/fd.h"
#include "storage/lmgr.h"
#include "storage/procarray.h"
#include "storage/smgr.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
#include "citus_version.h"
#include "pg_version_constants.h"
#include "columnar/columnar.h"
#include "columnar/columnar_storage.h"
#include "columnar/columnar_version_compat.h"
#include "distributed/listutils.h"
#if PG_VERSION_NUM >= PG_VERSION_16
#include "parser/parse_relation.h"
#include "storage/relfilelocator.h"
#include "utils/relfilenumbermap.h"
#else
#include "utils/relfilenodemap.h"
#endif
#define COLUMNAR_RELOPTION_NAMESPACE "columnar"
#define SLOW_METADATA_ACCESS_WARNING \
"Metadata index %s is not available, this might mean slower read/writes " \
"on columnar tables. This is expected during Postgres upgrades and not " \
@ -98,7 +86,6 @@ typedef enum RowNumberLookupMode
FIND_GREATER
} RowNumberLookupMode;
static void ParseColumnarRelOptions(List *reloptions, ColumnarOptions *options);
static void InsertEmptyStripeMetadataRow(uint64 storageId, uint64 stripeId,
uint32 columnCount, uint32 chunkGroupRowCount,
uint64 firstRowNumber);
@ -123,7 +110,7 @@ static Oid ColumnarChunkGroupRelationId(void);
static Oid ColumnarChunkIndexRelationId(void);
static Oid ColumnarChunkGroupIndexRelationId(void);
static Oid ColumnarNamespaceId(void);
static uint64 LookupStorageId(RelFileLocator relfilelocator);
static uint64 LookupStorageId(RelFileNode relfilenode);
static uint64 GetHighestUsedRowNumber(uint64 storageId);
static void DeleteStorageFromColumnarMetadataTable(Oid metadataTableId,
AttrNumber storageIdAtrrNumber,
@ -235,154 +222,6 @@ InitColumnarOptions(Oid regclass)
}
/*
* ParseColumnarRelOptions - update the given 'options' using the given list
* of DefElem.
*/
static void
ParseColumnarRelOptions(List *reloptions, ColumnarOptions *options)
{
ListCell *lc = NULL;
foreach(lc, reloptions)
{
DefElem *elem = castNode(DefElem, lfirst(lc));
if (elem->defnamespace == NULL ||
strcmp(elem->defnamespace, COLUMNAR_RELOPTION_NAMESPACE) != 0)
{
ereport(ERROR, (errmsg("columnar options must have the prefix \"%s\"",
COLUMNAR_RELOPTION_NAMESPACE)));
}
if (strcmp(elem->defname, "chunk_group_row_limit") == 0)
{
options->chunkRowCount = (elem->arg == NULL) ?
columnar_chunk_group_row_limit : defGetInt64(elem);
if (options->chunkRowCount < CHUNK_ROW_COUNT_MINIMUM ||
options->chunkRowCount > CHUNK_ROW_COUNT_MAXIMUM)
{
ereport(ERROR, (errmsg("chunk group row count limit out of range"),
errhint("chunk group row count limit must be between "
UINT64_FORMAT " and " UINT64_FORMAT,
(uint64) CHUNK_ROW_COUNT_MINIMUM,
(uint64) CHUNK_ROW_COUNT_MAXIMUM)));
}
}
else if (strcmp(elem->defname, "stripe_row_limit") == 0)
{
options->stripeRowCount = (elem->arg == NULL) ?
columnar_stripe_row_limit : defGetInt64(elem);
if (options->stripeRowCount < STRIPE_ROW_COUNT_MINIMUM ||
options->stripeRowCount > STRIPE_ROW_COUNT_MAXIMUM)
{
ereport(ERROR, (errmsg("stripe row count limit out of range"),
errhint("stripe row count limit must be between "
UINT64_FORMAT " and " UINT64_FORMAT,
(uint64) STRIPE_ROW_COUNT_MINIMUM,
(uint64) STRIPE_ROW_COUNT_MAXIMUM)));
}
}
else if (strcmp(elem->defname, "compression") == 0)
{
options->compressionType = (elem->arg == NULL) ?
columnar_compression : ParseCompressionType(
defGetString(elem));
if (options->compressionType == COMPRESSION_TYPE_INVALID)
{
ereport(ERROR, (errmsg("unknown compression type for columnar table: %s",
quote_identifier(defGetString(elem)))));
}
}
else if (strcmp(elem->defname, "compression_level") == 0)
{
options->compressionLevel = (elem->arg == NULL) ?
columnar_compression_level : defGetInt64(elem);
if (options->compressionLevel < COMPRESSION_LEVEL_MIN ||
options->compressionLevel > COMPRESSION_LEVEL_MAX)
{
ereport(ERROR, (errmsg("compression level out of range"),
errhint("compression level must be between %d and %d",
COMPRESSION_LEVEL_MIN,
COMPRESSION_LEVEL_MAX)));
}
}
else
{
ereport(ERROR, (errmsg("unrecognized columnar storage parameter \"%s\"",
elem->defname)));
}
}
}
/*
* ExtractColumnarOptions - extract columnar options from inOptions, appending
* to inoutColumnarOptions. Return the remaining (non-columnar) options.
*/
List *
ExtractColumnarRelOptions(List *inOptions, List **inoutColumnarOptions)
{
List *otherOptions = NIL;
ListCell *lc = NULL;
foreach(lc, inOptions)
{
DefElem *elem = castNode(DefElem, lfirst(lc));
if (elem->defnamespace != NULL &&
strcmp(elem->defnamespace, COLUMNAR_RELOPTION_NAMESPACE) == 0)
{
*inoutColumnarOptions = lappend(*inoutColumnarOptions, elem);
}
else
{
otherOptions = lappend(otherOptions, elem);
}
}
/* validate options */
ColumnarOptions dummy = { 0 };
ParseColumnarRelOptions(*inoutColumnarOptions, &dummy);
return otherOptions;
}
/*
* SetColumnarRelOptions - apply the list of DefElem options to the
* relation. If there are duplicates, the last one in the list takes effect.
*/
void
SetColumnarRelOptions(RangeVar *rv, List *reloptions)
{
ColumnarOptions options = { 0 };
if (reloptions == NIL)
{
return;
}
Relation rel = relation_openrv(rv, AccessShareLock);
Oid relid = RelationGetRelid(rel);
relation_close(rel, NoLock);
/* get existing or default options */
if (!ReadColumnarOptions(relid, &options))
{
/* if extension doesn't exist, just return */
return;
}
ParseColumnarRelOptions(reloptions, &options);
SetColumnarOptions(relid, &options);
}
/*
* SetColumnarOptions writes the passed table options as the authoritive options to the
* table irregardless of the optiones already existing or not. This can be used to put a
@ -602,15 +441,14 @@ ReadColumnarOptions(Oid regclass, ColumnarOptions *options)
* of columnar.chunk.
*/
void
SaveStripeSkipList(RelFileLocator relfilelocator, uint64 stripe,
StripeSkipList *chunkList,
SaveStripeSkipList(RelFileNode relfilenode, uint64 stripe, StripeSkipList *chunkList,
TupleDesc tupleDescriptor)
{
uint32 columnIndex = 0;
uint32 chunkIndex = 0;
uint32 columnCount = chunkList->columnCount;
uint64 storageId = LookupStorageId(relfilelocator);
uint64 storageId = LookupStorageId(relfilenode);
Oid columnarChunkOid = ColumnarChunkRelationId();
Relation columnarChunk = table_open(columnarChunkOid, RowExclusiveLock);
ModifyState *modifyState = StartModifyRelation(columnarChunk);
@ -669,10 +507,10 @@ SaveStripeSkipList(RelFileLocator relfilelocator, uint64 stripe,
* SaveChunkGroups saves the metadata for given chunk groups in columnar.chunk_group.
*/
void
SaveChunkGroups(RelFileLocator relfilelocator, uint64 stripe,
SaveChunkGroups(RelFileNode relfilenode, uint64 stripe,
List *chunkGroupRowCounts)
{
uint64 storageId = LookupStorageId(relfilelocator);
uint64 storageId = LookupStorageId(relfilenode);
Oid columnarChunkGroupOid = ColumnarChunkGroupRelationId();
Relation columnarChunkGroup = table_open(columnarChunkGroupOid, RowExclusiveLock);
ModifyState *modifyState = StartModifyRelation(columnarChunkGroup);
@ -705,8 +543,7 @@ SaveChunkGroups(RelFileLocator relfilelocator, uint64 stripe,
* ReadStripeSkipList fetches chunk metadata for a given stripe.
*/
StripeSkipList *
ReadStripeSkipList(RelFileLocator relfilelocator, uint64 stripe,
TupleDesc tupleDescriptor,
ReadStripeSkipList(RelFileNode relfilenode, uint64 stripe, TupleDesc tupleDescriptor,
uint32 chunkCount, Snapshot snapshot)
{
int32 columnIndex = 0;
@ -714,15 +551,15 @@ ReadStripeSkipList(RelFileLocator relfilelocator, uint64 stripe,
uint32 columnCount = tupleDescriptor->natts;
ScanKeyData scanKey[2];
uint64 storageId = LookupStorageId(relfilelocator);
uint64 storageId = LookupStorageId(relfilenode);
Oid columnarChunkOid = ColumnarChunkRelationId();
Relation columnarChunk = table_open(columnarChunkOid, AccessShareLock);
ScanKeyInit(&scanKey[0], Anum_columnar_chunk_storageid,
BTEqualStrategyNumber, F_INT8EQ, Int64GetDatum(storageId));
BTEqualStrategyNumber, F_OIDEQ, UInt64GetDatum(storageId));
ScanKeyInit(&scanKey[1], Anum_columnar_chunk_stripe,
BTEqualStrategyNumber, F_INT8EQ, Int64GetDatum(stripe));
BTEqualStrategyNumber, F_OIDEQ, Int32GetDatum(stripe));
Oid indexId = ColumnarChunkIndexRelationId();
bool indexOk = OidIsValid(indexId);
@ -928,7 +765,7 @@ StripeMetadataLookupRowNumber(Relation relation, uint64 rowNumber, Snapshot snap
uint64 storageId = ColumnarStorageGetStorageId(relation, false);
ScanKeyData scanKey[2];
ScanKeyInit(&scanKey[0], Anum_columnar_stripe_storageid,
BTEqualStrategyNumber, F_INT8EQ, Int64GetDatum(storageId));
BTEqualStrategyNumber, F_OIDEQ, Int32GetDatum(storageId));
StrategyNumber strategyNumber = InvalidStrategy;
RegProcedure procedure = InvalidOid;
@ -943,7 +780,7 @@ StripeMetadataLookupRowNumber(Relation relation, uint64 rowNumber, Snapshot snap
procedure = F_INT8GT;
}
ScanKeyInit(&scanKey[1], Anum_columnar_stripe_first_row_number,
strategyNumber, procedure, Int64GetDatum(rowNumber));
strategyNumber, procedure, UInt64GetDatum(rowNumber));
Relation columnarStripes = table_open(ColumnarStripeRelationId(), AccessShareLock);
@ -1094,7 +931,7 @@ FindStripeWithHighestRowNumber(Relation relation, Snapshot snapshot)
uint64 storageId = ColumnarStorageGetStorageId(relation, false);
ScanKeyData scanKey[1];
ScanKeyInit(&scanKey[0], Anum_columnar_stripe_storageid,
BTEqualStrategyNumber, F_INT8EQ, Int64GetDatum(storageId));
BTEqualStrategyNumber, F_OIDEQ, Int32GetDatum(storageId));
Relation columnarStripes = table_open(ColumnarStripeRelationId(), AccessShareLock);
@ -1156,9 +993,9 @@ ReadChunkGroupRowCounts(uint64 storageId, uint64 stripe, uint32 chunkGroupCount,
ScanKeyData scanKey[2];
ScanKeyInit(&scanKey[0], Anum_columnar_chunkgroup_storageid,
BTEqualStrategyNumber, F_INT8EQ, Int64GetDatum(storageId));
BTEqualStrategyNumber, F_OIDEQ, UInt64GetDatum(storageId));
ScanKeyInit(&scanKey[1], Anum_columnar_chunkgroup_stripe,
BTEqualStrategyNumber, F_INT8EQ, Int64GetDatum(stripe));
BTEqualStrategyNumber, F_OIDEQ, Int32GetDatum(stripe));
Oid indexId = ColumnarChunkGroupIndexRelationId();
bool indexOk = OidIsValid(indexId);
@ -1248,13 +1085,13 @@ InsertEmptyStripeMetadataRow(uint64 storageId, uint64 stripeId, uint32 columnCou
/*
* StripesForRelfilelocator returns a list of StripeMetadata for stripes
* StripesForRelfilenode returns a list of StripeMetadata for stripes
* of the given relfilenode.
*/
List *
StripesForRelfilelocator(RelFileLocator relfilelocator)
StripesForRelfilenode(RelFileNode relfilenode)
{
uint64 storageId = LookupStorageId(relfilelocator);
uint64 storageId = LookupStorageId(relfilenode);
return ReadDataFileStripeList(storageId, GetTransactionSnapshot());
}
@ -1269,9 +1106,9 @@ StripesForRelfilelocator(RelFileLocator relfilelocator)
* returns 0.
*/
uint64
GetHighestUsedAddress(RelFileLocator relfilelocator)
GetHighestUsedAddress(RelFileNode relfilenode)
{
uint64 storageId = LookupStorageId(relfilelocator);
uint64 storageId = LookupStorageId(relfilenode);
uint64 highestUsedAddress = 0;
uint64 highestUsedId = 0;
@ -1385,9 +1222,9 @@ UpdateStripeMetadataRow(uint64 storageId, uint64 stripeId, bool *update,
ScanKeyData scanKey[2];
ScanKeyInit(&scanKey[0], Anum_columnar_stripe_storageid,
BTEqualStrategyNumber, F_INT8EQ, Int64GetDatum(storageId));
BTEqualStrategyNumber, F_OIDEQ, Int32GetDatum(storageId));
ScanKeyInit(&scanKey[1], Anum_columnar_stripe_stripe,
BTEqualStrategyNumber, F_INT8EQ, Int64GetDatum(stripeId));
BTEqualStrategyNumber, F_OIDEQ, Int32GetDatum(stripeId));
Oid columnarStripesOid = ColumnarStripeRelationId();
@ -1464,7 +1301,7 @@ ReadDataFileStripeList(uint64 storageId, Snapshot snapshot)
HeapTuple heapTuple;
ScanKeyInit(&scanKey[0], Anum_columnar_stripe_storageid,
BTEqualStrategyNumber, F_INT8EQ, Int64GetDatum(storageId));
BTEqualStrategyNumber, F_OIDEQ, Int32GetDatum(storageId));
Oid columnarStripesOid = ColumnarStripeRelationId();
@ -1552,7 +1389,7 @@ BuildStripeMetadata(Relation columnarStripes, HeapTuple heapTuple)
* metadata tables.
*/
void
DeleteMetadataRows(RelFileLocator relfilelocator)
DeleteMetadataRows(RelFileNode relfilenode)
{
/*
* During a restore for binary upgrade, metadata tables and indexes may or
@ -1563,7 +1400,7 @@ DeleteMetadataRows(RelFileLocator relfilelocator)
return;
}
uint64 storageId = LookupStorageId(relfilelocator);
uint64 storageId = LookupStorageId(relfilenode);
DeleteStorageFromColumnarMetadataTable(ColumnarStripeRelationId(),
Anum_columnar_stripe_storageid,
@ -1591,7 +1428,7 @@ DeleteStorageFromColumnarMetadataTable(Oid metadataTableId,
{
ScanKeyData scanKey[1];
ScanKeyInit(&scanKey[0], storageIdAtrrNumber, BTEqualStrategyNumber,
F_INT8EQ, Int64GetDatum(storageId));
F_INT8EQ, UInt64GetDatum(storageId));
Relation metadataTable = try_relation_open(metadataTableId, AccessShareLock);
if (metadataTable == NULL)
@ -1636,8 +1473,12 @@ StartModifyRelation(Relation rel)
{
EState *estate = create_estate_for_relation(rel);
#if PG_VERSION_NUM >= PG_VERSION_14
ResultRelInfo *resultRelInfo = makeNode(ResultRelInfo);
InitResultRelInfo(resultRelInfo, rel, 1, NULL, 0);
#else
ResultRelInfo *resultRelInfo = estate->es_result_relation_info;
#endif
/* ExecSimpleRelationInsert, ... require caller to open indexes */
ExecOpenIndices(resultRelInfo, false);
@ -1667,7 +1508,7 @@ InsertTupleAndEnforceConstraints(ModifyState *state, Datum *values, bool *nulls)
ExecStoreHeapTuple(tuple, slot, false);
/* use ExecSimpleRelationInsert to enforce constraints */
ExecSimpleRelationInsert(state->resultRelInfo, state->estate, slot);
ExecSimpleRelationInsert_compat(state->resultRelInfo, state->estate, slot);
}
@ -1685,7 +1526,7 @@ DeleteTupleAndEnforceConstraints(ModifyState *state, HeapTuple heapTuple)
simple_heap_delete(state->rel, tid);
/* execute AFTER ROW DELETE Triggers to enforce constraints */
ExecARDeleteTriggers(estate, resultRelInfo, tid, NULL, NULL, false);
ExecARDeleteTriggers(estate, resultRelInfo, tid, NULL, NULL);
}
@ -1698,8 +1539,12 @@ FinishModifyRelation(ModifyState *state)
ExecCloseIndices(state->resultRelInfo);
AfterTriggerEndQuery(state->estate);
#if PG_VERSION_NUM >= PG_VERSION_14
ExecCloseResultRelations(state->estate);
ExecCloseRangeTableRelations(state->estate);
#else
ExecCleanUpTriggerState(state->estate);
#endif
ExecResetTupleTable(state->estate->es_tupleTable, false);
FreeExecutorState(state->estate);
@ -1726,13 +1571,15 @@ create_estate_for_relation(Relation rel)
rte->relid = RelationGetRelid(rel);
rte->relkind = rel->rd_rel->relkind;
rte->rellockmode = AccessShareLock;
#if PG_VERSION_NUM >= PG_VERSION_16
List *perminfos = NIL;
addRTEPermissionInfo(&perminfos, rte);
ExecInitRangeTable(estate, list_make1(rte), perminfos);
#else
ExecInitRangeTable(estate, list_make1(rte));
#if PG_VERSION_NUM < PG_VERSION_14
ResultRelInfo *resultRelInfo = makeNode(ResultRelInfo);
InitResultRelInfo(resultRelInfo, rel, 1, NULL, 0);
estate->es_result_relations = resultRelInfo;
estate->es_num_result_relations = 1;
estate->es_result_relation_info = resultRelInfo;
#endif
estate->es_output_cid = GetCurrentCommandId(true);
@ -1920,15 +1767,7 @@ ColumnarChunkGroupIndexRelationId(void)
static Oid
ColumnarNamespaceId(void)
{
Oid namespace = get_namespace_oid("columnar_internal", true);
/* if schema is earlier than 11.1-1 */
if (!OidIsValid(namespace))
{
namespace = get_namespace_oid("columnar", false);
}
return namespace;
return get_namespace_oid("columnar", false);
}
@ -1937,11 +1776,10 @@ ColumnarNamespaceId(void)
* false if the relation doesn't have a meta page yet.
*/
static uint64
LookupStorageId(RelFileLocator relfilelocator)
LookupStorageId(RelFileNode relfilenode)
{
Oid relationId = RelidByRelfilenumber(RelationTablespace_compat(relfilelocator),
RelationPhysicalIdentifierNumber_compat(
relfilelocator));
Oid relationId = RelidByRelfilenode(relfilenode.spcNode,
relfilenode.relNode);
Relation relation = relation_open(relationId, AccessShareLock);
uint64 storageId = ColumnarStorageGetStorageId(relation, false);
@ -1971,13 +1809,6 @@ columnar_relation_storageid(PG_FUNCTION_ARGS)
{
Oid relationId = PG_GETARG_OID(0);
Relation relation = relation_open(relationId, AccessShareLock);
if (!object_ownercheck(RelationRelationId, relationId, GetUserId()))
{
aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_TABLE,
get_rel_name(relationId));
}
if (!IsColumnarTableAmTable(relationId))
{
elog(ERROR, "relation \"%s\" is not a columnar table",
@ -2004,10 +1835,11 @@ ColumnarStorageUpdateIfNeeded(Relation rel, bool isUpgrade)
return;
}
BlockNumber nblocks = smgrnblocks(RelationGetSmgr(rel), MAIN_FORKNUM);
RelationOpenSmgr(rel);
BlockNumber nblocks = smgrnblocks(rel->rd_smgr, MAIN_FORKNUM);
if (nblocks < 2)
{
ColumnarStorageInit(RelationGetSmgr(rel), ColumnarMetadataNewStorageId());
ColumnarStorageInit(rel->rd_smgr, ColumnarMetadataNewStorageId());
return;
}
@ -2041,7 +1873,7 @@ GetHighestUsedRowNumber(uint64 storageId)
List *stripeMetadataList = ReadDataFileStripeList(storageId,
GetTransactionSnapshot());
StripeMetadata *stripeMetadata = NULL;
foreach_declared_ptr(stripeMetadata, stripeMetadataList)
foreach_ptr(stripeMetadata, stripeMetadataList)
{
highestRowNumber = Max(highestRowNumber,
StripeGetHighestRowNumber(stripeMetadata));

View File

@ -22,15 +22,16 @@
#include "access/xact.h"
#include "catalog/pg_am.h"
#include "commands/defrem.h"
#include "distributed/listutils.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/optimizer.h"
#include "optimizer/clauses.h"
#include "optimizer/restrictinfo.h"
#include "storage/fd.h"
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
#include "columnar/columnar.h"
@ -38,8 +39,6 @@
#include "columnar/columnar_tableam.h"
#include "columnar/columnar_version_compat.h"
#include "distributed/listutils.h"
#define UNEXPECTED_STRIPE_READ_ERR_MSG \
"attempted to read an unexpected stripe while reading columnar " \
"table %s, stripe with id=" UINT64_FORMAT " is not flushed"
@ -255,9 +254,8 @@ ColumnarReadFlushPendingWrites(ColumnarReadState *readState)
{
Assert(!readState->snapshotRegisteredByUs);
RelFileNumber relfilenumber = RelationPhysicalIdentifierNumber_compat(
RelationPhysicalIdentifier_compat(readState->relation));
FlushWriteStateForRelfilenumber(relfilenumber, GetCurrentSubTransactionId());
Oid relfilenode = readState->relation->rd_node.relNode;
FlushWriteStateForRelfilenode(relfilenode, GetCurrentSubTransactionId());
if (readState->snapshot == InvalidSnapshot || !IsMVCCSnapshot(readState->snapshot))
{
@ -880,7 +878,7 @@ ReadChunkGroupNextRow(ChunkGroupReadState *chunkGroupReadState, Datum *columnVal
memset(columnNulls, true, sizeof(bool) * chunkGroupReadState->columnCount);
int attno;
foreach_declared_int(attno, chunkGroupReadState->projectedColumnList)
foreach_int(attno, chunkGroupReadState->projectedColumnList)
{
const ChunkData *chunkGroupData = chunkGroupReadState->chunkGroupData;
const int rowIndex = chunkGroupReadState->currentRow;
@ -986,8 +984,7 @@ ColumnarTableRowCount(Relation relation)
{
ListCell *stripeMetadataCell = NULL;
uint64 totalRowCount = 0;
List *stripeList = StripesForRelfilelocator(RelationPhysicalIdentifier_compat(
relation));
List *stripeList = StripesForRelfilenode(relation->rd_node);
foreach(stripeMetadataCell, stripeList)
{
@ -1015,8 +1012,7 @@ LoadFilteredStripeBuffers(Relation relation, StripeMetadata *stripeMetadata,
bool *projectedColumnMask = ProjectedColumnMask(columnCount, projectedColumnList);
StripeSkipList *stripeSkipList = ReadStripeSkipList(RelationPhysicalIdentifier_compat(
relation),
StripeSkipList *stripeSkipList = ReadStripeSkipList(relation->rd_node,
stripeMetadata->id,
tupleDescriptor,
stripeMetadata->chunkCount,
@ -1489,7 +1485,7 @@ ProjectedColumnMask(uint32 columnCount, List *projectedColumnList)
bool *projectedColumnMask = palloc0(columnCount * sizeof(bool));
int attno;
foreach_declared_int(attno, projectedColumnList)
foreach_int(attno, projectedColumnList)
{
/* attno is 1-indexed; projectedColumnMask is 0-indexed */
int columnIndex = attno - 1;
@ -1561,7 +1557,7 @@ DeserializeDatumArray(StringInfo datumBuffer, bool *existsArray, uint32 datumCou
datumTypeLength);
currentDatumDataOffset = att_addlength_datum(currentDatumDataOffset,
datumTypeLength,
datumArray[datumIndex]);
currentDatumDataPointer);
currentDatumDataOffset = att_align_nominal(currentDatumDataOffset,
datumTypeAlign);

View File

@ -36,16 +36,14 @@
#include "postgres.h"
#include "miscadmin.h"
#include "safe_lib.h"
#include "access/generic_xlog.h"
#include "catalog/storage.h"
#include "miscadmin.h"
#include "storage/bufmgr.h"
#include "storage/lmgr.h"
#include "pg_version_compat.h"
#include "columnar/columnar.h"
#include "columnar/columnar_storage.h"
@ -169,11 +167,7 @@ ColumnarStorageInit(SMgrRelation srel, uint64 storageId)
}
/* create two pages */
#if PG_VERSION_NUM >= PG_VERSION_16
PGIOAlignedBlock block;
#else
PGAlignedBlock block;
#endif
Page page = block.data;
/* write metapage */
@ -192,7 +186,7 @@ ColumnarStorageInit(SMgrRelation srel, uint64 storageId)
(char *) &metapage, sizeof(ColumnarMetapage));
phdr->pd_lower += sizeof(ColumnarMetapage);
log_newpage(RelationPhysicalIdentifierBackend_compat(&srel), MAIN_FORKNUM,
log_newpage(&srel->smgr_rnode.node, MAIN_FORKNUM,
COLUMNAR_METAPAGE_BLOCKNO, page, true);
PageSetChecksumInplace(page, COLUMNAR_METAPAGE_BLOCKNO);
smgrextend(srel, MAIN_FORKNUM, COLUMNAR_METAPAGE_BLOCKNO, page, true);
@ -200,7 +194,7 @@ ColumnarStorageInit(SMgrRelation srel, uint64 storageId)
/* write empty page */
PageInit(page, BLCKSZ, 0);
log_newpage(RelationPhysicalIdentifierBackend_compat(&srel), MAIN_FORKNUM,
log_newpage(&srel->smgr_rnode.node, MAIN_FORKNUM,
COLUMNAR_EMPTY_BLOCKNO, page, true);
PageSetChecksumInplace(page, COLUMNAR_EMPTY_BLOCKNO);
smgrextend(srel, MAIN_FORKNUM, COLUMNAR_EMPTY_BLOCKNO, page, true);
@ -360,7 +354,8 @@ ColumnarStorageGetReservedOffset(Relation rel, bool force)
bool
ColumnarStorageIsCurrent(Relation rel)
{
BlockNumber nblocks = smgrnblocks(RelationGetSmgr(rel), MAIN_FORKNUM);
RelationOpenSmgr(rel);
BlockNumber nblocks = smgrnblocks(rel->rd_smgr, MAIN_FORKNUM);
if (nblocks < 2)
{
@ -444,7 +439,8 @@ ColumnarStorageReserveData(Relation rel, uint64 amount)
PhysicalAddr final = LogicalToPhysical(nextReservation - 1);
/* extend with new pages */
BlockNumber nblocks = smgrnblocks(RelationGetSmgr(rel), MAIN_FORKNUM);
RelationOpenSmgr(rel);
BlockNumber nblocks = smgrnblocks(rel->rd_smgr, MAIN_FORKNUM);
while (nblocks <= final.blockno)
{
@ -551,7 +547,8 @@ ColumnarStorageTruncate(Relation rel, uint64 newDataReservation)
rel->rd_id, newDataReservation);
}
BlockNumber old_rel_pages = smgrnblocks(RelationGetSmgr(rel), MAIN_FORKNUM);
RelationOpenSmgr(rel);
BlockNumber old_rel_pages = smgrnblocks(rel->rd_smgr, MAIN_FORKNUM);
if (old_rel_pages == 0)
{
/* nothing to do */
@ -630,7 +627,8 @@ ColumnarOverwriteMetapage(Relation relation, ColumnarMetapage columnarMetapage)
static ColumnarMetapage
ColumnarMetapageRead(Relation rel, bool force)
{
BlockNumber nblocks = smgrnblocks(RelationGetSmgr(rel), MAIN_FORKNUM);
RelationOpenSmgr(rel);
BlockNumber nblocks = smgrnblocks(rel->rd_smgr, MAIN_FORKNUM);
if (nblocks == 0)
{
/*

File diff suppressed because it is too large Load Diff

View File

@ -16,37 +16,28 @@
#include "postgres.h"
#include "miscadmin.h"
#include "safe_lib.h"
#include "access/heapam.h"
#include "access/nbtree.h"
#include "catalog/pg_am.h"
#include "miscadmin.h"
#include "storage/fd.h"
#include "storage/smgr.h"
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/rel.h"
#include "pg_version_compat.h"
#include "pg_version_constants.h"
#include "utils/relfilenodemap.h"
#include "columnar/columnar.h"
#include "columnar/columnar_storage.h"
#include "columnar/columnar_version_compat.h"
#if PG_VERSION_NUM >= PG_VERSION_16
#include "storage/relfilelocator.h"
#include "utils/relfilenumbermap.h"
#else
#include "utils/relfilenodemap.h"
#endif
struct ColumnarWriteState
{
TupleDesc tupleDescriptor;
FmgrInfo **comparisonFunctionArray;
RelFileLocator relfilelocator;
RelFileNode relfilenode;
MemoryContext stripeWriteContext;
MemoryContext perTupleContext;
@ -93,7 +84,7 @@ static StringInfo CopyStringInfo(StringInfo sourceString);
* data load operation.
*/
ColumnarWriteState *
ColumnarBeginWrite(RelFileLocator relfilelocator,
ColumnarBeginWrite(RelFileNode relfilenode,
ColumnarOptions options,
TupleDesc tupleDescriptor)
{
@ -133,7 +124,7 @@ ColumnarBeginWrite(RelFileLocator relfilelocator,
options.chunkRowCount);
ColumnarWriteState *writeState = palloc0(sizeof(ColumnarWriteState));
writeState->relfilelocator = relfilelocator;
writeState->relfilenode = relfilenode;
writeState->options = options;
writeState->tupleDescriptor = CreateTupleDescCopy(tupleDescriptor);
writeState->comparisonFunctionArray = comparisonFunctionArray;
@ -183,10 +174,8 @@ ColumnarWriteRow(ColumnarWriteState *writeState, Datum *columnValues, bool *colu
writeState->stripeSkipList = stripeSkipList;
writeState->compressionBuffer = makeStringInfo();
Oid relationId = RelidByRelfilenumber(RelationTablespace_compat(
writeState->relfilelocator),
RelationPhysicalIdentifierNumber_compat(
writeState->relfilelocator));
Oid relationId = RelidByRelfilenode(writeState->relfilenode.spcNode,
writeState->relfilenode.relNode);
Relation relation = relation_open(relationId, NoLock);
writeState->emptyStripeReservation =
ReserveEmptyStripe(relation, columnCount, chunkRowCount,
@ -404,10 +393,8 @@ FlushStripe(ColumnarWriteState *writeState)
elog(DEBUG1, "Flushing Stripe of size %d", stripeBuffers->rowCount);
Oid relationId = RelidByRelfilenumber(RelationTablespace_compat(
writeState->relfilelocator),
RelationPhysicalIdentifierNumber_compat(
writeState->relfilelocator));
Oid relationId = RelidByRelfilenode(writeState->relfilenode.spcNode,
writeState->relfilenode.relNode);
Relation relation = relation_open(relationId, NoLock);
/*
@ -499,10 +486,10 @@ FlushStripe(ColumnarWriteState *writeState)
}
}
SaveChunkGroups(writeState->relfilelocator,
SaveChunkGroups(writeState->relfilenode,
stripeMetadata->id,
writeState->chunkGroupRowCounts);
SaveStripeSkipList(writeState->relfilelocator,
SaveStripeSkipList(writeState->relfilenode,
stripeMetadata->id,
stripeSkipList, tupleDescriptor);

View File

@ -18,15 +18,13 @@
#include "citus_version.h"
#include "columnar/columnar.h"
#include "columnar/mod.h"
#include "columnar/columnar_tableam.h"
PG_MODULE_MAGIC;
void _PG_init(void);
void
_PG_init(void)
columnar_init(void)
{
columnar_init();
columnar_init_gucs();
columnar_tableam_init();
}

View File

@ -1 +0,0 @@
../../../vendor/safestringlib/safeclib/

View File

@ -1,32 +0,0 @@
-- add columnar objects back
ALTER EXTENSION citus_columnar ADD SCHEMA columnar;
ALTER EXTENSION citus_columnar ADD SCHEMA columnar_internal;
ALTER EXTENSION citus_columnar ADD SEQUENCE columnar_internal.storageid_seq;
ALTER EXTENSION citus_columnar ADD TABLE columnar_internal.options;
ALTER EXTENSION citus_columnar ADD TABLE columnar_internal.stripe;
ALTER EXTENSION citus_columnar ADD TABLE columnar_internal.chunk_group;
ALTER EXTENSION citus_columnar ADD TABLE columnar_internal.chunk;
ALTER EXTENSION citus_columnar ADD FUNCTION columnar_internal.columnar_handler;
ALTER EXTENSION citus_columnar ADD ACCESS METHOD columnar;
ALTER EXTENSION citus_columnar ADD FUNCTION pg_catalog.alter_columnar_table_set;
ALTER EXTENSION citus_columnar ADD FUNCTION pg_catalog.alter_columnar_table_reset;
ALTER EXTENSION citus_columnar ADD FUNCTION citus_internal.upgrade_columnar_storage;
ALTER EXTENSION citus_columnar ADD FUNCTION citus_internal.downgrade_columnar_storage;
ALTER EXTENSION citus_columnar ADD FUNCTION citus_internal.columnar_ensure_am_depends_catalog;
ALTER EXTENSION citus_columnar ADD FUNCTION columnar.get_storage_id;
ALTER EXTENSION citus_columnar ADD VIEW columnar.storage;
ALTER EXTENSION citus_columnar ADD VIEW columnar.options;
ALTER EXTENSION citus_columnar ADD VIEW columnar.stripe;
ALTER EXTENSION citus_columnar ADD VIEW columnar.chunk_group;
ALTER EXTENSION citus_columnar ADD VIEW columnar.chunk;
-- move citus_internal functions to columnar_internal
ALTER FUNCTION citus_internal.upgrade_columnar_storage(regclass) SET SCHEMA columnar_internal;
ALTER FUNCTION citus_internal.downgrade_columnar_storage(regclass) SET SCHEMA columnar_internal;
ALTER FUNCTION citus_internal.columnar_ensure_am_depends_catalog() SET SCHEMA columnar_internal;

View File

@ -1 +0,0 @@
-- fake sql file 'Y'

View File

@ -1,19 +0,0 @@
-- citus_columnar--11.1-1--11.2-1
#include "udfs/columnar_ensure_am_depends_catalog/11.2-1.sql"
DELETE FROM pg_depend
WHERE classid = 'pg_am'::regclass::oid
AND objid IN (select oid from pg_am where amname = 'columnar')
AND objsubid = 0
AND refclassid = 'pg_class'::regclass::oid
AND refobjid IN (
'columnar_internal.stripe_first_row_number_idx'::regclass::oid,
'columnar_internal.chunk_group_pkey'::regclass::oid,
'columnar_internal.chunk_pkey'::regclass::oid,
'columnar_internal.options_pkey'::regclass::oid,
'columnar_internal.stripe_first_row_number_idx'::regclass::oid,
'columnar_internal.stripe_pkey'::regclass::oid
)
AND refobjsubid = 0
AND deptype = 'n';

View File

@ -1,435 +0,0 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION citus_columnar" to load this file. \quit
-- columnar--9.5-1--10.0-1.sql
CREATE SCHEMA IF NOT EXISTS columnar;
SET search_path TO columnar;
CREATE SEQUENCE IF NOT EXISTS storageid_seq MINVALUE 10000000000 NO CYCLE;
CREATE TABLE IF NOT EXISTS options (
regclass regclass NOT NULL PRIMARY KEY,
chunk_group_row_limit int NOT NULL,
stripe_row_limit int NOT NULL,
compression_level int NOT NULL,
compression name NOT NULL
) WITH (user_catalog_table = true);
COMMENT ON TABLE options IS 'columnar table specific options, maintained by alter_columnar_table_set';
CREATE TABLE IF NOT EXISTS stripe (
storage_id bigint NOT NULL,
stripe_num bigint NOT NULL,
file_offset bigint NOT NULL,
data_length bigint NOT NULL,
column_count int NOT NULL,
chunk_row_count int NOT NULL,
row_count bigint NOT NULL,
chunk_group_count int NOT NULL,
first_row_number bigint NOT NULL,
PRIMARY KEY (storage_id, stripe_num),
CONSTRAINT stripe_first_row_number_idx UNIQUE (storage_id, first_row_number)
) WITH (user_catalog_table = true);
COMMENT ON TABLE stripe IS 'Columnar per stripe metadata';
CREATE TABLE IF NOT EXISTS chunk_group (
storage_id bigint NOT NULL,
stripe_num bigint NOT NULL,
chunk_group_num int NOT NULL,
row_count bigint NOT NULL,
PRIMARY KEY (storage_id, stripe_num, chunk_group_num)
);
COMMENT ON TABLE chunk_group IS 'Columnar chunk group metadata';
CREATE TABLE IF NOT EXISTS chunk (
storage_id bigint NOT NULL,
stripe_num bigint NOT NULL,
attr_num int NOT NULL,
chunk_group_num int NOT NULL,
minimum_value bytea,
maximum_value bytea,
value_stream_offset bigint NOT NULL,
value_stream_length bigint NOT NULL,
exists_stream_offset bigint NOT NULL,
exists_stream_length bigint NOT NULL,
value_compression_type int NOT NULL,
value_compression_level int NOT NULL,
value_decompressed_length bigint NOT NULL,
value_count bigint NOT NULL,
PRIMARY KEY (storage_id, stripe_num, attr_num, chunk_group_num)
) WITH (user_catalog_table = true);
COMMENT ON TABLE chunk IS 'Columnar per chunk metadata';
DO $proc$
BEGIN
-- from version 12 and up we have support for tableam's if installed on pg11 we can't
-- create the objects here. Instead we rely on citus_finish_pg_upgrade to be called by the
-- user instead to add the missing objects
IF substring(current_Setting('server_version'), '\d+')::int >= 12 THEN
EXECUTE $$
--#include "udfs/columnar_handler/10.0-1.sql"
CREATE OR REPLACE FUNCTION columnar.columnar_handler(internal)
RETURNS table_am_handler
LANGUAGE C
AS 'MODULE_PATHNAME', 'columnar_handler';
COMMENT ON FUNCTION columnar.columnar_handler(internal)
IS 'internal function returning the handler for columnar tables';
-- postgres 11.8 does not support the syntax for table am, also it is seemingly trying
-- to parse the upgrade file and erroring on unknown syntax.
-- normally this section would not execute on postgres 11 anyway. To trick it to pass on
-- 11.8 we wrap the statement in a plpgsql block together with an EXECUTE. This is valid
-- syntax on 11.8 and will execute correctly in 12
DO $create_table_am$
BEGIN
EXECUTE 'CREATE ACCESS METHOD columnar TYPE TABLE HANDLER columnar.columnar_handler';
END $create_table_am$;
--#include "udfs/alter_columnar_table_set/10.0-1.sql"
CREATE OR REPLACE FUNCTION pg_catalog.alter_columnar_table_set(
table_name regclass,
chunk_group_row_limit int DEFAULT NULL,
stripe_row_limit int DEFAULT NULL,
compression name DEFAULT null,
compression_level int DEFAULT NULL)
RETURNS void
LANGUAGE C
AS 'MODULE_PATHNAME', 'alter_columnar_table_set';
COMMENT ON FUNCTION pg_catalog.alter_columnar_table_set(
table_name regclass,
chunk_group_row_limit int,
stripe_row_limit int,
compression name,
compression_level int)
IS 'set one or more options on a columnar table, when set to NULL no change is made';
--#include "udfs/alter_columnar_table_reset/10.0-1.sql"
CREATE OR REPLACE FUNCTION pg_catalog.alter_columnar_table_reset(
table_name regclass,
chunk_group_row_limit bool DEFAULT false,
stripe_row_limit bool DEFAULT false,
compression bool DEFAULT false,
compression_level bool DEFAULT false)
RETURNS void
LANGUAGE C
AS 'MODULE_PATHNAME', 'alter_columnar_table_reset';
COMMENT ON FUNCTION pg_catalog.alter_columnar_table_reset(
table_name regclass,
chunk_group_row_limit bool,
stripe_row_limit bool,
compression bool,
compression_level bool)
IS 'reset on or more options on a columnar table to the system defaults';
$$;
END IF;
END$proc$;
-- (this function being dropped in 10.0.3)->#include "udfs/columnar_ensure_objects_exist/10.0-1.sql"
RESET search_path;
-- columnar--10.0.-1 --10.0.2
GRANT USAGE ON SCHEMA columnar TO PUBLIC;
GRANT SELECT ON ALL tables IN SCHEMA columnar TO PUBLIC ;
-- columnar--10.0-3--10.1-1.sql
-- Drop foreign keys between columnar metadata tables.
-- columnar--10.1-1--10.2-1.sql
-- For a proper mapping between tid & (stripe, row_num), add a new column to
-- columnar.stripe and define a BTREE index on this column.
-- Also include storage_id column for per-relation scans.
-- Populate first_row_number column of columnar.stripe table.
--
-- For simplicity, we calculate MAX(row_count) value across all the stripes
-- of all the columanar tables and then use it to populate first_row_number
-- column. This would introduce some gaps however we are okay with that since
-- it's already the case with regular INSERT/COPY's.
DO $$
DECLARE
max_row_count bigint;
-- this should be equal to columnar_storage.h/COLUMNAR_FIRST_ROW_NUMBER
COLUMNAR_FIRST_ROW_NUMBER constant bigint := 1;
BEGIN
SELECT MAX(row_count) INTO max_row_count FROM columnar.stripe;
UPDATE columnar.stripe SET first_row_number = COLUMNAR_FIRST_ROW_NUMBER +
(stripe_num - 1) * max_row_count;
END;
$$;
-- columnar--10.2-1--10.2-2.sql
-- revoke read access for columnar.chunk from unprivileged
-- user as it contains chunk min/max values
REVOKE SELECT ON columnar.chunk FROM PUBLIC;
-- columnar--10.2-2--10.2-3.sql
-- Since stripe_first_row_number_idx is required to scan a columnar table, we
-- need to make sure that it is created before doing anything with columnar
-- tables during pg upgrades.
--
-- However, a plain btree index is not a dependency of a table, so pg_upgrade
-- cannot guarantee that stripe_first_row_number_idx gets created when
-- creating columnar.stripe, unless we make it a unique "constraint".
--
-- To do that, drop stripe_first_row_number_idx and create a unique
-- constraint with the same name to keep the code change at minimum.
-- columnar--10.2-3--10.2-4.sql
-- columnar--11.0-2--11.1-1.sql
CREATE OR REPLACE FUNCTION pg_catalog.alter_columnar_table_set(
table_name regclass,
chunk_group_row_limit int DEFAULT NULL,
stripe_row_limit int DEFAULT NULL,
compression name DEFAULT null,
compression_level int DEFAULT NULL)
RETURNS void
LANGUAGE plpgsql AS
$alter_columnar_table_set$
declare
noop BOOLEAN := true;
cmd TEXT := 'ALTER TABLE ' || table_name::text || ' SET (';
begin
if (chunk_group_row_limit is not null) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.chunk_group_row_limit=' || chunk_group_row_limit;
noop := false;
end if;
if (stripe_row_limit is not null) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.stripe_row_limit=' || stripe_row_limit;
noop := false;
end if;
if (compression is not null) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.compression=' || compression;
noop := false;
end if;
if (compression_level is not null) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.compression_level=' || compression_level;
noop := false;
end if;
cmd := cmd || ')';
if (not noop) then
execute cmd;
end if;
return;
end;
$alter_columnar_table_set$;
COMMENT ON FUNCTION pg_catalog.alter_columnar_table_set(
table_name regclass,
chunk_group_row_limit int,
stripe_row_limit int,
compression name,
compression_level int)
IS 'set one or more options on a columnar table, when set to NULL no change is made';
CREATE OR REPLACE FUNCTION pg_catalog.alter_columnar_table_reset(
table_name regclass,
chunk_group_row_limit bool DEFAULT false,
stripe_row_limit bool DEFAULT false,
compression bool DEFAULT false,
compression_level bool DEFAULT false)
RETURNS void
LANGUAGE plpgsql AS
$alter_columnar_table_reset$
declare
noop BOOLEAN := true;
cmd TEXT := 'ALTER TABLE ' || table_name::text || ' RESET (';
begin
if (chunk_group_row_limit) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.chunk_group_row_limit';
noop := false;
end if;
if (stripe_row_limit) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.stripe_row_limit';
noop := false;
end if;
if (compression) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.compression';
noop := false;
end if;
if (compression_level) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.compression_level';
noop := false;
end if;
cmd := cmd || ')';
if (not noop) then
execute cmd;
end if;
return;
end;
$alter_columnar_table_reset$;
COMMENT ON FUNCTION pg_catalog.alter_columnar_table_reset(
table_name regclass,
chunk_group_row_limit bool,
stripe_row_limit bool,
compression bool,
compression_level bool)
IS 'reset on or more options on a columnar table to the system defaults';
-- rename columnar schema to columnar_internal and tighten security
REVOKE ALL PRIVILEGES ON ALL TABLES IN SCHEMA columnar FROM PUBLIC;
ALTER SCHEMA columnar RENAME TO columnar_internal;
REVOKE ALL PRIVILEGES ON SCHEMA columnar_internal FROM PUBLIC;
-- create columnar schema with public usage privileges
CREATE SCHEMA columnar;
GRANT USAGE ON SCHEMA columnar TO PUBLIC;
--#include "udfs/upgrade_columnar_storage/10.2-1.sql"
CREATE OR REPLACE FUNCTION columnar_internal.upgrade_columnar_storage(rel regclass)
RETURNS VOID
STRICT
LANGUAGE c AS 'MODULE_PATHNAME', $$upgrade_columnar_storage$$;
COMMENT ON FUNCTION columnar_internal.upgrade_columnar_storage(regclass)
IS 'function to upgrade the columnar storage, if necessary';
--#include "udfs/downgrade_columnar_storage/10.2-1.sql"
CREATE OR REPLACE FUNCTION columnar_internal.downgrade_columnar_storage(rel regclass)
RETURNS VOID
STRICT
LANGUAGE c AS 'MODULE_PATHNAME', $$downgrade_columnar_storage$$;
COMMENT ON FUNCTION columnar_internal.downgrade_columnar_storage(regclass)
IS 'function to downgrade the columnar storage, if necessary';
-- update UDF to account for columnar_internal schema
CREATE OR REPLACE FUNCTION columnar_internal.columnar_ensure_am_depends_catalog()
RETURNS void
LANGUAGE plpgsql
SET search_path = pg_catalog
AS $func$
BEGIN
INSERT INTO pg_depend
WITH columnar_schema_members(relid) AS (
SELECT pg_class.oid AS relid FROM pg_class
WHERE relnamespace =
COALESCE(
(SELECT pg_namespace.oid FROM pg_namespace WHERE nspname = 'columnar_internal'),
(SELECT pg_namespace.oid FROM pg_namespace WHERE nspname = 'columnar')
)
AND relname IN ('chunk',
'chunk_group',
'chunk_group_pkey',
'chunk_pkey',
'options',
'options_pkey',
'storageid_seq',
'stripe',
'stripe_first_row_number_idx',
'stripe_pkey')
)
SELECT -- Define a dependency edge from "columnar table access method" ..
'pg_am'::regclass::oid as classid,
(select oid from pg_am where amname = 'columnar') as objid,
0 as objsubid,
-- ... to each object that is registered to pg_class and that lives
-- in "columnar" schema. That contains catalog tables, indexes
-- created on them and the sequences created in "columnar" schema.
--
-- Given the possibility of user might have created their own objects
-- in columnar schema, we explicitly specify list of objects that we
-- are interested in.
'pg_class'::regclass::oid as refclassid,
columnar_schema_members.relid as refobjid,
0 as refobjsubid,
'n' as deptype
FROM columnar_schema_members
-- Avoid inserting duplicate entries into pg_depend.
EXCEPT TABLE pg_depend;
END;
$func$;
COMMENT ON FUNCTION columnar_internal.columnar_ensure_am_depends_catalog()
IS 'internal function responsible for creating dependencies from columnar '
'table access method to the rel objects in columnar schema';
SELECT columnar_internal.columnar_ensure_am_depends_catalog();
-- add utility function
CREATE FUNCTION columnar.get_storage_id(regclass) RETURNS bigint
LANGUAGE C STRICT
AS 'citus_columnar', $$columnar_relation_storageid$$;
-- create views for columnar table information
CREATE VIEW columnar.storage WITH (security_barrier) AS
SELECT c.oid::regclass AS relation,
columnar.get_storage_id(c.oid) AS storage_id
FROM pg_class c, pg_am am
WHERE c.relam = am.oid AND am.amname = 'columnar'
AND pg_has_role(c.relowner, 'USAGE');
COMMENT ON VIEW columnar.storage IS 'Columnar relation ID to storage ID mapping.';
GRANT SELECT ON columnar.storage TO PUBLIC;
CREATE VIEW columnar.options WITH (security_barrier) AS
SELECT regclass AS relation, chunk_group_row_limit,
stripe_row_limit, compression, compression_level
FROM columnar_internal.options o, pg_class c
WHERE o.regclass = c.oid
AND pg_has_role(c.relowner, 'USAGE');
COMMENT ON VIEW columnar.options
IS 'Columnar options for tables on which the current user has ownership privileges.';
GRANT SELECT ON columnar.options TO PUBLIC;
CREATE VIEW columnar.stripe WITH (security_barrier) AS
SELECT relation, storage.storage_id, stripe_num, file_offset, data_length,
column_count, chunk_row_count, row_count, chunk_group_count, first_row_number
FROM columnar_internal.stripe stripe, columnar.storage storage
WHERE stripe.storage_id = storage.storage_id;
COMMENT ON VIEW columnar.stripe
IS 'Columnar stripe information for tables on which the current user has ownership privileges.';
GRANT SELECT ON columnar.stripe TO PUBLIC;
CREATE VIEW columnar.chunk_group WITH (security_barrier) AS
SELECT relation, storage.storage_id, stripe_num, chunk_group_num, row_count
FROM columnar_internal.chunk_group cg, columnar.storage storage
WHERE cg.storage_id = storage.storage_id;
COMMENT ON VIEW columnar.chunk_group
IS 'Columnar chunk group information for tables on which the current user has ownership privileges.';
GRANT SELECT ON columnar.chunk_group TO PUBLIC;
CREATE VIEW columnar.chunk WITH (security_barrier) AS
SELECT relation, storage.storage_id, stripe_num, attr_num, chunk_group_num,
minimum_value, maximum_value, value_stream_offset, value_stream_length,
exists_stream_offset, exists_stream_length, value_compression_type,
value_compression_level, value_decompressed_length, value_count
FROM columnar_internal.chunk chunk, columnar.storage storage
WHERE chunk.storage_id = storage.storage_id;
COMMENT ON VIEW columnar.chunk
IS 'Columnar chunk information for tables on which the current user has ownership privileges.';
GRANT SELECT ON columnar.chunk TO PUBLIC;

View File

@ -1 +0,0 @@
-- citus_columnar--11.2-1--11.3-1

View File

@ -1 +0,0 @@
-- citus_columnar--11.3-1--12.2-1

View File

@ -28,5 +28,5 @@ $$;
#include "udfs/downgrade_columnar_storage/10.2-1.sql"
-- upgrade storage for all columnar relations
PERFORM citus_internal.upgrade_columnar_storage(c.oid) FROM pg_class c, pg_am a
SELECT citus_internal.upgrade_columnar_storage(c.oid) FROM pg_class c, pg_am a
WHERE c.relam = a.oid AND amname = 'columnar';

View File

@ -2,4 +2,4 @@
#include "udfs/columnar_ensure_am_depends_catalog/10.2-4.sql"
PERFORM citus_internal.columnar_ensure_am_depends_catalog();
SELECT citus_internal.columnar_ensure_am_depends_catalog();

View File

@ -1,71 +0,0 @@
#include "udfs/alter_columnar_table_set/11.1-1.sql"
#include "udfs/alter_columnar_table_reset/11.1-1.sql"
-- rename columnar schema to columnar_internal and tighten security
REVOKE ALL PRIVILEGES ON ALL TABLES IN SCHEMA columnar FROM PUBLIC;
ALTER SCHEMA columnar RENAME TO columnar_internal;
REVOKE ALL PRIVILEGES ON SCHEMA columnar_internal FROM PUBLIC;
-- create columnar schema with public usage privileges
CREATE SCHEMA columnar;
GRANT USAGE ON SCHEMA columnar TO PUBLIC;
-- update UDF to account for columnar_internal schema
#include "udfs/columnar_ensure_am_depends_catalog/11.1-1.sql"
-- add utility function
CREATE FUNCTION columnar.get_storage_id(regclass) RETURNS bigint
LANGUAGE C STRICT
AS 'citus_columnar', $$columnar_relation_storageid$$;
-- create views for columnar table information
CREATE VIEW columnar.storage WITH (security_barrier) AS
SELECT c.oid::regclass AS relation,
columnar.get_storage_id(c.oid) AS storage_id
FROM pg_class c, pg_am am
WHERE c.relam = am.oid AND am.amname = 'columnar'
AND pg_has_role(c.relowner, 'USAGE');
COMMENT ON VIEW columnar.storage IS 'Columnar relation ID to storage ID mapping.';
GRANT SELECT ON columnar.storage TO PUBLIC;
CREATE VIEW columnar.options WITH (security_barrier) AS
SELECT regclass AS relation, chunk_group_row_limit,
stripe_row_limit, compression, compression_level
FROM columnar_internal.options o, pg_class c
WHERE o.regclass = c.oid
AND pg_has_role(c.relowner, 'USAGE');
COMMENT ON VIEW columnar.options
IS 'Columnar options for tables on which the current user has ownership privileges.';
GRANT SELECT ON columnar.options TO PUBLIC;
CREATE VIEW columnar.stripe WITH (security_barrier) AS
SELECT relation, storage.storage_id, stripe_num, file_offset, data_length,
column_count, chunk_row_count, row_count, chunk_group_count, first_row_number
FROM columnar_internal.stripe stripe, columnar.storage storage
WHERE stripe.storage_id = storage.storage_id;
COMMENT ON VIEW columnar.stripe
IS 'Columnar stripe information for tables on which the current user has ownership privileges.';
GRANT SELECT ON columnar.stripe TO PUBLIC;
CREATE VIEW columnar.chunk_group WITH (security_barrier) AS
SELECT relation, storage.storage_id, stripe_num, chunk_group_num, row_count
FROM columnar_internal.chunk_group cg, columnar.storage storage
WHERE cg.storage_id = storage.storage_id;
COMMENT ON VIEW columnar.chunk_group
IS 'Columnar chunk group information for tables on which the current user has ownership privileges.';
GRANT SELECT ON columnar.chunk_group TO PUBLIC;
CREATE VIEW columnar.chunk WITH (security_barrier) AS
SELECT relation, storage.storage_id, stripe_num, attr_num, chunk_group_num,
minimum_value, maximum_value, value_stream_offset, value_stream_length,
exists_stream_offset, exists_stream_length, value_compression_type,
value_compression_level, value_decompressed_length, value_count
FROM columnar_internal.chunk chunk, columnar.storage storage
WHERE chunk.storage_id = storage.storage_id;
COMMENT ON VIEW columnar.chunk
IS 'Columnar chunk information for tables on which the current user has ownership privileges.';
GRANT SELECT ON columnar.chunk TO PUBLIC;

View File

@ -1,116 +0,0 @@
CREATE OR REPLACE FUNCTION pg_catalog.alter_columnar_table_set(
table_name regclass,
chunk_group_row_limit int DEFAULT NULL,
stripe_row_limit int DEFAULT NULL,
compression name DEFAULT null,
compression_level int DEFAULT NULL)
RETURNS void
LANGUAGE C
AS 'MODULE_PATHNAME', 'alter_columnar_table_set';
COMMENT ON FUNCTION pg_catalog.alter_columnar_table_set(
table_name regclass,
chunk_group_row_limit int,
stripe_row_limit int,
compression name,
compression_level int)
IS 'set one or more options on a columnar table, when set to NULL no change is made';
CREATE OR REPLACE FUNCTION pg_catalog.alter_columnar_table_reset(
table_name regclass,
chunk_group_row_limit bool DEFAULT false,
stripe_row_limit bool DEFAULT false,
compression bool DEFAULT false,
compression_level bool DEFAULT false)
RETURNS void
LANGUAGE C
AS 'MODULE_PATHNAME', 'alter_columnar_table_reset';
COMMENT ON FUNCTION pg_catalog.alter_columnar_table_reset(
table_name regclass,
chunk_group_row_limit bool,
stripe_row_limit bool,
compression bool,
compression_level bool)
IS 'reset on or more options on a columnar table to the system defaults';
CREATE OR REPLACE FUNCTION columnar_internal.columnar_ensure_am_depends_catalog()
RETURNS void
LANGUAGE plpgsql
SET search_path = pg_catalog
AS $func$
BEGIN
INSERT INTO pg_depend
SELECT -- Define a dependency edge from "columnar table access method" ..
'pg_am'::regclass::oid as classid,
(select oid from pg_am where amname = 'columnar') as objid,
0 as objsubid,
-- ... to each object that is registered to pg_class and that lives
-- in "columnar" schema. That contains catalog tables, indexes
-- created on them and the sequences created in "columnar" schema.
--
-- Given the possibility of user might have created their own objects
-- in columnar schema, we explicitly specify list of objects that we
-- are interested in.
'pg_class'::regclass::oid as refclassid,
columnar_schema_members.relname::regclass::oid as refobjid,
0 as refobjsubid,
'n' as deptype
FROM (VALUES ('columnar.chunk'),
('columnar.chunk_group'),
('columnar.chunk_group_pkey'),
('columnar.chunk_pkey'),
('columnar.options'),
('columnar.options_pkey'),
('columnar.storageid_seq'),
('columnar.stripe'),
('columnar.stripe_first_row_number_idx'),
('columnar.stripe_pkey')
) columnar_schema_members(relname)
-- Avoid inserting duplicate entries into pg_depend.
EXCEPT TABLE pg_depend;
END;
$func$;
COMMENT ON FUNCTION columnar_internal.columnar_ensure_am_depends_catalog()
IS 'internal function responsible for creating dependencies from columnar '
'table access method to the rel objects in columnar schema';
DROP VIEW columnar.options;
DROP VIEW columnar.stripe;
DROP VIEW columnar.chunk_group;
DROP VIEW columnar.chunk;
DROP VIEW columnar.storage;
DROP FUNCTION columnar.get_storage_id(regclass);
DROP SCHEMA columnar;
-- move columnar_internal functions back to citus_internal
ALTER FUNCTION columnar_internal.upgrade_columnar_storage(regclass) SET SCHEMA citus_internal;
ALTER FUNCTION columnar_internal.downgrade_columnar_storage(regclass) SET SCHEMA citus_internal;
ALTER FUNCTION columnar_internal.columnar_ensure_am_depends_catalog() SET SCHEMA citus_internal;
ALTER SCHEMA columnar_internal RENAME TO columnar;
GRANT USAGE ON SCHEMA columnar TO PUBLIC;
GRANT SELECT ON columnar.options TO PUBLIC;
GRANT SELECT ON columnar.stripe TO PUBLIC;
GRANT SELECT ON columnar.chunk_group TO PUBLIC;
-- detach relations from citus_columnar
ALTER EXTENSION citus_columnar DROP SCHEMA columnar;
ALTER EXTENSION citus_columnar DROP SEQUENCE columnar.storageid_seq;
-- columnar tables
ALTER EXTENSION citus_columnar DROP TABLE columnar.options;
ALTER EXTENSION citus_columnar DROP TABLE columnar.stripe;
ALTER EXTENSION citus_columnar DROP TABLE columnar.chunk_group;
ALTER EXTENSION citus_columnar DROP TABLE columnar.chunk;
ALTER EXTENSION citus_columnar DROP FUNCTION columnar.columnar_handler;
ALTER EXTENSION citus_columnar DROP ACCESS METHOD columnar;
ALTER EXTENSION citus_columnar DROP FUNCTION pg_catalog.alter_columnar_table_set;
ALTER EXTENSION citus_columnar DROP FUNCTION pg_catalog.alter_columnar_table_reset;
-- functions under citus_internal for columnar
ALTER EXTENSION citus_columnar DROP FUNCTION citus_internal.upgrade_columnar_storage;
ALTER EXTENSION citus_columnar DROP FUNCTION citus_internal.downgrade_columnar_storage;
ALTER EXTENSION citus_columnar DROP FUNCTION citus_internal.columnar_ensure_am_depends_catalog;

View File

@ -1,4 +0,0 @@
-- citus_columnar--11.2-1--11.1-1
-- Note that we intentionally do not re-insert the pg_depend records that we
-- deleted via citus_columnar--11.1-1--11.2-1.sql.

View File

@ -1 +0,0 @@
-- citus_columnar--11.3-1--11.2-1

View File

@ -1 +0,0 @@
-- citus_columnar--12.2-1--11.3-1

View File

@ -1,19 +0,0 @@
#include "../udfs/alter_columnar_table_set/10.0-1.sql"
#include "../udfs/alter_columnar_table_reset/10.0-1.sql"
#include "../udfs/columnar_ensure_am_depends_catalog/10.2-4.sql"
DROP VIEW columnar.options;
DROP VIEW columnar.stripe;
DROP VIEW columnar.chunk_group;
DROP VIEW columnar.chunk;
DROP VIEW columnar.storage;
DROP FUNCTION columnar.get_storage_id(regclass);
DROP SCHEMA columnar;
ALTER SCHEMA columnar_internal RENAME TO columnar;
GRANT USAGE ON SCHEMA columnar TO PUBLIC;
GRANT SELECT ON columnar.options TO PUBLIC;
GRANT SELECT ON columnar.stripe TO PUBLIC;
GRANT SELECT ON columnar.chunk_group TO PUBLIC;

View File

@ -1,48 +0,0 @@
CREATE OR REPLACE FUNCTION pg_catalog.alter_columnar_table_reset(
table_name regclass,
chunk_group_row_limit bool DEFAULT false,
stripe_row_limit bool DEFAULT false,
compression bool DEFAULT false,
compression_level bool DEFAULT false)
RETURNS void
LANGUAGE plpgsql AS
$alter_columnar_table_reset$
declare
noop BOOLEAN := true;
cmd TEXT := 'ALTER TABLE ' || table_name::text || ' RESET (';
begin
if (chunk_group_row_limit) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.chunk_group_row_limit';
noop := false;
end if;
if (stripe_row_limit) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.stripe_row_limit';
noop := false;
end if;
if (compression) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.compression';
noop := false;
end if;
if (compression_level) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.compression_level';
noop := false;
end if;
cmd := cmd || ')';
if (not noop) then
execute cmd;
end if;
return;
end;
$alter_columnar_table_reset$;
COMMENT ON FUNCTION pg_catalog.alter_columnar_table_reset(
table_name regclass,
chunk_group_row_limit bool,
stripe_row_limit bool,
compression bool,
compression_level bool)
IS 'reset on or more options on a columnar table to the system defaults';

View File

@ -5,39 +5,8 @@ CREATE OR REPLACE FUNCTION pg_catalog.alter_columnar_table_reset(
compression bool DEFAULT false,
compression_level bool DEFAULT false)
RETURNS void
LANGUAGE plpgsql AS
$alter_columnar_table_reset$
declare
noop BOOLEAN := true;
cmd TEXT := 'ALTER TABLE ' || table_name::text || ' RESET (';
begin
if (chunk_group_row_limit) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.chunk_group_row_limit';
noop := false;
end if;
if (stripe_row_limit) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.stripe_row_limit';
noop := false;
end if;
if (compression) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.compression';
noop := false;
end if;
if (compression_level) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.compression_level';
noop := false;
end if;
cmd := cmd || ')';
if (not noop) then
execute cmd;
end if;
return;
end;
$alter_columnar_table_reset$;
LANGUAGE C
AS 'MODULE_PATHNAME', 'alter_columnar_table_reset';
COMMENT ON FUNCTION pg_catalog.alter_columnar_table_reset(
table_name regclass,

View File

@ -1,48 +0,0 @@
CREATE OR REPLACE FUNCTION pg_catalog.alter_columnar_table_set(
table_name regclass,
chunk_group_row_limit int DEFAULT NULL,
stripe_row_limit int DEFAULT NULL,
compression name DEFAULT null,
compression_level int DEFAULT NULL)
RETURNS void
LANGUAGE plpgsql AS
$alter_columnar_table_set$
declare
noop BOOLEAN := true;
cmd TEXT := 'ALTER TABLE ' || table_name::text || ' SET (';
begin
if (chunk_group_row_limit is not null) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.chunk_group_row_limit=' || chunk_group_row_limit;
noop := false;
end if;
if (stripe_row_limit is not null) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.stripe_row_limit=' || stripe_row_limit;
noop := false;
end if;
if (compression is not null) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.compression=' || compression;
noop := false;
end if;
if (compression_level is not null) then
if (not noop) then cmd := cmd || ', '; end if;
cmd := cmd || 'columnar.compression_level=' || compression_level;
noop := false;
end if;
cmd := cmd || ')';
if (not noop) then
execute cmd;
end if;
return;
end;
$alter_columnar_table_set$;
COMMENT ON FUNCTION pg_catalog.alter_columnar_table_set(
table_name regclass,
chunk_group_row_limit int,
stripe_row_limit int,
compression name,
compression_level int)
IS 'set one or more options on a columnar table, when set to NULL no change is made';

Some files were not shown because too many files have changed in this diff Show More