Commit Graph

6763 Commits (d9c1e992b20113c78871ab7e80145a84d70382db)

Author SHA1 Message Date
Onur Tirtir d9c1e992b2 remove WorkerNodeListGetNodeWithGroupId 2023-10-24 18:34:04 +03:00
Onur Tirtir 6af8a51065
Merge branch 'main' into tenant-schema-isolation 2023-10-24 14:27:45 +03:00
Naisila Puka 10198b18e8
Technical readme small fixes (#7261) 2023-10-23 13:43:43 +03:00
Naisila Puka 1fe16fa746
Remove unnecessary pre-fastpath code (#7262)
This code was here because we first implemented
`fast path planner` via
[#2606](https://github.com/citusdata/citus/pull/2606)
and then later `deferred pruning`
[#3369](https://github.com/citusdata/citus/pull/3369)
So, for some years, this code was useful.
2023-10-23 13:01:48 +03:00
zhjwpku 2d1444188c
Fix wrong comments around HasDistributionKey() (#7223)
HasDistributionKey & HasDistributionKeyCacheEntry returns true when the
corresponding table has a distribution key, the comments state the
opposite,
which should be fixed.

Signed-off-by: Zhao Junwang <zhjwpku@gmail.com>
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2023-10-18 10:53:00 +02:00
Onur Tirtir db13afaa7b
Fix flaky columnar_create.sql test (#7266) 2023-10-17 16:58:17 +03:00
Gürkan İndibay 71a4633dad
Fixes typo and renames multi_process_utility (#7259) 2023-10-17 16:39:37 +03:00
Onur Tirtir aa8733faa7 comment 2023-10-16 16:47:11 +03:00
Onur Tirtir db43b6fdce improve & comment 2023-10-16 16:42:59 +03:00
Onur Tirtir ac97d54515 fix 2023-10-16 16:13:31 +03:00
Onur Tirtir 01b2bf5e3c fix 2023-10-16 14:55:29 +03:00
Onur Tirtir 26c27a9fbf fix tests 2023-10-16 14:46:25 +03:00
Onur Tirtir 09f0003ae1 Merge remote-tracking branch 'origin/main' into tenant-schema-isolation 2023-10-16 14:29:10 +03:00
Onur Tirtir 2d16b0fd9e address feedback 2023-10-16 14:28:25 +03:00
Onur Tirtir 5eaf6c221e
Fix flaky test detection job (#7256)
We were getting such errors in flaky-test detection job:
```
Unable to process file command 'output' successfully
```

Even though we don't seem to be writing multiple lines to
$GITHUB_OUTPUT, this seems to be the right fix.

https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions#multiline-strings
2023-10-16 14:20:55 +03:00
Jelte Fennema-Nio 788e09a39a
Add a test for citus_shards where table names have spaces (#7224)
There was a bug reported for previous versions of Citus where
shard\_size was returning NULL for tables with spaces in them. It works
fine on the main branch though, but I'm still adding a test for this to
the main branch because it seems a good test to have.
2023-10-16 11:38:24 +02:00
Onur Tirtir b0fa3d91bc fix flaky test detection job 2023-10-13 15:01:25 +03:00
Onur Tirtir a2fb92d8dd actually use the index 2023-10-13 15:01:25 +03:00
Onur Tirtir e9c05647af
Merge branch 'main' into tenant-schema-isolation 2023-10-13 14:04:21 +03:00
Nils Dijk fb08f9b198
Remove software-properties-common from dev container after use (#7255)
During the creation of the devcontainer we need to add a ppa repository,
which is easiest done via software-properies-common. As turns out this
installes pkexec into the container as a side effect.

When vscode tries to attach a debugger it first checks if pkexec is
installed as this gives a nicer popup asking for elevation of rights to
attach to the process. However, since dev containers don't have a
windowing system running pkexec isn't working as expected and thus
prevents the debugger from attaching.

Without pkexec in the container vscode 'falls back' to plain old sudo
which we can run passwordless in the container.

For pkexec to be removed we need to first purge
software-propertied-common as well as autoremove all packages that were
installed due to the installation of said package. By performing this
all in one step we minimize the size of the layer we are creating.
2023-10-12 17:47:44 +02:00
Onur Tirtir 89f13f038e
Merge branch 'main' into tenant-schema-isolation 2023-10-11 14:21:31 +03:00
Gokhan Gulbiz e0b0cdbb87
CircleCI to GHA migration (#7154)
Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
2023-10-10 16:58:50 +03:00
Onur Tirtir 9ea89a0063 Update src/backend/distributed/operations/rebalancer_placement_isolation.c 2023-10-10 11:28:20 +03:00
Onur Tirtir 2fc1411da4 properly assign conflicting placements 2023-10-10 11:28:20 +03:00
Onur Tirtir 7e9a186fa2 comment 2023-10-10 11:28:20 +03:00
Onur Tirtir 77c5c882de properly handle the cases where rebalancer is called for specific table 2023-10-10 11:28:20 +03:00
Onur Tirtir 6e9fc45b97 improve 2023-10-10 11:28:20 +03:00
Onur Tirtir d541f64e3c improve test 2023-10-10 11:28:20 +03:00
Onur Tirtir faffeccc76 take shardAllowedOnNode udf into account when planning 2023-10-10 11:28:20 +03:00
Onur Tirtir d1a1ad0147 improve citus_shards 2023-10-10 11:28:20 +03:00
Onur Tirtir cc587101ed rename needsisolatednode to needsseparatenode 2023-10-10 11:28:19 +03:00
Onur Tirtir a58442d411 rename to has_separate_node 2023-10-10 11:28:02 +03:00
Onur Tirtir 7d68e655bc add own_node to citus_shards 2023-10-10 11:28:01 +03:00
Onur Tirtir 785296406f reindent 2023-10-10 11:27:19 +03:00
Onur Tirtir bbf8d9c994 err msg update 2023-10-10 11:27:19 +03:00
Onur Tirtir d95ca2d63b some checks & tests 2023-10-10 11:27:19 +03:00
Onur Tirtir 3b767211cc improve code and one more test 2023-10-10 11:27:19 +03:00
Onur Tirtir 83dd504a64 rename to citus_shard_property_set 2023-10-10 11:27:19 +03:00
Onur Tirtir 51c3ed8dfd store needsisolatednode in pg_dist_shard 2023-10-10 11:27:18 +03:00
Onur Tirtir 518227de38 Allow isolating shard placement groups on individual nodes 2023-10-10 11:26:49 +03:00
Emel Şimşek e9035f6d32
Send keepalive messages in split decoder periodically to avoid wal receiver timeouts during large shard splits. (#7229)
DESCRIPTION: Send keepalive messages during the logical replication
phase of large shard splits to avoid timeouts.

During the logical replication part of the shard split process, split
decoder filters out the wal records produced by the initial copy. If the
number of wal records is big, then split decoder ends up processing for
a long time before sending out any wal records through pgoutput. Hence
the wal receiver may time out and restarts repeatedly causing our split
driver code catch up logic to fail.

Notes: 

1. If the wal_receiver_timeout is set to a very small number e.g. 600ms,
it may time out before receiving the keepalives. My tests show that this
code works best when the` wal_receiver_timeout `is set to 1minute, which
is the default value.

2. Once a logical replication worker time outs, a new one gets launched.
The new logical replication worker sets the pg_stat_subscription columns
to initial values. E.g. the latest_end_lsn is set to 0. Our driver logic
in `WaitForGroupedLogicalRepTargetsToCatchUp` can not handle LSN value
to go back. This is the main reason for it to get stuck in the infinite
loop.
2023-10-09 22:33:08 +03:00
Nils Dijk 76fdfa3c0f
Add devcontainer for development purposes (#7102)
This change adds a devcontainer configuration to the Citus project. This
devcontainer allows for quick generation of isolated development
environments, either local on the machine of a developer or in a cloud,
like github codepaces.

The devcontainer is updated automatically by github actions when its
configuration changes.

For more detailed instructions on how to quickstart the development in a
container see CONTRIBUTING.md
2023-10-09 15:37:21 +02:00
Nils Dijk 6d8725efb0
Fix leaking of memory and memory contexts in Foreign Constraint Graphs (#7236)
DESCRIPTION: Fix leaking of memory and memory contexts in Foreign
Constraint Graphs

Previously, every time we (re)created the Foreign Constraint
Relationship Graph, we created a new Memory Context while loosing a
reference to the previous context. This old context could still have
left over memory in there causing a memory leak.

With this patch we statically have one memory context that we lazily
initialize the first time we create our foreign constraint relationship
graph. On every subsequent creation, beside destroying our previous
hashmap we also reset our memory context to remove any left over
references.
2023-10-09 13:05:51 +02:00
Onur Tirtir 858d99be33
Take improvement_threshold into the account in citus_add_rebalance_strategy() (#7247)
DESCRIPTION: Makes sure to take improvement_threshold into the account
in `citus_add_rebalance_strategy()`.

Fixes https://github.com/citusdata/citus/issues/7188.
2023-10-09 13:13:08 +03:00
Önder Kalacı 7d6c401dd3
Update technical readme (#7248)
Fix a wrong query, reported by @naisila
2023-10-06 13:37:37 +03:00
Önder Kalacı 0dca65c84d
Addd missing image to Technical Readme (#7243)
DESCRIPTION: PR description that will go into the change log, up to 78
characters
2023-09-29 22:24:10 +02:00
Önder Kalacı 185ac5e01e
Citus Technical Readme (#7207)
This commit aims to add a comprehensive guide that covers all essential
aspects of Citus, including planning, execution, locking mechanisms,
shard moves, 2PC, and many other major components of Citus.

Co-authored-by: Marco Slot <marco.slot@gmail.com>
2023-09-29 16:50:52 +03:00
dependabot[bot] c323f49e83
Bump cryptography from 41.0.3 to 41.0.4 in /src/test/regress (#7231)
Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.3
to 41.0.4.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nils Dijk <nils@citusdata.com>
2023-09-27 15:36:58 +02:00
Onur Tirtir 27ac44eb2a
Fix mixed Citus upgrade tests (#7218)
When testing rolling Citus upgrades, coordinator should not be upgraded
until we upgrade all the workers.

---------

Co-authored-by: Jelte Fennema-Nio <github-tech@jeltef.nl>
2023-09-26 17:52:52 +03:00
Nils Dijk b87fbcbf79
Shard moves/isolate report LSN's in lsn format (#7227)
DESCRIPTION: Shard moves/isolate report LSN's in lsn format

While investigating an issue with our catchup mechanism on certain
postgres versions we noticed we print LSN's in the format of the native
long type. This is an uncommon representation for LSN's in postgres
logs.

This patch changes the output of our log message to go from the long
type representation to the native LSN type representation. Making it
easier for postgres users to recognize and compare LSN's with other
related reports.

example of new output:
```
2023-09-25 17:28:47.544 CEST [11345] LOG:  The LSN of the target subscriptions on node localhost:9701 have increased from 0/0 to 0/E1ED20F8 at 2023-09-25 17:28:47.544165+02 where the source LSN is 1/415DCAD0
```
2023-09-26 13:47:50 +02:00