Citus Technical Readme (#7207)
This commit aims to add a comprehensive guide that covers all essential aspects of Citus, including planning, execution, locking mechanisms, shard moves, 2PC, and many other major components of Citus. Co-authored-by: Marco Slot <marco.slot@gmail.com>pull/7243/head
|
@ -240,3 +240,9 @@ Any other SQL you can put directly in the main sql file, e.g.
|
||||||
### Running tests
|
### Running tests
|
||||||
|
|
||||||
See [`src/test/regress/README.md`](https://github.com/citusdata/citus/blob/master/src/test/regress/README.md)
|
See [`src/test/regress/README.md`](https://github.com/citusdata/citus/blob/master/src/test/regress/README.md)
|
||||||
|
|
||||||
|
### Documentation
|
||||||
|
|
||||||
|
User-facing documentation is published on [docs.citusdata.com](https://docs.citusdata.com/). When adding a new feature, function, or setting, you can open a pull request or issue against the [Citus docs repo](https://github.com/citusdata/citus_docs/).
|
||||||
|
|
||||||
|
Detailed descriptions of the implementation for Citus developers are provided in the [Citus Technical Documentation](src/backend/distributed/README.md). It is currently a single file for ease of searching. Please update the documentation if you make any changes that affect the design or add major new features.
|
||||||
|
|
|
@ -4,7 +4,7 @@
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
[](https://docs.citusdata.com/)
|
[](https://docs.citusdata.com/)
|
||||||
[](https://stackoverflow.com/questions/tagged/citus)
|
[](https://stackoverflow.com/questions/tagged/citus)
|
||||||
|
@ -31,7 +31,7 @@ You can use these Citus superpowers to make your Postgres database scale-out rea
|
||||||
|
|
||||||
Our [SIGMOD '21](https://2021.sigmod.org/) paper [Citus: Distributed PostgreSQL for Data-Intensive Applications](https://doi.org/10.1145/3448016.3457551) gives a more detailed look into what Citus is, how it works, and why it works that way.
|
Our [SIGMOD '21](https://2021.sigmod.org/) paper [Citus: Distributed PostgreSQL for Data-Intensive Applications](https://doi.org/10.1145/3448016.3457551) gives a more detailed look into what Citus is, how it works, and why it works that way.
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
Since Citus is an extension to Postgres, you can use Citus with the latest Postgres versions. And Citus works seamlessly with the PostgreSQL tools and extensions you are already familiar with.
|
Since Citus is an extension to Postgres, you can use Citus with the latest Postgres versions. And Citus works seamlessly with the PostgreSQL tools and extensions you are already familiar with.
|
||||||
|
|
||||||
|
@ -423,12 +423,14 @@ A Citus database cluster grows from a single PostgreSQL node into a cluster by a
|
||||||
|
|
||||||
Data in distributed tables is stored in “shards”, which are actually just regular PostgreSQL tables on the worker nodes. When querying a distributed table on the coordinator node, Citus will send regular SQL queries to the worker nodes. That way, all the usual PostgreSQL optimizations and extensions can automatically be used with Citus.
|
Data in distributed tables is stored in “shards”, which are actually just regular PostgreSQL tables on the worker nodes. When querying a distributed table on the coordinator node, Citus will send regular SQL queries to the worker nodes. That way, all the usual PostgreSQL optimizations and extensions can automatically be used with Citus.
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
When you send a query in which all (co-located) distributed tables have the same filter on the distribution column, Citus will automatically detect that and send the whole query to the worker node that stores the data. That way, arbitrarily complex queries are supported with minimal routing overhead, which is especially useful for scaling transactional workloads. If queries do not have a specific filter, each shard is queried in parallel, which is especially useful in analytical workloads. The Citus distributed executor is adaptive and is designed to handle both query types at the same time on the same system under high concurrency, which enables large-scale mixed workloads.
|
When you send a query in which all (co-located) distributed tables have the same filter on the distribution column, Citus will automatically detect that and send the whole query to the worker node that stores the data. That way, arbitrarily complex queries are supported with minimal routing overhead, which is especially useful for scaling transactional workloads. If queries do not have a specific filter, each shard is queried in parallel, which is especially useful in analytical workloads. The Citus distributed executor is adaptive and is designed to handle both query types at the same time on the same system under high concurrency, which enables large-scale mixed workloads.
|
||||||
|
|
||||||
The schema and metadata of distributed tables and reference tables are automatically synchronized to all the nodes in the cluster. That way, you can connect to any node to run distributed queries. Schema changes and cluster administration still need to go through the coordinator.
|
The schema and metadata of distributed tables and reference tables are automatically synchronized to all the nodes in the cluster. That way, you can connect to any node to run distributed queries. Schema changes and cluster administration still need to go through the coordinator.
|
||||||
|
|
||||||
|
Detailed descriptions of the implementation for Citus developers are provided in the [Citus Technical Documentation](src/backend/distributed/README.md).
|
||||||
|
|
||||||
## When to use Citus
|
## When to use Citus
|
||||||
|
|
||||||
Citus is uniquely capable of scaling both analytical and transactional workloads with up to petabytes of data. Use cases in which Citus is commonly used:
|
Citus is uniquely capable of scaling both analytical and transactional workloads with up to petabytes of data. Use cases in which Citus is commonly used:
|
||||||
|
|
After Width: | Height: | Size: 95 KiB |
Before Width: | Height: | Size: 94 KiB After Width: | Height: | Size: 94 KiB |
Before Width: | Height: | Size: 22 KiB After Width: | Height: | Size: 22 KiB |
Before Width: | Height: | Size: 18 KiB After Width: | Height: | Size: 18 KiB |
After Width: | Height: | Size: 102 KiB |
After Width: | Height: | Size: 29 KiB |
After Width: | Height: | Size: 69 KiB |
After Width: | Height: | Size: 111 KiB |
After Width: | Height: | Size: 12 KiB |
After Width: | Height: | Size: 168 KiB |