Adds support for VACUUM and ANALYZE commands which target a specific distributed table. After grabbing the appropriate locks, this imple- mentation sends VACUUM commands to each placement (using one connec- tion per placement). These commands are sent in parallel, so users with large tables will benefit from sharding. Except for VERBOSE, all VACUUM and ANALYZE options are supported, including the explicit column list used by ANALYZE. As with many of our utility commands, the local command also runs. In the VACUUM/ANALYZE case, the local command is executed before any re- mote propagation. Because error handling is managed after local proc- essing, this can result in a VACUUM completing locally but erroring out when distributed processing commences: a minor technicality in all cases, as there isn't really much reason to ever roll back a VACUUM (an impossibility in any case, as VACUUM cannot run within a transaction). Remote propagation of targeted VACUUM/ANALYZE is controlled by the enable_ddl_propagation setting; warnings are emitted if such a command is attempted when DDL propagation is disabled. Unqualified VACUUM or ANALYZE is not handled, but a warning message informs the user of this. Implementation note: this commit adds a "BARE" value to MultiShard- CommitProtocol. When active, no BEGIN command is ever sent to remote nodes, useful for commands such as VACUUM/ANALYZE which must not run in a transaction block. This value is not user-facing and is reset at transaction end. |
||
---|---|---|
src | ||
.codecov.yml | ||
.gitattributes | ||
.gitignore | ||
.travis.yml | ||
CHANGELOG.md | ||
CONTRIBUTING.md | ||
LICENSE | ||
Makefile | ||
Makefile.global.in | ||
README.md | ||
autogen.sh | ||
configure | ||
configure.in | ||
github-banner.png | ||
prep_buildtree |
README.md
What is Citus?
- Open-source PostgreSQL extension (not a fork)
- Scalable across multiple hosts through sharding and replication
- Distributed engine for query parallelization
- Highly available in the face of host failures
Citus horizontally scales PostgreSQL across commodity servers using sharding and replication. Its query engine parallelizes incoming SQL queries across these servers to enable real-time responses on large datasets.
Citus extends the underlying database rather than forking it, which gives developers and enterprises the power and familiarity of a traditional relational database. As an extension, Citus supports new PostgreSQL releases, allowing users to benefit from new features while maintaining compatibility with existing PostgreSQL tools. Note that Citus supports many (but not all) SQL commands; see the FAQ for more details.
Common Use-Cases:
- Powering real-time analytic dashboards
- Exploratory queries on events as they happen
- Large dataset archival and reporting
- Session analytics (funnels, segmentation, and cohorts)
To learn more, visit citusdata.com and join the mailing list to stay on top of the latest developments.
Quickstart
Local Citus Cluster
-
(Mac only) connect to Docker VM
eval $(docker-machine env default)
-
Pull and start the docker images
wget https://raw.githubusercontent.com/citusdata/docker/master/docker-compose.yml docker-compose -p citus up -d
-
Connect to the master database
docker exec -it citus_master psql -U postgres -d postgres
-
Follow the first tutorial instructions
-
To shut the cluster down, run
docker-compose -p citus down
Talk to Contributors and Learn More
Documentation | Try the Citus
tutorials for a hands-on introduction or the documentation for a more comprehensive reference. |
Google Groups | The Citus Google Group is our place for detailed questions and discussions. |
Slack | Chat with us in our community Slack channel. |
Github Issues | We track specific bug reports and feature requests on our project issues. |
Follow @citusdata for general updates and PostgreSQL scaling tips. | |
Training and Support | See our support page for training and dedicated support options. |
Contributing
Citus is built on and of open source. We welcome your contributions, and have added a helpwanted label to issues which are accessible to new contributors. The CONTRIBUTING.md file explains how to get started developing the Citus extension itself and our code quality guidelines.
Who is Using Citus?
Citus is deployed in production by many customers, ranging from technology start-ups to large enterprises. Here are some examples:
- CloudFlare uses Citus to provide real-time analytics on 100 TBs of data from over 4 million customer websites. Case Study
- MixRank uses Citus to efficiently collect and analyze vast amounts of data to allow inside B2B sales teams to find new customers. Case Study
- Neustar builds and maintains scalable ad-tech infrastructure that counts billions of events per day using Citus and HyperLogLog.
- Agari uses Citus to secure more than 85 percent of U.S. consumer emails on two 6-8 TB clusters. Case Study
- Heap uses Citus to run dynamic funnel, segmentation, and cohort queries across billions of users and tens of billions of events. Watch Video
Copyright © 2012–2016 Citus Data, Inc.