Commit Graph

9 Commits (4923a85aba1e2a13b89342e2cec3393903958cbd)

Author SHA1 Message Date
Yuanhao Luo 4923a85aba Bulkload copy for citus
Through gprof performance analysis, I found that master node is a
CPU bottleneck and function NextCopyFrom() spent most of the time.
To improve ingestion performance, I assign this time-consuming
function to each worker node and the benchmark result shows that
it's actually working, we get nearly #worker times as fast as before.

This bulkload feature works as below:
1. we issue a bulkload copy command on any node(master or worker), such
   as "copy tb1 from 'tb1.csv' with(format csv, method 'bulkload');" in
   node host:port.
2. the copy command is rebuilt to "copy tb1 from program 'bload'
   with(format csv, bulkload_host host, bulkload_port port, method
   'bulkload')" in host:port, and then this rebuilt-copy command is
   assigned to each worker asynchronously, besides, we would create a
   zeromq server, which reads records from file 'tb1.csv' and delivers
   these records to zeromq client(program 'bload' in each worker node).
3. in each worker node, it just executes the copy command assigned in
   step 2, the records of copy command come from zeromq client bload,
   which pull records from zeromq server.

To enable this feature, you must have zeromq installed. After compiling
and installing citus extension, just add copy option "method 'bulkload'"
to use bulkload ingestion.

For now, bulkload copy supports copy from file,program with(format csv,text)
for append and hash distributed table.
Note: only supports format csv,text for copy from stdin, format binary is
not supported.

TODO: better support for transaction and error handling.
2017-02-16 20:53:58 +08:00
Brian Cloutier 4ecd6b58fb Remove csql, \stage is no longer needed 2016-08-26 10:41:59 +03:00
Jason Petersen a53fb90ef9
Fix various build issues
I came across several places we weren't as flexible or resilient as we
should have been in our build logic. They include:

  * Not using `DESTDIR` in the install-header destination
  * Allowing callers to specify `VPATH` or `srcdir` (which breaks)
  * Using absolute path for SCRIPTS (9.5 prepends srcdir)
  * Including libpq-int in a confusing way (extracted this function)
  * Having server includes come first during csql build (client must)

In particular, I hit all of these attempting to build with pg_buildext
in Debian. It passes in an explicit VPATH, as well as srcdir (breaking
all recursive make invocations), and also uses DESTDIR during install.

In addition, a PGDG-enabled Debian box will have the latest libpq-dev
headers (e.g. 9.5) even when building against an older server version
(e.g. 9.4). This leads to problems when including e.g. `c.h`, which
is ambiguous. While compiling more client-side code (csql), we need to
ensure the newer libpq headers are included _first_, so I fixed that.
2016-03-11 13:38:47 -07:00
Andres Freund 93caee2929
Add deps from toplevel install to build targets
Otherwise a parallel 'make install' can end up trying to build the same
targets twice, once via the normal build rules ('all') and once via
install.
2016-02-17 16:51:01 -07:00
Murat Tuncer df5851366c Fixed merge leftovers 2016-02-17 15:44:24 +02:00
Murat Tuncer 3528d7ce85 Merge from master branch into feature/citusdb-to-citus 2016-02-17 14:49:01 +02:00
Jason Petersen 622eb29996
Add make targets for applying and checking style
Need to change to the project's top srcdir, as citus_indent expects to
be able to find styled files using git ls-files, and VPATH builds would
otherwise not return any results.
2016-02-16 12:04:06 -07:00
Murat Tuncer 55c44b48dd Changed product name to citus
All citusdb references in
- extension, binary names
- file headers
- all configuration name prefixes
- error/warning messages
- some functions names
- regression tests

are changed to be citus.
2016-02-15 16:04:31 +02:00
Onder Kalaci 136306a1fe Initial commit of Citus 5.0 2016-02-11 04:05:32 +02:00