Recent changes to DDL and transaction logic resulted in a "regression"
from the viewpoint of users. Previously, DDL commands were allowed in
multi-command transaction blocks, though they were not processed in any
actual transactional manner. We improved the atomicity of our DDL code,
but added a restriction that DDL commands themselves must not occur in
any BEGIN/END transaction block.
To give users back the original functionality (and improved atomicity)
we now keep track of whether a multi-command transaction has modified
data (DML) or schema (DDL). Interleaving the two modification types in
a single transaction is disallowed.
This first step simply permits a single DDL command in such a block,
admittedly an incomplete solution, but one which will permit us to add
full multi-DDL command support in a subsequent commit.
"Staging table" will be the only valid use of 'stage' from now on, we
will now say "load" when talking about data ingestion. If creation of
shards is its own step, we'll just say "shard creation".
Fixes#547
This change removes all references to \stage in the regression tests
and puts \COPY instead. Doing so changed shard counts, min/max
values on some test tables (lineitem, orders, etc.).
I've been seeing warnings on OS X/clang for a while about these lines
and finally got tired of it. The main problem is that PRIu64 expects a
uint64_t but we were passing a uint64 (a PostgreSQL-defined type). In
PostgreSQL 9.5, we now have INT64_MODIFIER, so can build our own zero-
padded unsigned 64-bit int format modifier that expects a PostgreSQL-
provided uint64 type.
This simplifies the code slightly (no more ifdefs) and gets rid of the
warning that's been annoying me since April (my TODO creation time).
A recent change to the image used in Travis causes some problems for
the code we use here to ensure the local replica is first. Since this
code is essentially dead in a post-stage world anyhow, we're OK with
ripping out the tests to placate Travis.
PostgreSQL 9.5.4 stopped calling planner for materialized view create
command when NO DATA option is provided.
This causes our test to behave differently between pre-9.5.4 and 9.5.4.
When an unreferenced prepared statement parameter does not explicitly
have a type assigned, we cannot deserialize it, to send to the remote
side. That commonly happens inside plpgsql functions, where local
variables are passed in as unused prepared statement parameters.
A recent change generates a "dummy" shard placement with its identifier
set to INVALID_SHARD_ID for SELECT queries against distributed tables
with no shards. Normally, no lock is acquired for SELECT statements,
but if all_modifications_commutative is set to true, we will acquire a
shared lock, triggering an assertion failure within LockShardResource
in the above case.
The "dummy" shard placement is actually necessary to ensure such empty
queries have somewhere to execute, and INVALID_SHARD_ID seems the most
appropriate value for the dummy's shard identifier field, so the most
straightforward fix is to just avoid locking invalid shard identifiers.
This bumps the Citus tools to 0.4.0, which include support for adding
a recent (0.6.3) uncrustify to Travis in addition to support for fully
installing Travis scripts to system locations. For brevity, suffixes
have been removed from Travis shell scripts.
The main additional logic here is just ensuring Travis CI gets a newer
uncrustify install and that `citus_indent` is called with the `--check`
flag, which exits with a nonzero status if any files don't comply.
Text datums can't be directly accessed via the struct equivalence trick
used to access catalogs. That's because, as an optimization, they're
sometimes aligned to 1 byte ("text"'s alignment), and sometimes to 4
bytes. That depends on it being a short
varlena (cf. VARATT_NOT_PAD_BYTE) or not.
In the case at hand here, partkey became longer than 127 characters -
the boundary for short varlenas (cf. VARATT_CAN_MAKE_SHORT()). Thus it
became 4 byte/int aligned. Which lead to the direct struct access
accessing the wrong data.
The fix is simply to never access partkey that way - to enforce that,
hide partkey ehind the usual ifdef.
Fixes: #674
Fixes#679
This change sets the default commit protocol for distributed DDL
commands to '1pc'. If the user issues a distributed DDL command with
this default setting, then once in a session, a NOTICE message is
shown about using '2pc' being extra safe.
This adds support for SERIAL/BIGSERIAL column types. Because we now can
evaluate functions on the master (during execution), adding this is a
matter of ensuring the table creation step works properly.
To accomplish this, I've added some logic to detect sequences owned by
a table (i.e. those related to its columns). Simply creating a sequence
and using it in a default value is insufficient; users who do so must
ensure the sequence is owned by the column using it.
Fortunately, this is exactly what SERIAL and BIGSERIAL do, which is the
use case we're targeting with this feature. While testing this, I found
that worker_apply_shard_ddl_command actually adds shard identifiers to
sequence names, though I found no places that use or test this path. I
removed that code so that sequence names are not mutated and will match
those used by a SERIAL default value expression.
Our use of the new-to-9.5 CREATE SEQUENCE IF NOT EXISTS syntax means we
are dropping support for 9.4 (which is being done regardless, but makes
this change simpler). I've removed 9.4 from the Travis build matrix.
Some edge cases are possible in ALTER SEQUENCE, COPY FROM (on workers),
and CREATE SEQUENCE OWNED BY. I've added errors for each so that users
understand when and why certain operations are prohibited.
We remove schema name parameter from worker_fetch_foreign_file and
worker_fetch_regular_table functions. We now send schema name
concatanated with table name.
Fixes#676
We added old versions (i.e. without schema name) of worker_apply_shard_ddl_command,
worker_fetch_foreign_file and worker_fetch_regular_table back. During function call
of one of these functions, we set schema name as public schema and call the newer
version of the functions.
This change removes AllFinalizedPlacementsAccessible function since,
we open connections to all shard placements before any command is
sent so we immediately error out if a shard placement is not accessible.
This change allows users to interrupt long running DDL commands.
Interrupt requests are handled after each DDL command being propagated
to a shard placement, which means that generally the cancel request will
be processed right after the execution of the DDL is finished in the
current placement.
We can now support richer set of queries in router planner.
This allow us to support CTEs, joins, window function, subqueries
if they are known to be executed at a single worker with a single
task (all tables are filtered down to a single shard and a single
worker contains all table shards referenced in the query).
Fixes : #501