Commit Graph

302 Commits (69bd7fc8dea8758e6bd31ba2a88e959a7103002b)

Author SHA1 Message Date
Metin Doslu f1bab608c1 Return false in MultiClientQueryResult() on failing query 2016-08-29 17:05:35 +03:00
Brian Cloutier a76041106e Remove csql, \stage is no longer needed 2016-08-26 10:41:59 +03:00
Brian Cloutier 24bd08d287 Remove check-multi-fdw tests, nobody uses Citus with fdws 2016-08-26 10:41:33 +03:00
Jason Petersen 1ed9871ead Rename test files with 'stage' in name
Ignored FDW files as those test are being removed entirely, I believe.
2016-08-22 13:32:53 -06:00
Jason Petersen 945ce21320 Replace verb 'stage' with 'load' in test comments
"Staging table" will be the only valid use of 'stage' from now on, we
will now say "load" when talking about data ingestion. If creation of
shards is its own step, we'll just say "shard creation".
2016-08-22 13:24:18 -06:00
Jason Petersen 1500438313 Replace verb 'stage' with 'load' in schedules
"Staging table" will be the only valid use of 'stage' from now on.
2016-08-22 11:48:41 -06:00
Eren Başak 7e511d278e Lowercase \copy to match PostgreSQL's style for local/psql-level functions 2016-08-22 11:31:26 -06:00
Eren Basak d030eb63d1 Replace \stage With \copy on Regression Tests
Fixes #547

This change removes all references to \stage in the regression tests
and puts \COPY instead. Doing so changed shard counts, min/max
values on some test tables (lineitem, orders, etc.).
2016-08-22 11:31:26 -06:00
Robin Thomas 475a6245bf Remove all usage of pg_dist_shard.shardalias in extension code. (#739)
Remove regression test of non-null shardalias.
2016-08-19 17:06:22 +03:00
Jason Petersen eb039bd58e Remove HAVE_INTTYPES_H ifdefs
I've been seeing warnings on OS X/clang for a while about these lines
and finally got tired of it. The main problem is that PRIu64 expects a
uint64_t but we were passing a uint64 (a PostgreSQL-defined type). In
PostgreSQL 9.5, we now have INT64_MODIFIER, so can build our own zero-
padded unsigned 64-bit int format modifier that expects a PostgreSQL-
provided uint64 type.

This simplifies the code slightly (no more ifdefs) and gets rid of the
warning that's been annoying me since April (my TODO creation time).
2016-08-18 15:19:53 -06:00
Jason Petersen 8e7a567827 Fix Travis local_first_candidate_nodes failures
A recent change to the image used in Travis causes some problems for
the code we use here to ensure the local replica is first. Since this
code is essentially dead in a post-stage world anyhow, we're OK with
ripping out the tests to placate Travis.
2016-08-14 23:12:10 -06:00
Murat Tuncer ed3a442cda Remove a router planner test for materialized view
PostgreSQL 9.5.4 stopped calling planner for materialized view create
command when NO DATA option is provided.

This causes our test to behave differently between pre-9.5.4 and 9.5.4.
2016-08-14 22:57:09 -06:00
Andres Freund d6f447884f Skip over unreferenced parameters when router executing prepared statement.
When an unreferenced prepared statement parameter does not explicitly
have a type assigned, we cannot deserialize it, to send to the remote
side.  That commonly happens inside plpgsql functions, where local
variables are passed in as unused prepared statement parameters.
2016-08-05 14:12:06 -07:00
Jason Petersen 2d7355ffe0 Avoid attempting to lock invalid shard identifier
A recent change generates a "dummy" shard placement with its identifier
set to INVALID_SHARD_ID for SELECT queries against distributed tables
with no shards. Normally, no lock is acquired for SELECT statements,
but if all_modifications_commutative is set to true, we will acquire a
shared lock, triggering an assertion failure within LockShardResource
in the above case.

The "dummy" shard placement is actually necessary to ensure such empty
queries have somewhere to execute, and INVALID_SHARD_ID seems the most
appropriate value for the dummy's shard identifier field, so the most
straightforward fix is to just avoid locking invalid shard identifiers.
2016-08-04 13:49:51 -07:00
Metin Doslu 16e02775ca Bump version numbers for 5.2 release 2016-08-01 13:48:24 -07:00
Marco Slot f538ab7f62 Rewrite WorkerShardStats to avoid invalid value bugs 2016-07-29 20:11:18 +02:00
Marco Slot 061662a51f Add MultiClientExecute and MultiClientValueIsNull for simple remote query execution 2016-07-29 20:07:18 +02:00
Andres Freund 0670ba6376 Don't access pg_dist_partition->partkey directly, use heap_getattr().
Text datums can't be directly accessed via the struct equivalence trick
used to access catalogs. That's because, as an optimization, they're
sometimes aligned to 1 byte ("text"'s alignment), and sometimes to 4
bytes. That depends on it being a short
varlena (cf. VARATT_NOT_PAD_BYTE) or not.

In the case at hand here, partkey became longer than 127 characters -
the boundary for short varlenas (cf. VARATT_CAN_MAKE_SHORT()). Thus it
became 4 byte/int aligned. Which lead to the direct struct access
accessing the wrong data.

The fix is simply to never access partkey that way - to enforce that,
hide partkey ehind the usual ifdef.

Fixes: #674
2016-07-29 10:02:36 -07:00
Eren Başak bb4f4e25b5 Set 1PC as the Default Commit Protocol for DDL Commands
Fixes #679

This change sets the default commit protocol for distributed DDL
commands to '1pc'. If the user issues a distributed DDL command with
this default setting, then once in a session, a NOTICE message is
shown about using '2pc' being extra safe.
2016-07-29 16:42:55 +03:00
Jason Petersen d43578c557 Quick fix for possible segfault in PurgeConnection
Now that connections can be acquired without going through the cache,
we have to handle cases where functions assume the cache has been ini-
tialized.
2016-07-29 00:12:56 -06:00
Jason Petersen f19779b0ce Support SERIAL/BIGSERIAL non-partition columns
This adds support for SERIAL/BIGSERIAL column types. Because we now can
evaluate functions on the master (during execution), adding this is a
matter of ensuring the table creation step works properly.

To accomplish this, I've added some logic to detect sequences owned by
a table (i.e. those related to its columns). Simply creating a sequence
and using it in a default value is insufficient; users who do so must
ensure the sequence is owned by the column using it.

Fortunately, this is exactly what SERIAL and BIGSERIAL do, which is the
use case we're targeting with this feature. While testing this, I found
that worker_apply_shard_ddl_command actually adds shard identifiers to
sequence names, though I found no places that use or test this path. I
removed that code so that sequence names are not mutated and will match
those used by a SERIAL default value expression.

Our use of the new-to-9.5 CREATE SEQUENCE IF NOT EXISTS syntax means we
are dropping support for 9.4 (which is being done regardless, but makes
this change simpler). I've removed 9.4 from the Travis build matrix.

Some edge cases are possible in ALTER SEQUENCE, COPY FROM (on workers),
and CREATE SEQUENCE OWNED BY. I've added errors for each so that users
understand when and why certain operations are prohibited.
2016-07-28 23:55:40 -06:00
Burak Yucesoy 0a2c940ae5 Remove schema name parameter from API functions
We remove schema name parameter from worker_fetch_foreign_file and
worker_fetch_regular_table functions. We now send schema name
concatanated with table name.
2016-07-28 20:41:05 +03:00
Burak Yucesoy 98025110f0 Add old version(without schema name parameter) of api functions back
Fixes #676

We added old versions (i.e. without schema name) of worker_apply_shard_ddl_command,
worker_fetch_foreign_file and worker_fetch_regular_table back. During function call
of one of these functions, we set schema name as  public schema and call the newer
version of the functions.
2016-07-28 20:40:38 +03:00
Eren Başak 6f91d30e1c Remove AllFinalizedlacementsAccessible Function
This change removes AllFinalizedPlacementsAccessible function since,
we open connections to all shard placements before any command is
sent so we immediately error out if a shard placement is not accessible.
2016-07-28 17:24:37 +03:00
Eren Başak 90a1d8f67a Allow Cancellation During Distributed DDL Commands
This change allows users to interrupt long running DDL commands.
Interrupt requests are handled after each DDL command being propagated
to a shard placement, which means that generally the cancel request will
be processed right after the execution of the DDL is finished in the
current placement.
2016-07-28 17:12:07 +03:00
Murat Tuncer 992997b8ad Expand router planner coverage
We can now support richer set of queries in router planner.
This allow us to support CTEs, joins, window function, subqueries
if they are known to be executed at a single worker with a single
task (all tables are filtered down to a single shard and a single
worker contains all table shards referenced in the query).

Fixes : #501
2016-07-27 23:35:38 +03:00
Murat Tuncer 719e44d1f4 Remove PostgreSQL 9.4 support 2016-07-26 20:16:09 +03:00
Onder Kalaci 0d3e85ce11 Fix bug related to poll timeout
This commit fixes a bug on setting polling timeout. The code
updated to comform to the comment that is already placed.
2016-07-26 09:55:47 +03:00
Burak Yucesoy 770ecffc8f Remove warnings on schema creation
Since now we support schema related operations, there is no need to warn user about
schema usage.
2016-07-22 18:24:23 +03:00
Burak Yucesoy 9df8300efa Fix ALTER TABLE SET SCHEMA
Fixes #132

We hook into ALTER ... SET SCHEMA and warn out if user tries to change schema of a
distributed table.

We also hook into ALTER TABLE ALL IN TABLE SPACE statements and warn out if citus has
been loaded.
2016-07-22 17:52:40 +03:00
Murat Tuncer 461fefbdb2 Fix outer join crash when subquery is flatten 2016-07-22 17:01:19 +03:00
Burak Yucesoy 444d4eb558 Fix worker_fetch_regular_table with schema
Fixes #504
Fixes #646

We changed signature of worker_fetch_regular_table to accept schema name as parameter to
make it work with schemas.
2016-07-22 00:44:02 -06:00
Jason Petersen 44e444ac6a Permit "single-shard" transactions
Allows the use of modification commands (INSERT/UPDATE/DELETE) within
transaction blocks (delimited by BEGIN and ROLLBACK/COMMIT), so long as
all modifications hit a subset of nodes involved in the first such com-
mand in the transaction. This does not circumvent the requirement that
each individual modification command must still target a single shard.

For instance, after sending BEGIN, a user might INSERT some rows to a
shard replicated on two nodes. Subsequent modifications can hit other
shards, so long as they are on one or both of these nodes.

SAVEPOINTs are supported, though if the user actually attempts to send
a ROLLBACK command that specifies a SAVEPOINT they will receive an
ERROR at the end of the topmost transaction.

Placements are only marked inactive if at least one replica succeeds
in a transaction where others fail. Non-atomic behavior is possible if
the shard targeted by the initial modification within a transaction has
a higher replication factor than another shard within the same block
and a node with the latter shard has a failure during the COMMIT phase.

Other methods of denoting transaction blocks (multi-statement commands
sent all at once and functions written in e.g. PL/pgSQL or other such
languages) are not presently supported; their treatment remains the
same as before.
2016-07-21 15:57:22 -06:00
Burak Yucesoy 7df5a265c7 Fix COUNT DISTINCT approximation with schema
Fixes #555

Before this change, we were resolving HLL function and type Oid without qualified name.
Now we find the schema name where HLL objects are stored and generate qualified names for
each objects.

Similar fix is also applied for cstore_table_size function call.
2016-07-21 17:29:18 +03:00
Burak Yucesoy 5a93a70e2d Fix master_apply_delete_command with schema
Fixes #73
2016-07-21 15:09:20 +03:00
Burak Yucesoy d0beacc4e1 Change worker_apply_shard_ddl_command to accept schema name as parameter
Fixes #565
Fixes #626

To add schema support to citus, we need to schema-prefix all table names, object names etc.
in the queries sent to worker nodes. However; query deparsing is not available for most of
DDL commands, therefore it is not easy to generate worker query in the master node.

As a solution we are sending schema names along with shard id and query to run to worker
nodes with worker_apply_shard_ddl_command.

To not break \STAGE command we pass public schema as paramater while calling
worker_apply_shard_ddl_command from there. This will not cause problem if user uses \STAGE
in different schema because passes schema name is used only if there is no schema name is
given in the query.
2016-07-21 14:17:26 +03:00
Metin Doslu 28000a8203 Add support for prepared statements with parameterized non-partition columns in router executor 2016-07-21 11:09:28 +03:00
Marco Slot 7c093c5cef Move CompleteShardPlacementTransactions to multi_shard_transaction.c 2016-07-20 12:10:46 +02:00
Burak Yucesoy 71bb558641 Always schema-prefix worker queries
Fixes #215
Fixes #267
Fixes #502
Fixes #556
Fixes #557
Fixes #560
Fixes #568
Fixes #623
Fixes #624

With this change we schema-prefix table names, operator names and composite types.
2016-07-20 10:42:24 +03:00
Eren Başak c559592da0 Fix Unused Parameter isTopLevel in ExecuteDistributedDDLCommand
This change fixes the unused variable problem in
`ExecuteDistributedDDLCommand` function (multi_utility.c). The
parameter is meant to be used in PreventTransactionChain call.
2016-07-19 14:14:02 +03:00
Eren 692ef0964a Propagate DDL Commands with 2PC
Fixes #513

This change modifies the DDL Propagation logic so that DDL queries
are propagated via 2-Phase Commit protocol. This way, failures during
the execution of distributed DDL commands will not leave the table in
an intermediate state and the pending prepared transactions can be
commited manually.

DDL commands are not allowed inside other transaction blocks or functions.

DDL commands are performed with 2PC regardless of the value of
`citus.multi_shard_commit_protocol` parameter.

The workflow of the successful case is this:
1. Open individual connections to all shard placements and send `BEGIN`
2. Send `SELECT worker_apply_shard_ddl_command(<shardId>, <DDL Command>)`
to all connections, one by one, in a serial manner.
3. Send `PREPARE TRANSCATION <transaction_id>` to all connections.
4. Sedn `COMMIT` to all connections.

Failure cases:
- If a worker problem occurs before sending of all DDL commands is finished, then
all changes are rolled back.
- If a worker problem occurs after all DDL commands are sent but not after
`PREPARE TRANSACTION` commands are finished, then all changes are rolled back.
However, if a worker node is failed, then the prepared transactions in that worker
should be rolled back manually.
- If a worker problem occurs during `COMMIT PREPARED` statements are being sent,
then the prepared transactions on the failed workers should be commited manually.
- If master fails before the first 'PREPARE TRANSACTION' is sent, then nothing is
changed on workers.
- If master fails during `PREPARE TRANSACTION` commands are being sent, then the
prepared transactions on workers should be rolled back manually.
- If master fails during `COMMIT PREPARED` or `ROLLBACK PREPARED` commands are being
sent, then the remaining prepared transactions on the workers should be handled manually.

This change also helps with #480, since failed DDL changes no longer mark
failed placements as inactive.
2016-07-19 10:44:11 +03:00
Murat Tuncer eae7f79a8b Make router planner use original query 2016-07-18 18:23:04 +03:00
Eren c92c81b550 Add LIMIT/OFFSET Support
Fixes #394

This change adds LIMIT/OFFSET support for non router-plannable
distributed queries.

In cases that we can push the LIMIT down, we add the OFFSET value to
that LIMIT in the worker queries. When a query with LIMIT x OFFSET y is issued,
the query is propagated to the workers as LIMIT (x+y) OFFSET 0, and on the
master table, the original LIMIT and OFFSET values are used. With this change,
we can use OFFSET wherever we can use LIMIT.
2016-07-18 12:00:24 +03:00
Andres Freund bafafcd1bf citus_indent fixups 2016-07-13 11:45:51 -07:00
Brian Cloutier 728eefcf2b Simplify code and fix include guards in citus_clauses 2016-07-13 11:45:51 -07:00
Brian Cloutier 9a5e529f6f cosmetic changes 2016-07-13 11:45:51 -07:00
Brian Cloutier c46cb19cda Only reparse queries if the planner flags them for reparsing 2016-07-13 11:45:51 -07:00
Brian Cloutier d792c0af4d citus_indent and some renaming 2016-07-13 11:45:51 -07:00
Brian Cloutier e73b4ac026 Evaluate functions on the master
- Enables using VOLATILE functions (like nextval()) in INSERT queries
- Enables using STABLE functions (like now()) targetLists and joinTrees

UPDATE and INSERT can now contain non-immutable functions. INSERT can contain any kind of
expression, while UPDATE can contain any STABLE function, so long as a Var is not passed
into the STABLE function, even indirectly. UPDATE TagetEntry's can now also include Vars.

There's an exception, CASE/COALESCE statements may not contain mutable functions.

Functions calls in master_modify_multiple_shards are also evaluated.
2016-07-13 11:45:51 -07:00
Burak Yucesoy 7cb92b8bb1 Fix COPY produces error when using array of user-defined types
Fixes #463

OID of user-defined types may be different in master and worker nodes. This causes errors
while sending data between nodes with binary nodes. Because binary copy format adds OID
of the element if it is in an array. The code adding OID is in PostgreSQL code, therefore
we cannot change it. Instead we decided to use text format if we try to send array of
user-defined type.
2016-07-13 11:12:24 +03:00
Jason Petersen 9157ac9f10 Remove hash-pruning logic for NULL values
It turns out some tests exercised this behavior, but removing it should
have no ill effects. Besides, both copy and INSERT disallow NULLs in a
table's partition column.

Fixes a bug where anti-joins on hash-partitioned distributed tables
would incorrectly prune shards early, result in incorrect results (test
included).
2016-07-06 17:04:21 -06:00
Andres Freund c945ea310b Add regression tests for RETURNING. 2016-07-01 13:07:12 -07:00
Andres Freund 586f738bc7 Support RETURNING for modification commands.
Fixes: #242
2016-07-01 13:07:12 -07:00
Andres Freund 610e17d94a Combine router executor paths for select and modify commands.
The upcoming RETURNING support would otherwise require too much
duplication.  This contains most of the pieces required for RETURNING
support, except removing the planner checks and adjusting regression
test output.
2016-07-01 13:07:12 -07:00
Andres Freund c9505a47ab Remember original targetlist in MultiQueryContainerNode().
The old targetlist wasn't used so far, but the upcoming RETURNING
support relies on it.

This also allows to get rid of some crufty code in
multi_executor.c:multi_ExecutorStart(), which used the worker query's
targetlist instead of the main statement's (which didn't have one up to
now).
2016-07-01 12:50:12 -07:00
Andres Freund 63fcd4a505 Fix definition of faux targetlist element inserted to prevent backward scans.
The targetlist contains TargetEntrys containing expressions, not
expressions directly. That didn't matter so far, but with the upcoming
RETURNING support, the targetlist is inspected to build a TupleDesc.
ExecCleanTypeFromTL hits an assert when looking at something that's not
a TargetEntry.

Mark the entry as resjunk, so it's not actually used.
2016-07-01 12:50:12 -07:00
Andres Freund 3201ef8764 Add tests verifying that updates return correct tuple counts.
This unfortunately requires adding a new table, triggering renumbering
of a number of shard ids.
2016-07-01 12:50:12 -07:00
Metin Doslu 85db53c8fe Add null check to SqlStateMatchesCategory()
Fixes #634
2016-07-01 12:28:46 -07:00
Jason Petersen 518adff539 Minor formatting fix
Noticed that uncrustify doesn't like the array-of-struct literals, so
omitting them from formatting (at least here).
2016-06-28 13:09:57 -06:00
Jason Petersen f132dbad1a Use literal instead of constant to fix 9.4 build
PG_UINT32_MAX doesn't exist before 9.5. Missed this because I removed
my assert-enabled builds during packaging work.

Fixes #619
2016-06-28 12:36:14 -06:00
Andres Freund 891dd366d2 Provide our own psqlscan.l->psqlscan.l rule.
As postgres's generic .l -> .c Makefile rule uses ifdef - which is
evaluated early, not during rule evaluation - we have to override the
rule, in addition to the detection of FLEX in the previous commit.

Fixes: #439
2016-06-22 11:03:23 -07:00
Jason Petersen 56f145e7fa Purge connection if re-raising error
The only way we re-raise an error is if the raiseError flag is true, so
might as well purge connection in that block rather than independently
checking errorLevel.
2016-06-21 09:51:12 -06:00
Murat Tuncer e86b4b397c Refactor multi_planner to create router plan directly
If router plan creation fails, it falls back to normal planner
2016-06-21 12:50:21 +03:00
Burak Yucesoy 2da5ae240e Fix master_append_table_to_shard to work with schemas
Fixes #78

With this change, it is possible to append a table in any schema to shard. The function
master_append_table_to_shard now supports schema names.
2016-06-17 04:35:00 +03:00
Andres Freund acb36b4505 Store ShardInterval instead of shardId in RangeTableFragments.
For CITUS_RTE_RELATION type fragments, reloading shardIntervals from the
database is rather expensive. So store a pointer to the full shard
interval, instead of just the shard id.  There's no new memory lifetime
hazards here, because we already passed a pointer to the shardInterval's
->shardId field around.

The plan time for the query in issue #607 goes from 2889 ms to 106 ms.
with this change.
2016-06-16 17:31:35 -07:00
Andres Freund 1e07a94435 Use cached comparator in ShardIntervalsOverlap().
By far the most expensive part of ShardIntervalsOverlap() is computing
the function to use to determine overlap. Luckily we already have that
computed and cached.

The plan time for the query in issue #607 goes from 8764 ms to 2889 ms
with this change.
2016-06-16 17:21:19 -07:00
Andres Freund 1c24d703d7 Add tests for LEFT JOIN ON clauses preventing matches left/right. 2016-06-16 16:53:02 -07:00
Marco Slot f15ec5554c Do not copy outer join clauses into WHERE 2016-06-16 16:42:32 -07:00
Metin Doslu 7ede3db4f5 Drop function from public and create in pg_catalog
Fixes #600
2016-06-16 14:08:40 -07:00
Murat Tuncer aeb6443898 Reduce regression test runtime
-Added 2 more schedules for task-tracker and multi-binary
 instead of running multi_schedule 3 times
-set task-tracker-delay for each long running schedule
2016-06-15 16:35:07 +03:00
Burak Yucesoy bdf9ca2466 Append shardId before escaping the table name
Fixes #550, fixes #545

If table name contains special characters, it needs to be escaped. However in some cases,
we escape table name before appending shardId, which causes syntax error in the queries
sent to worker nodes. With this change we now append shardId before escaping table names.
2016-06-15 04:15:40 +03:00
Murat Tuncer 5fddb9c34e Remove variant files
This checkin removes variant files we needed
due to differences in outputs of pg94 and pg95 runs.
However, variant file for test multi_upsert stays
since this file tests for a feature that does not
exist in pg94, and outputs are drastically different.
2016-06-13 12:12:06 +03:00
Eren ae5687e726 Eliminate compile time warnings in multi_logical_optimizer.c
This change removes some issues about mixed declarations
and code in TablePartitioningSupportsDistinct() and
WorkerExtendedOpNode() functions.
2016-06-10 12:27:12 +03:00
Murat Tuncer 0de0e7c3d1 Refactor task tracker cleanup to enable workers receive cleanup jobs
Long sleep is replaced by multiple small sleeps. Maximum timeout
is also increased since we do not have to wait for that long most
of the cases.
2016-06-09 17:03:54 +03:00
Murat Tuncer 315b7f3e4c Fix crash in count distinct with filters in repartition subqueries
now copies all column references in count distinct aggreagete
to worker target list and group by. Master target list is
also updated to reflect changes in attribute order.

Fixes 569
2016-06-09 11:47:24 +03:00
Jason Petersen 369ab7664c Minor formatting/comment fixes 2016-06-08 10:34:07 -06:00
Amos Bird b58dfd93ae Add overflow checks. 2016-06-08 10:30:03 +08:00
Amos Bird ad42423c24 Eliminates the possibilities of counter overflows.
This patch uses scanint8 instead of pg_atoi to make sure the affected
tuples counter never gets overflow.
2016-06-08 10:30:03 +08:00
Burak Yücesoy 54c0d827d8 Fix wrong storage type for foreign tables
Fixes #496

Previously we do not check whether table is foreign or not while creating empty
shards, and set storage type to 't'(Standard table) or 'c'(Columnar table). Now
if the table is foreign table(but not CStore foreign table) we set storage
type to 'f'(Foreign table). If it is CStore foreign table, we set its storage
type to 'c', i.e. columnar table have priority over foreign table.

Please note that 'c' is only used for CStore tables not for other possible
columnar stores at the moment. Possible improvement could be checking for other
columnar stores, though I am not sure if there is a way to check it for all
other columnar stores.
2016-06-08 04:12:01 +03:00
Jason Petersen 6e9a2869e0 Add back test for INSERT where all placements fail
Since we now short-circuit on certain remote errors, we want to ensure
we preserve the old behavior of not modifying any placement states if
a non-short-circuiting error occurs on all placements.
2016-06-07 13:21:23 -06:00
Jason Petersen 281d93b3c6 Make ReportRemoteError's CONTEXT style-compliant
There's not a ton of documentation about what CONTEXT lines should look
like, but this seems like the most dominant pattern. Similarly, users
should expect lowercase, non-period strings.
2016-06-07 12:47:16 -06:00
Jason Petersen 8efb504d1a Refactor ReportRemoteError to remove boolean arg
Broke it into two explicitly-named functions instead: WarnRemoteError
and ReraiseRemoteError.
2016-06-07 12:38:32 -06:00
Metin Doslu dfc7dd8d87 Fail fast on constraint violations in router executor 2016-06-07 18:11:17 +03:00
Metin Doslu 6195535906 Update ereport format 2016-06-07 15:58:32 +03:00
Metin Doslu 5eb2e76296 Update only shard length on statistics update for hash-partitioned
Update only the shard length on master_update_shard_statistics() call for
hash-partitioned tables.

Fixes #519.
2016-06-07 15:04:29 +03:00
Eren 0645cba428 Set Explicit ShardId/JobId In Regression Tests
Fixes #271

This change sets ShardIds and JobIds for each test case. Before this change,
when a new test that somehow increments Job or Shard IDs is added, then
the tests after the new test should be updated.

ShardID and JobID sequences are set at the beginning of each file with the
following commands:

```
ALTER SEQUENCE pg_catalog.pg_dist_shardid_seq RESTART 290000;
ALTER SEQUENCE pg_catalog.pg_dist_jobid_seq RESTART 290000;
```

ShardIds and JobIds are multiples of 10000. Exceptions are:
- multi_large_shardid: shardid and jobid sequences are set to much larger values
- multi_fdw_large_shardid: same as above
- multi_join_pruning: Causes a race condition with multi_hash_pruning since
they are run in parallel.
2016-06-07 14:32:44 +03:00
Murat Tuncer fcd4248f6a Add enable_ddl_propagation flag to control automatic ddl propagation 2016-06-06 13:42:46 +03:00
Murat Tuncer 41096f2076 Change equality operator check for operator expressions 2016-06-06 12:34:16 +03:00
Burak Yücesoy 658fa600d2 Update regression tests where metadata edited manually
Fixes #302

Since our previous syntax did not allow creating hash partitioned tables,
some of the previous tests manually changed partition method to hash to
be able to test it. With this change we remove unnecessary workaround and
create hash distributed tables instead. Also in some tests metadata was
created manually. With this change we also fixed this issue.
2016-06-04 13:50:42 +00:00
Burak Yucesoy 15f55cb675 Remove ONLY clause from worker queries
Fixes #475

With this change we prevent addition of ONLY clause to queries prepared for
worker nodes. When we add ONLY clause we may miss the inherited tables in
worker nodes created by users manually.
2016-06-03 11:42:43 +03:00
Andres Freund ee5bb2297b Rely less on remote_task_check_interval.
When executing queries with citus.task_executor = 'real-time', query
execution could, so far, spend a significant amount of time
sleeping. That's because we were
a) sleeping after several phases of query execution, even if we're not
   waiting for network IO
b) sleeping for a fixed amount of time when waiting for network IO;
   often a lot longer than actually required.
Just reducing the amount of time slept isn't a real solution, because
that just increases CPU usage.

Instead have the real-time executor's ManageTaskExecution return whether
a task is currently being processed, waiting for reads or writes, or
failed. When all tasks are waiting for IO use poll() to wait for IO
readyness.

That requires to slightly redefine how connection timeouts are handled:
before we counted the number of times ManageTaskExecution() was called,
and compared that with the timeout divided by the task check
interval. That, if processing of tasks took a while, could significantly
increase the time till a timeout occurred. Because it was based on the
ManageTaskExecution() being called on a constant interval, this approach
isn't feasible anymore.  Instead measure the actual time since
connection establishment was started. That could in theory, if task
processing takes a very long time, lead to few passes over
PQconnectPoll().

The problem of sleeping too much also exists for the 'task-tracker'
executor, but is generally less problematic there, as processing the
individual tasks usually will take longer. That said, for e.g. the
regression tests it'd be helpful to use a similar approach.
2016-06-02 12:11:16 -06:00
Metin Doslu c094104d9e Move master_update_shard_statistics() to pg_catalog
Fixes #546
2016-06-02 10:52:47 +03:00
Jason Petersen cc46222e35 Fix formatting
Checking in citus_indent output.
2016-05-27 15:13:28 -06:00
Amos Bird ed0002f28e Remove redundant implementations of error funcs.
This patch does some basic cleaning jobs. It removes duplicated
implementations of ReportRemoteError() and related ones and adjusts
regression tests.
2016-05-27 15:12:59 -06:00
Jason Petersen f9f17cd1ba Merge branch credativ:reproducible
cr: @jasonmp85
2016-05-27 12:45:55 -06:00
Matthew Seaman 62bf21de5d Add inet includes for htonl and htons funtions
Needed to fix FreeBSD builds.
2016-05-27 12:36:12 -06:00
Murat Tuncer 9167373f54 Add complex distinct count support for repartitioned subqueries
Single table repartition subqueries now support count(distinct column)
and count(distinct (case when ...)) expressions. Repartition query
extracts column used in aggregate expression and adds them to target
list and group by list, master query stays the same (count (distinct ...))
but attribute numbers inside the aggregate expression is modified to
reflect changes in repartition query.
2016-05-27 15:43:05 +03:00
Metin Doslu a82efa6613 Make master_create_empty_shard() aware of the shard placement policy
Now, master_create_empty_shard() will create shards according to the
value of citus.shard_placement_policy which also makes default round-robin
instead of random.
2016-05-27 15:05:53 +03:00
eren 793cb2d004 ADD master_modify_multiple_shards UDF
Fixes #10

This change creates a new UDF: master_modify_multiple_shards
Parameters:
  modify_query: A simple DELETE or UPDATE query as a string.

The UDF is similar to the existing master_apply_delete_command UDF.
Basically, given the modify query, it prunes the shard list, re-constructs
the query for each shard and sends the query to the placements.

Depending on the value of citus.multi_shard_commit_protocol, the commit
can be done in one-phase or two-phase manner.

Limitations:
* It cannot be called inside a transaction block
* It only be called with simple operator expressions (like Single Shard Modify)

Sample Usage:
```
SELECT master_modify_multiple_shards(
  'DELETE FROM customer_delete_protocol WHERE c_custkey > 500 AND c_custkey < 500');
```
2016-05-26 17:30:35 +03:00
Burak Yucesoy 31b0423f1f Fix #469
This change renames one of the ReceiveRegularFile functions with
more descriptive name.
2016-05-26 12:03:36 +03:00
Christoph Berg 2d56be6983 Sort list of objects in src/backend/distributed/Makefile
Make's $(wildcard) does not sort the glob result, but returns filenames
in filesystem ordering. This makes the build result vary and hence
unreproducible on the binary level. Fix by adding $(sort).

Spotted by Debian's reproducible builds project.
2016-05-18 10:42:20 +02:00
Jason Petersen 6998e4a423 Add multi_copy test outputs to gitignore 2016-05-10 13:36:56 -06:00
Jason Petersen 60b7cdfa7c Add gitignore rules for latest install files
Got tired of dirty git tree.
2016-05-10 11:57:11 -06:00
Marco Slot d333c49280 Add JSON/XML validation to EXPLAIN regression tests and fix issues 2016-05-06 11:30:07 +02:00
Lukas Fittl 19e71b5271 Distributed EXPLAIN: Generate valid JSON output.
This modifies the EXPLAIN output functions to actually generate
valid JSON output when (FORMAT JSON) is being used.

Fixes #494.
2016-05-05 12:48:01 +02:00
Onder Kalaci 0a740c0bdc Fix check-full failures
This commit fixes failures happen during check-full. The change does make
clean seperation of executor types in certain places to keep the outputs
stable.
2016-05-05 12:28:22 +03:00
Andres Freund 812a930f6c Stamp 5.1 release. 2016-05-04 18:05:41 -07:00
Andres Freund e28ce607d2 Generate extension versions from the previous one. 2016-05-04 18:05:41 -07:00
Onder Kalaci 38a1092687 Fix compile time warning
This change fixes a compile time warning related to definition/declaration order
of the code.
2016-05-04 09:42:10 +03:00
Marco Slot 206912bda4 Remove costs from explain regression tests 2016-05-03 22:11:23 +02:00
Metin Doslu fb6b6daf9d Add COPY support on worker nodes for append partitioned relations
Now, we can copy to an append-partitioned distributed relation from
any worker node by providing master options such as;

COPY relation_name FROM file_path WITH (delimiter '|', master_host 'localhost', master_port 5432);

where master_port is optional and default is 5432.
2016-05-03 16:00:00 +03:00
Marco Slot 27a551fedc Add deprecation warning to copy_to_distributed_table 2016-05-03 14:08:42 +02:00
Brian Cloutier 5962c9b7c8 Query Planning Performance Improvments (#474)
- Only look at pruned shards when determining AnchorTable
- Use cached shardIntervalCompareFunction during copartition check
2016-05-03 10:48:46 +03:00
Marco Slot 1bfd124da8 Remove spurious intermediate regression test files 2016-05-02 12:30:15 +02:00
Jason Petersen 37103eb92f Force bad connections in tests by closing sockets
Based on Andres' suggestion, I removed SetConnectionStatus, moving its
functionality directly into set_connection_status_bad, which now simply
shuts down the socket underlying a particular connection.

This keeps the functionality as-is while removing our questionable use
of internal libpq headers.
2016-04-29 15:56:04 -07:00
Marco Slot cfbdbe29a9 Add EXPLAIN for simple distributed queries 2016-04-30 00:11:02 +02:00
eren 9dc6f6b2e2 FIX "mixed declarations and code" Warning in multi_physical_planner.c
Fixes #477

This change fixes the compile time warning message in BuildMapMergeJob in
multi_physical_planner.c about mixed declarations and code. Basically, the
problematic declaration is moved up so that no expression is before it.
2016-04-29 11:18:04 +03:00
Brian Cloutier 38fdb01b91 Allow references to columns in UPDATE statements (#472)
Allow references to columns in UPDATE statements

Queries like "UPDATE tbl SET column = column + 1" are now allowed, so long as you don't use any IMMUTABLE functions.
2016-04-28 05:45:16 -07:00
eren 888457bb7f Rename copy_transaction_manager
This change renames the distributed transaction manager parameter from
citus.copy_transaction_manager to citus.multi_shard_commit_protocol.

Distributed transaction manager has been used only by the COPY on hash
partitioned tables but it can be used by upcoming features so, we needed
to rename so that its name do not contain a reference to COPY.

The change also includes renames like transaction_manager_options to
commit_protocol_options and TRANSACTION_MANAGER_1PC to COMMIT_PROTOCOL_1PC.

With this change, declaration of MultiShardCommitProtocol (was
CopyTransactionManager) is moved from multi_copy.c to multi_transaction.c.
2016-04-28 15:12:50 +03:00
Andres Freund a9d7f62cad Perform permission checks on operations re-implemented by citus.
Currently that's just COPY FROM.  There's other places where we could
check for permissions earlier (to fail less verbosely), but since
there's other pending changes in the whole DDL area, which is affected
by this, I'm just adding a note to those places.
2016-04-27 10:28:36 -07:00
Andres Freund 63998786ba Create new shards as owned the distributed table's owner.
That's important because ownership of relations implies special
privileges. Without this change, a distributed table can be accessible
by a table's owner, but a shard created by another user might not.
2016-04-27 10:28:33 -07:00
Andres Freund c45b94e88a Add ReplicateGrantStmt().
This is the basis for coordinating GRANT/REVOKE across nodes.
2016-04-27 10:28:25 -07:00
Andres Freund ee6ef363c0 Add pg_get_table_grants() function and support extending GRANTs. 2016-04-27 10:28:25 -07:00
Andres Freund c181ccf6ff Grant SELECT for pg_catalog.pg_dist* to PUBLIC.
Given pg_class et al. are readable by everyone there's little point in
restricting read only access to citus catalogs.
2016-04-27 10:28:25 -07:00
Andres Freund 99e983433f Run some commands as superuser to allow normal users to execute queries.
Some small parts of citus currently require superuser privileges; which
is obviously not desirable for production scenarios. Run these small
parts under superuser privileges (we use the extension owner) to avoid
that.

This does not yet coordinate grants between master and workers. Thus it
allows to create shards, load data, and run queries as a non-superuser,
but it is not easily possible to allow differentiated accesses to
several users.
2016-04-27 10:28:22 -07:00
Andres Freund a0058023bf Add CitusExtensionOwner(), to execute some priviledged operations under.
There exist some operations we have to execute with elevated
privileges. The most expedient user for that is the user owning the
citusdb extension.
2016-04-27 10:26:08 -07:00
Andres Freund e1fc079d07 Replace direct inserts in csql's \stage by serverside functions.
\stage so far directly inserted into pg_dist_shard and
pg_dist_shard_placement. That makes it hard to do effective permission
checks.  Thus move the inserts into two C functions.

These two new functions aren't the nicest abstraction. But as we are
planning to obsolete \stage, it doesn't seem worthwhile to refactor the
client-side code of \stage to allow the use of
master_create_empty_shard() et al.
2016-04-27 10:23:35 -07:00
Andres Freund 22ea434cef Perform permission checks in functions manipulating distributed tables.
Previously several commands, amongst them commands like
master_create_distributed_table(), were allowed for everyone. That's not
good: Even though citus currently requires superuser permissions, we
shouldn't allow non-superusers to perform actions as sensitive as making
a table distributed.

There's no checks on the worker_* functions, as these usually just punt
the action to underlying postgres functionality, which then perform the
necessary checks.
2016-04-27 10:22:20 -07:00
Andres Freund 6080ab4441 Add very basic infrastructure for schema upgrade scripts.
Citus' extension version now has a -$schemaversion appendix.  When the
schema is changed, a new schema version has to be added; changes to the
same schema version several commits inside a single pull request are ok.

Schema migration scripts between each schema version have to be
added. To ensure upgrade scripts work correctly a new regression test
ensures that all steps work.

The extension scripts to-be-used for CREATE EXTENSION (i.e. not
extension updates) are generated by concatenating citus.sql and the
relevant migration scripts.
2016-04-27 10:00:08 -07:00
Andres Freund e99ae630a0 Always create database for regression tests with a fixed username.
Otherwise the owner of relations and such will depend on the username of
the user running the regression tests. As "postgres" is the most common
username for that purpose, hardcode that in pg_regress_multi.pl.
2016-04-27 10:00:08 -07:00
Andres Freund 3dae284bbe Use the current session's username when connecting to worker nodes.
So far we've always used libpq defaults when connecting to workers; bar
special environment variables being set that'll always be the user that
started the server.  That's not desirable because it prevents using
users with fewer privileges.

Thus change the various APIs creating connections to workers to always
use usernames. That means:
1) MultiClientConnect() needs to, optionally, accept a username
2) GetOrEstablishConnection(), including the underlying cache, need to
   use the current user as part of the connection cache key. That way
   connections for separate users are distinct, and we always use one
   with the correct authorization.
3) The task tracker needs to keep track of the username associated with
   a task, so it can use it when establishing connections outside the
   originating session.
2016-04-27 10:00:08 -07:00
Onder Kalaci c763d7492c Apply final code review feedback
- Fix o(n^2) loop to o(n)
- Collapse two if statements into a single one
- Some coding conventions feedback
2016-04-27 10:36:03 +03:00
Onder Kalaci 876730ad73 Fix Merge Conflict
This commit fixes merge conflicts.
2016-04-26 11:18:47 +03:00
Onder Kalaci 16425e9054 Add fast shard pruning path for INSERTs on hash partitioned tables
This commit adds a fast shard pruning path for INSERTs on
hash-partitioned tables. The rationale behind this change is
that if there exists a sorted shard interval array, a single
index lookup on the array allows us to find the corresponding
shard interval. As mentioned above, we need a sorted
(wrt shardminvalue) shard interval array. Thus, this commit
updates shardIntervalArray to sortedShardIntervalArray in the
metadata cache. Then uses the low-level API that is defined in
multi_copy to handle the fast shard pruning.

The performance impact of this change is more apparent as more
shards exist for a distributed table. Previous implementation
was relying on linear search through the shard intervals. However,
this commit relies on constant lookup time on shard interval
array. Thus, the shard pruning becomes less dependent on the
shard count.
2016-04-26 11:16:00 +03:00
Brian Cloutier 1f5379457a Clear metadata_cache upon DROP EXTENSION
When we notice that pg_dist_partition is being invalidated we assume
that the citus extension is being dropped and drop state such as
extensionLoaded and the cached oids of all the metadata tables.

This frees the user from needing to reconnect after running DROP
EXTENSION, so we also no longer send a warning message.
2016-04-22 07:25:49 -07:00
Murat Tuncer 2bc96fabe5 Add dynamic executor selection
- non-router plannable queries can be executed
  by router executor if they satisfy the criteria
- router executor is removed from configuration,
  now task executor can not be set to router
- removed some tests that error out for router executor
2016-04-21 09:15:33 +03:00
Murat Tuncer 68cbf8a482 Add router plannable check and router planning logic
for single shard select queries
2016-04-21 09:15:33 +03:00
Brian Cloutier c6135fe0dc Support count(distinct) on hash partitioned tables
Also add test to ensure we get the same results when running
count(distinct) on range and hash partitioned tables.
2016-04-20 04:54:07 -07:00
eren 33b96dfb7f FIX Warning Message in multi_logical_optimizer.c
With #426, some new warning messages started to arise, because of
cross assignment of Node and Expr pointers. This change fixes the
warnings with type casts.
2016-04-20 11:33:29 +03:00
eren f77cff3fb6 Fix JOINs on varchar columns with subquery pushdown
Fixes #379

Varchar VAR struct is wrapped in RELABELTYPE struct inside PostgreSQL code and
IsPartitionColumnRecursive function considers only VAR types so returning false
for varchar.

This change adds strip_implicit_coercions() call to the columnExpression in
IsPartitionColumnRecursive function so that we get rid of implicit coercions like
RELABELTYPE are stripped to VAR.
2016-04-19 21:55:50 -06:00
eren e786cbed0f Fix Join Problem With VARCHAR Partition Columns
This change fixes the problem with joins with VARCHAR columns. Prior to
this change, when we tried to do large table joins on varchar columns, we got
an error of the form:
ERROR: cannot perform local joins that involve expressions
DETAIL: local joins can be performed between columns only.

This is because we have a check in CheckJoinBetweenColumns() which requires the
join clause to have only 'Var' nodes (i.e. columns). Postgres adds a relabel t
ype cast to cast the varchar to text; hence the type of the node is not T_Var
and the join fails.

The fix involves calling strip_implicit_coercions() to the left and right
arguments so that RELABELTYPE is stripped to VAR.

Fixes #76.
2016-04-19 21:55:50 -06:00
eren f53057c7dd Fix Shard Pruning Problem With Subqueries on VARCHAR Partition Columns
Fixes #375

Prior to this change, shard pruning couldn't be done if:
- Table is hash-distributed
- Partition column of is VARCHAR
- Query to be pruned is a subquery

There were two problems:
- A bug in left-side/right-side checks for the partition column
- We were not considering relabeled types (VARCHAR was relabeled as TEXT)
2016-04-19 21:55:50 -06:00
Metin Doslu 4e20753003 Add COPY support on master node for append partitioned relations 2016-04-19 21:57:59 +03:00
Andres Freund c53fcd8042 Remove wholly unused variable.
This avoids a -Wunused warning.
2016-04-19 12:31:13 -06:00
Andres Freund 926534bbc2 Annotate variables only used for asserts with PG_USED_FOR_ASSERTS_ONLY.
This avoids '-Wunused-but-set-variable' type warnings when compiling
without assertions, e.g. against a system postgres.
2016-04-19 12:31:12 -06:00
Jason Petersen 37d7f98f50 Add clarifying comment in HashableClauseMutator
While reading this code last week, it appeared as though there was no
place we ensured that the partition clause actually used equality ops.
As such, I was worried that we might transform a clause such as id < 5
into a constraint like hash(id) = hash(5) when doing shard pruning. The
relevant code seemed to just ensure:

  1. The node is an OpExpr
  2. With a related hash function
  3. It compares the partition column
  4. Against a constant

A superficial reading implied we didn't actually make sure the original
op was equality-related, but it turns out the hash lookup function DOES
ensure that for us. So I added a comment.
2016-04-19 12:21:11 -06:00
Murat Tuncer c19af52f9c Merge pull request #410 from citusdata/350-error-during-duplicate-index-creation
Error out earlier when creating an index with a name collision.
2016-04-19 07:26:31 +03:00
Brian Cloutier 301ffd64f2 Better error on "CREATE INDEX already_exists ..."
Previously (if you're creating the index with the same name on different
tables) we successfully ran the command on the workers before failing it
on the master and leaving no record of the index.

Now we check whether the index exists on the master before sending
commands to the workers.

--

Also make the error better when user attampts to create an index without
a name. Previously those statements returned:

brian=# create index on c (b);
WARNING:  could not receive query results from localhost:9700
DETAIL:  Client error: cannot extend name for null index name
ERROR:  could not execute DDL command on worker node shards

They now return

brian=# create index on c (b);
ERROR:  creating index without a name on a distributed table is
currently unsupported
2016-04-18 13:33:53 -07:00
Jason Petersen dbf4351de2 Merge branch 'master' into 422-incorrect-node-port-type 2016-04-15 12:30:50 -06:00
Brian Cloutier 356f6f6cd7 Treat nodePort as the 64bit integer it is 2016-04-15 11:29:36 -07:00