Commit Graph

30 Commits (aa0aea9840986104e25d3c1cbba557e94cbce2b3)

Author SHA1 Message Date
velioglu 72dfe4a289 Adds colocation check to local join 2018-04-04 22:49:27 +03:00
Dimitri Fontaine c9760fbb64 Fix CREATE INDEX with storage options on distributed tables.
By sharing the implementation of the function AppendOptionListToString on
three call sites, we would expand an extra OPTIONS keyword in a create index
statement, and omit other bits of the specific syntax here.

This patch introduces an AppendStorageParametersToString() function that is
very similar to AppendOptionListToString() but handles WITH(a="foo",...)
syntax that is used in reloptions (aka Storage Parameters).

Fixes #1747.
2018-01-17 21:56:40 +01:00
Dimitri Fontaine 952da72c55 Implement ALTER TABLE|INDEX ... SET|RESET ().
PostgreSQL implements support for several relation kinds in a single
statement, such as in the AlterTableStmt case, which supports both tables
and indexes and more (see ATExecSetRelOptions in PostgreSQL source code file
src/backend/commands/tablecmds.c for an example of that).

As a consequence, this patch implements support for setting and resetting
storage parameters on both relation kinds.
2018-01-17 21:56:40 +01:00
Dimitri Fontaine 17266e3301 Implement ALTER INDEX ... RENAME TO ...
The command is now distributed among the shards when the table is
distributed. To that effect, we fill in the DDLJob's targetRelationId with
the OID of the table for which the index is defined, rather than the OID of
the index itself.
2018-01-17 21:56:40 +01:00
Dimitri Fontaine e010238280 Implement ALTER TABLE ... RENAME TO ...
The implementation was already mostly in place, but the code was protected
by a principled check against the operation. Turns out there's a nasty
concurrency bug though with long identifier names, much as in #1664.

To prevent deadlocks from happening, we could either review the DDL
transaction management in shards and placements, or we can simply reject
names with (NAMEDATALEN - 1) chars or more — that's because of the
PostgreSQL array types being created with a one-char prefix: '_'.
2018-01-11 13:21:24 +01:00
Marco Slot a9933deac6 Make real time executor work in transactions 2017-11-30 09:59:32 +03:00
Marco Slot f4ceea5a3d Enable 2PC by default 2017-11-22 11:26:58 +01:00
mehmet furkan şahin 314fc09d90 regression test shard_count is changed from 32 to 4 2017-11-20 12:47:49 +03:00
mehmet furkan şahin 61ae33dc7f ALTER TABLE .. REPLICA IDENTITY support is implemented 2017-10-26 13:44:28 +03:00
Hadi Moshayedi e5fbcf37dd Add Savepoint Support (#1539)
This change adds support for SAVEPOINT, ROLLBACK TO SAVEPOINT, and RELEASE SAVEPOINT.

When transaction connections are not established yet, savepoints are kept in a stack and sent to the worker when the connection is later established. After establishing connections, savepoint commands are sent as they arrive.

This change fixes #1493 .
2017-08-15 13:02:28 -04:00
Hadi Moshayedi 8229a64fe8 Remove distributed tables' dependency on distribution key columns. (#1527)
This change removes distributed tables' dependency on distribution key columns. We already check that we cannot drop distribution key columns in ErrorIfUnsupportedAlterTableStmt() at multi_utility.c, so we don't need to have distributed table to distribution key column dependency to avoid dropping of distribution key column.

Furthermore, having this dependency causes some warnings in pg_dump --schema-only (See #866), which are not desirable.

This change also adds check to disallow drop of distribution keys when citus.enable_ddl_propagation is set to false. Regression tests are updated accordingly.
2017-08-03 10:07:04 -04:00
Jason Petersen 2204da19f0 Support PostgreSQL 10 (#1379)
Adds support for PostgreSQL 10 by copying in the requisite ruleutils
and updating all API usages to conform with changes in PostgreSQL 10.
Most changes are fairly minor but they are numerous. One particular
obstacle was the change in \d behavior in PostgreSQL 10's psql; I had
to add SQL implementations (views, mostly) to mimic the pre-10 output.
2017-06-26 02:35:46 -06:00
velioglu a1ea29ec2b Use placement connection to drop shards instead of node connection 2017-06-14 14:14:59 +03:00
Jason Petersen 5272c2c44b
Enable distributed ALTER TABLE ... RENAME COLUMN
Pretty straightforward. Had some concerns about locking, but due to the
fact that all distributed operations use either some level of deparsing
or need to enumerate column names, they all block during any concurrent
column renames (due to the AccessExclusive lock).

In addition, I had some misgivings about permitting renames of the dis-
tribution column, but nothing bad comes from just allowing them.

Finally, I tried to trigger any sort of error using prepared statements
and could not trigger any errors not also exhibited by plain PostgreSQL
tables.
2017-04-18 22:47:48 -06:00
Marco Slot f838c83809 Remove redundant pg_dist_jobid_seq restarts in tests 2017-04-18 11:42:32 +02:00
Metin Doslu 54a277ff01 Add disable/enable trigger all support 2017-03-29 22:00:14 +03:00
Andres Freund 52358fe891 Initial temp table removal implementation 2017-03-14 12:09:49 +02:00
Murat Tuncer 72027f2eba Remove default clause from shard DDL when sequences are used 2017-03-01 17:32:48 +03:00
Onder Kalaci 9f0bd4cb36 Reference Table Support - Phase 1
With this commit, we implemented some basic features of reference tables.

To start with, a reference table is
  * a distributed table whithout a distribution column defined on it
  * the distributed table is single sharded
  * and the shard is replicated to all nodes

Reference tables follows the same code-path with a single sharded
tables. Thus, broadcast JOINs are applicable to reference tables.
But, since the table is replicated to all nodes, table fetching is
not required any more.

Reference tables support the uniqueness constraints for any column.

Reference tables can be used in INSERT INTO .. SELECT queries with
the following rules:
  * If a reference table is in the SELECT part of the query, it is
    safe join with another reference table and/or hash partitioned
    tables.
  * If a reference table is in the INSERT part of the query, all
    other participating tables should be reference tables.

Reference tables follow the regular co-location structure. Since
all reference tables are single sharded and replicated to all nodes,
they are always co-located with each other.

Queries involving only reference tables always follows router planner
and executor.

Reference tables can have composite typed columns and there is no need
to create/define the necessary support functions.

All modification queries, master_* UDFs, EXPLAIN, DDLs, TRUNCATE,
sequences, transactions, COPY, schema support works on reference
tables as expected. Plus, all the pre-requisites associated with
distribution columns are dismissed.
2016-12-20 14:09:35 +02:00
Jason Petersen 74f4e0003b
Permit multiple DDL commands in a transaction
Three changes here to get to true multi-statement, multi-relation DDL
transactions (same functionality pre-5.2, with benefits of atomicity):

    1. Changed the multi-shard utility hook to always run (consistency
       with router executor hook, removes ad-hoc "installed" boolean)

    2. Change the global connection list in multi_shard_transaction to
       instead be a hash; update related functions to operate on global
       hash instead of local hash/global list

    3. Remove check within DDL code to prevent subsequent DDL commands;
       place unset/reset guard around call to ConnectToNode to permit
       connecting to additional nodes after DDL transaction has begun

In addition, code has been added to raise an error if a ROLLBACK TO
SAVEPOINT is attempted (similar to router executor), and comprehensive
tests execute all multi-DDL scenarios (full success, user ROLLBACK, any
actual errors (say, duplicate index), partial failure (duplicate index
on one node but not others), partial COMMIT (one node fails), and 2PC
partial PREPARE (one node fails)). Interleavings with other commands
(DML, \copy) are similarly all covered.
2016-09-08 22:35:55 -05:00
Jason Petersen 850c51947a
Re-permit DDL in transactions, selectively
Recent changes to DDL and transaction logic resulted in a "regression"
from the viewpoint of users. Previously, DDL commands were allowed in
multi-command transaction blocks, though they were not processed in any
actual transactional manner. We improved the atomicity of our DDL code,
but added a restriction that DDL commands themselves must not occur in
any BEGIN/END transaction block.

To give users back the original functionality (and improved atomicity)
we now keep track of whether a multi-command transaction has modified
data (DML) or schema (DDL). Interleaving the two modification types in
a single transaction is disallowed.

This first step simply permits a single DDL command in such a block,
admittedly an incomplete solution, but one which will permit us to add
full multi-DDL command support in a subsequent commit.
2016-08-30 20:37:19 -06:00
Eren Başak 0322916700
Lowercase \copy to match PostgreSQL's style for local/psql-level functions 2016-08-22 11:31:26 -06:00
Eren Basak b513f1c911
Replace \stage With \copy on Regression Tests
Fixes #547

This change removes all references to \stage in the regression tests
and puts \COPY instead. Doing so changed shard counts, min/max
values on some test tables (lineitem, orders, etc.).
2016-08-22 11:31:26 -06:00
Eren Başak bb3893d0d8 Set 1PC as the Default Commit Protocol for DDL Commands
Fixes #679

This change sets the default commit protocol for distributed DDL
commands to '1pc'. If the user issues a distributed DDL command with
this default setting, then once in a session, a NOTICE message is
shown about using '2pc' being extra safe.
2016-07-29 16:42:55 +03:00
Burak Yucesoy bdff72ed75 Fix ALTER TABLE SET SCHEMA
Fixes #132

We hook into ALTER ... SET SCHEMA and warn out if user tries to change schema of a
distributed table.

We also hook into ALTER TABLE ALL IN TABLE SPACE statements and warn out if citus has
been loaded.
2016-07-22 17:52:40 +03:00
Eren 3eaff48114 Propagate DDL Commands with 2PC
Fixes #513

This change modifies the DDL Propagation logic so that DDL queries
are propagated via 2-Phase Commit protocol. This way, failures during
the execution of distributed DDL commands will not leave the table in
an intermediate state and the pending prepared transactions can be
commited manually.

DDL commands are not allowed inside other transaction blocks or functions.

DDL commands are performed with 2PC regardless of the value of
`citus.multi_shard_commit_protocol` parameter.

The workflow of the successful case is this:
1. Open individual connections to all shard placements and send `BEGIN`
2. Send `SELECT worker_apply_shard_ddl_command(<shardId>, <DDL Command>)`
to all connections, one by one, in a serial manner.
3. Send `PREPARE TRANSCATION <transaction_id>` to all connections.
4. Sedn `COMMIT` to all connections.

Failure cases:
- If a worker problem occurs before sending of all DDL commands is finished, then
all changes are rolled back.
- If a worker problem occurs after all DDL commands are sent but not after
`PREPARE TRANSACTION` commands are finished, then all changes are rolled back.
However, if a worker node is failed, then the prepared transactions in that worker
should be rolled back manually.
- If a worker problem occurs during `COMMIT PREPARED` statements are being sent,
then the prepared transactions on the failed workers should be commited manually.
- If master fails before the first 'PREPARE TRANSACTION' is sent, then nothing is
changed on workers.
- If master fails during `PREPARE TRANSACTION` commands are being sent, then the
prepared transactions on workers should be rolled back manually.
- If master fails during `COMMIT PREPARED` or `ROLLBACK PREPARED` commands are being
sent, then the remaining prepared transactions on the workers should be handled manually.

This change also helps with #480, since failed DDL changes no longer mark
failed placements as inactive.
2016-07-19 10:44:11 +03:00
Eren 5512bb359a Set Explicit ShardId/JobId In Regression Tests
Fixes #271

This change sets ShardIds and JobIds for each test case. Before this change,
when a new test that somehow increments Job or Shard IDs is added, then
the tests after the new test should be updated.

ShardID and JobID sequences are set at the beginning of each file with the
following commands:

```
ALTER SEQUENCE pg_catalog.pg_dist_shardid_seq RESTART 290000;
ALTER SEQUENCE pg_catalog.pg_dist_jobid_seq RESTART 290000;
```

ShardIds and JobIds are multiples of 10000. Exceptions are:
- multi_large_shardid: shardid and jobid sequences are set to much larger values
- multi_fdw_large_shardid: same as above
- multi_join_pruning: Causes a race condition with multi_hash_pruning since
they are run in parallel.
2016-06-07 14:32:44 +03:00
Murat Tuncer 360e884de1 Add enable_ddl_propagation flag to control automatic ddl propagation 2016-06-06 13:42:46 +03:00
Andres Freund 53309461cb Improve DDL replication related regression tests.
The previous form of the test, utilizing DEBUG2, included too much
output dependent on the specifc system and version. Reformulate it to
explicitly connect to workers and show the schema there, when necessary.

The only remaining difference in some of the remaining alternate
regression test files was due to an older minor version release
change. Remove those as well.
2016-03-17 16:05:54 -07:00
Onder Kalaci 136306a1fe Initial commit of Citus 5.0 2016-02-11 04:05:32 +02:00