citus

Commit Graph

Author	SHA1	Message	Date
mehmet furkan şahin	61ae33dc7f	ALTER TABLE .. REPLICA IDENTITY support is implemented	2017-10-26 13:44:28 +03:00
Hadi Moshayedi	e5fbcf37dd	Add Savepoint Support (#1539 ) This change adds support for SAVEPOINT, ROLLBACK TO SAVEPOINT, and RELEASE SAVEPOINT. When transaction connections are not established yet, savepoints are kept in a stack and sent to the worker when the connection is later established. After establishing connections, savepoint commands are sent as they arrive. This change fixes #1493 .	2017-08-15 13:02:28 -04:00
Burak Yucesoy	fddf9b3fcc	Add distributed partitioned table support distributed table creation With this PR, Citus starts to support all possible ways to create distributed partitioned tables. These are; - Distributing already created partitioning hierarchy - CREATE TABLE ... PARTITION OF a distributed_table - ALTER TABLE distributed_table ATTACH PARTITION non_distributed_table - ALTER TABLE distributed_table ATTACH PARTITION distributed_table We also support DETACHing partitions from partitioned tables and propogating TRUNCATE and DDL commands to distributed partitioned tables. This PR also refactors some parts of distributed table creation logic.	2017-08-09 10:01:35 +03:00
Hadi Moshayedi	8229a64fe8	Remove distributed tables' dependency on distribution key columns. (#1527 ) This change removes distributed tables' dependency on distribution key columns. We already check that we cannot drop distribution key columns in ErrorIfUnsupportedAlterTableStmt() at multi_utility.c, so we don't need to have distributed table to distribution key column dependency to avoid dropping of distribution key column. Furthermore, having this dependency causes some warnings in pg_dump --schema-only (See #866), which are not desirable. This change also adds check to disallow drop of distribution keys when citus.enable_ddl_propagation is set to false. Regression tests are updated accordingly.	2017-08-03 10:07:04 -04:00
Jason Petersen	2204da19f0	Support PostgreSQL 10 (#1379 ) Adds support for PostgreSQL 10 by copying in the requisite ruleutils and updating all API usages to conform with changes in PostgreSQL 10. Most changes are fairly minor but they are numerous. One particular obstacle was the change in \d behavior in PostgreSQL 10's psql; I had to add SQL implementations (views, mostly) to mimic the pre-10 output.	2017-06-26 02:35:46 -06:00
velioglu	a1ea29ec2b	Use placement connection to drop shards instead of node connection	2017-06-14 14:14:59 +03:00
velioglu	24d24db25c	Implement ALTER TABLE ADD CONSTRAINT command	2017-04-20 15:02:33 +03:00
Jason Petersen	5272c2c44b	Enable distributed ALTER TABLE ... RENAME COLUMN Pretty straightforward. Had some concerns about locking, but due to the fact that all distributed operations use either some level of deparsing or need to enumerate column names, they all block during any concurrent column renames (due to the AccessExclusive lock). In addition, I had some misgivings about permitting renames of the dis- tribution column, but nothing bad comes from just allowing them. Finally, I tried to trigger any sort of error using prepared statements and could not trigger any errors not also exhibited by plain PostgreSQL tables.	2017-04-18 22:47:48 -06:00
Marco Slot	f838c83809	Remove redundant pg_dist_jobid_seq restarts in tests	2017-04-18 11:42:32 +02:00
Metin Doslu	54a277ff01	Add disable/enable trigger all support	2017-03-29 22:00:14 +03:00
Jason Petersen	f181b24859	Move worker execution to after master, fix tests Some tests relied on worker errors though local commands were invalid. Fixed those by ensuring preconditions were met to have command work correctly. Otherwise most test changes are related to slight changes in local/remote error ordering.	2017-03-22 17:21:49 -06:00
Andres Freund	52358fe891	Initial temp table removal implementation	2017-03-14 12:09:49 +02:00
Murat Tuncer	72027f2eba	Remove default clause from shard DDL when sequences are used	2017-03-01 17:32:48 +03:00
Onder Kalaci	9f0bd4cb36	Reference Table Support - Phase 1 With this commit, we implemented some basic features of reference tables. To start with, a reference table is * a distributed table whithout a distribution column defined on it * the distributed table is single sharded * and the shard is replicated to all nodes Reference tables follows the same code-path with a single sharded tables. Thus, broadcast JOINs are applicable to reference tables. But, since the table is replicated to all nodes, table fetching is not required any more. Reference tables support the uniqueness constraints for any column. Reference tables can be used in INSERT INTO .. SELECT queries with the following rules: * If a reference table is in the SELECT part of the query, it is safe join with another reference table and/or hash partitioned tables. * If a reference table is in the INSERT part of the query, all other participating tables should be reference tables. Reference tables follow the regular co-location structure. Since all reference tables are single sharded and replicated to all nodes, they are always co-located with each other. Queries involving only reference tables always follows router planner and executor. Reference tables can have composite typed columns and there is no need to create/define the necessary support functions. All modification queries, master_* UDFs, EXPLAIN, DDLs, TRUNCATE, sequences, transactions, COPY, schema support works on reference tables as expected. Plus, all the pre-requisites associated with distribution columns are dismissed.	2016-12-20 14:09:35 +02:00
Andres Freund	fa5e202403	Convert multi_shard_transaction.[ch] to new framework.	2016-12-12 15:18:12 -08:00
Burak Yucesoy	8d7cd4d746	Add Foreign Key Support to ALTER TABLE commands With this PR, we add foreign key support to ALTER TABLE commands. For now, we only support foreign constraint creation via ALTER TABLE query, if it is only subcommand in ALTER TABLE subcommand list. We also only allow foreign key creation if replication factor is 1.	2016-12-08 15:03:25 +02:00
Sumedh Pathak	0a0d4784b9	Change DDL error message to say "unsupported" instead of "supported"	2016-11-26 10:30:09 +01:00
Marco Slot	271b20a23e	Parallelise DDL commands	2016-10-24 12:39:08 +02:00
Jason Petersen	74f4e0003b	Permit multiple DDL commands in a transaction Three changes here to get to true multi-statement, multi-relation DDL transactions (same functionality pre-5.2, with benefits of atomicity): 1. Changed the multi-shard utility hook to always run (consistency with router executor hook, removes ad-hoc "installed" boolean) 2. Change the global connection list in multi_shard_transaction to instead be a hash; update related functions to operate on global hash instead of local hash/global list 3. Remove check within DDL code to prevent subsequent DDL commands; place unset/reset guard around call to ConnectToNode to permit connecting to additional nodes after DDL transaction has begun In addition, code has been added to raise an error if a ROLLBACK TO SAVEPOINT is attempted (similar to router executor), and comprehensive tests execute all multi-DDL scenarios (full success, user ROLLBACK, any actual errors (say, duplicate index), partial failure (duplicate index on one node but not others), partial COMMIT (one node fails), and 2PC partial PREPARE (one node fails)). Interleavings with other commands (DML, \copy) are similarly all covered.	2016-09-08 22:35:55 -05:00
Jason Petersen	850c51947a	Re-permit DDL in transactions, selectively Recent changes to DDL and transaction logic resulted in a "regression" from the viewpoint of users. Previously, DDL commands were allowed in multi-command transaction blocks, though they were not processed in any actual transactional manner. We improved the atomicity of our DDL code, but added a restriction that DDL commands themselves must not occur in any BEGIN/END transaction block. To give users back the original functionality (and improved atomicity) we now keep track of whether a multi-command transaction has modified data (DML) or schema (DDL). Interleaving the two modification types in a single transaction is disallowed. This first step simply permits a single DDL command in such a block, admittedly an incomplete solution, but one which will permit us to add full multi-DDL command support in a subsequent commit.	2016-08-30 20:37:19 -06:00
Eren Başak	0322916700	Lowercase \copy to match PostgreSQL's style for local/psql-level functions	2016-08-22 11:31:26 -06:00
Eren Basak	b513f1c911	Replace \stage With \copy on Regression Tests Fixes #547 This change removes all references to \stage in the regression tests and puts \COPY instead. Doing so changed shard counts, min/max values on some test tables (lineitem, orders, etc.).	2016-08-22 11:31:26 -06:00
Eren Başak	bb3893d0d8	Set 1PC as the Default Commit Protocol for DDL Commands Fixes #679 This change sets the default commit protocol for distributed DDL commands to '1pc'. If the user issues a distributed DDL command with this default setting, then once in a session, a NOTICE message is shown about using '2pc' being extra safe.	2016-07-29 16:42:55 +03:00
Burak Yucesoy	bdff72ed75	Fix ALTER TABLE SET SCHEMA Fixes #132 We hook into ALTER ... SET SCHEMA and warn out if user tries to change schema of a distributed table. We also hook into ALTER TABLE ALL IN TABLE SPACE statements and warn out if citus has been loaded.	2016-07-22 17:52:40 +03:00
Eren	3eaff48114	Propagate DDL Commands with 2PC Fixes #513 This change modifies the DDL Propagation logic so that DDL queries are propagated via 2-Phase Commit protocol. This way, failures during the execution of distributed DDL commands will not leave the table in an intermediate state and the pending prepared transactions can be commited manually. DDL commands are not allowed inside other transaction blocks or functions. DDL commands are performed with 2PC regardless of the value of `citus.multi_shard_commit_protocol` parameter. The workflow of the successful case is this: 1. Open individual connections to all shard placements and send `BEGIN` 2. Send `SELECT worker_apply_shard_ddl_command(<shardId>, <DDL Command>)` to all connections, one by one, in a serial manner. 3. Send `PREPARE TRANSCATION <transaction_id>` to all connections. 4. Sedn `COMMIT` to all connections. Failure cases: - If a worker problem occurs before sending of all DDL commands is finished, then all changes are rolled back. - If a worker problem occurs after all DDL commands are sent but not after `PREPARE TRANSACTION` commands are finished, then all changes are rolled back. However, if a worker node is failed, then the prepared transactions in that worker should be rolled back manually. - If a worker problem occurs during `COMMIT PREPARED` statements are being sent, then the prepared transactions on the failed workers should be commited manually. - If master fails before the first 'PREPARE TRANSACTION' is sent, then nothing is changed on workers. - If master fails during `PREPARE TRANSACTION` commands are being sent, then the prepared transactions on workers should be rolled back manually. - If master fails during `COMMIT PREPARED` or `ROLLBACK PREPARED` commands are being sent, then the remaining prepared transactions on the workers should be handled manually. This change also helps with #480, since failed DDL changes no longer mark failed placements as inactive.	2016-07-19 10:44:11 +03:00
Jason Petersen	48f4e5d1a5	Make ReportRemoteError's CONTEXT style-compliant There's not a ton of documentation about what CONTEXT lines should look like, but this seems like the most dominant pattern. Similarly, users should expect lowercase, non-period strings.	2016-06-07 12:47:16 -06:00
Metin Doslu	15eed396b3	Update ereport format	2016-06-07 15:58:32 +03:00
Eren	5512bb359a	Set Explicit ShardId/JobId In Regression Tests Fixes #271 This change sets ShardIds and JobIds for each test case. Before this change, when a new test that somehow increments Job or Shard IDs is added, then the tests after the new test should be updated. ShardID and JobID sequences are set at the beginning of each file with the following commands: ``` ALTER SEQUENCE pg_catalog.pg_dist_shardid_seq RESTART 290000; ALTER SEQUENCE pg_catalog.pg_dist_jobid_seq RESTART 290000; ``` ShardIds and JobIds are multiples of 10000. Exceptions are: - multi_large_shardid: shardid and jobid sequences are set to much larger values - multi_fdw_large_shardid: same as above - multi_join_pruning: Causes a race condition with multi_hash_pruning since they are run in parallel.	2016-06-07 14:32:44 +03:00
Murat Tuncer	360e884de1	Add enable_ddl_propagation flag to control automatic ddl propagation	2016-06-06 13:42:46 +03:00
Amos Bird	92788a0d9c	Remove redundant implementations of error funcs. This patch does some basic cleaning jobs. It removes duplicated implementations of ReportRemoteError() and related ones and adjusts regression tests.	2016-05-27 15:12:59 -06:00
Onder Kalaci	6c7abc2ba5	Add fast shard pruning path for INSERTs on hash partitioned tables This commit adds a fast shard pruning path for INSERTs on hash-partitioned tables. The rationale behind this change is that if there exists a sorted shard interval array, a single index lookup on the array allows us to find the corresponding shard interval. As mentioned above, we need a sorted (wrt shardminvalue) shard interval array. Thus, this commit updates shardIntervalArray to sortedShardIntervalArray in the metadata cache. Then uses the low-level API that is defined in multi_copy to handle the fast shard pruning. The performance impact of this change is more apparent as more shards exist for a distributed table. Previous implementation was relying on linear search through the shard intervals. However, this commit relies on constant lookup time on shard interval array. Thus, the shard pruning becomes less dependent on the shard count.	2016-04-26 11:16:00 +03:00
Andres Freund	53309461cb	Improve DDL replication related regression tests. The previous form of the test, utilizing DEBUG2, included too much output dependent on the specifc system and version. Reformulate it to explicitly connect to workers and show the schema there, when necessary. The only remaining difference in some of the remaining alternate regression test files was due to an older minor version release change. Remove those as well.	2016-03-17 16:05:54 -07:00
Marco Slot	52f11223e5	Drop shards when a distributed table is dropped After this change, shards and associated metadata are automatically dropped when running DROP TABLE on a distributed table, which fixes #230. It also adds schema support for master_apply_delete_command, which fixes #73. Dropping the shards happens in the master_drop_all_shards UDF, which is called from the SQL_DROP trigger. Inside the trigger, the table is no longer visible and calling master_apply_delete_command directly wouldn't work and oid <-> name mappings are not available. The master_drop_all_shards function therefore takes the relation id, schema name, and table name as parameters, which can be obtained from pg_event_trigger_dropped_objects() in the SQL_DROP trigger. If the user calls master_drop_all_shards while the table still exists, the schema name and table name are ignored. Author: Marco Slot Reviewed-By: Andres Freund	2016-02-16 10:54:29 +01:00
Onder Kalaci	136306a1fe	Initial commit of Citus 5.0	2016-02-11 04:05:32 +02:00

34 Commits (61ae33dc7f43e0417129ebe89ba499d63ca6fd3a)