citus

Commit Graph

Author	SHA1	Message	Date
Marco Slot	8971b3ed75	Short circuit in multi_ProcessUtility on ABORT/COMMIT	2017-01-25 11:57:00 +01:00
Marco Slot	5f61d8ea5a	Always skip foreign key validation when enable_ddl_propagation is off	2017-01-25 11:56:59 +01:00
Marco Slot	8adb9c3ec1	Use coordinator instead of schema node in terminology	2017-01-25 11:07:23 +01:00
Marco Slot	cff95c310e	Use placement connection API for multi-shard transactions	2017-01-23 18:34:50 +01:00
Andres Freund	a596858463	Remove connection_cache.[ch].	2017-01-21 09:01:15 -08:00
Andres Freund	0bdb22268f	Remove remnants of commit_protocol.[ch].	2017-01-21 09:01:15 -08:00
Onder Kalaci	0a05f12475	Use 2PC for reference table modification With this commit, we ensure that router executor always uses 2PC for reference table modifications and never mark the placements of it as INVALID.	2017-01-04 12:46:35 +02:00
Eren Basak	da3ce88091	Error on Unsupported Features on Workers This change makes the metadata workers error out on unsupported commands.	2017-01-02 16:03:45 +03:00
Burak Yucesoy	4b3757c9d1	Error out on foreign keys with reference tables We have one replication of reference table for each node. Therefore all problems with replication factor > 1 also applies to reference table. As a solution we will not allow foreign keys on reference tables. It is not possible to define foreign key from, to or between reference tables.	2016-12-28 10:58:26 +03:00
Eren Basak	9876e253b7	Propagate DDL commands to metadata workers for MX tables	2016-12-23 15:43:32 +03:00
Onder Kalaci	807fc1cc28	Reference Table Support - Phase 1 With this commit, we implemented some basic features of reference tables. To start with, a reference table is * a distributed table whithout a distribution column defined on it * the distributed table is single sharded * and the shard is replicated to all nodes Reference tables follows the same code-path with a single sharded tables. Thus, broadcast JOINs are applicable to reference tables. But, since the table is replicated to all nodes, table fetching is not required any more. Reference tables support the uniqueness constraints for any column. Reference tables can be used in INSERT INTO .. SELECT queries with the following rules: * If a reference table is in the SELECT part of the query, it is safe join with another reference table and/or hash partitioned tables. * If a reference table is in the INSERT part of the query, all other participating tables should be reference tables. Reference tables follow the regular co-location structure. Since all reference tables are single sharded and replicated to all nodes, they are always co-located with each other. Queries involving only reference tables always follows router planner and executor. Reference tables can have composite typed columns and there is no need to create/define the necessary support functions. All modification queries, master_* UDFs, EXPLAIN, DDLs, TRUNCATE, sequences, transactions, COPY, schema support works on reference tables as expected. Plus, all the pre-requisites associated with distribution columns are dismissed.	2016-12-20 14:09:35 +02:00
Jason Petersen	54c2efccc1	Add targeted VACUUM/ANALYZE support Adds support for VACUUM and ANALYZE commands which target a specific distributed table. After grabbing the appropriate locks, this imple- mentation sends VACUUM commands to each placement (using one connec- tion per placement). These commands are sent in parallel, so users with large tables will benefit from sharding. Except for VERBOSE, all VACUUM and ANALYZE options are supported, including the explicit column list used by ANALYZE. As with many of our utility commands, the local command also runs. In the VACUUM/ANALYZE case, the local command is executed before any re- mote propagation. Because error handling is managed after local proc- essing, this can result in a VACUUM completing locally but erroring out when distributed processing commences: a minor technicality in all cases, as there isn't really much reason to ever roll back a VACUUM (an impossibility in any case, as VACUUM cannot run within a transaction). Remote propagation of targeted VACUUM/ANALYZE is controlled by the enable_ddl_propagation setting; warnings are emitted if such a command is attempted when DDL propagation is disabled. Unqualified VACUUM or ANALYZE is not handled, but a warning message informs the user of this. Implementation note: this commit adds a "BARE" value to MultiShard- CommitProtocol. When active, no BEGIN command is ever sent to remote nodes, useful for commands such as VACUUM/ANALYZE which must not run in a transaction block. This value is not user-facing and is reset at transaction end.	2016-12-16 16:59:06 -07:00
Burak Yucesoy	cdb725a561	Add Foreign Key Support to ALTER TABLE commands With this PR, we add foreign key support to ALTER TABLE commands. For now, we only support foreign constraint creation via ALTER TABLE query, if it is only subcommand in ALTER TABLE subcommand list. We also only allow foreign key creation if replication factor is 1.	2016-12-08 15:03:25 +02:00
Sumedh Pathak	b9cbb52c06	Change DDL error message to say "unsupported" instead of "supported"	2016-11-26 10:30:09 +01:00
Samay Sharma	69bd7fc8de	Avoid error during CREATE INDEX IF NOT EXISTS Previously, we threw an error when we ran CREATE INDEX IF NOT EXISTS with an already existing index. This change enables expected behavior by checking if the statement has IF NOT EXISTS before throwing the error. We also ensure that we don't execute the command on the workers, if an index already exists on the master.	2016-11-01 14:51:19 -07:00
Brian Cloutier	762468f0e3	Copy raw_parse_tree before using it Address citusdata/citus#922. Fixes a segfault in PG's installcheck caused by our reuse of raw_parse_tree when handling EXPLAIN EXECUTE.	2016-10-27 18:25:49 +03:00
Brian Cloutier	c7e722e624	Fix segfault during EXPLAIN EXECUTE Fix citusdata/citus#886 The way postgres' explain hook is designed means that our hook is never called during EXPLAIN EXECUTE. So, we special-case EXPLAIN EXECUTE by catching it in the utility hook. We then replace the EXECUTE with the original query and pass it back to Citus.	2016-10-26 15:18:42 +03:00
Marco Slot	914100cfbe	Parallelise DDL commands	2016-10-24 12:39:08 +02:00
Marco Slot	06e790f420	Parallelise master_modify_multiple_shards	2016-10-19 08:33:08 +02:00
Murat Tuncer	4416b77088	Add support for truncate statement	2016-09-26 18:23:42 -06:00
Jason Petersen	b00b15a718	Permit multiple DDL commands in a transaction Three changes here to get to true multi-statement, multi-relation DDL transactions (same functionality pre-5.2, with benefits of atomicity): 1. Changed the multi-shard utility hook to always run (consistency with router executor hook, removes ad-hoc "installed" boolean) 2. Change the global connection list in multi_shard_transaction to instead be a hash; update related functions to operate on global hash instead of local hash/global list 3. Remove check within DDL code to prevent subsequent DDL commands; place unset/reset guard around call to ConnectToNode to permit connecting to additional nodes after DDL transaction has begun In addition, code has been added to raise an error if a ROLLBACK TO SAVEPOINT is attempted (similar to router executor), and comprehensive tests execute all multi-DDL scenarios (full success, user ROLLBACK, any actual errors (say, duplicate index), partial failure (duplicate index on one node but not others), partial COMMIT (one node fails), and 2PC partial PREPARE (one node fails)). Interleavings with other commands (DML, \copy) are similarly all covered.	2016-09-08 22:35:55 -05:00
Jason Petersen	c684ac11ec	Re-permit DDL in transactions, selectively Recent changes to DDL and transaction logic resulted in a "regression" from the viewpoint of users. Previously, DDL commands were allowed in multi-command transaction blocks, though they were not processed in any actual transactional manner. We improved the atomicity of our DDL code, but added a restriction that DDL commands themselves must not occur in any BEGIN/END transaction block. To give users back the original functionality (and improved atomicity) we now keep track of whether a multi-command transaction has modified data (DML) or schema (DDL). Interleaving the two modification types in a single transaction is disallowed. This first step simply permits a single DDL command in such a block, admittedly an incomplete solution, but one which will permit us to add full multi-DDL command support in a subsequent commit.	2016-08-30 20:37:19 -06:00
Eren Başak	bb4f4e25b5	Set 1PC as the Default Commit Protocol for DDL Commands Fixes #679 This change sets the default commit protocol for distributed DDL commands to '1pc'. If the user issues a distributed DDL command with this default setting, then once in a session, a NOTICE message is shown about using '2pc' being extra safe.	2016-07-29 16:42:55 +03:00
Jason Petersen	f19779b0ce	Support SERIAL/BIGSERIAL non-partition columns This adds support for SERIAL/BIGSERIAL column types. Because we now can evaluate functions on the master (during execution), adding this is a matter of ensuring the table creation step works properly. To accomplish this, I've added some logic to detect sequences owned by a table (i.e. those related to its columns). Simply creating a sequence and using it in a default value is insufficient; users who do so must ensure the sequence is owned by the column using it. Fortunately, this is exactly what SERIAL and BIGSERIAL do, which is the use case we're targeting with this feature. While testing this, I found that worker_apply_shard_ddl_command actually adds shard identifiers to sequence names, though I found no places that use or test this path. I removed that code so that sequence names are not mutated and will match those used by a SERIAL default value expression. Our use of the new-to-9.5 CREATE SEQUENCE IF NOT EXISTS syntax means we are dropping support for 9.4 (which is being done regardless, but makes this change simpler). I've removed 9.4 from the Travis build matrix. Some edge cases are possible in ALTER SEQUENCE, COPY FROM (on workers), and CREATE SEQUENCE OWNED BY. I've added errors for each so that users understand when and why certain operations are prohibited.	2016-07-28 23:55:40 -06:00
Burak Yucesoy	0a2c940ae5	Remove schema name parameter from API functions We remove schema name parameter from worker_fetch_foreign_file and worker_fetch_regular_table functions. We now send schema name concatanated with table name.	2016-07-28 20:41:05 +03:00
Eren Başak	6f91d30e1c	Remove AllFinalizedlacementsAccessible Function This change removes AllFinalizedPlacementsAccessible function since, we open connections to all shard placements before any command is sent so we immediately error out if a shard placement is not accessible.	2016-07-28 17:24:37 +03:00
Eren Başak	90a1d8f67a	Allow Cancellation During Distributed DDL Commands This change allows users to interrupt long running DDL commands. Interrupt requests are handled after each DDL command being propagated to a shard placement, which means that generally the cancel request will be processed right after the execution of the DDL is finished in the current placement.	2016-07-28 17:12:07 +03:00
Murat Tuncer	719e44d1f4	Remove PostgreSQL 9.4 support	2016-07-26 20:16:09 +03:00
Burak Yucesoy	770ecffc8f	Remove warnings on schema creation Since now we support schema related operations, there is no need to warn user about schema usage.	2016-07-22 18:24:23 +03:00
Burak Yucesoy	9df8300efa	Fix ALTER TABLE SET SCHEMA Fixes #132 We hook into ALTER ... SET SCHEMA and warn out if user tries to change schema of a distributed table. We also hook into ALTER TABLE ALL IN TABLE SPACE statements and warn out if citus has been loaded.	2016-07-22 17:52:40 +03:00
Burak Yucesoy	444d4eb558	Fix worker_fetch_regular_table with schema Fixes #504 Fixes #646 We changed signature of worker_fetch_regular_table to accept schema name as parameter to make it work with schemas.	2016-07-22 00:44:02 -06:00
Burak Yucesoy	d0beacc4e1	Change worker_apply_shard_ddl_command to accept schema name as parameter Fixes #565 Fixes #626 To add schema support to citus, we need to schema-prefix all table names, object names etc. in the queries sent to worker nodes. However; query deparsing is not available for most of DDL commands, therefore it is not easy to generate worker query in the master node. As a solution we are sending schema names along with shard id and query to run to worker nodes with worker_apply_shard_ddl_command. To not break \STAGE command we pass public schema as paramater while calling worker_apply_shard_ddl_command from there. This will not cause problem if user uses \STAGE in different schema because passes schema name is used only if there is no schema name is given in the query.	2016-07-21 14:17:26 +03:00
Eren Başak	c559592da0	Fix Unused Parameter isTopLevel in ExecuteDistributedDDLCommand This change fixes the unused variable problem in `ExecuteDistributedDDLCommand` function (multi_utility.c). The parameter is meant to be used in PreventTransactionChain call.	2016-07-19 14:14:02 +03:00
Eren	692ef0964a	Propagate DDL Commands with 2PC Fixes #513 This change modifies the DDL Propagation logic so that DDL queries are propagated via 2-Phase Commit protocol. This way, failures during the execution of distributed DDL commands will not leave the table in an intermediate state and the pending prepared transactions can be commited manually. DDL commands are not allowed inside other transaction blocks or functions. DDL commands are performed with 2PC regardless of the value of `citus.multi_shard_commit_protocol` parameter. The workflow of the successful case is this: 1. Open individual connections to all shard placements and send `BEGIN` 2. Send `SELECT worker_apply_shard_ddl_command(<shardId>, <DDL Command>)` to all connections, one by one, in a serial manner. 3. Send `PREPARE TRANSCATION <transaction_id>` to all connections. 4. Sedn `COMMIT` to all connections. Failure cases: - If a worker problem occurs before sending of all DDL commands is finished, then all changes are rolled back. - If a worker problem occurs after all DDL commands are sent but not after `PREPARE TRANSACTION` commands are finished, then all changes are rolled back. However, if a worker node is failed, then the prepared transactions in that worker should be rolled back manually. - If a worker problem occurs during `COMMIT PREPARED` statements are being sent, then the prepared transactions on the failed workers should be commited manually. - If master fails before the first 'PREPARE TRANSACTION' is sent, then nothing is changed on workers. - If master fails during `PREPARE TRANSACTION` commands are being sent, then the prepared transactions on workers should be rolled back manually. - If master fails during `COMMIT PREPARED` or `ROLLBACK PREPARED` commands are being sent, then the remaining prepared transactions on the workers should be handled manually. This change also helps with #480, since failed DDL changes no longer mark failed placements as inactive.	2016-07-19 10:44:11 +03:00
Murat Tuncer	fcd4248f6a	Add enable_ddl_propagation flag to control automatic ddl propagation	2016-06-06 13:42:46 +03:00
Burak Yucesoy	31b0423f1f	Fix #469 This change renames one of the ReceiveRegularFile functions with more descriptive name.	2016-05-26 12:03:36 +03:00
Metin Doslu	fb6b6daf9d	Add COPY support on worker nodes for append partitioned relations Now, we can copy to an append-partitioned distributed relation from any worker node by providing master options such as; COPY relation_name FROM file_path WITH (delimiter '\|', master_host 'localhost', master_port 5432); where master_port is optional and default is 5432.	2016-05-03 16:00:00 +03:00
Andres Freund	a9d7f62cad	Perform permission checks on operations re-implemented by citus. Currently that's just COPY FROM. There's other places where we could check for permissions earlier (to fail less verbosely), but since there's other pending changes in the whole DDL area, which is affected by this, I'm just adding a note to those places.	2016-04-27 10:28:36 -07:00
Andres Freund	63998786ba	Create new shards as owned the distributed table's owner. That's important because ownership of relations implies special privileges. Without this change, a distributed table can be accessible by a table's owner, but a shard created by another user might not.	2016-04-27 10:28:33 -07:00
Andres Freund	c45b94e88a	Add ReplicateGrantStmt(). This is the basis for coordinating GRANT/REVOKE across nodes.	2016-04-27 10:28:25 -07:00
Andres Freund	99e983433f	Run some commands as superuser to allow normal users to execute queries. Some small parts of citus currently require superuser privileges; which is obviously not desirable for production scenarios. Run these small parts under superuser privileges (we use the extension owner) to avoid that. This does not yet coordinate grants between master and workers. Thus it allows to create shards, load data, and run queries as a non-superuser, but it is not easily possible to allow differentiated accesses to several users.	2016-04-27 10:28:22 -07:00
Brian Cloutier	1f5379457a	Clear metadata_cache upon DROP EXTENSION When we notice that pg_dist_partition is being invalidated we assume that the citus extension is being dropped and drop state such as extensionLoaded and the cached oids of all the metadata tables. This frees the user from needing to reconnect after running DROP EXTENSION, so we also no longer send a warning message.	2016-04-22 07:25:49 -07:00
Murat Tuncer	2bc96fabe5	Add dynamic executor selection - non-router plannable queries can be executed by router executor if they satisfy the criteria - router executor is removed from configuration, now task executor can not be set to router - removed some tests that error out for router executor	2016-04-21 09:15:33 +03:00
Brian Cloutier	301ffd64f2	Better error on "CREATE INDEX already_exists ..." Previously (if you're creating the index with the same name on different tables) we successfully ran the command on the workers before failing it on the master and leaving no record of the index. Now we check whether the index exists on the master before sending commands to the workers. -- Also make the error better when user attampts to create an index without a name. Previously those statements returned: brian=# create index on c (b); WARNING: could not receive query results from localhost:9700 DETAIL: Client error: cannot extend name for null index name ERROR: could not execute DDL command on worker node shards They now return brian=# create index on c (b); ERROR: creating index without a name on a distributed table is currently unsupported	2016-04-18 13:33:53 -07:00
Metin Doslu	ce0721fdf8	Send COPY rows in binary format	2016-04-12 20:22:31 +02:00
Marco Slot	690252b222	Support for COPY FROM, based on pg_shard PR by Postres Pro	2016-04-12 20:22:31 +02:00
Murat Tuncer	3f27c627ad	Allow users to create unique indexes Users can now create unique indexes on partition columns for hash and range distributed tables.	2016-03-24 09:31:06 +02:00
Jason Petersen	a95c9da472	Update copyright dates Fixed configure variable and updated all end dates to 2016.	2016-03-23 17:14:37 -06:00
Murat Tuncer	00b10e5a93	Merge from master branch into feature/citusdb-to-citus	2016-02-17 14:49:01 +02:00
Jason Petersen	1ea3f46194	Omit RangeVarCallbackForDropIndex from formatting I removed two braces to have this function remain more similar to the original PostgreSQL function and added uncrustify commands to disable formatting of its contents.	2016-02-15 23:29:33 -07:00

1 2

53 Commits (8971b3ed75c60aed6afaad0e8949c704a03b479d)