citus

Commit Graph

Author	SHA1	Message	Date
Jason Petersen	b391abda3d	Replace verb 'stage' with 'load' in test comments "Staging table" will be the only valid use of 'stage' from now on, we will now say "load" when talking about data ingestion. If creation of shards is its own step, we'll just say "shard creation".	2016-08-22 13:24:18 -06:00
Jason Petersen	35e9f51348	Replace verb 'stage' with 'load' in schedules "Staging table" will be the only valid use of 'stage' from now on.	2016-08-22 11:48:41 -06:00
Eren Başak	0322916700	Lowercase \copy to match PostgreSQL's style for local/psql-level functions	2016-08-22 11:31:26 -06:00
Eren Basak	b513f1c911	Replace \stage With \copy on Regression Tests Fixes #547 This change removes all references to \stage in the regression tests and puts \COPY instead. Doing so changed shard counts, min/max values on some test tables (lineitem, orders, etc.).	2016-08-22 11:31:26 -06:00
Robin Thomas	010cbf16fc	Remove all usage of pg_dist_shard.shardalias in extension code. (#739 ) Remove regression test of non-null shardalias.	2016-08-19 17:06:22 +03:00
Jason Petersen	900f7590ab	Fix Travis local_first_candidate_nodes failures A recent change to the image used in Travis causes some problems for the code we use here to ensure the local replica is first. Since this code is essentially dead in a post-stage world anyhow, we're OK with ripping out the tests to placate Travis.	2016-08-14 23:12:10 -06:00
Murat Tuncer	3a49cf830e	Remove a router planner test for materialized view PostgreSQL 9.5.4 stopped calling planner for materialized view create command when NO DATA option is provided. This causes our test to behave differently between pre-9.5.4 and 9.5.4.	2016-08-14 22:57:09 -06:00
Metin Doslu	3ff1877108	Bump version numbers for 5.2 release	2016-08-01 13:48:24 -07:00
Marco Slot	9705cbcdf8	Rewrite WorkerShardStats to avoid invalid value bugs	2016-07-29 20:11:18 +02:00
Eren Başak	bb3893d0d8	Set 1PC as the Default Commit Protocol for DDL Commands Fixes #679 This change sets the default commit protocol for distributed DDL commands to '1pc'. If the user issues a distributed DDL command with this default setting, then once in a session, a NOTICE message is shown about using '2pc' being extra safe.	2016-07-29 16:42:55 +03:00
Jason Petersen	bedf53d566	Quick fix for possible segfault in PurgeConnection Now that connections can be acquired without going through the cache, we have to handle cases where functions assume the cache has been ini- tialized.	2016-07-29 00:12:56 -06:00
Jason Petersen	abe7304898	Support SERIAL/BIGSERIAL non-partition columns This adds support for SERIAL/BIGSERIAL column types. Because we now can evaluate functions on the master (during execution), adding this is a matter of ensuring the table creation step works properly. To accomplish this, I've added some logic to detect sequences owned by a table (i.e. those related to its columns). Simply creating a sequence and using it in a default value is insufficient; users who do so must ensure the sequence is owned by the column using it. Fortunately, this is exactly what SERIAL and BIGSERIAL do, which is the use case we're targeting with this feature. While testing this, I found that worker_apply_shard_ddl_command actually adds shard identifiers to sequence names, though I found no places that use or test this path. I removed that code so that sequence names are not mutated and will match those used by a SERIAL default value expression. Our use of the new-to-9.5 CREATE SEQUENCE IF NOT EXISTS syntax means we are dropping support for 9.4 (which is being done regardless, but makes this change simpler). I've removed 9.4 from the Travis build matrix. Some edge cases are possible in ALTER SEQUENCE, COPY FROM (on workers), and CREATE SEQUENCE OWNED BY. I've added errors for each so that users understand when and why certain operations are prohibited.	2016-07-28 23:55:40 -06:00
Burak Yucesoy	6f20af9e38	Remove schema name parameter from API functions We remove schema name parameter from worker_fetch_foreign_file and worker_fetch_regular_table functions. We now send schema name concatanated with table name.	2016-07-28 20:41:05 +03:00
Burak Yucesoy	a649b47bac	Add old version(without schema name parameter) of api functions back Fixes #676 We added old versions (i.e. without schema name) of worker_apply_shard_ddl_command, worker_fetch_foreign_file and worker_fetch_regular_table back. During function call of one of these functions, we set schema name as public schema and call the newer version of the functions.	2016-07-28 20:40:38 +03:00
Murat Tuncer	cc33a450c4	Expand router planner coverage We can now support richer set of queries in router planner. This allow us to support CTEs, joins, window function, subqueries if they are known to be executed at a single worker with a single task (all tables are filtered down to a single shard and a single worker contains all table shards referenced in the query). Fixes : #501	2016-07-27 23:35:38 +03:00
Murat Tuncer	c20080992d	Remove PostgreSQL 9.4 support	2016-07-26 20:16:09 +03:00
Burak Yucesoy	c1a3478c3b	Remove warnings on schema creation Since now we support schema related operations, there is no need to warn user about schema usage.	2016-07-22 18:24:23 +03:00
Burak Yucesoy	bdff72ed75	Fix ALTER TABLE SET SCHEMA Fixes #132 We hook into ALTER ... SET SCHEMA and warn out if user tries to change schema of a distributed table. We also hook into ALTER TABLE ALL IN TABLE SPACE statements and warn out if citus has been loaded.	2016-07-22 17:52:40 +03:00
Murat Tuncer	5d996a6891	Fix outer join crash when subquery is flatten	2016-07-22 17:01:19 +03:00
Burak Yucesoy	b58872b441	Fix worker_fetch_regular_table with schema Fixes #504 Fixes #646 We changed signature of worker_fetch_regular_table to accept schema name as parameter to make it work with schemas.	2016-07-22 00:44:02 -06:00
Jason Petersen	5d525fba24	Permit "single-shard" transactions Allows the use of modification commands (INSERT/UPDATE/DELETE) within transaction blocks (delimited by BEGIN and ROLLBACK/COMMIT), so long as all modifications hit a subset of nodes involved in the first such com- mand in the transaction. This does not circumvent the requirement that each individual modification command must still target a single shard. For instance, after sending BEGIN, a user might INSERT some rows to a shard replicated on two nodes. Subsequent modifications can hit other shards, so long as they are on one or both of these nodes. SAVEPOINTs are supported, though if the user actually attempts to send a ROLLBACK command that specifies a SAVEPOINT they will receive an ERROR at the end of the topmost transaction. Placements are only marked inactive if at least one replica succeeds in a transaction where others fail. Non-atomic behavior is possible if the shard targeted by the initial modification within a transaction has a higher replication factor than another shard within the same block and a node with the latter shard has a failure during the COMMIT phase. Other methods of denoting transaction blocks (multi-statement commands sent all at once and functions written in e.g. PL/pgSQL or other such languages) are not presently supported; their treatment remains the same as before.	2016-07-21 15:57:22 -06:00
Burak Yucesoy	20debfc0ee	Fix COUNT DISTINCT approximation with schema Fixes #555 Before this change, we were resolving HLL function and type Oid without qualified name. Now we find the schema name where HLL objects are stored and generate qualified names for each objects. Similar fix is also applied for cstore_table_size function call.	2016-07-21 17:29:18 +03:00
Burak Yucesoy	bca672e0a4	Fix master_apply_delete_command with schema Fixes #73	2016-07-21 15:09:20 +03:00
Burak Yucesoy	2f0158dde1	Change worker_apply_shard_ddl_command to accept schema name as parameter Fixes #565 Fixes #626 To add schema support to citus, we need to schema-prefix all table names, object names etc. in the queries sent to worker nodes. However; query deparsing is not available for most of DDL commands, therefore it is not easy to generate worker query in the master node. As a solution we are sending schema names along with shard id and query to run to worker nodes with worker_apply_shard_ddl_command. To not break \STAGE command we pass public schema as paramater while calling worker_apply_shard_ddl_command from there. This will not cause problem if user uses \STAGE in different schema because passes schema name is used only if there is no schema name is given in the query.	2016-07-21 14:17:26 +03:00
Metin Doslu	a811e09dd4	Add support for prepared statements with parameterized non-partition columns in router executor	2016-07-21 11:09:28 +03:00
Burak Yucesoy	a0e8f9eb64	Always schema-prefix worker queries Fixes #215 Fixes #267 Fixes #502 Fixes #556 Fixes #557 Fixes #560 Fixes #568 Fixes #623 Fixes #624 With this change we schema-prefix table names, operator names and composite types.	2016-07-20 10:42:24 +03:00
Eren	3eaff48114	Propagate DDL Commands with 2PC Fixes #513 This change modifies the DDL Propagation logic so that DDL queries are propagated via 2-Phase Commit protocol. This way, failures during the execution of distributed DDL commands will not leave the table in an intermediate state and the pending prepared transactions can be commited manually. DDL commands are not allowed inside other transaction blocks or functions. DDL commands are performed with 2PC regardless of the value of `citus.multi_shard_commit_protocol` parameter. The workflow of the successful case is this: 1. Open individual connections to all shard placements and send `BEGIN` 2. Send `SELECT worker_apply_shard_ddl_command(<shardId>, <DDL Command>)` to all connections, one by one, in a serial manner. 3. Send `PREPARE TRANSCATION <transaction_id>` to all connections. 4. Sedn `COMMIT` to all connections. Failure cases: - If a worker problem occurs before sending of all DDL commands is finished, then all changes are rolled back. - If a worker problem occurs after all DDL commands are sent but not after `PREPARE TRANSACTION` commands are finished, then all changes are rolled back. However, if a worker node is failed, then the prepared transactions in that worker should be rolled back manually. - If a worker problem occurs during `COMMIT PREPARED` statements are being sent, then the prepared transactions on the failed workers should be commited manually. - If master fails before the first 'PREPARE TRANSACTION' is sent, then nothing is changed on workers. - If master fails during `PREPARE TRANSACTION` commands are being sent, then the prepared transactions on workers should be rolled back manually. - If master fails during `COMMIT PREPARED` or `ROLLBACK PREPARED` commands are being sent, then the remaining prepared transactions on the workers should be handled manually. This change also helps with #480, since failed DDL changes no longer mark failed placements as inactive.	2016-07-19 10:44:11 +03:00
Murat Tuncer	4d992c8143	Make router planner use original query	2016-07-18 18:23:04 +03:00
Eren	5b54e28f93	Add LIMIT/OFFSET Support Fixes #394 This change adds LIMIT/OFFSET support for non router-plannable distributed queries. In cases that we can push the LIMIT down, we add the OFFSET value to that LIMIT in the worker queries. When a query with LIMIT x OFFSET y is issued, the query is propagated to the workers as LIMIT (x+y) OFFSET 0, and on the master table, the original LIMIT and OFFSET values are used. With this change, we can use OFFSET wherever we can use LIMIT.	2016-07-18 12:00:24 +03:00
Brian Cloutier	ae91768c96	Evaluate functions on the master - Enables using VOLATILE functions (like nextval()) in INSERT queries - Enables using STABLE functions (like now()) targetLists and joinTrees UPDATE and INSERT can now contain non-immutable functions. INSERT can contain any kind of expression, while UPDATE can contain any STABLE function, so long as a Var is not passed into the STABLE function, even indirectly. UPDATE TagetEntry's can now also include Vars. There's an exception, CASE/COALESCE statements may not contain mutable functions. Functions calls in master_modify_multiple_shards are also evaluated.	2016-07-13 11:45:51 -07:00
Burak Yucesoy	cab03a6274	Fix COPY produces error when using array of user-defined types Fixes #463 OID of user-defined types may be different in master and worker nodes. This causes errors while sending data between nodes with binary nodes. Because binary copy format adds OID of the element if it is in an array. The code adding OID is in PostgreSQL code, therefore we cannot change it. Instead we decided to use text format if we try to send array of user-defined type.	2016-07-13 11:12:24 +03:00
Jason Petersen	41ed433b0e	Remove hash-pruning logic for NULL values It turns out some tests exercised this behavior, but removing it should have no ill effects. Besides, both copy and INSERT disallow NULLs in a table's partition column. Fixes a bug where anti-joins on hash-partitioned distributed tables would incorrectly prune shards early, result in incorrect results (test included).	2016-07-06 17:04:21 -06:00
Andres Freund	4549e06884	Add regression tests for RETURNING.	2016-07-01 13:07:12 -07:00
Andres Freund	cccba66f24	Support RETURNING for modification commands. Fixes: #242	2016-07-01 13:07:12 -07:00
Andres Freund	d5ad8d7db9	Add tests verifying that updates return correct tuple counts. This unfortunately requires adding a new table, triggering renumbering of a number of shard ids.	2016-07-01 12:50:12 -07:00
Burak Yucesoy	78aaad2738	Fix master_append_table_to_shard to work with schemas Fixes #78 With this change, it is possible to append a table in any schema to shard. The function master_append_table_to_shard now supports schema names.	2016-06-17 04:35:00 +03:00
Andres Freund	38f4722f6f	Add tests for LEFT JOIN ON clauses preventing matches left/right.	2016-06-16 16:53:02 -07:00
Marco Slot	52bc209c37	Do not copy outer join clauses into WHERE	2016-06-16 16:42:32 -07:00
Murat Tuncer	9cb6a022a5	Reduce regression test runtime -Added 2 more schedules for task-tracker and multi-binary instead of running multi_schedule 3 times -set task-tracker-delay for each long running schedule	2016-06-15 16:35:07 +03:00
Burak Yucesoy	4a718d293b	Append shardId before escaping the table name Fixes #550, fixes #545 If table name contains special characters, it needs to be escaped. However in some cases, we escape table name before appending shardId, which causes syntax error in the queries sent to worker nodes. With this change we now append shardId before escaping table names.	2016-06-15 04:15:40 +03:00
Murat Tuncer	31df82ba7a	Remove variant files This checkin removes variant files we needed due to differences in outputs of pg94 and pg95 runs. However, variant file for test multi_upsert stays since this file tests for a feature that does not exist in pg94, and outputs are drastically different.	2016-06-13 12:12:06 +03:00
Murat Tuncer	bb3eee63e7	Refactor task tracker cleanup to enable workers receive cleanup jobs Long sleep is replaced by multiple small sleeps. Maximum timeout is also increased since we do not have to wait for that long most of the cases.	2016-06-09 17:03:54 +03:00
Murat Tuncer	0db413491c	Fix crash in count distinct with filters in repartition subqueries now copies all column references in count distinct aggreagete to worker target list and group by. Master target list is also updated to reflect changes in attribute order. Fixes 569	2016-06-09 11:47:24 +03:00
Burak Yücesoy	323f1151e0	Fix wrong storage type for foreign tables Fixes #496 Previously we do not check whether table is foreign or not while creating empty shards, and set storage type to 't'(Standard table) or 'c'(Columnar table). Now if the table is foreign table(but not CStore foreign table) we set storage type to 'f'(Foreign table). If it is CStore foreign table, we set its storage type to 'c', i.e. columnar table have priority over foreign table. Please note that 'c' is only used for CStore tables not for other possible columnar stores at the moment. Possible improvement could be checking for other columnar stores, though I am not sure if there is a way to check it for all other columnar stores.	2016-06-08 04:12:01 +03:00
Jason Petersen	a19520b9bd	Add back test for INSERT where all placements fail Since we now short-circuit on certain remote errors, we want to ensure we preserve the old behavior of not modifying any placement states if a non-short-circuiting error occurs on all placements.	2016-06-07 13:21:23 -06:00
Jason Petersen	48f4e5d1a5	Make ReportRemoteError's CONTEXT style-compliant There's not a ton of documentation about what CONTEXT lines should look like, but this seems like the most dominant pattern. Similarly, users should expect lowercase, non-period strings.	2016-06-07 12:47:16 -06:00
Metin Doslu	7d0c90b398	Fail fast on constraint violations in router executor	2016-06-07 18:11:17 +03:00
Metin Doslu	15eed396b3	Update ereport format	2016-06-07 15:58:32 +03:00
Metin Doslu	28a16beba7	Update only shard length on statistics update for hash-partitioned Update only the shard length on master_update_shard_statistics() call for hash-partitioned tables. Fixes #519.	2016-06-07 15:04:29 +03:00
Eren	5512bb359a	Set Explicit ShardId/JobId In Regression Tests Fixes #271 This change sets ShardIds and JobIds for each test case. Before this change, when a new test that somehow increments Job or Shard IDs is added, then the tests after the new test should be updated. ShardID and JobID sequences are set at the beginning of each file with the following commands: ``` ALTER SEQUENCE pg_catalog.pg_dist_shardid_seq RESTART 290000; ALTER SEQUENCE pg_catalog.pg_dist_jobid_seq RESTART 290000; ``` ShardIds and JobIds are multiples of 10000. Exceptions are: - multi_large_shardid: shardid and jobid sequences are set to much larger values - multi_fdw_large_shardid: same as above - multi_join_pruning: Causes a race condition with multi_hash_pruning since they are run in parallel.	2016-06-07 14:32:44 +03:00

1 2 3

103 Commits (b391abda3d6ea70a3a884665079813b4bddb777c)