citus

Commit Graph

Author	SHA1	Message	Date
Andres Freund	3dae284bbe	Use the current session's username when connecting to worker nodes. So far we've always used libpq defaults when connecting to workers; bar special environment variables being set that'll always be the user that started the server. That's not desirable because it prevents using users with fewer privileges. Thus change the various APIs creating connections to workers to always use usernames. That means: 1) MultiClientConnect() needs to, optionally, accept a username 2) GetOrEstablishConnection(), including the underlying cache, need to use the current user as part of the connection cache key. That way connections for separate users are distinct, and we always use one with the correct authorization. 3) The task tracker needs to keep track of the username associated with a task, so it can use it when establishing connections outside the originating session.	2016-04-27 10:00:08 -07:00
Onder Kalaci	c763d7492c	Apply final code review feedback - Fix o(n^2) loop to o(n) - Collapse two if statements into a single one - Some coding conventions feedback	2016-04-27 10:36:03 +03:00
Onder Kalaci	876730ad73	Fix Merge Conflict This commit fixes merge conflicts.	2016-04-26 11:18:47 +03:00
Onder Kalaci	16425e9054	Add fast shard pruning path for INSERTs on hash partitioned tables This commit adds a fast shard pruning path for INSERTs on hash-partitioned tables. The rationale behind this change is that if there exists a sorted shard interval array, a single index lookup on the array allows us to find the corresponding shard interval. As mentioned above, we need a sorted (wrt shardminvalue) shard interval array. Thus, this commit updates shardIntervalArray to sortedShardIntervalArray in the metadata cache. Then uses the low-level API that is defined in multi_copy to handle the fast shard pruning. The performance impact of this change is more apparent as more shards exist for a distributed table. Previous implementation was relying on linear search through the shard intervals. However, this commit relies on constant lookup time on shard interval array. Thus, the shard pruning becomes less dependent on the shard count.	2016-04-26 11:16:00 +03:00
Brian Cloutier	1f5379457a	Clear metadata_cache upon DROP EXTENSION When we notice that pg_dist_partition is being invalidated we assume that the citus extension is being dropped and drop state such as extensionLoaded and the cached oids of all the metadata tables. This frees the user from needing to reconnect after running DROP EXTENSION, so we also no longer send a warning message.	2016-04-22 07:25:49 -07:00
Murat Tuncer	2bc96fabe5	Add dynamic executor selection - non-router plannable queries can be executed by router executor if they satisfy the criteria - router executor is removed from configuration, now task executor can not be set to router - removed some tests that error out for router executor	2016-04-21 09:15:33 +03:00
Murat Tuncer	68cbf8a482	Add router plannable check and router planning logic for single shard select queries	2016-04-21 09:15:33 +03:00
Brian Cloutier	c6135fe0dc	Support count(distinct) on hash partitioned tables Also add test to ensure we get the same results when running count(distinct) on range and hash partitioned tables.	2016-04-20 04:54:07 -07:00
eren	33b96dfb7f	FIX Warning Message in multi_logical_optimizer.c With #426, some new warning messages started to arise, because of cross assignment of Node and Expr pointers. This change fixes the warnings with type casts.	2016-04-20 11:33:29 +03:00
eren	f77cff3fb6	Fix JOINs on varchar columns with subquery pushdown Fixes #379 Varchar VAR struct is wrapped in RELABELTYPE struct inside PostgreSQL code and IsPartitionColumnRecursive function considers only VAR types so returning false for varchar. This change adds strip_implicit_coercions() call to the columnExpression in IsPartitionColumnRecursive function so that we get rid of implicit coercions like RELABELTYPE are stripped to VAR.	2016-04-19 21:55:50 -06:00
eren	e786cbed0f	Fix Join Problem With VARCHAR Partition Columns This change fixes the problem with joins with VARCHAR columns. Prior to this change, when we tried to do large table joins on varchar columns, we got an error of the form: ERROR: cannot perform local joins that involve expressions DETAIL: local joins can be performed between columns only. This is because we have a check in CheckJoinBetweenColumns() which requires the join clause to have only 'Var' nodes (i.e. columns). Postgres adds a relabel t ype cast to cast the varchar to text; hence the type of the node is not T_Var and the join fails. The fix involves calling strip_implicit_coercions() to the left and right arguments so that RELABELTYPE is stripped to VAR. Fixes #76.	2016-04-19 21:55:50 -06:00
eren	f53057c7dd	Fix Shard Pruning Problem With Subqueries on VARCHAR Partition Columns Fixes #375 Prior to this change, shard pruning couldn't be done if: - Table is hash-distributed - Partition column of is VARCHAR - Query to be pruned is a subquery There were two problems: - A bug in left-side/right-side checks for the partition column - We were not considering relabeled types (VARCHAR was relabeled as TEXT)	2016-04-19 21:55:50 -06:00
Metin Doslu	4e20753003	Add COPY support on master node for append partitioned relations	2016-04-19 21:57:59 +03:00
Andres Freund	c53fcd8042	Remove wholly unused variable. This avoids a -Wunused warning.	2016-04-19 12:31:13 -06:00
Andres Freund	926534bbc2	Annotate variables only used for asserts with PG_USED_FOR_ASSERTS_ONLY. This avoids '-Wunused-but-set-variable' type warnings when compiling without assertions, e.g. against a system postgres.	2016-04-19 12:31:12 -06:00
Jason Petersen	37d7f98f50	Add clarifying comment in HashableClauseMutator While reading this code last week, it appeared as though there was no place we ensured that the partition clause actually used equality ops. As such, I was worried that we might transform a clause such as id < 5 into a constraint like hash(id) = hash(5) when doing shard pruning. The relevant code seemed to just ensure: 1. The node is an OpExpr 2. With a related hash function 3. It compares the partition column 4. Against a constant A superficial reading implied we didn't actually make sure the original op was equality-related, but it turns out the hash lookup function DOES ensure that for us. So I added a comment.	2016-04-19 12:21:11 -06:00
Murat Tuncer	c19af52f9c	Merge pull request #410 from citusdata/350-error-during-duplicate-index-creation Error out earlier when creating an index with a name collision.	2016-04-19 07:26:31 +03:00
Brian Cloutier	301ffd64f2	Better error on "CREATE INDEX already_exists ..." Previously (if you're creating the index with the same name on different tables) we successfully ran the command on the workers before failing it on the master and leaving no record of the index. Now we check whether the index exists on the master before sending commands to the workers. -- Also make the error better when user attampts to create an index without a name. Previously those statements returned: brian=# create index on c (b); WARNING: could not receive query results from localhost:9700 DETAIL: Client error: cannot extend name for null index name ERROR: could not execute DDL command on worker node shards They now return brian=# create index on c (b); ERROR: creating index without a name on a distributed table is currently unsupported	2016-04-18 13:33:53 -07:00
Brian Cloutier	356f6f6cd7	Treat nodePort as the 64bit integer it is	2016-04-15 11:29:36 -07:00
Matthew Seaman	17ba0de333	Regularize include paths for some postgresql headers. Addresses #411	2016-04-15 09:37:22 -07:00
Matthew Seaman	08752ecf41	Include appropriate headers for htons() and htonl().	2016-04-15 09:37:08 -07:00
Matthew Seaman	e47778d4df	Add sys/stat.h include to files using S_IRUSR and S_IWUSR macros.	2016-04-15 09:34:22 -07:00
eren	e98c35bb8a	Clarify Error Message Related to shared_preload_libraries Fixes #363 This change modifies the error message given when Citus is attempted to be loaded other than shared_preload_libraries. Explanations have been extended with that shared_preload_parameters parameter is in postgresql.conf and citus should be at the beginning.	2016-04-13 12:12:21 +03:00
eren	662f13a4d4	Fix SELECT problem with no target list Prior to this change, performing a SELECT query without a target list caused backend to crash. Sample Query: SELECT FROM github_events; (without any * before FROM) PostgreSQL: ``` -- (39599 rows) ``` Citus: ``` server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. !> ``` The problem was an unnecessary Assert on column list in SetRangeTblExtraData(citus_nodefuncs.c)	2016-04-13 11:08:14 +03:00
Metin Doslu	ce0721fdf8	Send COPY rows in binary format	2016-04-12 20:22:31 +02:00
Marco Slot	690252b222	Support for COPY FROM, based on pg_shard PR by Postres Pro	2016-04-12 20:22:31 +02:00
Onder Kalaci	2eabf3fcfa	Allow all types of nodes in the WHERE clauses This change removes the whitelisting check on the WHERE clauses. Note that, before this change, citus was already allowing all types of nodes with the following format (i.e., wrap with a boolean test): * SELECT col FROM table WHERE (ANY EXPRESSION) is TRUE; Thus, this change is mostly useful for allowing the expressions in the WHERE clause directly and avoiding "unsupport clause type" errors.	2016-03-30 16:39:58 +03:00
eren	a4750e2e61	Fixes issue #313 Prior to this change, it was not possible to use UDFs in repartitioned subqueries. The reason is that we were setting the search path explicitly and omiting public schema from that path. This change adds the public schema to the explicitly set search path.	2016-03-30 15:39:12 +03:00
eren	3a9a01557f	Fix spurious NOTICE messages with ANY/ALL Fixes issue #258 Prior to this change, Citus gives a deceptive NOTICE message when a query including ANY or ALL on a non-partition column is issued on a hash partitioned table. Let the github_events table be hash-distributed on repo_id column. Then, issuing this query: SELECT count(*) FROM github_events WHERE event_id = ANY ('{1,2,3}') Gives this message: NOTICE: cannot use shard pruning with ANY (array expression) HINT: Consider rewriting the expression with OR clauses. Note that since event_id is not the partition column, shard pruning would not be applied in any case. However, the NOTICE message would be valid and be given if the ANY clause would have been applied on repo_id column. Reviewer: Murat Tuncer	2016-03-25 14:30:02 +02:00
Murat Tuncer	3f27c627ad	Allow users to create unique indexes Users can now create unique indexes on partition columns for hash and range distributed tables.	2016-03-24 09:31:06 +02:00
Jason Petersen	388968c761	Merge latest 5.0 release fixes	2016-03-23 17:43:34 -06:00
Jason Petersen	ade2d5bd77	Fix strlcpy off-by-one error WORKER_LENGTH + 1 is too large. Fixing this has no impact on the string that is ultimately copied, as it's impossible for the source string to be any larger to begin with.	2016-03-23 17:34:34 -06:00
Jason Petersen	a95c9da472	Update copyright dates Fixed configure variable and updated all end dates to 2016.	2016-03-23 17:14:37 -06:00
Andres Freund	eed14b1224	Copy toplevel queryId to citus' master statement. multi_ExecutorStart() replaces the original planned statement with the master select statement. As that hasn't gone through the parse analysis hooks, it'll not have a associated queryId. This prevents extensions pg_stat_statements to show useful data associated with the query.	2016-03-14 17:27:52 -07:00
Jason Petersen	b73c3b1604	Fix various build issues I came across several places we weren't as flexible or resilient as we should have been in our build logic. They include: * Not using `DESTDIR` in the install-header destination * Allowing callers to specify `VPATH` or `srcdir` (which breaks) * Using absolute path for SCRIPTS (9.5 prepends srcdir) * Including libpq-int in a confusing way (extracted this function) * Having server includes come first during csql build (client must) In particular, I hit all of these attempting to build with pg_buildext in Debian. It passes in an explicit VPATH, as well as srcdir (breaking all recursive make invocations), and also uses DESTDIR during install. In addition, a PGDG-enabled Debian box will have the latest libpq-dev headers (e.g. 9.5) even when building against an older server version (e.g. 9.4). This leads to problems when including e.g. `c.h`, which is ambiguous. While compiling more client-side code (csql), we need to ensure the newer libpq headers are included _first_, so I fixed that.	2016-03-11 13:38:47 -07:00
Jason Petersen	0f62a5197b	Final formatting fixes	2016-02-17 17:20:14 -07:00
Andres Freund	0c57a1a04e	Make 'all' default src/backend/distributed target Otherwise typing 'make' will just build citusdb--5.0.sql, not particularly helpful.	2016-02-17 16:51:55 -07:00
Andres Freund	caed118e7f	Fix make install for VPATH builds. copy_to_distributed_table is in the source, not the build directory. As there might be scripts in either at some point, install scripts from both.	2016-02-17 16:48:06 -07:00
Marco Slot	58351fb128	Merge remote-tracking branch 'origin/master' into feature/drop_shards_on_drop_table	2016-02-17 22:52:58 +01:00
Marco Slot	9aa1f1e1e7	Rename topLevel variable to isTopLevel	2016-02-17 22:52:35 +01:00
Murat Tuncer	00b10e5a93	Merge from master branch into feature/citusdb-to-citus	2016-02-17 14:49:01 +02:00
Metin Doslu	87ff558c1c	Add check for count distinct on single table subqueries Fixes #314	2016-02-17 14:24:07 +02:00
Murat Tuncer	db8330ee81	Merge pull request #334 from citusdata/feature/append_table_to_shard Add support for appending to cstore table shards	2016-02-17 09:19:33 +02:00
Jason Petersen	27edf02484	Merge pull request #344 from citusdata/fix_shard_lock_acquisition#342 Ensure router executor acquires proper shard lock cr: @onderkalaci	2016-02-16 16:43:39 -07:00
Jason Petersen	130e65f5be	Ensure router executor acquires proper shard lock Though Citus' Task struct has a shardId field, it doesn't have the same semantics as the one previously used in pg_shard code. The analogous field in the Citus Task is anchorShardId. I've also added an argument check to the relevant locking function to catch future locking attempts which pass an invalid argument.	2016-02-16 11:20:18 -07:00
Marco Slot	37f580f9c7	Trim comment about invalidating dropped relations	2016-02-16 14:04:12 +01:00
Marco Slot	2af6797c04	Perform relcache invalidation in CitusInvalidateRelcacheByRelid	2016-02-16 12:59:38 +01:00
Murat Tuncer	44d7721b4c	Add support for appending to cstore table shards - Flexed the check which prevented append operation cstore tables since its storage type is not SHARD_STORAGE_TABLE. - Used process utility function to perform copy operation in worker_append_table_to shard() instead of directly calling postgresql DoCopy(). - Removed the additional check in master_create_empty_shard() function. This check was redundant and erroneous since it was called after CheckDistributedTable() call. - Modified WorkerTableSize() function to retrieve cstore table shard size correctly.	2016-02-16 13:58:39 +02:00
Marco Slot	52f11223e5	Drop shards when a distributed table is dropped After this change, shards and associated metadata are automatically dropped when running DROP TABLE on a distributed table, which fixes #230. It also adds schema support for master_apply_delete_command, which fixes #73. Dropping the shards happens in the master_drop_all_shards UDF, which is called from the SQL_DROP trigger. Inside the trigger, the table is no longer visible and calling master_apply_delete_command directly wouldn't work and oid <-> name mappings are not available. The master_drop_all_shards function therefore takes the relation id, schema name, and table name as parameters, which can be obtained from pg_event_trigger_dropped_objects() in the SQL_DROP trigger. If the user calls master_drop_all_shards while the table still exists, the schema name and table name are ignored. Author: Marco Slot Reviewed-By: Andres Freund	2016-02-16 10:54:29 +01:00
Jason Petersen	20ba6c659e	Omit backend/copy.c-inspired parts from formatting I think we need to assess whether this function is still as in-sync with upstream as we believe, but for now I'm omitting it from formatting.	2016-02-15 23:29:33 -07:00

1 2

62 Commits (3dae284bbeb49175cda97c9ee6a76afefef6281f)