citus

Commit Graph

Author	SHA1	Message	Date
Onder Kalaci	48f96bf3e5	Enable non equi joins in subquery pushdown Subquery pushdown planning is based on relation restriction equivalnce. This brings us the opportuneatly to allow any other joins as long as there is an already equi join between the distributed tables. We already allow that for joins with reference tables and this commit allows that for joins among distributed tables.	2017-11-23 16:13:46 +02:00
Onder Kalaci	16421f089f	Register citus custom scan nodes	2017-11-23 11:38:33 +02:00
Onder Kalaci	83c1143505	Refactor custom scan related codes In this commit, we don't change any codes, only create a new file and move the related functions and types there.	2017-11-23 11:38:12 +02:00
Marco Slot	20a526d5c4	Fix memory leak in ListToHashSet	2017-11-22 11:26:58 +01:00
Marco Slot	f4ceea5a3d	Enable 2PC by default	2017-11-22 11:26:58 +01:00
Marco Slot	8486f76e15	Auto-recover 2PC transactions	2017-11-22 11:26:58 +01:00
Marco Slot	6ba3f42d23	Rename MultiPlan to DistributedPlan	2017-11-22 09:36:24 +01:00
Marco Slot	0ad39b36fe	Treat immutable table functions and constant subqueries as reference tables	2017-11-21 14:15:22 +01:00
Onder Kalaci	d558ebb923	Relax the checks on ensuring distribution columns for target entries With this commit, we allow pushing down subqueries with only reference tables where GROUP BY or DISTINCT clause or Window functions include only columns from reference tables.	2017-11-21 12:28:14 +02:00
Andres Freund	d063658d6d	Protect some initializations from being called during backend startup. On EXEC_BACKEND builds these functions shouldn't be called at every backend start.	2017-11-20 15:29:51 -08:00
Brian Cloutier	d267e0f9fa	EXEC_BACKEND: don't put pointers to shared hashes into shared memory Store pointers to shared hashes in process-local variables. Previously pointers to shared hashes were put into shared memory. This causes problems on EXEC_BACKEND because everybody calls execve and receives a brand new address space; the shared hash will be in a different place for every backend. (normally we call fork, which gives you a copy of the address space, so these pointers remain constant)	2017-11-20 15:29:51 -08:00
Brian Cloutier	30a2365d81	Rename CreateDirectory to CitusCreateDirectory	2017-11-20 14:38:26 -08:00
Brian Cloutier	aa2ab023a2	Rename RemoveDirectory -> CitusRemoveDirectory	2017-11-20 14:21:52 -08:00
Brian Cloutier	06f756b0a1	Rename DeleteFile -> CitusDeleteFile	2017-11-20 13:30:11 -08:00
Marco Slot	9793218122	Do not commit already-committed prepared transactions in recovery	2017-11-20 13:18:48 +01:00
Marco Slot	ae47df01ea	Observe prepared xacts twice in RecoverWorkerTransactions to avoid race condition	2017-11-20 11:44:08 +01:00
Marco Slot	2410c2e450	Rewrite recover_prepared_transactions to be fast, non-blocking	2017-11-20 11:27:40 +01:00
Onder Kalaci	5bea95009b	Skip autovacuum processes for distributed deadlock detection Autovacuum process cancels itself if any modification starts on the table in order to avoid blocking your regular Postgres sessions. That's normal and expected. Thus, any locks held by autovacuum process cannot involve in a distributed deadlock since it'll be released if needed.	2017-11-15 14:32:16 +02:00
Onder Kalaci	c65c153a46	Skip speculative locks for distributed deadlock detection These locks are held for a very short duration time and cannot contribute to a deadlock. Speculative locks are used by Postgres for internal notification mechanism among transactions.	2017-11-15 12:43:45 +02:00
Marco Slot	bbbadd6d1b	Bump Citus version to 7.2devel	2017-11-15 10:32:49 +01:00
Marco Slot	d3b634b301	Allow generating placement IDs without using the sequence	2017-11-15 10:12:06 +01:00
Marco Slot	c24a0875a5	Allow generating shard IDs without using the sequence	2017-11-15 10:12:05 +01:00
Brian Cloutier	0f3230170f	Pull in INT32_MAXINT and INT32_MININT	2017-11-14 14:03:46 -08:00
Brian Cloutier	0db8277266	remove unused errno import	2017-11-14 13:09:34 -08:00
Brian Cloutier	5d9f3ae7fd	Remove unused poll import from multi_real_time_executor	2017-11-14 13:09:34 -08:00
Marco Slot	533a533565	Only drop sequences on workers with metadata	2017-11-14 16:01:56 +01:00
velioglu	be28ba8e70	Add stub UDF to run pg_upgrade flawlessly	2017-11-13 16:14:45 +02:00
metdos	111c04c2bd	Warn on CLUSTER command for distributed tables	2017-11-10 12:14:45 +02:00
Burak Yücesoy	863df0b874	Merge branch 'master' into fix_partitioning_in_schema	2017-11-09 12:49:35 +02:00
Burak Yucesoy	17229ed7bd	Fix attaching partition to a distributed table in schema While attaching a partition to a distributed table in schema, we mistakenly used unqualified name to find partitioned table's oid. This caused problems while using partitioned tables with schemas. We are fixing this issue in this PR.	2017-11-09 13:20:29 +03:00
Onder Kalaci	94921a2be1	Skip page-level locks on distributed deadlock detection Short-term share/exclusive page-level locks are used for read/write access. Locks are released immediately after each index row is fetched or inserted. Since those locks may not lead to any deadlocks, it's safe to ignore them in the distributed deadlock detection.	2017-11-09 10:37:23 +02:00
Marco Slot	f71728f634	Add GUC for specifying sslmode in connections to workers	2017-11-08 14:15:58 +01:00
Murat Tuncer	4e3d633ebf	Add check for connection failures during multishard update (#1765 )	2017-11-07 12:33:25 +02:00
Hadi Moshayedi	6d79d25101	Fix a relcache reference leak in stats collection. In DistributedTablesSize() we didn't close the relations that had replication factor > 2. This caused relcache reference leaks, and warning messages like following in logs: WARNING: relcache reference leak: relation "researchers" not closed	2017-11-06 23:16:43 -05:00
metdos	c83edc36b5	Check connection status before using it	2017-11-06 14:53:35 +02:00
Brian Cloutier	7be1545843	Support implicit casts during INSERT/SELECT It's possible to build INSERT SELECT queries which include implicit casts, currently we attempt to support these by adding explicit casts to the SELECT query, but this sometimes crashes because we don't update all nodes with the new types. (SortClauses, for instance) This commit removes those explicit casts and passes an unmodified SELECT query to the COPY executor (how we implement INSERT SELECT under the scenes). In lieu of those cases, COPY has been given some extra logic to inspect queries, notice that the types don't line up with the table it's supposed to be inserting into, and "manually" casting every tuple before sending them to workers.	2017-11-03 22:27:15 -07:00
Marco Slot	6883a09cdd	Allow distributed partitioned table creation in Cloud	2017-11-03 10:09:18 +01:00
Marco Slot	6219186683	Allow distributed INSERT...SELECT via worker nodes in MX	2017-11-02 14:38:39 +01:00
Hadi Moshayedi	7280774cf4	Use list_length() != 1 in SingleReplicatedTable(). ShardPlacementList's implementation can return NIL. In previous implementation we got a segmentation fault in this case. The relation can be dropped after getting distributed table list but before calling SingleReplicatedTable().	2017-10-31 21:51:43 -04:00
Hadi Moshayedi	7691991cb5	Do PG_TRY() inside a subtransaction block. If we don't propagate the errors we are catching in PG_CATCH(), database's internal state might not be clean. So we do PG_TRY() inside a subtransaction so we can rollback to it after catching errors.	2017-10-31 21:51:43 -04:00
Hadi Moshayedi	9bfbbf8a04	Make reports hostname configurable and enable stats collection in tests. This patch adds --with-reports-host configure option, which sets the REPORTS_BASE_URL constant. The default is reports.citusdata.com. It also enables stats collection in tests.	2017-10-31 21:51:43 -04:00
Hadi Moshayedi	acaf085a80	Add callback function for request by CollectBasicUsageStatistics(). Curl writes the received response to stdout if we don't specify a response callback or an output file. This can pollute the PostgreSQL log. In this change we add a callback function so the response messages aren't added to the log file.	2017-10-31 21:51:43 -04:00
Hadi Moshayedi	747e439601	Limit number of stats collection retries to once a day.	2017-10-31 21:51:43 -04:00
Hadi Moshayedi	78a2cd9052	Check for Citus updates. Sends a request to /v1/releases/latest?flavor=$CITUS_EDITION once a day, which returns a response similar to {"version": "7.1.0", "major": 7, "minor": 1, "patch": 0}. Then compares it with current Citus version, and if the latest release is newer, logs a LOG message.	2017-10-31 21:51:43 -04:00
Hadi Moshayedi	34f3ec0961	Call FlushDistTableCache() before stats collection.	2017-10-31 21:51:43 -04:00
Hadi Moshayedi	c18c6625d9	Lock relations before calling citus_table_size(). This is to make sure they don't get dropped.	2017-10-31 21:51:43 -04:00
Hadi Moshayedi	97d544b75c	Follow the patterns used in Deadlock Detection in Stats Collection. This includes: (1) Wrap everything inside a StartTransactionCommand()/CommitTransactionCommand(). This is so we can access the database. This also switches to a new memory context and releases it, so we don't have to do our own memory management. (2) LockCitusExtension() so the extension cannot be dropped or created concurrently. (3) Check CitusHasBeenLoaded() && CheckCitusVersion() before doing any work. (4) Do not PG_TRY() inside a loop.	2017-10-31 21:51:43 -04:00
Marco Slot	100aaeb3f5	Fix typo in distributed deadlock error message	2017-10-31 19:39:32 +01:00
metdos	8c356b2bc8	Don't try to add restrictions for reference tables in insert into select	2017-10-31 19:44:10 +02:00
mehmet furkan şahin	32fb19911c	Add Constraint %s Add Primary Key Using index %s support This commit makes a change in relay_event_utility.c to check if the Alter Table command adds a constraint using index. If this is the case, it appends the shard id to the index name.	2017-10-31 16:03:56 +03:00
Marco Slot	7e34348334	Add shard transfer mode parameter to shard copy functions	2017-10-31 13:30:48 +01:00
Marco Slot	2bb46bb5ee	Reset connectionReady flag after moving a connection in WaitForAllConnections	2017-10-31 12:06:53 +01:00
Marco Slot	e6e6897499	Defer initial PQflush to main loop in WaitForAllConnections	2017-10-31 12:06:53 +01:00
Marco Slot	d6dadb1b25	Use correct index for ModifyWaitEvent in WaitForAllConnections	2017-10-31 12:06:53 +01:00
Furkan Sahin	2b39c52f0b	Replica identity on create_distributed_table By this commit, citus minds the replica identity of the table when we distribute the table. So the shards of the distributed table have the same replica identity with the local table.	2017-10-31 13:08:36 +03:00
Marco Slot	7f68f78ee9	Omit public schema from shard_name output	2017-10-31 00:22:07 +01:00
Murat Tuncer	e16805215d	Support count(distinct) for non-partition columns (#1692 ) Expands count distinct coverage by allowing more cases. We used to support count distinct only if we can push down distinct aggregate to worker query i.e. the count distinct clause was on the partition column of the table, or there was a grouping on the partition column. Now we can support - non-partition columns, with or without grouping on partition column - partition, and non partition column in the same query - having clause - single table subqueries - insert into select queries - join queries where count distinct is on partition, or non-partition column - filters on count distinct clauses (extends existing support) We first try to push down aggregate to worker query (original case), if we can't then we modify worker query to return distinct columns to coordinator node. We do that by adding distinct column targets to group by clauses. Then we perform count distinct operation on the coordinator node. This work should reduce the cases where HLL is used as it can address anything that HLL can. However, if we start having performance issues due to very large number rows, then we can recommend hll use.	2017-10-30 13:12:24 +02:00
Marco Slot	be46661bf7	Block only 2PCs instead of all writes in citus_create_restore_point	2017-10-27 00:07:32 +02:00
mehmet furkan şahin	61ae33dc7f	ALTER TABLE .. REPLICA IDENTITY support is implemented	2017-10-26 13:44:28 +03:00
Brian Cloutier	4a17d12d74	Replace uint with uint32	2017-10-25 19:32:12 -07:00
velioglu	0b5db5d826	Support multi shard update/delete queries	2017-10-25 15:52:38 +03:00
Marco Slot	4bde83e1d2	Relay error message if DML fails on worker	2017-10-25 14:23:21 +02:00
Hadi Moshayedi	9a04b78980	Send server_id for statistics reports. (#1698 ) This change introduces the `pg_dist_node_metadata` which has a single jsonb value. When creating the extension, a random server id is generated and stored in there. Everything in the metadata table is added as a nested objected to the json payload that is sent to the reports server.	2017-10-18 21:20:32 -04:00
Hadi Moshayedi	86bcd93a4a	Don't collect stats when there is a version mismatch. (#1712 ) The following scenario can cause an Assert() crash if we don't do this: - Install Citus v7.0-15 - Restart server & run a query to start maintenanced. - Install Citus v7.1 - Restart server & run a query. This will tell user to upgrade. - Type "UPDATE EXTENSION c" & press tab. maintenanced will start and crash with Assert(CitusHasBeenLoaded() && CheckCitusVersion(WARNING)); This change checks Citus version before calling metadata functions so the crash doesn't happen.	2017-10-17 14:01:14 -04:00
Jason Petersen	8544878c4b	Add citus_version(), analogous to PG's version() This will provide the full project name (i.e. Citus/Citus Enterprise), and the host system, compiler, and architecture word size. I wanted to limit the number of copied files in 'config', so I added only config.guess and call it manually, rather than using the macro AC_CANONICAL_HOST, which requires several other files.	2017-10-16 18:09:29 -06:00
Brian Cloutier	91ff8cd2d5	{*,}create_distributed_table doesn't emit OID (#1710 )	2017-10-16 18:08:51 -06:00
Brian Cloutier	ebcb2b65e9	Add master_move_node function	2017-10-16 10:51:28 -07:00
Brian Cloutier	58cf15ceca	DistributedTableSize doesn't emit oid when erring out	2017-10-14 02:42:57 +03:00
Hadi Moshayedi	2aec6eda49	Properly use #ifdef HAVE_LIBCURL.	2017-10-13 12:04:36 -06:00
Jason Petersen	01353cb7cb	Use header define rather than -D flag Eclipse apparently doesn't scan build output looking for -D flags, so having the value actually appear in a header is nicer for those of us using IDEs.	2017-10-13 11:00:09 -04:00
Hadi Moshayedi	946659aebe	Delete StatsCollection memory context after we are done with stats reporting. Previously we left the memory context untouched, which overtime leaked memory.	2017-10-13 11:00:09 -04:00
Hadi Moshayedi	873fd1e7ff	Fix compiling --without-libcurl. Previously <curl/curl.h> was included even if compiled --without-libcurl. This can fail when libcurl headers are not there. This commit guards this include by checks for HAVE_LIBCURL.	2017-10-13 11:00:09 -04:00
Murat Tuncer	4832abc7cb	Make multi_master_planner.c coding convention compliant Changed order of function definitions and added declarations in the beginning of the file	2017-10-13 14:59:48 +03:00
Murat Tuncer	f7ab901766	Add select distinct, and distinct on support Distinct, and distinct on() clauses are supported in simple selects, joins, subqueries, and insert into select queries.	2017-10-13 14:59:48 +03:00
Hadi Moshayedi	6879f92e23	Fix out of bound memeory access when getting HTTP response code. (#1699 )	2017-10-12 12:51:42 -04:00
Hadi Moshayedi	a1387f4aa8	Basic usage statistics collection. (#1656 ) Adds ```citus.enable_statistics_collection``` GUC variable, which ```true``` by default, unless built without libcurl. If statistics collection is enabled, sends basic usage data to Citus servers every 24 hours. The data that is collected consists of: - Citus version - OS name & release - Hardware Id - Number of tables, rounded to next power of 2 - Size of data, rounded to next power of 2 - Number of workers	2017-10-11 09:55:15 -04:00
Onder Kalaci	498ac80d8b	Add window function support for SUBQUERY PUSHDOWN and INSERT INTO SELECT This commit provides the support for window functions in subquery and insert into select queries. Note that our support for window functions is still limited because it must have a partition by clause on the distribution key. This commit makes changes in the files insert_select_planner and multi_logical_planner. The required tests are also added with files multi_subquery_window_functions.out and multi_insert_select_window.out.	2017-10-04 15:33:07 +03:00
Marco Slot	9e516513fc	Use local group ID when querying for prepared transactions	2017-10-03 16:36:53 +02:00
Hadi Moshayedi	11adb9b034	Push down LIMIT and HAVING when grouped by partition key. (#1641 ) We can do this because all rows belonging to a group are in the same shard when grouping by distribution column on a range/hash distributed table.	2017-10-02 20:17:51 -04:00
Marco Slot	394918f9d0	Invalidate worker and group ID cache in maintenance daemon	2017-10-02 18:14:29 +02:00
Marco Slot	43d5e79eaa	Execute transmit commands as superuser during task-tracker queries	2017-09-28 15:27:25 +02:00
Marco Slot	306c58d59b	Check for absolute paths in COPY with format transmit	2017-09-28 15:27:11 +02:00
Marco Slot	cb6b0e820c	Allow read-only users to run task-tracker queries	2017-09-28 13:52:36 +02:00
Marco Slot	da6b42a3e2	Use unique constraint index for transaction record deletion	2017-09-28 12:04:56 +02:00
Onder Kalaci	68ca8cb7f0	Skip relation extension locks We should skip if the process blocked on the relation extension since those locks are hold for a short duration while the relation is actually extended on the disk and released as soon as the extension is done. Thus, recording such waits on our lock graphs could yield detecting wrong distributed deadlocks.	2017-09-28 10:09:09 +03:00
Murat Tuncer	4676c4f7a5	Prevent crash when remote transaction start fails (#1662 ) We sent multiple commands to worker when starting a transaction. Previously we only checked the result of the first command that is transaction 'BEGIN' which always succeeds. Any failure on following commands were not checked. With this commit, we make sure all command results are checked. If there is any error we report the first error found.	2017-09-26 17:25:46 -07:00
Jason Petersen	b4d53423fa	Add adapter functions for OpenFile changes	2017-09-25 17:20:24 -07:00
Jason Petersen	d686123dae	Omit now-public Explain methods from PG11 build This copy-pasted code is no longer needed in PG11.	2017-09-25 17:20:24 -07:00
Jason Petersen	89d02c6115	Add ruleutils file for PostgreSQL 11	2017-09-25 17:20:24 -07:00
Jason Petersen	bbc15e0598	Handle HASHPROC changes PostgreSQL 11 now has "standard" and "extended" (64-bit) versions of hash functions.	2017-09-25 17:20:24 -07:00
Jason Petersen	6c9b19a954	Add version-compat header For polyfill macros, etc.	2017-09-25 17:20:23 -07:00
Jason Petersen	fbeaa2f9d0	Remove direct access to tupleDesc->attrs A level of indirection was removed from this field for PostgreSQL 11. By using the handy provided macro, we can be version agnostic.	2017-09-25 17:20:23 -07:00
Jason Petersen	6a020b5adc	Update CopyGetAttnums with latest from PostgreSQL This function was recently modified to use the TupleDescAttr wrapper, which abstracts away recent changes to TupleDesc.	2017-09-25 17:20:23 -07:00
Andres Freund	78716e5546	Fix possible shard cache incoherency. When a table and it's shards are dropped, and afterwards the same shard identifiers are reused, e.g. due to a DROP & CREATE EXTENSION, the old entry in the shard cache and the required entry in the shard cache might be for different tables. Force invalidation for both old and new table to fix.	2017-09-25 13:05:09 -07:00
velioglu	0a56ed910b	Change error message of queries with distributed and local table Citus can handle INSERT INTO ... SELECT queries if the query inserts into local table by reading data from distributed table. The opposite way is not correct. With this commit we warn the user if the latter option is used.	2017-09-22 13:46:19 -07:00
Onder Kalaci	867224bdd7	Make the tests produce more consistent outputs	2017-09-22 20:38:56 +03:00
Onder Kalaci	4782f9f98a	Properly copy and trim the error messages that come from pg_conn When a NULL connection is provided to PQerrorMessage(), the returned error message is a static text. Modifying that static text, which doesn't necessarly be in a writeable memory, is dangreous and might cause a segfault.	2017-09-22 19:43:09 +03:00
Onder Kalaci	6736fd1682	Remove two obsolete functions Namely GetConnectionFromPGconn() and CloseConnectionByPGconn()	2017-09-21 00:36:23 -06:00
Onder Kalaci	33ec33c5b3	Ensure schema exists on reference table creation If the schema doesn't exists on the workers, create it.	2017-09-18 23:50:47 +03:00
Onder Kalaci	6116c8e93d	Allow pushing down GROUP BYs when at least there is one distribution column in the target list	2017-09-15 19:15:06 +03:00

1 2 3 4 5 ...

827 Commits (0d5a4b9c7247a8044f0ce26bc42e541cd84a3806)