citus

Commit Graph

Author	SHA1	Message	Date
Hadi Moshayedi	9bfbbf8a04	Make reports hostname configurable and enable stats collection in tests. This patch adds --with-reports-host configure option, which sets the REPORTS_BASE_URL constant. The default is reports.citusdata.com. It also enables stats collection in tests.	2017-10-31 21:51:43 -04:00
Hadi Moshayedi	78a2cd9052	Check for Citus updates. Sends a request to /v1/releases/latest?flavor=$CITUS_EDITION once a day, which returns a response similar to {"version": "7.1.0", "major": 7, "minor": 1, "patch": 0}. Then compares it with current Citus version, and if the latest release is newer, logs a LOG message.	2017-10-31 21:51:43 -04:00
Hadi Moshayedi	34f3ec0961	Call FlushDistTableCache() before stats collection.	2017-10-31 21:51:43 -04:00
Hadi Moshayedi	97d544b75c	Follow the patterns used in Deadlock Detection in Stats Collection. This includes: (1) Wrap everything inside a StartTransactionCommand()/CommitTransactionCommand(). This is so we can access the database. This also switches to a new memory context and releases it, so we don't have to do our own memory management. (2) LockCitusExtension() so the extension cannot be dropped or created concurrently. (3) Check CitusHasBeenLoaded() && CheckCitusVersion() before doing any work. (4) Do not PG_TRY() inside a loop.	2017-10-31 21:51:43 -04:00
Furkan Sahin	2b39c52f0b	Replica identity on create_distributed_table By this commit, citus minds the replica identity of the table when we distribute the table. So the shards of the distributed table have the same replica identity with the local table.	2017-10-31 13:08:36 +03:00
velioglu	0b5db5d826	Support multi shard update/delete queries	2017-10-25 15:52:38 +03:00
Hadi Moshayedi	9a04b78980	Send server_id for statistics reports. (#1698 ) This change introduces the `pg_dist_node_metadata` which has a single jsonb value. When creating the extension, a random server id is generated and stored in there. Everything in the metadata table is added as a nested objected to the json payload that is sent to the reports server.	2017-10-18 21:20:32 -04:00
Jason Petersen	f2c593b25c	Add CITUS_NAME and CITUS_EDITION Unambiguous places to check whether we're running simply Citus or Citus Enterprise, or to check for 'community' or 'enterprise'.	2017-10-16 18:09:29 -06:00
Jason Petersen	8544878c4b	Add citus_version(), analogous to PG's version() This will provide the full project name (i.e. Citus/Citus Enterprise), and the host system, compiler, and architecture word size. I wanted to limit the number of copied files in 'config', so I added only config.guess and call it manually, rather than using the macro AC_CANONICAL_HOST, which requires several other files.	2017-10-16 18:09:29 -06:00
Brian Cloutier	ebcb2b65e9	Add master_move_node function	2017-10-16 10:51:28 -07:00
Hadi Moshayedi	2aec6eda49	Properly use #ifdef HAVE_LIBCURL.	2017-10-13 12:04:36 -06:00
Jason Petersen	01353cb7cb	Use header define rather than -D flag Eclipse apparently doesn't scan build output looking for -D flags, so having the value actually appear in a header is nicer for those of us using IDEs.	2017-10-13 11:00:09 -04:00
Murat Tuncer	f7ab901766	Add select distinct, and distinct on support Distinct, and distinct on() clauses are supported in simple selects, joins, subqueries, and insert into select queries.	2017-10-13 14:59:48 +03:00
Hadi Moshayedi	a1387f4aa8	Basic usage statistics collection. (#1656 ) Adds ```citus.enable_statistics_collection``` GUC variable, which ```true``` by default, unless built without libcurl. If statistics collection is enabled, sends basic usage data to Citus servers every 24 hours. The data that is collected consists of: - Citus version - OS name & release - Hardware Id - Number of tables, rounded to next power of 2 - Size of data, rounded to next power of 2 - Number of workers	2017-10-11 09:55:15 -04:00
Onder Kalaci	498ac80d8b	Add window function support for SUBQUERY PUSHDOWN and INSERT INTO SELECT This commit provides the support for window functions in subquery and insert into select queries. Note that our support for window functions is still limited because it must have a partition by clause on the distribution key. This commit makes changes in the files insert_select_planner and multi_logical_planner. The required tests are also added with files multi_subquery_window_functions.out and multi_insert_select_window.out.	2017-10-04 15:33:07 +03:00
Marco Slot	394918f9d0	Invalidate worker and group ID cache in maintenance daemon	2017-10-02 18:14:29 +02:00
Marco Slot	43d5e79eaa	Execute transmit commands as superuser during task-tracker queries	2017-09-28 15:27:25 +02:00
Marco Slot	da6b42a3e2	Use unique constraint index for transaction record deletion	2017-09-28 12:04:56 +02:00
Murat Tuncer	4676c4f7a5	Prevent crash when remote transaction start fails (#1662 ) We sent multiple commands to worker when starting a transaction. Previously we only checked the result of the first command that is transaction 'BEGIN' which always succeeds. Any failure on following commands were not checked. With this commit, we make sure all command results are checked. If there is any error we report the first error found.	2017-09-26 17:25:46 -07:00
Jason Petersen	b4d53423fa	Add adapter functions for OpenFile changes	2017-09-25 17:20:24 -07:00
Jason Petersen	bbc15e0598	Handle HASHPROC changes PostgreSQL 11 now has "standard" and "extended" (64-bit) versions of hash functions.	2017-09-25 17:20:24 -07:00
Jason Petersen	6c9b19a954	Add version-compat header For polyfill macros, etc.	2017-09-25 17:20:23 -07:00
velioglu	0a56ed910b	Change error message of queries with distributed and local table Citus can handle INSERT INTO ... SELECT queries if the query inserts into local table by reading data from distributed table. The opposite way is not correct. With this commit we warn the user if the latter option is used.	2017-09-22 13:46:19 -07:00
Onder Kalaci	4782f9f98a	Properly copy and trim the error messages that come from pg_conn When a NULL connection is provided to PQerrorMessage(), the returned error message is a static text. Modifying that static text, which doesn't necessarly be in a writeable memory, is dangreous and might cause a segfault.	2017-09-22 19:43:09 +03:00
Onder Kalaci	6736fd1682	Remove two obsolete functions Namely GetConnectionFromPGconn() and CloseConnectionByPGconn()	2017-09-21 00:36:23 -06:00
Marco Slot	0aadbb1760	Convert multi-row INSERT target list to Vars	2017-08-25 10:55:56 +02:00
Marco Slot	ae00795dab	Allow default columns in multi-row INSERTs	2017-08-25 10:55:56 +02:00
Marco Slot	c97692f382	Fix multi-row INSERT with RETURNING on reference tables	2017-08-24 10:42:12 +02:00
Marco Slot	4d7927b672	Execute multi-row INSERTs sequentially	2017-08-23 10:04:57 +02:00
Marco Slot	cf375d6a66	Consider dropped columns that precede the partition column in COPY	2017-08-22 13:02:35 +02:00
Onder Kalaci	6532b69873	Kill the maintenance daemon on DROP DATABASE	2017-08-18 16:03:08 +03:00
Marco Slot	7523753a73	Clear metadata OID cache prior to deadlock detection	2017-08-18 11:20:24 +02:00
Hadi Moshayedi	e5fbcf37dd	Add Savepoint Support (#1539 ) This change adds support for SAVEPOINT, ROLLBACK TO SAVEPOINT, and RELEASE SAVEPOINT. When transaction connections are not established yet, savepoints are kept in a stack and sent to the worker when the connection is later established. After establishing connections, savepoint commands are sent as they arrive. This change fixes #1493 .	2017-08-15 13:02:28 -04:00
Burak Yucesoy	dfdfb44ebf	Acquire shard resource locks on parent tables while operating on partitions	2017-08-14 14:44:30 +03:00
Burak Yucesoy	a321e750c0	Acquire relation locks on partitions while operation on parent table	2017-08-14 14:44:30 +03:00
Burak Yucesoy	52b9e35d50	Add relationIdList field to the Job struct	2017-08-14 14:06:22 +03:00
Onder Kalaci	5b48de7430	Improve deadlock detection for MX We added a new field to the transaction id that is set to true only for the transactions initialized on the coordinator. This is only useful for MX in order to distinguish the transaction that started the distributed transaction on the coordinator where we could have the same transactions' worker queries on the same node.	2017-08-12 13:28:37 +03:00
Onder Kalaci	e5d5bdff51	Enable distributed deadlock detection on the maintenance deamon With this commit, the maintenance deamon starts to check for distributed deadlocks. We also introduced a GUC variable (distributed_deadlock_detection_factor) whose value is multiplied with Postgres' deadlock_timeout. Setting it to -1 disables the distributed deadlock detection.	2017-08-12 13:28:37 +03:00
Onder Kalaci	a333c9f16c	Add infrastructure for distributed deadlock detection This commit adds all the necessary pieces to do the distributed deadlock detection. Each distributed transaction is already assigned with distributed transaction ids introduced with `3369f3486f`. The dependency among the distributed transactions are gathered with `80ea233ec1`. With this commit, we implement a DFS (depth first seach) on the dependency graph and search for cycles. Finding a cycle reveals a distributed deadlock. Once we find the deadlock, we examine the path that the cycle exists and cancel the youngest distributed transaction. Note that, we're not yet enabling the deadlock detection by default with this commit.	2017-08-12 13:28:37 +03:00
velioglu	b0efffae1c	Correct planner and add more tests	2017-08-11 10:16:13 +03:00
velioglu	ceba81ce35	Move physical planner checks to logical planner	2017-08-11 10:09:47 +03:00
velioglu	c4e3b8b5e1	Add planner changes and tests for subquery on reference tables	2017-08-11 10:09:47 +03:00
Marco Slot	0ae265c436	Add citus_create_restore_point for distributed snapshots	2017-08-11 07:36:20 +02:00
Marco Slot	fca986f214	Add API for waiting for multiple connections	2017-08-11 00:03:06 +02:00
Brian Cloutier	9d93fb5551	Create citus.use_secondary_nodes GUC This GUC has two settings, 'always' and 'never'. When it's set to 'never' all behavior stays exactly as it was prior to this commit. When it's set to 'always' only SELECT queries are allowed to run, and only secondary nodes are used when processing those queries. Add some helper functions: - WorkerNodeIsSecondary(), checks the noderole of the worker node - WorkerNodeIsReadable(), returns whether we're currently allowed to read from this node - ActiveReadableNodeList(), some functions (namely, the ones on the SELECT path) don't require working with Primary Nodes. They should call this function instead of ActivePrimaryNodeList(), because the latter will error out in contexts where we're not allowed to write to nodes. - ActiveReadableNodeCount(), like the above, replaces ActivePrimaryNodeCount(). - EnsureModificationsCanRun(), error out if we're not currently allowed to run queries which modify data. (Either we're in read-only mode or use_secondary_nodes is set) Some parts of the code were switched over to use readable nodes instead of primary nodes: - Deadlock detection - DistributedTableSize, - the router, real-time, and task tracker executors - ShardPlacement resolution	2017-08-10 17:37:17 +03:00
Brian Cloutier	3fc87a7a29	Metadata sync also syncs nodes in other clusters	2017-08-10 16:55:55 +03:00
Eren Başak	deb89cb9ce	Delete tesh_helper_functions.h	2017-08-10 14:00:44 +03:00
Eren Başak	3061737712	Define Some Utility Functions This change declares two new functions: `master_update_table_statistics` updates the statistics of shards belong to the given table as well as its colocated tables. `get_colocated_shard_array` returns the ids of colocated shards of a given shard.	2017-08-10 12:42:46 +03:00
Jason Petersen	6a35c2937c	Enable multi-row INSERTs This is a pretty substantial refactoring of the existing modify path within the router executor and planner. In particular, we now hunt for all VALUES range table entries in INSERT statements and group the rows contained therein by shard identifier. These rows are stashed away for later in "ModifyRoute" elements. During deparse, the appropriate RTE is extracted from the Query and its values list is replaced by these rows before any SQL is generated. In this way, we can create multiple Tasks, but only one per shard, to piecemeal execute a multi-row INSERT. The execution of jobs containing such tasks now exclusively go through the "multi-router executor" which was previously used for e.g. INSERT INTO ... SELECT. By piggybacking onto that executor, we participate in ongoing trans- actions, get rollback-ability, etc. In short order, the only remaining use of the "single modify" router executor will be for bare single- row INSERT statements (i.e. those not in a transaction). This change appropriately handles deferred pruning as well as master- evaluated functions.	2017-08-10 00:32:46 -07:00
Onder Kalaci	b5ea3ab6a3	Improve locking semantics for backend management We use the backend shared memory lock for preventing new backends to be part of a new distributed transaction or an existing backend to leave a distributed transaction while we're reading the all backends' data. The primary goal is to provide consistent view of the current distributed transactions while doing the deadlock detection.	2017-08-09 17:17:12 +03:00

1 2 3 4 5 ...

319 Commits (3cf1ef06c0044fbe3111972cf440d2ab320786f1)