Commit Graph

420 Commits (c1c8c38dc9ea7b4f052af1ed4d13cd75cb13a171)

Author SHA1 Message Date
Onder Kalaci 1c930c96a3 Support non-co-located joins between subqueries
With #1804 (and related PRs), Citus gained the ability to
plan subqueries that are not safe to pushdown.

There are two high-level requirements for pushing down subqueries:

   * Individual subqueries that require a merge step (i.e., GROUP BY
     on non-distribution key, or LIMIT in the subquery etc). We've
     handled such subqueries via #1876.

    * Combination of subqueries that are not joined on distribution keys.
      This commit aims to recursively plan some of such subqueries to make
      the whole query safe to pushdown.

The main logic behind non colocated subquery joins is that we pick
an anchor range table entry and check for distribution key equality
of any  other subqueries in the given query. If for a given subquery,
we cannot find distribution key equality with the anchor rte, we
recursively plan that subquery.

We also used a hacky solution for picking relations as the anchor range
table entries. The hack is that we wrap them into a subquery. This is only
necessary since some of the attribute equivalance checks are based on
queries rather than range table entries.
2018-02-26 13:50:37 +02:00
Onder Kalaci 7b57e0562a Add infrastructure for detecting non-colocated subqueries 2018-02-26 13:28:25 +02:00
Onder Kalaci e8aa532a90 Refactor checks for distribution key equality
Change some function names, ensure we stick to Citus'
function order rules etc.
2018-02-26 13:28:24 +02:00
Markus Sintonen 6202e80d06 Implemented jsonb_agg, json_agg, jsonb_object_agg, json_object_agg 2018-02-18 00:19:18 +02:00
velioglu 195ac948d2 Recursively plan subqueries in WHERE clause when FROM recurs 2018-02-13 19:52:12 +03:00
Marco Slot 6051aae56e Handle errors that are discovered during abort 2018-02-12 16:45:02 +01:00
Onder Kalaci 94c5ac6ebb Remove duplicate join restrictions
We use PostgreSQL hooks to accumulate the join restrictions
and PostgreSQL gives us all the join paths it tries while
deciding on the join order. Thus, for queries that have many
joins, this function is likely to remove lots of duplicate join
restrictions. This becomes relevant for Citus on query pushdown
check peformance.
2018-02-12 18:35:05 +02:00
Onder Kalaci c228d8ff3d Refactor equivalance generation related codes
This commit changes the APIs for restriction generation to make future
changes simpler.
2018-02-12 18:35:04 +02:00
Onder Kalaci 2f2d350924 Refactor relation restriction related codes
This commit moves some of the functions to a more relevant
source file.
2018-02-12 18:35:04 +02:00
Marco Slot 6f7c3bd73b Skip JSON validation on coordinator during COPY 2018-02-02 15:33:27 +01:00
Brian Cloutier 15511f6ba1 Dynamically allocate connection metadata in WaitForAllConnections 2018-02-01 10:30:41 -08:00
Brian Cloutier a2ed45e206 Remove variable length arrays
VLAs aren't supported by Visual Studio.

- Remove all existing instances of VLAs.
- Add a flag, -Werror=vla, which makes gcc refuse to compile if we add
  VLAs in the future.
2018-02-01 10:30:41 -08:00
Brian Cloutier 76d1edc3fd
Don't rely on gcc-specific features (#1963)
* Don't use expressions inside compound statements
* Don't depend on __builtin_constant_p
* Remove reliance on S_ISLNK
* Replace use of __func__: older mcvs doesn't support this builtin
2018-01-23 17:03:29 -08:00
Marco Slot 3fd65cb91b Do not raise errors in the real-time executor (#1903) 2018-01-01 22:26:31 -05:00
Marco Slot 09c09f650f Recursively plan set operations when leaf nodes recur 2017-12-26 13:46:55 +02:00
mehmet furkan şahin fd546cf322 Intermediate result size limitation
This commit introduces a new GUC to limit the intermediate
result size which we handle when we use read_intermediate_result
function for CTEs and complex subqueries.
2017-12-21 14:26:56 +03:00
Onder Kalaci 0d5a4b9c72 Recursively plan subqueries that are not safe to pushdown
With this commit, Citus recursively plans subqueries that
are not safe to pushdown, in other words, requires a merge
step.

The algorithm is simple: Recursively traverse the query from bottom
up (i.e., bottom meaning the leaf queries). On each level, check
whether the query is safe to pushdown (or a single repartition
subquery). If the answer is yes, do not touch that subquery. If the
answer is no, plan the subquery seperately (i.e., create a subPlan
for it) and replace the subquery with a call to
`read_intermediate_results(planId, subPlanId)`. During the the
execution, run the subPlans first, and make them avaliable to the
next query executions.

Some of the queries hat this change allows us:

   * Subqueries with LIMIT
   * Subqueries with GROUP BY/DISTINCT on non-partition keys
   * Subqueries involving re-partition joins, router queries
   * Mixed usage of subqueries and CTEs (i.e., use CTEs in
     subqueries as well). Nested subqueries as long as we
     support the subquery inside the nested subquery.
   * Subqueries with local tables (i.e., those subqueries
     has the limitation that they have to be leaf subqueries)

   * VIEWs on the distributed tables just works (i.e., the
     limitations mentioned below still applies to views)

Some of the queries that is still NOT supported:

  * Corrolated subqueries that are not safe to pushdown
  * Window function on non-partition keys
  * Recursively planned subqueries or CTEs on the outer
    side of an outer join
  * Only recursively planned subqueries and CTEs in the FROM
    (i.e., not any distributed tables in the FROM) and subqueries
    in WHERE clause
  * Subquery joins that are not on the partition columns (i.e., each
    subquery is individually joined on partition keys but not the upper
    level subquery.)
  * Any limitation that logical planner applies such as aggregate
    distincts (except for count) when GROUP BY is on non-partition key,
    or array_agg with ORDER BY
2017-12-21 08:37:40 +02:00
Onder Kalaci e12ea914b9 Refactor ErrorIfQueryNotSupported to defer errors 2017-12-20 09:03:49 +02:00
Onder Kalaci 71ce42b936 Refactor RecursivelyPlanSubqueriesAndCTEs() to make it ready
to work with subqueries
2017-12-20 09:03:47 +02:00
Marco Slot 7dab078e67 Set cost estimates for read_intermediate_result 2017-12-18 16:23:44 +01:00
Marco Slot a64f0060ba Reduce the frequency of FinishConnectionIO calls during COPY (#1864) 2017-12-14 13:21:59 -05:00
Marco Slot 2e2b4e81fa Add support for CTEs in distributed queries 2017-12-14 09:32:55 +01:00
Marco Slot cbbd418af2 Add citus.copy_format OIDs to metadata cache 2017-12-14 09:32:55 +01:00
Marco Slot 66f9f1d6cd Make some intermediate results functions public 2017-12-14 09:32:55 +01:00
Marco Slot 36ee21c323 Make CanUseBinaryCopyFormatForType public 2017-12-14 09:32:55 +01:00
Marco Slot 7d1191954d Add DistributedSubPlan node 2017-12-14 09:32:55 +01:00
mehmet furkan şahin 3c941aedf1 adds citus.enable_repartition_joins GUC
The new GUC allows Citus to switch between task executors
when necessary
2017-12-11 09:36:37 +03:00
Marco Slot 60a1e31671 Allow queries with local tables in NeedsDistributedPlanning 2017-12-07 16:20:23 +01:00
Marco Slot f8550b8c85 Fix issues with read_intermediate_result signature 2017-12-07 13:47:56 +01:00
Marco Slot d8fea4efb8 Revert "Allow queries with local tables in NeedsDistributedPlanning"
This reverts commit d2bac081e8.
2017-12-07 11:19:11 +01:00
Marco Slot d2bac081e8 Allow queries with local tables in NeedsDistributedPlanning 2017-12-07 11:02:16 +01:00
Marco Slot 7279d42849 Treat read_intermediate_result as recurring tuples 2017-12-04 14:50:11 +01:00
Marco Slot 4cdadfcab6 Add intermediate results infrastructure 2017-12-04 14:50:11 +01:00
Marco Slot bfcc76df69 Make several COPY-related functions public 2017-12-04 13:12:03 +01:00
Marco Slot 73989b07eb Refactor query execution functions 2017-12-04 13:12:03 +01:00
Murat Tuncer 2d66bf5f16
Fix hard coded formatting strings for 64 bit numbers (#1831)
Postgres provides OS agnosting formatting macros for
formatting 64 bit numbers. Replaced %ld %lu with
INT64_FORMAT and UINT64_FORMAT respectively.

Also found some incorrect usages of formatting
flags and fixed them.
2017-12-04 14:11:06 +03:00
Marco Slot d6dd0b3a81 Send BEGIN in the real-time executor when in a transaction 2017-11-30 12:59:09 +01:00
Marco Slot 3a4d5f8182 Remove filter checks on leaf queries 2017-11-30 12:25:14 +01:00
Marco Slot 3f03cb6a6a Support UNION with joins in the subqueries 2017-11-30 10:37:56 +01:00
Marco Slot a9933deac6 Make real time executor work in transactions 2017-11-30 09:59:32 +03:00
Onder Kalaci 05fb0dd020 Add infrastructure for filtering restriction contexts based on the input query
In subquery pushdown, we first ensure that each relation is joined with at least
on another relation on the partition keys. That's fine given that the decision
is binary: pushdown the query at all or not.

With recursive planning, we'd want to check whether any specific part
of the query can be pushded down or not. Thus, we need the ability to
understand which part(s) of the subquery is safe to pushdown. This commit
adds the infrastructure for doing that.
2017-11-28 09:58:21 +02:00
Onder Kalaci 16421f089f Register citus custom scan nodes 2017-11-23 11:38:33 +02:00
Onder Kalaci 83c1143505 Refactor custom scan related codes
In this commit, we don't change any codes, only create a new
file and move the related functions and types there.
2017-11-23 11:38:12 +02:00
Marco Slot 8486f76e15 Auto-recover 2PC transactions 2017-11-22 11:26:58 +01:00
Marco Slot 6ba3f42d23 Rename MultiPlan to DistributedPlan 2017-11-22 09:36:24 +01:00
Marco Slot 0ad39b36fe Treat immutable table functions and constant subqueries as reference tables 2017-11-21 14:15:22 +01:00
Brian Cloutier d267e0f9fa EXEC_BACKEND: don't put pointers to shared hashes into shared memory
Store pointers to shared hashes in process-local variables. Previously
pointers to shared hashes were put into shared memory. This causes
problems on EXEC_BACKEND because everybody calls execve and receives a
brand new address space; the shared hash will be in a different place
for every backend. (normally we call fork, which gives you a copy of the
address space, so these pointers remain constant)
2017-11-20 15:29:51 -08:00
Brian Cloutier 30a2365d81 Rename CreateDirectory to CitusCreateDirectory 2017-11-20 14:38:26 -08:00
Brian Cloutier aa2ab023a2 Rename RemoveDirectory -> CitusRemoveDirectory 2017-11-20 14:21:52 -08:00
Marco Slot 2410c2e450 Rewrite recover_prepared_transactions to be fast, non-blocking 2017-11-20 11:27:40 +01:00
Marco Slot d3b634b301 Allow generating placement IDs without using the sequence 2017-11-15 10:12:06 +01:00
Marco Slot c24a0875a5 Allow generating shard IDs without using the sequence 2017-11-15 10:12:05 +01:00
velioglu be28ba8e70 Add stub UDF to run pg_upgrade flawlessly 2017-11-13 16:14:45 +02:00
Marco Slot f71728f634 Add GUC for specifying sslmode in connections to workers 2017-11-08 14:15:58 +01:00
Brian Cloutier 7be1545843 Support implicit casts during INSERT/SELECT
It's possible to build INSERT SELECT queries which include implicit
casts, currently we attempt to support these by adding explicit casts to
the SELECT query, but this sometimes crashes because we don't update all
nodes with the new types. (SortClauses, for instance)

This commit removes those explicit casts and passes an unmodified SELECT
query to the COPY executor (how we implement INSERT SELECT under the
scenes). In lieu of those cases, COPY has been given some extra logic to
inspect queries, notice that the types don't line up with the table it's
supposed to be inserting into, and "manually" casting every tuple before
sending them to workers.
2017-11-03 22:27:15 -07:00
Marco Slot 6883a09cdd Allow distributed partitioned table creation in Cloud 2017-11-03 10:09:18 +01:00
Hadi Moshayedi 9bfbbf8a04 Make reports hostname configurable and enable stats collection in tests.
This patch adds --with-reports-host configure option, which sets the
REPORTS_BASE_URL constant. The default is reports.citusdata.com.

It also enables stats collection in tests.
2017-10-31 21:51:43 -04:00
Hadi Moshayedi 78a2cd9052 Check for Citus updates.
Sends a request to /v1/releases/latest?flavor=$CITUS_EDITION once a day,
which returns a response similar to {"version": "7.1.0", "major": 7,
"minor": 1, "patch": 0}. Then compares it with current Citus version,
and if the latest release is newer, logs a LOG message.
2017-10-31 21:51:43 -04:00
Hadi Moshayedi 34f3ec0961 Call FlushDistTableCache() before stats collection. 2017-10-31 21:51:43 -04:00
Hadi Moshayedi 97d544b75c Follow the patterns used in Deadlock Detection in Stats Collection.
This includes:

(1) Wrap everything inside a StartTransactionCommand()/CommitTransactionCommand().
This is so we can access the database. This also switches to a new memory context
and releases it, so we don't have to do our own memory management.

(2) LockCitusExtension() so the extension cannot be dropped or created concurrently.

(3) Check CitusHasBeenLoaded() && CheckCitusVersion() before doing any work.

(4) Do not PG_TRY() inside a loop.
2017-10-31 21:51:43 -04:00
Furkan Sahin 2b39c52f0b Replica identity on create_distributed_table
By this commit, citus minds the replica identity of the table when
we distribute the table. So the shards of the distributed table
have the same replica identity with the local table.
2017-10-31 13:08:36 +03:00
velioglu 0b5db5d826 Support multi shard update/delete queries 2017-10-25 15:52:38 +03:00
Hadi Moshayedi 9a04b78980 Send server_id for statistics reports. (#1698)
This change introduces the `pg_dist_node_metadata` which has a single jsonb value. When creating
the extension, a random server id is generated and stored in there. Everything in the metadata table
is added as a nested objected to the json payload that is sent to the reports server.
2017-10-18 21:20:32 -04:00
Brian Cloutier ebcb2b65e9 Add master_move_node function 2017-10-16 10:51:28 -07:00
Hadi Moshayedi 2aec6eda49 Properly use #ifdef HAVE_LIBCURL. 2017-10-13 12:04:36 -06:00
Murat Tuncer f7ab901766 Add select distinct, and distinct on support
Distinct, and distinct on() clauses are supported
in simple selects, joins, subqueries, and insert into select
queries.
2017-10-13 14:59:48 +03:00
Hadi Moshayedi a1387f4aa8 Basic usage statistics collection. (#1656)
Adds ```citus.enable_statistics_collection``` GUC variable, which ```true``` by default, unless built without libcurl. If statistics collection is enabled, sends basic usage data to Citus servers every 24 hours.

The data that is collected consists of:
- Citus version
- OS name & release
- Hardware Id
- Number of tables, rounded to next power of 2
- Size of data, rounded to next power of 2
- Number of workers
2017-10-11 09:55:15 -04:00
Onder Kalaci 498ac80d8b Add window function support for SUBQUERY PUSHDOWN and INSERT INTO SELECT
This commit provides the support for window functions in subquery and insert
into select queries. Note that our support for window functions is still limited
because it must have a partition by clause on the distribution key. This commit
makes changes in the files insert_select_planner and multi_logical_planner. The
required tests are also added with files multi_subquery_window_functions.out
and multi_insert_select_window.out.
2017-10-04 15:33:07 +03:00
Marco Slot 394918f9d0 Invalidate worker and group ID cache in maintenance daemon 2017-10-02 18:14:29 +02:00
Marco Slot 43d5e79eaa Execute transmit commands as superuser during task-tracker queries 2017-09-28 15:27:25 +02:00
Marco Slot da6b42a3e2 Use unique constraint index for transaction record deletion 2017-09-28 12:04:56 +02:00
Murat Tuncer 4676c4f7a5 Prevent crash when remote transaction start fails (#1662)
We sent multiple commands to worker when starting a transaction.
Previously we only checked the result of the first command that
is transaction 'BEGIN' which always succeeds. Any failure on
following commands were not checked.

With this commit, we make sure all command results are checked.
If there is any error we report the first error found.
2017-09-26 17:25:46 -07:00
Jason Petersen b4d53423fa
Add adapter functions for OpenFile changes 2017-09-25 17:20:24 -07:00
Jason Petersen bbc15e0598
Handle HASHPROC changes
PostgreSQL 11 now has "standard" and "extended" (64-bit) versions of
hash functions.
2017-09-25 17:20:24 -07:00
Jason Petersen 6c9b19a954
Add version-compat header
For polyfill macros, etc.
2017-09-25 17:20:23 -07:00
velioglu 0a56ed910b Change error message of queries with distributed and local table
Citus can handle INSERT INTO ... SELECT queries if the query inserts
into local table by reading data from distributed table. The opposite
way is not correct. With this commit we warn the user if the latter
option is used.
2017-09-22 13:46:19 -07:00
Onder Kalaci 4782f9f98a Properly copy and trim the error messages that come from pg_conn
When a NULL connection is provided to PQerrorMessage(), the
returned error message is a static text. Modifying that static
text, which doesn't necessarly be in a writeable memory, is
dangreous and might cause a segfault.
2017-09-22 19:43:09 +03:00
Onder Kalaci 6736fd1682 Remove two obsolete functions
Namely GetConnectionFromPGconn() and CloseConnectionByPGconn()
2017-09-21 00:36:23 -06:00
Marco Slot 0aadbb1760 Convert multi-row INSERT target list to Vars 2017-08-25 10:55:56 +02:00
Marco Slot ae00795dab Allow default columns in multi-row INSERTs 2017-08-25 10:55:56 +02:00
Marco Slot c97692f382 Fix multi-row INSERT with RETURNING on reference tables 2017-08-24 10:42:12 +02:00
Marco Slot 4d7927b672 Execute multi-row INSERTs sequentially 2017-08-23 10:04:57 +02:00
Marco Slot cf375d6a66 Consider dropped columns that precede the partition column in COPY 2017-08-22 13:02:35 +02:00
Onder Kalaci 6532b69873 Kill the maintenance daemon on DROP DATABASE 2017-08-18 16:03:08 +03:00
Marco Slot 7523753a73 Clear metadata OID cache prior to deadlock detection 2017-08-18 11:20:24 +02:00
Hadi Moshayedi e5fbcf37dd Add Savepoint Support (#1539)
This change adds support for SAVEPOINT, ROLLBACK TO SAVEPOINT, and RELEASE SAVEPOINT.

When transaction connections are not established yet, savepoints are kept in a stack and sent to the worker when the connection is later established. After establishing connections, savepoint commands are sent as they arrive.

This change fixes #1493 .
2017-08-15 13:02:28 -04:00
Burak Yucesoy dfdfb44ebf Acquire shard resource locks on parent tables while operating on partitions 2017-08-14 14:44:30 +03:00
Burak Yucesoy a321e750c0 Acquire relation locks on partitions while operation on parent table 2017-08-14 14:44:30 +03:00
Burak Yucesoy 52b9e35d50 Add relationIdList field to the Job struct 2017-08-14 14:06:22 +03:00
Onder Kalaci 5b48de7430 Improve deadlock detection for MX
We added a new field to the transaction id that is set to true only
for the transactions initialized on the coordinator. This is only
useful for MX in order to distinguish the transaction that started
the distributed transaction on the coordinator where we could
 have the same transactions' worker queries on the same node.
2017-08-12 13:28:37 +03:00
Onder Kalaci e5d5bdff51 Enable distributed deadlock detection on the maintenance deamon
With this commit, the maintenance deamon starts to check for
distributed deadlocks.

We also introduced a GUC variable (distributed_deadlock_detection_factor)
whose value is multiplied with Postgres' deadlock_timeout. Setting
it to -1 disables the distributed deadlock detection.
2017-08-12 13:28:37 +03:00
Onder Kalaci a333c9f16c Add infrastructure for distributed deadlock detection
This commit adds all the necessary pieces to do the distributed
deadlock detection.

Each distributed transaction is already assigned with distributed
transaction ids introduced with
3369f3486f. The dependency among the
distributed transactions are gathered with
80ea233ec1.

With this commit, we implement a DFS (depth first seach) on the
dependency graph and search for cycles. Finding a cycle reveals
a distributed deadlock.

Once we find the deadlock, we examine the path that the cycle exists
and cancel the youngest distributed transaction.

Note that, we're not yet enabling the deadlock detection by default
with this commit.
2017-08-12 13:28:37 +03:00
velioglu b0efffae1c Correct planner and add more tests 2017-08-11 10:16:13 +03:00
velioglu ceba81ce35 Move physical planner checks to logical planner 2017-08-11 10:09:47 +03:00
velioglu c4e3b8b5e1 Add planner changes and tests for subquery on reference tables 2017-08-11 10:09:47 +03:00
Marco Slot 0ae265c436 Add citus_create_restore_point for distributed snapshots 2017-08-11 07:36:20 +02:00
Marco Slot fca986f214 Add API for waiting for multiple connections 2017-08-11 00:03:06 +02:00
Brian Cloutier 9d93fb5551 Create citus.use_secondary_nodes GUC
This GUC has two settings, 'always' and 'never'. When it's set to
'never' all behavior stays exactly as it was prior to this commit. When
it's set to 'always' only SELECT queries are allowed to run, and only
secondary nodes are used when processing those queries.

Add some helper functions:
- WorkerNodeIsSecondary(), checks the noderole of the worker node
- WorkerNodeIsReadable(), returns whether we're currently allowed to
  read from this node
- ActiveReadableNodeList(), some functions (namely, the ones on the
  SELECT path) don't require working with Primary Nodes. They should call
  this function instead of ActivePrimaryNodeList(), because the latter
  will error out in contexts where we're not allowed to write to nodes.
- ActiveReadableNodeCount(), like the above, replaces
  ActivePrimaryNodeCount().
- EnsureModificationsCanRun(), error out if we're not currently allowed
  to run queries which modify data. (Either we're in read-only mode or
  use_secondary_nodes is set)

Some parts of the code were switched over to use readable nodes instead
of primary nodes:
- Deadlock detection
- DistributedTableSize,
- the router, real-time, and task tracker executors
- ShardPlacement resolution
2017-08-10 17:37:17 +03:00
Brian Cloutier 3fc87a7a29 Metadata sync also syncs nodes in other clusters 2017-08-10 16:55:55 +03:00
Eren Başak deb89cb9ce Delete tesh_helper_functions.h 2017-08-10 14:00:44 +03:00