Commit Graph

1147 Commits (004f28e18cbe4c7353f6a31785bf60b15c0d8de7)

Author SHA1 Message Date
Burak Yucesoy 0c283fa8a3 Add partitioning support to MX tables
Previously, we prevented creation of partitioned tables on Citus MX.
We decided to not focus on this feature until there is a need. Since
now there are requests for this feature, we are implementing support
for partitioned tables on Citus MX.
2018-04-06 12:47:06 +03:00
velioglu 72dfe4a289 Adds colocation check to local join 2018-04-04 22:49:27 +03:00
velioglu 698d585fb5 Remove broadcast join logic
After this change all the logic related to shard data fetch logic
will be removed. Planner won't plan any ShardFetchTask anymore.
Shard fetch related steps in real time executor and task-tracker
executor have been removed.
2018-03-30 11:45:19 +03:00
Matthew Wozniczka 4582a4b398 Fixed a typo 2018-03-27 22:51:36 -06:00
Brian Cloutier f8f0d4aedc Add Windows replacement for uname 2018-03-21 20:35:56 -07:00
Brian Cloutier 98ffafe16e Fix error handling in connection_management 2018-03-21 20:05:00 -07:00
Murat Tuncer 224b0a8c14 Replace poll with select/poll
Windows does not have poll(), so fall back to select()
2018-03-21 20:05:00 -07:00
Metin Doslu 3b7b64a8b6 Remove skip_jsonb_validation_in_copy GUC 2018-03-13 10:33:27 +02:00
Murat Tuncer 1440caeef2
Fix incorrect limit pushdown when distinct clause is not superset of group by (#2035)
Pushing down limit and order by into workers may produce
wrong output when distinct on() clause has expressions,
aggregates, or window functions.

This checking allows pushing down of limits only if
distinct clause is a superset of group by clause. i.e. it contains all clauses in group by.
2018-03-07 13:24:56 +03:00
Metin Doslu e86d34256c Change default to false for citus.skip_jsonb_validation_in_copy 2018-03-06 13:19:47 +02:00
Onder Kalaci 40b898b59f Improve error messages for INSERT queries that have subqueries 2018-03-05 14:46:47 +02:00
Onder Kalaci 7dc9589b56 Handle failures during I/O
This commit checks the connection status right after any IO happens
on the socket.

This is necessary since before this commit we didn't pass any information
to the higher level functions whether we're done with the connection
(e.g., no IO required anymore) or an errors happened during the IO.
2018-03-02 08:33:53 +02:00
Onder Kalaci da0048e0b7 ForgetResults() becomes a wrapper for ClearResults()
ClearResults() is able to handle failures properly by
checking the result status. So, relying on it makes
error handling more generic in Citus.
2018-03-02 08:33:53 +02:00
Murat Tuncer 76f6883d5d
Add support for window functions that can be pushed down to worker (#2008)
This is the first of series of window function work.

We can now support window functions that can be pushed down to workers.
Window function must have distribution column in the partition clause
 to be pushed down.
2018-03-01 19:07:07 +03:00
Marco Slot e79db17b91 Update comment in WorkerAggregateExpressionList 2018-02-27 23:48:25 +01:00
Murat Tuncer e13c5beced
Fix worker query when order by avg aggregate is used (#2024)
We push down order by to worker query when limit is specified
(with some other additional checks). If the query has an expression
on an aggregate or avg aggregate by itself, and there is an order
by on this particular target we may send wrong order by to worker
query with potential to affect query result.

The fix creates a auxilary target entry in the worker query and
uses that target entry for sorting.
2018-02-28 12:12:54 +03:00
Metin Doslu bcf660475a Add support for modifying CTEs 2018-02-27 15:08:32 +02:00
velioglu 78e6d990a2 Fix master plan of the query with distinct, aggregate and group by clauses.
Before this PR, we were trusting on the columns of group by about
guaranteeing the uniqueness of the results. However, this assumption
is correct only if the columns in the group by is subset of columns
in the distinct clause. It can be wrong if we have part of group by
columns and some aggregation columns in the distinct clause. With
this PR, we add distinct plan on top of aggregate plan when necessary.
2018-02-26 15:30:15 +03:00
Onder Kalaci 1c930c96a3 Support non-co-located joins between subqueries
With #1804 (and related PRs), Citus gained the ability to
plan subqueries that are not safe to pushdown.

There are two high-level requirements for pushing down subqueries:

   * Individual subqueries that require a merge step (i.e., GROUP BY
     on non-distribution key, or LIMIT in the subquery etc). We've
     handled such subqueries via #1876.

    * Combination of subqueries that are not joined on distribution keys.
      This commit aims to recursively plan some of such subqueries to make
      the whole query safe to pushdown.

The main logic behind non colocated subquery joins is that we pick
an anchor range table entry and check for distribution key equality
of any  other subqueries in the given query. If for a given subquery,
we cannot find distribution key equality with the anchor rte, we
recursively plan that subquery.

We also used a hacky solution for picking relations as the anchor range
table entries. The hack is that we wrap them into a subquery. This is only
necessary since some of the attribute equivalance checks are based on
queries rather than range table entries.
2018-02-26 13:50:37 +02:00
Onder Kalaci 7b57e0562a Add infrastructure for detecting non-colocated subqueries 2018-02-26 13:28:25 +02:00
Onder Kalaci 4d70c86645 Leaf level recursive planning for non colocated subqueries
With this commit, we enable recursive planning for the subqueries
that are not joined on the distribution keys.
2018-02-26 13:28:24 +02:00
Onder Kalaci e998703ff8 Enable restriction eq. checks for top level set operations
We used to only support pushdownable set operations inside a
subquery, however, we could easily expand the restriction
checks to cover top level set operations as well.
2018-02-26 13:28:24 +02:00
Onder Kalaci e8aa532a90 Refactor checks for distribution key equality
Change some function names, ensure we stick to Citus'
function order rules etc.
2018-02-26 13:28:24 +02:00
Marco Slot 1e9186a3b5 Do not use new connection in table size functions 2018-02-23 07:07:55 +01:00
Markus Sintonen 6202e80d06 Implemented jsonb_agg, json_agg, jsonb_object_agg, json_object_agg 2018-02-18 00:19:18 +02:00
velioglu 195ac948d2 Recursively plan subqueries in WHERE clause when FROM recurs 2018-02-13 19:52:12 +03:00
Marco Slot 0cba4ab588 Refactor worker node hash initialisation 2018-02-12 23:36:43 +01:00
Marco Slot 40d715d494 Cache worker node array for faster iteration 2018-02-12 23:36:43 +01:00
Marco Slot 6e79a34c97 Do not check for cancellation in ClearResultsIfReady 2018-02-12 16:45:02 +01:00
Marco Slot 6051aae56e Handle errors that are discovered during abort 2018-02-12 16:45:02 +01:00
Marco Slot ee6a751798 Only copy distributed plan when modifying it 2018-02-12 16:30:55 +01:00
Onder Kalaci 94c5ac6ebb Remove duplicate join restrictions
We use PostgreSQL hooks to accumulate the join restrictions
and PostgreSQL gives us all the join paths it tries while
deciding on the join order. Thus, for queries that have many
joins, this function is likely to remove lots of duplicate join
restrictions. This becomes relevant for Citus on query pushdown
check peformance.
2018-02-12 18:35:05 +02:00
Onder Kalaci c228d8ff3d Refactor equivalance generation related codes
This commit changes the APIs for restriction generation to make future
changes simpler.
2018-02-12 18:35:04 +02:00
Onder Kalaci 2f2d350924 Refactor relation restriction related codes
This commit moves some of the functions to a more relevant
source file.
2018-02-12 18:35:04 +02:00
Murat Tuncer 901b543e20 Fix count distinct using field select on top level query
We were allowing count distict queries even if they were
not directly on columns if the query is grouped on
distribution column.

When performing these checks we were skipping subqueries
because they also perform this check in a more concise manner.
We relied on oid SUBQUERY_RELATION_ID (10000) to decide if
a given RTE relation id denotes a subquery, however, we also
use SUBQUERY_PUSHDOWN_RELATION_ID (10001) for some subqueries.

We skip both type of subqueries with this change.
2018-02-06 13:16:10 +03:00
metdos 35f864bcaf Respect enable_hashagg in the master planner 2018-02-05 15:06:00 +02:00
metdos 3d540d961c Fix typo in grouping_is_sortable() 2018-02-05 12:10:19 +02:00
Marco Slot 6f7c3bd73b Skip JSON validation on coordinator during COPY 2018-02-02 15:33:27 +01:00
Brian Cloutier 15511f6ba1 Dynamically allocate connection metadata in WaitForAllConnections 2018-02-01 10:30:41 -08:00
Brian Cloutier e6ebfc1f53 Remove VLA from UpdateNodeLocation 2018-02-01 10:30:41 -08:00
Brian Cloutier a2ed45e206 Remove variable length arrays
VLAs aren't supported by Visual Studio.

- Remove all existing instances of VLAs.
- Add a flag, -Werror=vla, which makes gcc refuse to compile if we add
  VLAs in the future.
2018-02-01 10:30:41 -08:00
Brian Cloutier 2efe80ce55 CheckForDistributedDeadlocks no longer uses a VLA
- variable length arrays (VLAs) do not work with Visual Studio
- fix an off-by-one error. We incorrectly assumed there would always at
  least as many edges as there were nodes.
- refactor: reduce scope of transactionNodeStack by moving it into the
  function which uses it.
- refactor: break up the distinct uses of currentStackDepth into
  separate variables.
2018-02-01 10:30:41 -08:00
Brian Cloutier 097fd15a89 small refactor, CheckDeadlockForTransactionNode builds it's own array 2018-02-01 10:30:41 -08:00
Brian Cloutier 457f570b77 Small refactor, we were using incompatible types 2018-01-31 11:05:59 -08:00
Brian Cloutier b864d014ab
GetNextNodeId() incorrectly called PG_RETURN_DATUM
- Also stabilize the output of a multi_router_planner test
2018-01-29 15:32:36 -08:00
Brian Cloutier 61a6b846b9 Refactor: use a temporary timestamp variable
It's against our coding convention to call functions inside parameter
lists; when single-stepping with a debugger it's difficult to determine
what the function returned.

That wouldn't be good enough reason to change this code but while
porting Citus to Windows I ran into this line of code.
assign_distributed_transaction_id was called with a weird timestamp and
I wasn't able to find the problem without first making this change.
2018-01-29 11:20:13 -08:00
Marco Slot bd0ebac865 Skip call to ActiveReadableNodeList when there are no subplans 2018-01-29 16:05:10 +01:00
Hadi Moshayedi ff26bcd5a5
Include sys/stat.h for S_IRUSR and S_IWUSR. (#1977) 2018-01-26 16:21:48 -05:00
Brian Cloutier 76d1edc3fd
Don't rely on gcc-specific features (#1963)
* Don't use expressions inside compound statements
* Don't depend on __builtin_constant_p
* Remove reliance on S_ISLNK
* Replace use of __func__: older mcvs doesn't support this builtin
2018-01-23 17:03:29 -08:00
Onder Kalaci fbde87d2d0 Allocate enough space for transaction nodes
This fix prevents any potential memory access that might occur
while forming the deadlock path.
2018-01-22 08:45:48 +02:00
Onder Kalaci 9a89c0b425 Fix bug while traversing the distributed deadlock graph
With this fix, we traverse the graph with DFS which was originally
intended. Note that, before the fix, we traverse the graph with BFS
which might lead to killing some unrelated backend that is not
involved in the distributed deadlock.
2018-01-22 08:45:48 +02:00
Dimitri Fontaine c9760fbb64 Fix CREATE INDEX with storage options on distributed tables.
By sharing the implementation of the function AppendOptionListToString on
three call sites, we would expand an extra OPTIONS keyword in a create index
statement, and omit other bits of the specific syntax here.

This patch introduces an AppendStorageParametersToString() function that is
very similar to AppendOptionListToString() but handles WITH(a="foo",...)
syntax that is used in reloptions (aka Storage Parameters).

Fixes #1747.
2018-01-17 21:56:40 +01:00
Dimitri Fontaine 952da72c55 Implement ALTER TABLE|INDEX ... SET|RESET ().
PostgreSQL implements support for several relation kinds in a single
statement, such as in the AlterTableStmt case, which supports both tables
and indexes and more (see ATExecSetRelOptions in PostgreSQL source code file
src/backend/commands/tablecmds.c for an example of that).

As a consequence, this patch implements support for setting and resetting
storage parameters on both relation kinds.
2018-01-17 21:56:40 +01:00
Dimitri Fontaine 17266e3301 Implement ALTER INDEX ... RENAME TO ...
The command is now distributed among the shards when the table is
distributed. To that effect, we fill in the DDLJob's targetRelationId with
the OID of the table for which the index is defined, rather than the OID of
the index itself.
2018-01-17 21:56:40 +01:00
velioglu d357d2fccd Bump citus version to 7.3devel 2018-01-16 11:50:28 +03:00
Dimitri Fontaine e010238280 Implement ALTER TABLE ... RENAME TO ...
The implementation was already mostly in place, but the code was protected
by a principled check against the operation. Turns out there's a nasty
concurrency bug though with long identifier names, much as in #1664.

To prevent deadlocks from happening, we could either review the DDL
transaction management in shards and placements, or we can simply reject
names with (NAMEDATALEN - 1) chars or more — that's because of the
PostgreSQL array types being created with a one-char prefix: '_'.
2018-01-11 13:21:24 +01:00
Hadi Moshayedi 5d7c52ffa6
Don't return in PG_TRY() block when cancellations happen in WaitForConnections(). (#1923)
We shouldn't return in middle of a PG_TRY() block because if we do, we won't reset PG_exception_stack, and later when a re-throw tries to jump to the jump-point which was active in this PG_TRY() block, it seg-faults.

We used to return in middle of PG_TRY() block in WaitForConnections() where we checked for cancellations. Whenever cancellations were caught here, Citus crashed. And example was reported by @onderkalaci at #1903.
2018-01-03 09:54:03 -05:00
Marco Slot 8f69973411 Fix cancellation issues in the real-time executor (#1905) 2018-01-01 23:10:29 -05:00
Marco Slot 3fd65cb91b Do not raise errors in the real-time executor (#1903) 2018-01-01 22:26:31 -05:00
Onder Kalaci a1bbdf2d44 Outer joins should also use subquery pushdown planner if join
clause is not supported

This change allows unsupported clauses to go through query pushdown
planner instead of erroring out as we already do for non-outer joins.
2017-12-29 16:40:47 +02:00
Marco Slot 09c09f650f Recursively plan set operations when leaf nodes recur 2017-12-26 13:46:55 +02:00
mehmet furkan şahin 446893234a unsupported subquery error messages are fixed 2017-12-25 15:10:59 +03:00
mehmet furkan şahin 57bc86e23d new debug output for subplans 2017-12-25 09:50:51 +03:00
Marco Slot fa7fa2734b Log remote commands sent via MultiClientSendQuery 2017-12-22 16:18:40 +01:00
Murat Tuncer 87c6f306f1
Fix join clause eq restrictions (#1884)
We used to error out if the join clause includes filters like
t1.a < t2.a even if other filter like t1.key = t2.key exists.

Recently we lifted that restriction in subquery planning by
not lifting that restriction and focusing on equivalance classes
provided by postgres.

This checkin forwards previously erroring out real-time queries
due to join clauses to subquery planner and let it handle the
join even if the query does not have a subquery.

We are now pushing down queries that do not have any
subqueries in it. Error message looked misleading, changed to a more descriptive one.
2017-12-22 12:16:14 +03:00
metdos 32b7e152a3 Get shard resource locks for only DMLs 2017-12-22 10:30:41 +02:00
Murat Tuncer a9cf0c3e66
Fix CTE column alias issue (#1893)
We were creating intermediate query result's target
names from subquery target list. Now we also check
if cte re-defines its column name aliases, and create
intermediate result query accordingly.
2017-12-22 09:39:40 +03:00
Brian Cloutier 377b31dcf7 Remove enable_deadlock_prevention prevention warning 2017-12-21 14:47:52 +01:00
Brian Cloutier fb7b86fa14 Replace strtoull with pg_strtouint64
The macro we were using to detect strtoull isn't set on Windows, and
just in case there are differences use a portable function from PG
instead of calling strtoull directly.
2017-12-21 14:28:51 +01:00
mehmet furkan şahin fd546cf322 Intermediate result size limitation
This commit introduces a new GUC to limit the intermediate
result size which we handle when we use read_intermediate_result
function for CTEs and complex subqueries.
2017-12-21 14:26:56 +03:00
Onder Kalaci 0d5a4b9c72 Recursively plan subqueries that are not safe to pushdown
With this commit, Citus recursively plans subqueries that
are not safe to pushdown, in other words, requires a merge
step.

The algorithm is simple: Recursively traverse the query from bottom
up (i.e., bottom meaning the leaf queries). On each level, check
whether the query is safe to pushdown (or a single repartition
subquery). If the answer is yes, do not touch that subquery. If the
answer is no, plan the subquery seperately (i.e., create a subPlan
for it) and replace the subquery with a call to
`read_intermediate_results(planId, subPlanId)`. During the the
execution, run the subPlans first, and make them avaliable to the
next query executions.

Some of the queries hat this change allows us:

   * Subqueries with LIMIT
   * Subqueries with GROUP BY/DISTINCT on non-partition keys
   * Subqueries involving re-partition joins, router queries
   * Mixed usage of subqueries and CTEs (i.e., use CTEs in
     subqueries as well). Nested subqueries as long as we
     support the subquery inside the nested subquery.
   * Subqueries with local tables (i.e., those subqueries
     has the limitation that they have to be leaf subqueries)

   * VIEWs on the distributed tables just works (i.e., the
     limitations mentioned below still applies to views)

Some of the queries that is still NOT supported:

  * Corrolated subqueries that are not safe to pushdown
  * Window function on non-partition keys
  * Recursively planned subqueries or CTEs on the outer
    side of an outer join
  * Only recursively planned subqueries and CTEs in the FROM
    (i.e., not any distributed tables in the FROM) and subqueries
    in WHERE clause
  * Subquery joins that are not on the partition columns (i.e., each
    subquery is individually joined on partition keys but not the upper
    level subquery.)
  * Any limitation that logical planner applies such as aggregate
    distincts (except for count) when GROUP BY is on non-partition key,
    or array_agg with ORDER BY
2017-12-21 08:37:40 +02:00
Onder Kalaci e12ea914b9 Refactor ErrorIfQueryNotSupported to defer errors 2017-12-20 09:03:49 +02:00
Onder Kalaci 71ce42b936 Refactor RecursivelyPlanSubqueriesAndCTEs() to make it ready
to work with subqueries
2017-12-20 09:03:47 +02:00
Marco Slot 5e0539efa3 Plan CTEs when subquery pushdown is on 2017-12-19 16:34:56 +01:00
Marco Slot 44a1ea631a Show distributed subplan ID in EXPLAIN output 2017-12-19 16:34:56 +01:00
Marco Slot 35dbacdb69 Do not reinitialise MyBackendData 2017-12-19 15:56:26 +01:00
Marco Slot af201a2f6d Allow intermediate results to be used in parallel workers 2017-12-18 19:05:08 +01:00
Marco Slot 7dab078e67 Set cost estimates for read_intermediate_result 2017-12-18 16:23:44 +01:00
Marco Slot 74bd33d0cc Revert "Plan CTEs when subquery pushdown is on"
This reverts commit e3b953b8e3.
2017-12-17 22:34:20 +01:00
Marco Slot aca5f35ab9 Revert "Show distributed subplan ID in EXPLAIN output"
This reverts commit 686b079272.
2017-12-17 22:34:04 +01:00
Marco Slot e3b953b8e3 Plan CTEs when subquery pushdown is on 2017-12-17 21:49:36 +01:00
Marco Slot 686b079272 Show distributed subplan ID in EXPLAIN output 2017-12-16 11:32:01 +01:00
Marco Slot ea6b98fda4 Allow count(distinct) in queries with a subquery 2017-12-15 15:24:26 +01:00
Marco Slot 9ee0e68882 Do not take extra access exclusive lock partitioned tables 2017-12-15 13:02:31 +01:00
Marco Slot 5a69fc1b17 Relax checks on recurring tuples in FROM with sublinks 2017-12-15 11:56:06 +01:00
Marco Slot a64f0060ba Reduce the frequency of FinishConnectionIO calls during COPY (#1864) 2017-12-14 13:21:59 -05:00
Marco Slot 2e2b4e81fa Add support for CTEs in distributed queries 2017-12-14 09:32:55 +01:00
Marco Slot d0335ec818 Send BEGIN for SELECTs in the router executor 2017-12-14 09:32:55 +01:00
Marco Slot cbbd418af2 Add citus.copy_format OIDs to metadata cache 2017-12-14 09:32:55 +01:00
Marco Slot 66f9f1d6cd Make some intermediate results functions public 2017-12-14 09:32:55 +01:00
Marco Slot 36ee21c323 Make CanUseBinaryCopyFormatForType public 2017-12-14 09:32:55 +01:00
Marco Slot 7d1191954d Add DistributedSubPlan node 2017-12-14 09:32:55 +01:00
Onder Kalaci 86b2d9420c Treat recurring tuples as reference table for GROUP BY checks
read_intermediate_results() and immutable functions are implemented.
Empty join trees seems not applicable here.
2017-12-13 14:55:42 +02:00
Marco Slot d1a470a52e Fix issue with multiple ANALYZE in transaction block 2017-12-12 10:28:48 +01:00
mehmet furkan şahin 3c941aedf1 adds citus.enable_repartition_joins GUC
The new GUC allows Citus to switch between task executors
when necessary
2017-12-11 09:36:37 +03:00
Marco Slot 60a1e31671 Allow queries with local tables in NeedsDistributedPlanning 2017-12-07 16:20:23 +01:00
Marco Slot f8550b8c85 Fix issues with read_intermediate_result signature 2017-12-07 13:47:56 +01:00
Marco Slot d8fea4efb8 Revert "Allow queries with local tables in NeedsDistributedPlanning"
This reverts commit d2bac081e8.
2017-12-07 11:19:11 +01:00
Marco Slot d2bac081e8 Allow queries with local tables in NeedsDistributedPlanning 2017-12-07 11:02:16 +01:00
Onder Kalaci c42a92afd2 Fix bug related to incrementing an index not properly 2017-12-07 08:50:57 +02:00
Marco Slot eab15aa035 Avoid deadlock in ColocatedTableId 2017-12-06 11:49:34 +01:00
Marco Slot 7279d42849 Treat read_intermediate_result as recurring tuples 2017-12-04 14:50:11 +01:00
Marco Slot 4cdadfcab6 Add intermediate results infrastructure 2017-12-04 14:50:11 +01:00
Marco Slot bfcc76df69 Make several COPY-related functions public 2017-12-04 13:12:03 +01:00
Marco Slot 73989b07eb Refactor query execution functions 2017-12-04 13:12:03 +01:00
Murat Tuncer 2d66bf5f16
Fix hard coded formatting strings for 64 bit numbers (#1831)
Postgres provides OS agnosting formatting macros for
formatting 64 bit numbers. Replaced %ld %lu with
INT64_FORMAT and UINT64_FORMAT respectively.

Also found some incorrect usages of formatting
flags and fixed them.
2017-12-04 14:11:06 +03:00
Hadi Moshayedi ff706cf556 Test that COPY blocks UPDATE/DELETE/INSERT...SELECT when rep factor 2. 2017-11-30 14:52:29 -05:00
Marco Slot acbc0fe0de Use RowExclusiveLock shard resource lock in COPY 2017-11-30 09:15:45 -05:00
Onder Kalaci a273711500 The common attribute equivalance class always includes the input relations
We added the ability to filter out the planner restriction information
for specific parts of the query. This might lead to situations where
the common restriction includes some other relations that we're searching
for. The reason is that while filtering for join restrictions, we add the
restriction as soon as we find the relation.

With this commit we make sure that the common attribute
equivalance class always includes the input relations.
2017-11-30 16:00:26 +02:00
Marco Slot d6dd0b3a81 Send BEGIN in the real-time executor when in a transaction 2017-11-30 12:59:09 +01:00
Marco Slot 3a4d5f8182 Remove filter checks on leaf queries 2017-11-30 12:25:14 +01:00
Marco Slot 3f03cb6a6a Support UNION with joins in the subqueries 2017-11-30 10:37:56 +01:00
Marco Slot a9933deac6 Make real time executor work in transactions 2017-11-30 09:59:32 +03:00
Jason Petersen 0eacf6bd95
Refactor VacuumStmt checker to be single-return
Decided this would be safer for the future (defaults to unsupported).
2017-11-29 16:06:50 -07:00
Jason Petersen b12e77ab0e
Ensure unsupported VACUUMs don't go to workers
Apparently these two blocks have been incorrect for nearly a year…
2017-11-29 16:06:50 -07:00
Marco Slot 7ea718fd8d Round-robin over worker nodes for 0-shard router queries 2017-11-29 15:52:22 +01:00
Onder Kalaci 05fb0dd020 Add infrastructure for filtering restriction contexts based on the input query
In subquery pushdown, we first ensure that each relation is joined with at least
on another relation on the partition keys. That's fine given that the decision
is binary: pushdown the query at all or not.

With recursive planning, we'd want to check whether any specific part
of the query can be pushded down or not. Thus, we need the ability to
understand which part(s) of the subquery is safe to pushdown. This commit
adds the infrastructure for doing that.
2017-11-28 09:58:21 +02:00
Onder Kalaci 26d9b58e9e Make sure that ExtractRangeTableRelationWalker never misses RTE_RELATION 2017-11-28 09:27:34 +02:00
Onder Kalaci 32def06ebd Split assigning RTE identities and partitioning related query modifications
Note that we used to iterate over the RTEs once for performance reasons.
However, keeping an extra copy of original query seems more costly and
hard to maintain/explain.
2017-11-28 09:27:34 +02:00
Marco Slot feffe86440 Subqueries containing functions go through subquery pushdown 2017-11-27 22:13:02 +01:00
Onder Kalaci 48f96bf3e5 Enable non equi joins in subquery pushdown
Subquery pushdown planning is based on relation restriction
equivalnce. This brings us the opportuneatly to allow any
other joins as long as there is an already equi join between
the distributed tables.

We already allow that for joins with reference tables and
this commit allows that for joins among distributed tables.
2017-11-23 16:13:46 +02:00
Onder Kalaci 16421f089f Register citus custom scan nodes 2017-11-23 11:38:33 +02:00
Onder Kalaci 83c1143505 Refactor custom scan related codes
In this commit, we don't change any codes, only create a new
file and move the related functions and types there.
2017-11-23 11:38:12 +02:00
Marco Slot 20a526d5c4 Fix memory leak in ListToHashSet 2017-11-22 11:26:58 +01:00
Marco Slot f4ceea5a3d Enable 2PC by default 2017-11-22 11:26:58 +01:00
Marco Slot 8486f76e15 Auto-recover 2PC transactions 2017-11-22 11:26:58 +01:00
Marco Slot 6ba3f42d23 Rename MultiPlan to DistributedPlan 2017-11-22 09:36:24 +01:00
Marco Slot 0ad39b36fe Treat immutable table functions and constant subqueries as reference tables 2017-11-21 14:15:22 +01:00
Onder Kalaci d558ebb923 Relax the checks on ensuring distribution columns for target entries
With this commit, we allow pushing down subqueries with only
reference tables where GROUP BY or DISTINCT clause or Window
functions include only columns from reference tables.
2017-11-21 12:28:14 +02:00
Andres Freund d063658d6d Protect some initializations from being called during backend startup.
On EXEC_BACKEND builds these functions shouldn't be called at every
backend start.
2017-11-20 15:29:51 -08:00
Brian Cloutier d267e0f9fa EXEC_BACKEND: don't put pointers to shared hashes into shared memory
Store pointers to shared hashes in process-local variables. Previously
pointers to shared hashes were put into shared memory. This causes
problems on EXEC_BACKEND because everybody calls execve and receives a
brand new address space; the shared hash will be in a different place
for every backend. (normally we call fork, which gives you a copy of the
address space, so these pointers remain constant)
2017-11-20 15:29:51 -08:00
Brian Cloutier 30a2365d81 Rename CreateDirectory to CitusCreateDirectory 2017-11-20 14:38:26 -08:00
Brian Cloutier aa2ab023a2 Rename RemoveDirectory -> CitusRemoveDirectory 2017-11-20 14:21:52 -08:00
Brian Cloutier 06f756b0a1 Rename DeleteFile -> CitusDeleteFile 2017-11-20 13:30:11 -08:00
Marco Slot 9793218122 Do not commit already-committed prepared transactions in recovery 2017-11-20 13:18:48 +01:00
Marco Slot ae47df01ea Observe prepared xacts twice in RecoverWorkerTransactions to avoid race condition 2017-11-20 11:44:08 +01:00
Marco Slot 2410c2e450 Rewrite recover_prepared_transactions to be fast, non-blocking 2017-11-20 11:27:40 +01:00
Onder Kalaci 5bea95009b Skip autovacuum processes for distributed deadlock detection
Autovacuum process cancels itself if any modification starts
on the table in order to avoid blocking your regular Postgres
sessions. That's normal and expected. Thus, any locks held by
autovacuum process cannot involve in a distributed deadlock
since it'll be released if needed.
2017-11-15 14:32:16 +02:00
Onder Kalaci c65c153a46 Skip speculative locks for distributed deadlock detection
These locks are held for a very short duration time and cannot
contribute to a deadlock. Speculative locks are used by Postgres
for internal notification mechanism among transactions.
2017-11-15 12:43:45 +02:00
Marco Slot bbbadd6d1b Bump Citus version to 7.2devel 2017-11-15 10:32:49 +01:00
Marco Slot d3b634b301 Allow generating placement IDs without using the sequence 2017-11-15 10:12:06 +01:00
Marco Slot c24a0875a5 Allow generating shard IDs without using the sequence 2017-11-15 10:12:05 +01:00
Brian Cloutier 0f3230170f Pull in INT32_MAXINT and INT32_MININT 2017-11-14 14:03:46 -08:00
Brian Cloutier 0db8277266 remove unused errno import 2017-11-14 13:09:34 -08:00
Brian Cloutier 5d9f3ae7fd Remove unused poll import from multi_real_time_executor 2017-11-14 13:09:34 -08:00
Marco Slot 533a533565 Only drop sequences on workers with metadata 2017-11-14 16:01:56 +01:00
velioglu be28ba8e70 Add stub UDF to run pg_upgrade flawlessly 2017-11-13 16:14:45 +02:00
metdos 111c04c2bd Warn on CLUSTER command for distributed tables 2017-11-10 12:14:45 +02:00
Burak Yücesoy 863df0b874
Merge branch 'master' into fix_partitioning_in_schema 2017-11-09 12:49:35 +02:00
Burak Yucesoy 17229ed7bd Fix attaching partition to a distributed table in schema
While attaching a partition to a distributed table in schema, we mistakenly
used unqualified name to find partitioned table's oid. This caused problems
while using partitioned tables with schemas. We are fixing this issue in
this PR.
2017-11-09 13:20:29 +03:00
Onder Kalaci 94921a2be1 Skip page-level locks on distributed deadlock detection
Short-term share/exclusive page-level locks are used for
read/write access. Locks are released immediately after
each index row is fetched or inserted.

Since those locks may not lead to any deadlocks, it's safe
to ignore them in the distributed deadlock detection.
2017-11-09 10:37:23 +02:00
Marco Slot f71728f634 Add GUC for specifying sslmode in connections to workers 2017-11-08 14:15:58 +01:00
Murat Tuncer 4e3d633ebf
Add check for connection failures during multishard update (#1765) 2017-11-07 12:33:25 +02:00
Hadi Moshayedi 6d79d25101 Fix a relcache reference leak in stats collection.
In DistributedTablesSize() we didn't close the relations that had
replication factor > 2. This caused relcache reference leaks, and
warning messages like following in logs:

    WARNING:  relcache reference leak: relation "researchers" not closed
2017-11-06 23:16:43 -05:00
metdos c83edc36b5 Check connection status before using it 2017-11-06 14:53:35 +02:00
Brian Cloutier 7be1545843 Support implicit casts during INSERT/SELECT
It's possible to build INSERT SELECT queries which include implicit
casts, currently we attempt to support these by adding explicit casts to
the SELECT query, but this sometimes crashes because we don't update all
nodes with the new types. (SortClauses, for instance)

This commit removes those explicit casts and passes an unmodified SELECT
query to the COPY executor (how we implement INSERT SELECT under the
scenes). In lieu of those cases, COPY has been given some extra logic to
inspect queries, notice that the types don't line up with the table it's
supposed to be inserting into, and "manually" casting every tuple before
sending them to workers.
2017-11-03 22:27:15 -07:00
Marco Slot 6883a09cdd Allow distributed partitioned table creation in Cloud 2017-11-03 10:09:18 +01:00
Marco Slot 6219186683 Allow distributed INSERT...SELECT via worker nodes in MX 2017-11-02 14:38:39 +01:00
Hadi Moshayedi 7280774cf4 Use list_length() != 1 in SingleReplicatedTable().
ShardPlacementList's implementation can return NIL. In previous implementation
we got a segmentation fault in this case. The relation can be dropped after
getting distributed table list but before calling SingleReplicatedTable().
2017-10-31 21:51:43 -04:00
Hadi Moshayedi 7691991cb5 Do PG_TRY() inside a subtransaction block.
If we don't propagate the errors we are catching in PG_CATCH(), database's
internal state might not be clean. So we do PG_TRY() inside a subtransaction
so we can rollback to it after catching errors.
2017-10-31 21:51:43 -04:00
Hadi Moshayedi 9bfbbf8a04 Make reports hostname configurable and enable stats collection in tests.
This patch adds --with-reports-host configure option, which sets the
REPORTS_BASE_URL constant. The default is reports.citusdata.com.

It also enables stats collection in tests.
2017-10-31 21:51:43 -04:00
Hadi Moshayedi acaf085a80 Add callback function for request by CollectBasicUsageStatistics().
Curl writes the received response to stdout if we don't specify a response
callback or an output file. This can pollute the PostgreSQL log. In this change
we add a callback function so the response messages aren't added to the log file.
2017-10-31 21:51:43 -04:00
Hadi Moshayedi 747e439601 Limit number of stats collection retries to once a day. 2017-10-31 21:51:43 -04:00
Hadi Moshayedi 78a2cd9052 Check for Citus updates.
Sends a request to /v1/releases/latest?flavor=$CITUS_EDITION once a day,
which returns a response similar to {"version": "7.1.0", "major": 7,
"minor": 1, "patch": 0}. Then compares it with current Citus version,
and if the latest release is newer, logs a LOG message.
2017-10-31 21:51:43 -04:00
Hadi Moshayedi 34f3ec0961 Call FlushDistTableCache() before stats collection. 2017-10-31 21:51:43 -04:00
Hadi Moshayedi c18c6625d9 Lock relations before calling citus_table_size().
This is to make sure they don't get dropped.
2017-10-31 21:51:43 -04:00
Hadi Moshayedi 97d544b75c Follow the patterns used in Deadlock Detection in Stats Collection.
This includes:

(1) Wrap everything inside a StartTransactionCommand()/CommitTransactionCommand().
This is so we can access the database. This also switches to a new memory context
and releases it, so we don't have to do our own memory management.

(2) LockCitusExtension() so the extension cannot be dropped or created concurrently.

(3) Check CitusHasBeenLoaded() && CheckCitusVersion() before doing any work.

(4) Do not PG_TRY() inside a loop.
2017-10-31 21:51:43 -04:00
Marco Slot 100aaeb3f5 Fix typo in distributed deadlock error message 2017-10-31 19:39:32 +01:00
metdos 8c356b2bc8 Don't try to add restrictions for reference tables in insert into select 2017-10-31 19:44:10 +02:00
mehmet furkan şahin 32fb19911c Add Constraint %s Add Primary Key Using index %s support
This commit makes a change in relay_event_utility.c to check if the
Alter Table command adds a constraint using index. If this is the
case, it appends the shard id to the index name.
2017-10-31 16:03:56 +03:00
Marco Slot 7e34348334 Add shard transfer mode parameter to shard copy functions 2017-10-31 13:30:48 +01:00
Marco Slot 2bb46bb5ee Reset connectionReady flag after moving a connection in WaitForAllConnections 2017-10-31 12:06:53 +01:00
Marco Slot e6e6897499 Defer initial PQflush to main loop in WaitForAllConnections 2017-10-31 12:06:53 +01:00
Marco Slot d6dadb1b25 Use correct index for ModifyWaitEvent in WaitForAllConnections 2017-10-31 12:06:53 +01:00
Furkan Sahin 2b39c52f0b Replica identity on create_distributed_table
By this commit, citus minds the replica identity of the table when
we distribute the table. So the shards of the distributed table
have the same replica identity with the local table.
2017-10-31 13:08:36 +03:00
Marco Slot 7f68f78ee9 Omit public schema from shard_name output 2017-10-31 00:22:07 +01:00
Murat Tuncer e16805215d
Support count(distinct) for non-partition columns (#1692)
Expands count distinct coverage by allowing more cases. We used to support
count distinct only if we can push down distinct aggregate to worker query
i.e. the count distinct clause was on the partition column of the table,
or there was a grouping on the partition column.

Now we can support
- non-partition columns, with or without grouping on partition column
- partition, and non partition column in the same query
- having clause
- single table subqueries
- insert into select queries
- join queries where count distinct is on partition, or non-partition column
- filters on count distinct clauses (extends existing support)

We first try to push down aggregate to worker query (original case), if we
can't then we modify worker query to return distinct columns to coordinator
node. We do that by adding distinct column targets to group by clauses. Then
we perform count distinct operation on the coordinator node.

This work should reduce the cases where HLL is used as it can address anything
that HLL can. However, if we start having performance issues due to very large
number rows, then we can recommend hll use.
2017-10-30 13:12:24 +02:00
Marco Slot be46661bf7 Block only 2PCs instead of all writes in citus_create_restore_point 2017-10-27 00:07:32 +02:00
mehmet furkan şahin 61ae33dc7f ALTER TABLE .. REPLICA IDENTITY support is implemented 2017-10-26 13:44:28 +03:00
Brian Cloutier 4a17d12d74 Replace uint with uint32 2017-10-25 19:32:12 -07:00
velioglu 0b5db5d826 Support multi shard update/delete queries 2017-10-25 15:52:38 +03:00
Marco Slot 4bde83e1d2 Relay error message if DML fails on worker 2017-10-25 14:23:21 +02:00
Hadi Moshayedi 9a04b78980 Send server_id for statistics reports. (#1698)
This change introduces the `pg_dist_node_metadata` which has a single jsonb value. When creating
the extension, a random server id is generated and stored in there. Everything in the metadata table
is added as a nested objected to the json payload that is sent to the reports server.
2017-10-18 21:20:32 -04:00
Hadi Moshayedi 86bcd93a4a Don't collect stats when there is a version mismatch. (#1712)
The following scenario can cause an Assert() crash if we don't do this:
- Install Citus v7.0-15
- Restart server & run a query to start maintenanced.
- Install Citus v7.1
- Restart server & run a query. This will tell user to upgrade.
- Type "UPDATE EXTENSION c" & press tab. maintenanced will start and crash
  with Assert(CitusHasBeenLoaded() && CheckCitusVersion(WARNING));

This change checks Citus version before calling metadata functions so the
crash doesn't happen.
2017-10-17 14:01:14 -04:00
Jason Petersen 8544878c4b
Add citus_version(), analogous to PG's version()
This will provide the full project name (i.e. Citus/Citus Enterprise),
and the host system, compiler, and architecture word size.

I wanted to limit the number of copied files in 'config', so I added
only config.guess and call it manually, rather than using the macro
AC_CANONICAL_HOST, which requires several other files.
2017-10-16 18:09:29 -06:00
Brian Cloutier 91ff8cd2d5 {*,}create_distributed_table doesn't emit OID (#1710) 2017-10-16 18:08:51 -06:00
Brian Cloutier ebcb2b65e9 Add master_move_node function 2017-10-16 10:51:28 -07:00
Brian Cloutier 58cf15ceca DistributedTableSize doesn't emit oid when erring out 2017-10-14 02:42:57 +03:00
Hadi Moshayedi 2aec6eda49 Properly use #ifdef HAVE_LIBCURL. 2017-10-13 12:04:36 -06:00
Jason Petersen 01353cb7cb Use header define rather than -D flag
Eclipse apparently doesn't scan build output looking for -D flags, so
having the value actually appear in a header is nicer for those of us
using IDEs.
2017-10-13 11:00:09 -04:00
Hadi Moshayedi 946659aebe Delete StatsCollection memory context after we are done with stats reporting.
Previously we left the memory context untouched, which overtime leaked memory.
2017-10-13 11:00:09 -04:00
Hadi Moshayedi 873fd1e7ff Fix compiling --without-libcurl.
Previously <curl/curl.h> was included even if compiled --without-libcurl.
This can fail when libcurl headers are not there. This commit guards this
include by checks for HAVE_LIBCURL.
2017-10-13 11:00:09 -04:00
Murat Tuncer 4832abc7cb Make multi_master_planner.c coding convention compliant
Changed order of function definitions and added
declarations in the beginning of the file
2017-10-13 14:59:48 +03:00
Murat Tuncer f7ab901766 Add select distinct, and distinct on support
Distinct, and distinct on() clauses are supported
in simple selects, joins, subqueries, and insert into select
queries.
2017-10-13 14:59:48 +03:00
Hadi Moshayedi 6879f92e23 Fix out of bound memeory access when getting HTTP response code. (#1699) 2017-10-12 12:51:42 -04:00
Hadi Moshayedi a1387f4aa8 Basic usage statistics collection. (#1656)
Adds ```citus.enable_statistics_collection``` GUC variable, which ```true``` by default, unless built without libcurl. If statistics collection is enabled, sends basic usage data to Citus servers every 24 hours.

The data that is collected consists of:
- Citus version
- OS name & release
- Hardware Id
- Number of tables, rounded to next power of 2
- Size of data, rounded to next power of 2
- Number of workers
2017-10-11 09:55:15 -04:00
Onder Kalaci 498ac80d8b Add window function support for SUBQUERY PUSHDOWN and INSERT INTO SELECT
This commit provides the support for window functions in subquery and insert
into select queries. Note that our support for window functions is still limited
because it must have a partition by clause on the distribution key. This commit
makes changes in the files insert_select_planner and multi_logical_planner. The
required tests are also added with files multi_subquery_window_functions.out
and multi_insert_select_window.out.
2017-10-04 15:33:07 +03:00
Marco Slot 9e516513fc Use local group ID when querying for prepared transactions 2017-10-03 16:36:53 +02:00
Hadi Moshayedi 11adb9b034 Push down LIMIT and HAVING when grouped by partition key. (#1641)
We can do this because all rows belonging to a group are in the same shard when grouping by distribution column on a range/hash distributed table.
2017-10-02 20:17:51 -04:00
Marco Slot 394918f9d0 Invalidate worker and group ID cache in maintenance daemon 2017-10-02 18:14:29 +02:00
Marco Slot 43d5e79eaa Execute transmit commands as superuser during task-tracker queries 2017-09-28 15:27:25 +02:00
Marco Slot 306c58d59b Check for absolute paths in COPY with format transmit 2017-09-28 15:27:11 +02:00
Marco Slot cb6b0e820c Allow read-only users to run task-tracker queries 2017-09-28 13:52:36 +02:00
Marco Slot da6b42a3e2 Use unique constraint index for transaction record deletion 2017-09-28 12:04:56 +02:00
Onder Kalaci 68ca8cb7f0 Skip relation extension locks
We should skip if the process blocked on the relation
extension since those locks are hold for a short duration
while the relation is actually extended on the disk and
released as soon as the extension is done. Thus, recording
such waits on our lock graphs could yield detecting wrong
distributed deadlocks.
2017-09-28 10:09:09 +03:00
Murat Tuncer 4676c4f7a5 Prevent crash when remote transaction start fails (#1662)
We sent multiple commands to worker when starting a transaction.
Previously we only checked the result of the first command that
is transaction 'BEGIN' which always succeeds. Any failure on
following commands were not checked.

With this commit, we make sure all command results are checked.
If there is any error we report the first error found.
2017-09-26 17:25:46 -07:00
Jason Petersen b4d53423fa
Add adapter functions for OpenFile changes 2017-09-25 17:20:24 -07:00
Jason Petersen d686123dae
Omit now-public Explain methods from PG11 build
This copy-pasted code is no longer needed in PG11.
2017-09-25 17:20:24 -07:00
Jason Petersen 89d02c6115
Add ruleutils file for PostgreSQL 11 2017-09-25 17:20:24 -07:00
Jason Petersen bbc15e0598
Handle HASHPROC changes
PostgreSQL 11 now has "standard" and "extended" (64-bit) versions of
hash functions.
2017-09-25 17:20:24 -07:00
Jason Petersen 6c9b19a954
Add version-compat header
For polyfill macros, etc.
2017-09-25 17:20:23 -07:00
Jason Petersen fbeaa2f9d0
Remove direct access to tupleDesc->attrs
A level of indirection was removed from this field for PostgreSQL 11.
By using the handy provided macro, we can be version agnostic.
2017-09-25 17:20:23 -07:00
Jason Petersen 6a020b5adc
Update CopyGetAttnums with latest from PostgreSQL
This function was recently modified to use the TupleDescAttr wrapper,
which abstracts away recent changes to TupleDesc.
2017-09-25 17:20:23 -07:00
Andres Freund 78716e5546 Fix possible shard cache incoherency.
When a table and it's shards are dropped, and afterwards the same
shard identifiers are reused, e.g. due to a DROP & CREATE EXTENSION,
the old entry in the shard cache and the required entry in the shard
cache might be for different tables.

Force invalidation for both old and new table to fix.
2017-09-25 13:05:09 -07:00
velioglu 0a56ed910b Change error message of queries with distributed and local table
Citus can handle INSERT INTO ... SELECT queries if the query inserts
into local table by reading data from distributed table. The opposite
way is not correct. With this commit we warn the user if the latter
option is used.
2017-09-22 13:46:19 -07:00
Onder Kalaci 867224bdd7 Make the tests produce more consistent outputs 2017-09-22 20:38:56 +03:00
Onder Kalaci 4782f9f98a Properly copy and trim the error messages that come from pg_conn
When a NULL connection is provided to PQerrorMessage(), the
returned error message is a static text. Modifying that static
text, which doesn't necessarly be in a writeable memory, is
dangreous and might cause a segfault.
2017-09-22 19:43:09 +03:00
Onder Kalaci 6736fd1682 Remove two obsolete functions
Namely GetConnectionFromPGconn() and CloseConnectionByPGconn()
2017-09-21 00:36:23 -06:00
Onder Kalaci 33ec33c5b3 Ensure schema exists on reference table creation
If the schema doesn't exists on the workers, create it.
2017-09-18 23:50:47 +03:00
Onder Kalaci 6116c8e93d Allow pushing down GROUP BYs when at least there is one distribution
column in the target list
2017-09-15 19:15:06 +03:00
Onder Kalaci a5b66912d4 Expand reference table support in subquery pushdown
With this commit, we relax the restrictions put on the reference
tables with subquery pushdown.

We did three notable improvements:

1) Relax equi-join restrictions

 Previously, we always expected that the non-reference tables are
 equi joined with reference tables on the partition key of the
 non-reference table.

 With this commit, we allow any column of non-reference tables
 joined using non-equi joins as well.

2) Relax OUTER JOIN restrictions

 Previously Citus errored out if any reference table exists at
 any point of the outer part of an outer join. For instance,
 See the below sketch where (h) denotes a hash distributed relation,
 (r) denotes a reference table, (L) denotes LEFT JOIN and
 (I) denotes INNER JOIN.

             (L)
             /  \
           (I)     h
          /  \
        r      h

 Before this commit Citus would error out since a reference table
 appears on the left most part of an left join. However, that was
 too restrictive so that we only error out if the reference table
 is directly below and in the outer part of an outer join.

3) Bug fixes

 We've done some minor bugfixes in the existing implementation.
2017-09-14 20:59:22 +03:00
Marco Slot d1befa4df9 Wait for I/O to finish after PQputCopyData 2017-09-12 16:18:42 -07:00
Marco Slot cbe16169b4 Free per-tuple COPY memory in INSERT...SELECT 2017-09-12 15:35:53 -07:00
Marco Slot 5fe0845d7e Always copy MultiPlan in GetMultiPlan 2017-09-12 11:38:52 -07:00
Jason Petersen 8b2c3fcc15
Add clarifying comment to RngVarCallbackForDropIdx
We don't need the PARTITION-related logic recently added in PostgreSQL.
2017-09-01 15:57:30 -06:00
Jason Petersen ec30ad38ba
Update ruleutils_10 with latest PostgreSQL changes
See:
	postgres/postgres@21d304dfed
	postgres/postgres@bb5d6e80b1
	postgres/postgres@d363d42bb9
	postgres/postgres@eb145fdfea
	postgres/postgres@decb08ebdf
	postgres/postgres@a3ca72ae9a
	postgres/postgres@bc2d716ad0
	postgres/postgres@382ceffdf7
	postgres/postgres@c7b8998ebb
	postgres/postgres@e3860ffa4d
	postgres/postgres@76a3df6e5e
2017-09-01 14:26:59 -06:00
Jason Petersen ebecde8f6e
Update ruleutils_96 with latest PostgreSQL changes
See:
	postgres/postgres@41ada83774
	postgres/postgres@3b0c2dbed0
	postgres/postgres@ff2d537223
2017-09-01 14:26:53 -06:00
Marco Slot 0aadbb1760 Convert multi-row INSERT target list to Vars 2017-08-25 10:55:56 +02:00
Marco Slot ae00795dab Allow default columns in multi-row INSERTs 2017-08-25 10:55:56 +02:00
Marco Slot c97692f382 Fix multi-row INSERT with RETURNING on reference tables 2017-08-24 10:42:12 +02:00
Marco Slot dbf18df995 Don't error out if BuildGlobalWaitGraph fails to connect 2017-08-23 19:08:03 +02:00
Onder Kalaci c7bb29b69e Prevent maintanince deamon crashes due to dead processes
If after the distributed deadlock detection decides to cancel
a backend, the backend has been terminated/killed/cancelled
externally, we might be accessing to a NULL pointer. This commit
prevents that case by ignoring the current distributed deadlock.
2017-08-23 15:44:09 +03:00
Marco Slot 641420d79f Remove source node argument from dump_local_wait_edges 2017-08-23 13:14:00 +02:00
Jason Petersen 8cb69e3a14 Add alias for target in multi-row INSERTs
This is necessary for multi-row INSERTs for the same reasons we use it
in e.g. UPSERTs: if the range table list has more than one entry, then
PostgreSQL's deparse logic requires that vars be prefixed by the name
of their corresponding range table entry. This of course doesn't affect
single-row INSERTs, but since multi-row INSERTs have a VALUE RTE, they
were affected.

The piece of ruleutils which builds range table names wasn't modified
to handle shard extension; instead UPSERT/INSERT INTO ... SELECT added
an alias to the RTE. When present, this alias is favored. Doing the
same in the multi-row INSERT case fixes RETURNING for such commands.
2017-08-23 10:24:00 +02:00
Marco Slot 4d7927b672 Execute multi-row INSERTs sequentially 2017-08-23 10:04:57 +02:00
Marco Slot cf375d6a66 Consider dropped columns that precede the partition column in COPY 2017-08-22 13:02:35 +02:00
Marco Slot bd6bf29983 Don't add procs multiple times in BuildWaitGraphForSourceNode 2017-08-21 16:48:30 +02:00
Onder Kalaci 6532b69873 Kill the maintenance daemon on DROP DATABASE 2017-08-18 16:03:08 +03:00
Metin Doslu 0d052e9864 Fix a crash on zero-shard tables 2017-08-18 13:53:59 +03:00
Önder Kalacı b82f886ad3 Merge branch 'master' into improve_deadlock_detection 2017-08-18 13:07:18 +03:00
Marco Slot 7523753a73 Clear metadata OID cache prior to deadlock detection 2017-08-18 11:20:24 +02:00
Andres Freund b936bde936 Take AccessShareLock on the extension prior to running deadlock detection 2017-08-18 11:20:24 +02:00
Onder Kalaci 20679c9e8b Relax assertion on deadlock detection considering
self deadlocks.
2017-08-18 11:16:38 +03:00
Onder Kalaci 550a5578d8 Skip deadlock detection on the workers
Do not run distributed deadlock detection
on the worker nodes to prevent errornous
decisions to kill the deadlocks.
2017-08-17 19:43:38 +03:00
Marco Slot 1eca53ad40 Exit maintenanced on database crash 2017-08-16 18:29:44 +02:00
Marco Slot 9e7b1fb858 Return readable nodes in master_get_active_worker_nodes 2017-08-16 11:28:47 +02:00
Hadi Moshayedi e5fbcf37dd Add Savepoint Support (#1539)
This change adds support for SAVEPOINT, ROLLBACK TO SAVEPOINT, and RELEASE SAVEPOINT.

When transaction connections are not established yet, savepoints are kept in a stack and sent to the worker when the connection is later established. After establishing connections, savepoint commands are sent as they arrive.

This change fixes #1493 .
2017-08-15 13:02:28 -04:00
Onder Kalaci 205501532a Add version check to the maintenance daemon
We should prevent running the deadlock detection if
there is a major version change. Otherwise, the daemon
may access to obsolete metadata catalog tables.
2017-08-15 18:47:13 +03:00
Marco Slot 4614814de1 Enable 2PC for INSERT...SELECT via coordinator 2017-08-15 13:44:20 +02:00
Marco Slot fa70089766 Enable 2PC during distributed table creation 2017-08-15 13:44:20 +02:00
Marco Slot 9232823070 Abort on failure on master connection during copy from worker 2017-08-15 13:44:20 +02:00
Marco Slot df7723cde5 Should not commit on aborted non-critical connections 2017-08-15 13:44:20 +02:00
Eren Başak 77626c4238 Fix NULL nodeClusterString crush on pg_worker_list.conf migrations 2017-08-14 18:13:53 +03:00
Eren Başak b3d2f9ba71 Fix pg_worker_list use-after-free bug
This change fixes a use-after-free bug while renaming obsolete
`pg_worker_list.conf` file, which causes Citus to crash during upgrade
(or even extension creation) if `pg_worker_list.conf` exists.
2017-08-14 18:13:53 +03:00
Burak Yucesoy dfdfb44ebf Acquire shard resource locks on parent tables while operating on partitions 2017-08-14 14:44:30 +03:00
Burak Yucesoy a321e750c0 Acquire relation locks on partitions while operation on parent table 2017-08-14 14:44:30 +03:00
Burak Yucesoy 52b9e35d50 Add relationIdList field to the Job struct 2017-08-14 14:06:22 +03:00
Onder Kalaci 5b48de7430 Improve deadlock detection for MX
We added a new field to the transaction id that is set to true only
for the transactions initialized on the coordinator. This is only
useful for MX in order to distinguish the transaction that started
the distributed transaction on the coordinator where we could
 have the same transactions' worker queries on the same node.
2017-08-12 13:28:37 +03:00
Onder Kalaci 59133415b0 Add logging infrasture for distributed deadlock detection
We added a new GUC citus.log_distributed_deadlock_detection
which is off by default. When set to on, we log some debug messages
related to the distributed deadlock to the server logs.
2017-08-12 13:28:37 +03:00
Onder Kalaci e5d5bdff51 Enable distributed deadlock detection on the maintenance deamon
With this commit, the maintenance deamon starts to check for
distributed deadlocks.

We also introduced a GUC variable (distributed_deadlock_detection_factor)
whose value is multiplied with Postgres' deadlock_timeout. Setting
it to -1 disables the distributed deadlock detection.
2017-08-12 13:28:37 +03:00
Onder Kalaci 66936053a0 Improve error messages when a backend is cancelled by deadlock detection
We send SIGINT to a backend that is cancelled due to a deadlock. That
approach ends up being a very confusing error message.

With this commit we intercept the error messages and show a more
meaningful error message to the user.
2017-08-12 13:28:37 +03:00
Onder Kalaci be4fc45c03 Deprecate enable_deadlock_prevention flag
Now that we already have the necessary infrastructure for detecting
distributed deadlocks. Thus, we don't need enable_deadlock_prevention
which is purely intended for preventing some forms of distributed
deadlocks.
2017-08-12 13:28:37 +03:00
Onder Kalaci a333c9f16c Add infrastructure for distributed deadlock detection
This commit adds all the necessary pieces to do the distributed
deadlock detection.

Each distributed transaction is already assigned with distributed
transaction ids introduced with
3369f3486f. The dependency among the
distributed transactions are gathered with
80ea233ec1.

With this commit, we implement a DFS (depth first seach) on the
dependency graph and search for cycles. Finding a cycle reveals
a distributed deadlock.

Once we find the deadlock, we examine the path that the cycle exists
and cancel the youngest distributed transaction.

Note that, we're not yet enabling the deadlock detection by default
with this commit.
2017-08-12 13:28:37 +03:00
Marco Slot 55992d4bc0 Disallow task-tracker queries on follower clusters 2017-08-12 11:47:31 +02:00
velioglu 100739f62a Change citus subversion 2017-08-11 11:57:57 +03:00
Marco Slot 53584affa8 Fix locking in create_distributed_table 2017-08-11 11:34:33 +03:00
velioglu 7c65001e23 Do not delete row from colocation table within drop table 2017-08-11 11:34:33 +03:00
velioglu b0efffae1c Correct planner and add more tests 2017-08-11 10:16:13 +03:00
velioglu 7550b8ad52 Fix anchor shard id selection when reference table exists 2017-08-11 10:09:47 +03:00
velioglu ceba81ce35 Move physical planner checks to logical planner 2017-08-11 10:09:47 +03:00
velioglu 0359d03530 Add set operation check for reference tables 2017-08-11 10:09:47 +03:00
velioglu c4e3b8b5e1 Add planner changes and tests for subquery on reference tables 2017-08-11 10:09:47 +03:00
velioglu 45717dd013 Check equivalence on reference tables for subquery pushdown 2017-08-11 10:09:47 +03:00
Marco Slot 0ae265c436 Add citus_create_restore_point for distributed snapshots 2017-08-11 07:36:20 +02:00
Marco Slot fdff210ef7 Wait for commit/abort/prepare results asynchronously 2017-08-11 00:03:06 +02:00
Marco Slot fca986f214 Add API for waiting for multiple connections 2017-08-11 00:03:06 +02:00
Brian Cloutier 9d93fb5551 Create citus.use_secondary_nodes GUC
This GUC has two settings, 'always' and 'never'. When it's set to
'never' all behavior stays exactly as it was prior to this commit. When
it's set to 'always' only SELECT queries are allowed to run, and only
secondary nodes are used when processing those queries.

Add some helper functions:
- WorkerNodeIsSecondary(), checks the noderole of the worker node
- WorkerNodeIsReadable(), returns whether we're currently allowed to
  read from this node
- ActiveReadableNodeList(), some functions (namely, the ones on the
  SELECT path) don't require working with Primary Nodes. They should call
  this function instead of ActivePrimaryNodeList(), because the latter
  will error out in contexts where we're not allowed to write to nodes.
- ActiveReadableNodeCount(), like the above, replaces
  ActivePrimaryNodeCount().
- EnsureModificationsCanRun(), error out if we're not currently allowed
  to run queries which modify data. (Either we're in read-only mode or
  use_secondary_nodes is set)

Some parts of the code were switched over to use readable nodes instead
of primary nodes:
- Deadlock detection
- DistributedTableSize,
- the router, real-time, and task tracker executors
- ShardPlacement resolution
2017-08-10 17:37:17 +03:00
Brian Cloutier 3fc87a7a29 Metadata sync also syncs nodes in other clusters 2017-08-10 16:55:55 +03:00
Brian Cloutier 0dee4f8418 Metadata sync syncs all nodes, not just primaries 2017-08-10 16:55:55 +03:00
Eren Başak f9470329e5 Remove test_helper_functions.h inclusions 2017-08-10 12:42:46 +03:00
Eren Başak 3061737712 Define Some Utility Functions
This change declares two new functions:

`master_update_table_statistics` updates the statistics of shards belong
to the given table as well as its colocated tables.

`get_colocated_shard_array` returns the ids of colocated shards of a
given shard.
2017-08-10 12:42:46 +03:00
Brian Cloutier 1961add6f9 Improve error message when there are no nodes for a placement 2017-08-10 12:38:51 +03:00
Jason Petersen dee66e3959
Final review feedback 2017-08-10 01:10:09 -07:00
Jason Petersen 6a35c2937c
Enable multi-row INSERTs
This is a pretty substantial refactoring of the existing modify path
within the router executor and planner. In particular, we now hunt for
all VALUES range table entries in INSERT statements and group the rows
contained therein by shard identifier. These rows are stashed away for
later in "ModifyRoute" elements. During deparse, the appropriate RTE
is extracted from the Query and its values list is replaced by these
rows before any SQL is generated.

In this way, we can create multiple Tasks, but only one per shard, to
piecemeal execute a multi-row INSERT. The execution of jobs containing
such tasks now exclusively go through the "multi-router executor" which
was previously used for e.g. INSERT INTO ... SELECT.

By piggybacking onto that executor, we participate in ongoing trans-
actions, get rollback-ability, etc. In short order, the only remaining
use of the "single modify" router executor will be for bare single-
row INSERT statements (i.e. those not in a transaction).

This change appropriately handles deferred pruning as well as master-
evaluated functions.
2017-08-10 00:32:46 -07:00
velioglu 7e436c0277 Add bool expression to pruning instance with a function 2017-08-10 08:56:36 +03:00
Andres Freund e8b793c454 Support for IN (const, list) and = ANY(const, b, c) pruning. 2017-08-10 08:56:36 +03:00
Onder Kalaci b5ea3ab6a3 Improve locking semantics for backend management
We use the backend shared memory lock for preventing
new backends to be part of a new distributed transaction
or an existing backend to leave a distributed transaction
while we're reading the all backends' data.

The primary goal is to provide consistent view of the
current distributed transactions while doing the
deadlock detection.
2017-08-09 17:17:12 +03:00
Brian Cloutier 2e0916e15a Add master_add_secondary_node() UDF 2017-08-09 17:10:48 +03:00
Marco Slot 08ed6d8269 Prevent pg_dist_node changes during master_create_empty_shard 2017-08-09 14:22:09 +02:00
Marco Slot 3a0571e69b Remove LockMetadataSnapshot 2017-08-09 14:09:54 +02:00
Marco Slot c2f8bafa05 Fix shard creation vs. pg_dist_node change locking 2017-08-09 14:09:54 +02:00
Marco Slot 868ee6be83 Fix and simplify pg_dist_node locking 2017-08-09 14:09:54 +02:00
Burak Yucesoy 8455d1a4ef Ensure we are allowing partitioned tables at all appropriate places 2017-08-09 10:01:35 +03:00
Burak Yucesoy 2eee556738 Add distributed partitioned table support for COPY
For partitioned tables, PostgreSQL opens partition and its partitions
in BeginCopyFrom and it expects its caller to close those relations.
However, we do not have quick access to opened relations and performing
special operations for partitioned tables isn't necessary in coordinator
node. Therefore before calling BeginCopyFrom, we change relkind of those
partitioned tables to RELKIND_RELATION. This prevents PostgreSQL to open
its partitions as well.
2017-08-09 10:01:35 +03:00
Burak Yucesoy 31f3221342 Add distributed partitioned table support to router plannable queries
In standart_planner, PostgreSQL expands partitioned tables to their
partitions and call our restriction hook for each partition. It also,
for some queries, skips the partitioned table itself completely. This
behaviour makes it difficult to prune shards and decide whether query
is router plannable or not. To prevent this behaviour, we change inh
flag of partitioned tables to false in the query tree. In this case,
PostgreSQL treats those partitioned tables as regular relations and
does not expand them.

This behaviour is inline with our expectations, because we do not want
to treat partitioned tables differently on coordinator. Although we are
not entirely comfortable with modifying query tree, other solutions to
this problem is overly complicated.
2017-08-09 10:01:35 +03:00
Burak Yucesoy fddf9b3fcc Add distributed partitioned table support distributed table creation
With this PR, Citus starts to support all possible ways to create
distributed partitioned tables. These are;

- Distributing already created partitioning hierarchy
- CREATE TABLE ... PARTITION OF a distributed_table
- ALTER TABLE distributed_table ATTACH PARTITION non_distributed_table
- ALTER TABLE distributed_table ATTACH PARTITION distributed_table

We also support DETACHing partitions from partitioned tables and propogating
TRUNCATE and DDL commands to distributed partitioned tables.

This PR also refactors some parts of distributed table creation logic.
2017-08-09 10:01:35 +03:00
Metin Doslu b8a9e7c1bf Add support for UPDATE/DELETE with subqueries 2017-08-08 21:35:08 +03:00
Marco Slot d3e9746236 Avoid connections that accessed non-colocated placements in multi-shard commands 2017-08-08 18:32:34 +02:00
Brian Cloutier 7060ade6fe GetNodeTuple returns NULL it node does not exist
It never throws an error.
2017-08-08 13:12:06 +03:00
Brian Cloutier a3e9bef685 All users of WorkerNodeHash take an AccessShareLock
The metadata cache simulates a SELECT on pg_dist_node. Now the locks it
takes also simulate that SELECT.
2017-08-08 13:12:06 +03:00