citus

Commit Graph

Author	SHA1	Message	Date
Marco Slot	1e9186a3b5	Do not use new connection in table size functions	2018-02-23 07:07:55 +01:00
Markus Sintonen	6202e80d06	Implemented jsonb_agg, json_agg, jsonb_object_agg, json_object_agg	2018-02-18 00:19:18 +02:00
velioglu	195ac948d2	Recursively plan subqueries in WHERE clause when FROM recurs	2018-02-13 19:52:12 +03:00
Marco Slot	0cba4ab588	Refactor worker node hash initialisation	2018-02-12 23:36:43 +01:00
Marco Slot	40d715d494	Cache worker node array for faster iteration	2018-02-12 23:36:43 +01:00
Marco Slot	6e79a34c97	Do not check for cancellation in ClearResultsIfReady	2018-02-12 16:45:02 +01:00
Marco Slot	6051aae56e	Handle errors that are discovered during abort	2018-02-12 16:45:02 +01:00
Marco Slot	ee6a751798	Only copy distributed plan when modifying it	2018-02-12 16:30:55 +01:00
Onder Kalaci	94c5ac6ebb	Remove duplicate join restrictions We use PostgreSQL hooks to accumulate the join restrictions and PostgreSQL gives us all the join paths it tries while deciding on the join order. Thus, for queries that have many joins, this function is likely to remove lots of duplicate join restrictions. This becomes relevant for Citus on query pushdown check peformance.	2018-02-12 18:35:05 +02:00
Onder Kalaci	c228d8ff3d	Refactor equivalance generation related codes This commit changes the APIs for restriction generation to make future changes simpler.	2018-02-12 18:35:04 +02:00
Onder Kalaci	2f2d350924	Refactor relation restriction related codes This commit moves some of the functions to a more relevant source file.	2018-02-12 18:35:04 +02:00
Murat Tuncer	901b543e20	Fix count distinct using field select on top level query We were allowing count distict queries even if they were not directly on columns if the query is grouped on distribution column. When performing these checks we were skipping subqueries because they also perform this check in a more concise manner. We relied on oid SUBQUERY_RELATION_ID (10000) to decide if a given RTE relation id denotes a subquery, however, we also use SUBQUERY_PUSHDOWN_RELATION_ID (10001) for some subqueries. We skip both type of subqueries with this change.	2018-02-06 13:16:10 +03:00
metdos	35f864bcaf	Respect enable_hashagg in the master planner	2018-02-05 15:06:00 +02:00
metdos	3d540d961c	Fix typo in grouping_is_sortable()	2018-02-05 12:10:19 +02:00
Marco Slot	6f7c3bd73b	Skip JSON validation on coordinator during COPY	2018-02-02 15:33:27 +01:00
Brian Cloutier	15511f6ba1	Dynamically allocate connection metadata in WaitForAllConnections	2018-02-01 10:30:41 -08:00
Brian Cloutier	e6ebfc1f53	Remove VLA from UpdateNodeLocation	2018-02-01 10:30:41 -08:00
Brian Cloutier	a2ed45e206	Remove variable length arrays VLAs aren't supported by Visual Studio. - Remove all existing instances of VLAs. - Add a flag, -Werror=vla, which makes gcc refuse to compile if we add VLAs in the future.	2018-02-01 10:30:41 -08:00
Brian Cloutier	2efe80ce55	CheckForDistributedDeadlocks no longer uses a VLA - variable length arrays (VLAs) do not work with Visual Studio - fix an off-by-one error. We incorrectly assumed there would always at least as many edges as there were nodes. - refactor: reduce scope of transactionNodeStack by moving it into the function which uses it. - refactor: break up the distinct uses of currentStackDepth into separate variables.	2018-02-01 10:30:41 -08:00
Brian Cloutier	097fd15a89	small refactor, CheckDeadlockForTransactionNode builds it's own array	2018-02-01 10:30:41 -08:00
Brian Cloutier	457f570b77	Small refactor, we were using incompatible types	2018-01-31 11:05:59 -08:00
Brian Cloutier	b864d014ab	GetNextNodeId() incorrectly called PG_RETURN_DATUM - Also stabilize the output of a multi_router_planner test	2018-01-29 15:32:36 -08:00
Brian Cloutier	61a6b846b9	Refactor: use a temporary timestamp variable It's against our coding convention to call functions inside parameter lists; when single-stepping with a debugger it's difficult to determine what the function returned. That wouldn't be good enough reason to change this code but while porting Citus to Windows I ran into this line of code. assign_distributed_transaction_id was called with a weird timestamp and I wasn't able to find the problem without first making this change.	2018-01-29 11:20:13 -08:00
Marco Slot	bd0ebac865	Skip call to ActiveReadableNodeList when there are no subplans	2018-01-29 16:05:10 +01:00
Hadi Moshayedi	ff26bcd5a5	Include sys/stat.h for S_IRUSR and S_IWUSR. (#1977 )	2018-01-26 16:21:48 -05:00
Brian Cloutier	76d1edc3fd	Don't rely on gcc-specific features (#1963 ) * Don't use expressions inside compound statements * Don't depend on __builtin_constant_p * Remove reliance on S_ISLNK * Replace use of __func__: older mcvs doesn't support this builtin	2018-01-23 17:03:29 -08:00
Onder Kalaci	fbde87d2d0	Allocate enough space for transaction nodes This fix prevents any potential memory access that might occur while forming the deadlock path.	2018-01-22 08:45:48 +02:00
Onder Kalaci	9a89c0b425	Fix bug while traversing the distributed deadlock graph With this fix, we traverse the graph with DFS which was originally intended. Note that, before the fix, we traverse the graph with BFS which might lead to killing some unrelated backend that is not involved in the distributed deadlock.	2018-01-22 08:45:48 +02:00
Dimitri Fontaine	c9760fbb64	Fix CREATE INDEX with storage options on distributed tables. By sharing the implementation of the function AppendOptionListToString on three call sites, we would expand an extra OPTIONS keyword in a create index statement, and omit other bits of the specific syntax here. This patch introduces an AppendStorageParametersToString() function that is very similar to AppendOptionListToString() but handles WITH(a="foo",...) syntax that is used in reloptions (aka Storage Parameters). Fixes #1747.	2018-01-17 21:56:40 +01:00
Dimitri Fontaine	952da72c55	Implement ALTER TABLE\|INDEX ... SET\|RESET (). PostgreSQL implements support for several relation kinds in a single statement, such as in the AlterTableStmt case, which supports both tables and indexes and more (see ATExecSetRelOptions in PostgreSQL source code file src/backend/commands/tablecmds.c for an example of that). As a consequence, this patch implements support for setting and resetting storage parameters on both relation kinds.	2018-01-17 21:56:40 +01:00
Dimitri Fontaine	17266e3301	Implement ALTER INDEX ... RENAME TO ... The command is now distributed among the shards when the table is distributed. To that effect, we fill in the DDLJob's targetRelationId with the OID of the table for which the index is defined, rather than the OID of the index itself.	2018-01-17 21:56:40 +01:00
velioglu	d357d2fccd	Bump citus version to 7.3devel	2018-01-16 11:50:28 +03:00
Dimitri Fontaine	e010238280	Implement ALTER TABLE ... RENAME TO ... The implementation was already mostly in place, but the code was protected by a principled check against the operation. Turns out there's a nasty concurrency bug though with long identifier names, much as in #1664. To prevent deadlocks from happening, we could either review the DDL transaction management in shards and placements, or we can simply reject names with (NAMEDATALEN - 1) chars or more — that's because of the PostgreSQL array types being created with a one-char prefix: '_'.	2018-01-11 13:21:24 +01:00
Hadi Moshayedi	5d7c52ffa6	Don't return in PG_TRY() block when cancellations happen in WaitForConnections(). (#1923 ) We shouldn't return in middle of a PG_TRY() block because if we do, we won't reset PG_exception_stack, and later when a re-throw tries to jump to the jump-point which was active in this PG_TRY() block, it seg-faults. We used to return in middle of PG_TRY() block in WaitForConnections() where we checked for cancellations. Whenever cancellations were caught here, Citus crashed. And example was reported by @onderkalaci at #1903.	2018-01-03 09:54:03 -05:00
Marco Slot	8f69973411	Fix cancellation issues in the real-time executor (#1905 )	2018-01-01 23:10:29 -05:00
Marco Slot	3fd65cb91b	Do not raise errors in the real-time executor (#1903 )	2018-01-01 22:26:31 -05:00
Onder Kalaci	a1bbdf2d44	Outer joins should also use subquery pushdown planner if join clause is not supported This change allows unsupported clauses to go through query pushdown planner instead of erroring out as we already do for non-outer joins.	2017-12-29 16:40:47 +02:00
Marco Slot	09c09f650f	Recursively plan set operations when leaf nodes recur	2017-12-26 13:46:55 +02:00
mehmet furkan şahin	446893234a	unsupported subquery error messages are fixed	2017-12-25 15:10:59 +03:00
mehmet furkan şahin	57bc86e23d	new debug output for subplans	2017-12-25 09:50:51 +03:00
Marco Slot	fa7fa2734b	Log remote commands sent via MultiClientSendQuery	2017-12-22 16:18:40 +01:00
Murat Tuncer	87c6f306f1	Fix join clause eq restrictions (#1884 ) We used to error out if the join clause includes filters like t1.a < t2.a even if other filter like t1.key = t2.key exists. Recently we lifted that restriction in subquery planning by not lifting that restriction and focusing on equivalance classes provided by postgres. This checkin forwards previously erroring out real-time queries due to join clauses to subquery planner and let it handle the join even if the query does not have a subquery. We are now pushing down queries that do not have any subqueries in it. Error message looked misleading, changed to a more descriptive one.	2017-12-22 12:16:14 +03:00
metdos	32b7e152a3	Get shard resource locks for only DMLs	2017-12-22 10:30:41 +02:00
Murat Tuncer	a9cf0c3e66	Fix CTE column alias issue (#1893 ) We were creating intermediate query result's target names from subquery target list. Now we also check if cte re-defines its column name aliases, and create intermediate result query accordingly.	2017-12-22 09:39:40 +03:00
Brian Cloutier	377b31dcf7	Remove enable_deadlock_prevention prevention warning	2017-12-21 14:47:52 +01:00
Brian Cloutier	fb7b86fa14	Replace strtoull with pg_strtouint64 The macro we were using to detect strtoull isn't set on Windows, and just in case there are differences use a portable function from PG instead of calling strtoull directly.	2017-12-21 14:28:51 +01:00
mehmet furkan şahin	fd546cf322	Intermediate result size limitation This commit introduces a new GUC to limit the intermediate result size which we handle when we use read_intermediate_result function for CTEs and complex subqueries.	2017-12-21 14:26:56 +03:00
Onder Kalaci	0d5a4b9c72	Recursively plan subqueries that are not safe to pushdown With this commit, Citus recursively plans subqueries that are not safe to pushdown, in other words, requires a merge step. The algorithm is simple: Recursively traverse the query from bottom up (i.e., bottom meaning the leaf queries). On each level, check whether the query is safe to pushdown (or a single repartition subquery). If the answer is yes, do not touch that subquery. If the answer is no, plan the subquery seperately (i.e., create a subPlan for it) and replace the subquery with a call to `read_intermediate_results(planId, subPlanId)`. During the the execution, run the subPlans first, and make them avaliable to the next query executions. Some of the queries hat this change allows us: * Subqueries with LIMIT * Subqueries with GROUP BY/DISTINCT on non-partition keys * Subqueries involving re-partition joins, router queries * Mixed usage of subqueries and CTEs (i.e., use CTEs in subqueries as well). Nested subqueries as long as we support the subquery inside the nested subquery. * Subqueries with local tables (i.e., those subqueries has the limitation that they have to be leaf subqueries) * VIEWs on the distributed tables just works (i.e., the limitations mentioned below still applies to views) Some of the queries that is still NOT supported: * Corrolated subqueries that are not safe to pushdown * Window function on non-partition keys * Recursively planned subqueries or CTEs on the outer side of an outer join * Only recursively planned subqueries and CTEs in the FROM (i.e., not any distributed tables in the FROM) and subqueries in WHERE clause * Subquery joins that are not on the partition columns (i.e., each subquery is individually joined on partition keys but not the upper level subquery.) * Any limitation that logical planner applies such as aggregate distincts (except for count) when GROUP BY is on non-partition key, or array_agg with ORDER BY	2017-12-21 08:37:40 +02:00
Onder Kalaci	e12ea914b9	Refactor ErrorIfQueryNotSupported to defer errors	2017-12-20 09:03:49 +02:00
Onder Kalaci	71ce42b936	Refactor RecursivelyPlanSubqueriesAndCTEs() to make it ready to work with subqueries	2017-12-20 09:03:47 +02:00
Marco Slot	5e0539efa3	Plan CTEs when subquery pushdown is on	2017-12-19 16:34:56 +01:00
Marco Slot	44a1ea631a	Show distributed subplan ID in EXPLAIN output	2017-12-19 16:34:56 +01:00
Marco Slot	35dbacdb69	Do not reinitialise MyBackendData	2017-12-19 15:56:26 +01:00
Marco Slot	af201a2f6d	Allow intermediate results to be used in parallel workers	2017-12-18 19:05:08 +01:00
Marco Slot	7dab078e67	Set cost estimates for read_intermediate_result	2017-12-18 16:23:44 +01:00
Marco Slot	74bd33d0cc	Revert "Plan CTEs when subquery pushdown is on" This reverts commit `e3b953b8e3`.	2017-12-17 22:34:20 +01:00
Marco Slot	aca5f35ab9	Revert "Show distributed subplan ID in EXPLAIN output" This reverts commit `686b079272`.	2017-12-17 22:34:04 +01:00
Marco Slot	e3b953b8e3	Plan CTEs when subquery pushdown is on	2017-12-17 21:49:36 +01:00
Marco Slot	686b079272	Show distributed subplan ID in EXPLAIN output	2017-12-16 11:32:01 +01:00
Marco Slot	ea6b98fda4	Allow count(distinct) in queries with a subquery	2017-12-15 15:24:26 +01:00
Marco Slot	9ee0e68882	Do not take extra access exclusive lock partitioned tables	2017-12-15 13:02:31 +01:00
Marco Slot	5a69fc1b17	Relax checks on recurring tuples in FROM with sublinks	2017-12-15 11:56:06 +01:00
Marco Slot	a64f0060ba	Reduce the frequency of FinishConnectionIO calls during COPY (#1864 )	2017-12-14 13:21:59 -05:00
Marco Slot	2e2b4e81fa	Add support for CTEs in distributed queries	2017-12-14 09:32:55 +01:00
Marco Slot	d0335ec818	Send BEGIN for SELECTs in the router executor	2017-12-14 09:32:55 +01:00
Marco Slot	cbbd418af2	Add citus.copy_format OIDs to metadata cache	2017-12-14 09:32:55 +01:00
Marco Slot	66f9f1d6cd	Make some intermediate results functions public	2017-12-14 09:32:55 +01:00
Marco Slot	36ee21c323	Make CanUseBinaryCopyFormatForType public	2017-12-14 09:32:55 +01:00
Marco Slot	7d1191954d	Add DistributedSubPlan node	2017-12-14 09:32:55 +01:00
Onder Kalaci	86b2d9420c	Treat recurring tuples as reference table for GROUP BY checks read_intermediate_results() and immutable functions are implemented. Empty join trees seems not applicable here.	2017-12-13 14:55:42 +02:00
Marco Slot	d1a470a52e	Fix issue with multiple ANALYZE in transaction block	2017-12-12 10:28:48 +01:00
mehmet furkan şahin	3c941aedf1	adds citus.enable_repartition_joins GUC The new GUC allows Citus to switch between task executors when necessary	2017-12-11 09:36:37 +03:00
Marco Slot	60a1e31671	Allow queries with local tables in NeedsDistributedPlanning	2017-12-07 16:20:23 +01:00
Marco Slot	f8550b8c85	Fix issues with read_intermediate_result signature	2017-12-07 13:47:56 +01:00
Marco Slot	d8fea4efb8	Revert "Allow queries with local tables in NeedsDistributedPlanning" This reverts commit `d2bac081e8`.	2017-12-07 11:19:11 +01:00
Marco Slot	d2bac081e8	Allow queries with local tables in NeedsDistributedPlanning	2017-12-07 11:02:16 +01:00
Onder Kalaci	c42a92afd2	Fix bug related to incrementing an index not properly	2017-12-07 08:50:57 +02:00
Marco Slot	eab15aa035	Avoid deadlock in ColocatedTableId	2017-12-06 11:49:34 +01:00
Marco Slot	7279d42849	Treat read_intermediate_result as recurring tuples	2017-12-04 14:50:11 +01:00
Marco Slot	4cdadfcab6	Add intermediate results infrastructure	2017-12-04 14:50:11 +01:00
Marco Slot	bfcc76df69	Make several COPY-related functions public	2017-12-04 13:12:03 +01:00
Marco Slot	73989b07eb	Refactor query execution functions	2017-12-04 13:12:03 +01:00
Murat Tuncer	2d66bf5f16	Fix hard coded formatting strings for 64 bit numbers (#1831 ) Postgres provides OS agnosting formatting macros for formatting 64 bit numbers. Replaced %ld %lu with INT64_FORMAT and UINT64_FORMAT respectively. Also found some incorrect usages of formatting flags and fixed them.	2017-12-04 14:11:06 +03:00
Hadi Moshayedi	ff706cf556	Test that COPY blocks UPDATE/DELETE/INSERT...SELECT when rep factor 2.	2017-11-30 14:52:29 -05:00
Marco Slot	acbc0fe0de	Use RowExclusiveLock shard resource lock in COPY	2017-11-30 09:15:45 -05:00
Onder Kalaci	a273711500	The common attribute equivalance class always includes the input relations We added the ability to filter out the planner restriction information for specific parts of the query. This might lead to situations where the common restriction includes some other relations that we're searching for. The reason is that while filtering for join restrictions, we add the restriction as soon as we find the relation. With this commit we make sure that the common attribute equivalance class always includes the input relations.	2017-11-30 16:00:26 +02:00
Marco Slot	d6dd0b3a81	Send BEGIN in the real-time executor when in a transaction	2017-11-30 12:59:09 +01:00
Marco Slot	3a4d5f8182	Remove filter checks on leaf queries	2017-11-30 12:25:14 +01:00
Marco Slot	3f03cb6a6a	Support UNION with joins in the subqueries	2017-11-30 10:37:56 +01:00
Marco Slot	a9933deac6	Make real time executor work in transactions	2017-11-30 09:59:32 +03:00
Jason Petersen	0eacf6bd95	Refactor VacuumStmt checker to be single-return Decided this would be safer for the future (defaults to unsupported).	2017-11-29 16:06:50 -07:00
Jason Petersen	b12e77ab0e	Ensure unsupported VACUUMs don't go to workers Apparently these two blocks have been incorrect for nearly a year…	2017-11-29 16:06:50 -07:00
Marco Slot	7ea718fd8d	Round-robin over worker nodes for 0-shard router queries	2017-11-29 15:52:22 +01:00
Onder Kalaci	05fb0dd020	Add infrastructure for filtering restriction contexts based on the input query In subquery pushdown, we first ensure that each relation is joined with at least on another relation on the partition keys. That's fine given that the decision is binary: pushdown the query at all or not. With recursive planning, we'd want to check whether any specific part of the query can be pushded down or not. Thus, we need the ability to understand which part(s) of the subquery is safe to pushdown. This commit adds the infrastructure for doing that.	2017-11-28 09:58:21 +02:00
Onder Kalaci	26d9b58e9e	Make sure that ExtractRangeTableRelationWalker never misses RTE_RELATION	2017-11-28 09:27:34 +02:00
Onder Kalaci	32def06ebd	Split assigning RTE identities and partitioning related query modifications Note that we used to iterate over the RTEs once for performance reasons. However, keeping an extra copy of original query seems more costly and hard to maintain/explain.	2017-11-28 09:27:34 +02:00
Marco Slot	feffe86440	Subqueries containing functions go through subquery pushdown	2017-11-27 22:13:02 +01:00
Onder Kalaci	48f96bf3e5	Enable non equi joins in subquery pushdown Subquery pushdown planning is based on relation restriction equivalnce. This brings us the opportuneatly to allow any other joins as long as there is an already equi join between the distributed tables. We already allow that for joins with reference tables and this commit allows that for joins among distributed tables.	2017-11-23 16:13:46 +02:00
Onder Kalaci	16421f089f	Register citus custom scan nodes	2017-11-23 11:38:33 +02:00
Onder Kalaci	83c1143505	Refactor custom scan related codes In this commit, we don't change any codes, only create a new file and move the related functions and types there.	2017-11-23 11:38:12 +02:00

1 2 3 4 5 ...

924 Commits (b1e66363982b883d4b3be1208746b5ff99105f10)