Support outer joins where the outer rel is a recurring one and the inner one is a non-recurring one (#6512)

DESCRIPTION: Adds support for outer joins having a recurring rel in the
outer side of the join (e.g., \<reference table\> LEFT JOIN
\<distributed table\>)

Closes #6219.
Closes #521

If the outer part of an outer join is a recurring rel (i.e., reference
table
or an intermediate_result injected into the query during the earlier
stages
of the recursive planning), Citus cannot run the join query if the other
side
of the join is not a recurring rel (i.e., distributed table).
See DeferredErrorIfUnsupportedRecurringTuplesJoin for the reasoning.

And to support such joins, now we start recursively planning distributed
side
of such joins so that non-recurring rel becomes an intermediate result
(and
hence a recurring rel) since Citus already knows how to compute an outer
join
between two recurring rels already. In the simplest scenario, this means
to
convert
  _"\<reference\> LEFT JOIN \<distributed\>"_ to
  _"\<reference\> LEFT JOIN  \<intermediate_result\>"_
by wrapping the distributed table into a subquery.

- [x] Add support for outer joins having a recurring rel in the outer
side and a "distributed table" (*) in the inner side of the join
- [x] Expand "distributed table" concept to "distributed rel" in first
item.
      This means that;
- [x] Currently RecursivelyPlanNonRecurringJoinNode doesn't know how to
wrap a sub join tree that constitutes a recurring rel, e.g., rhs clause
of the following join: `reference LEFT OUTER <distributed INNER JOIN
distributed>`; fix this.
- [x] Similar to previous item, currently
RecursivelyPlanNonRecurringJoinNode doesn't know how to handle
subqueries constituting a distributed rel, e.g., `SELECT * FROM ref LEFT
JOIN (SELECT * FROM dist_1) u1 ON (ref.a = u1.a);`; fix this.
- [x] Add lateral join checks for now-supported outer joins into
recursive planner
- [x] Fix regressions tests
- [x] Verified each test output file by first un-distributing Citus
tables involved in related queries and re-running the test file.
- [x] Some of the tests --that were not supposed to return any data
before but this PR adds support for-- were likely to get flaky, so added
some "ORDER BY"s to them.
- [x] Continue doing manual testing and start writing a test file for
the join clauses that this PR adds support for --not only rely on
existing tests

See https://github.com/citusdata/citus/issues/6546 for what we could do
further.
pull/6551/head
Onur Tirtir 2022-12-07 18:44:00 +03:00 committed by GitHub
commit 1e415bb1c3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
59 changed files with 4830 additions and 547 deletions

View File

@ -1845,6 +1845,8 @@ multi_join_restriction_hook(PlannerInfo *root,
*/
joinRestrictionContext->hasSemiJoin = joinRestrictionContext->hasSemiJoin ||
extra->sjinfo->jointype == JOIN_SEMI;
joinRestrictionContext->hasOuterJoin = joinRestrictionContext->hasOuterJoin ||
IS_OUTER_JOIN(extra->sjinfo->jointype);
MemoryContextSwitchTo(oldMemoryContext);
}

View File

@ -173,9 +173,6 @@ typedef enum ConversionChoice
static bool HasConstantFilterOnUniqueColumn(RangeTblEntry *rangeTableEntry,
RelationRestriction *relationRestriction);
static List * RequiredAttrNumbersForRelation(RangeTblEntry *relationRte,
PlannerRestrictionContext *
plannerRestrictionContext);
static ConversionCandidates * CreateConversionCandidates(PlannerRestrictionContext *
plannerRestrictionContext,
List *rangeTableList,
@ -474,7 +471,7 @@ AppendUniqueIndexColumnsToList(Form_pg_index indexForm, List **uniqueIndexGroups
* The function could be optimized by not adding the columns that only appear
* WHERE clause as a filter (e.g., not a join clause).
*/
static List *
List *
RequiredAttrNumbersForRelation(RangeTblEntry *rangeTableEntry,
PlannerRestrictionContext *plannerRestrictionContext)
{

View File

@ -105,6 +105,7 @@ static List * CreateSubqueryTargetListAndAdjustVars(List *columnList);
static AttrNumber FindResnoForVarInTargetList(List *targetList, int varno, int varattno);
static bool RelationInfoContainsOnlyRecurringTuples(PlannerInfo *plannerInfo,
Relids relids);
static char * RecurringTypeDescription(RecurringTuplesType recurType);
static DeferredErrorMessage * DeferredErrorIfUnsupportedLateralSubquery(
PlannerInfo *plannerInfo, Relids recurringRelIds, Relids nonRecurringRelIds);
static Var * PartitionColumnForPushedDownSubquery(Query *query);
@ -603,7 +604,6 @@ DeferErrorIfUnsupportedSubqueryPushdown(Query *originalQuery,
return error;
}
/* we shouldn't allow reference tables in the outer part of outer joins */
error = DeferredErrorIfUnsupportedRecurringTuplesJoin(plannerRestrictionContext);
if (error)
{
@ -643,7 +643,8 @@ DeferErrorIfUnsupportedSubqueryPushdown(Query *originalQuery,
* sublinks into joins.
*
* In some cases, sublinks are pulled up and converted into outer joins. Those cases
* are already handled with DeferredErrorIfUnsupportedRecurringTuplesJoin().
* are already handled with RecursivelyPlanRecurringTupleOuterJoinWalker() or thrown
* an error for in DeferredErrorIfUnsupportedRecurringTuplesJoin().
*
* If the sublinks are not pulled up, we should still error out in if the expression
* in the FROM clause would recur for every shard in a subquery on the WHERE clause.
@ -751,20 +752,11 @@ FromClauseRecurringTupleType(Query *queryTree)
/*
* DeferredErrorIfUnsupportedRecurringTuplesJoin returns true if there exists a outer join
* between reference table and distributed tables which does not follow
* the rules :
* - Reference tables can not be located in the outer part of the semi join or the
* anti join. Otherwise, we may have duplicate results. Although getting duplicate
* results is not possible by checking the equality on the column of the reference
* table and partition column of distributed table, we still keep these checks.
* Because, using the reference table in the outer part of the semi join or anti
* join is not very common.
* - Reference tables can not be located in the outer part of the left join
* (Note that PostgreSQL converts right joins to left joins. While converting
* join types, innerrel and outerrel are also switched.) Otherwise we will
* definitely have duplicate rows. Beside, reference tables can not be used
* with full outer joins because of the same reason.
* DeferredErrorIfUnsupportedRecurringTuplesJoin returns a DeferredError if
* there exists a join between a recurring rel (such as reference tables
* and intermediate_results) and a non-recurring rel (such as distributed tables
* and subqueries that we can push-down to worker nodes) that can return an
* incorrect result set due to recurring tuples coming from the recurring rel.
*/
static DeferredErrorMessage *
DeferredErrorIfUnsupportedRecurringTuplesJoin(
@ -783,31 +775,48 @@ DeferredErrorIfUnsupportedRecurringTuplesJoin(
Relids innerrelRelids = joinRestriction->innerrelRelids;
Relids outerrelRelids = joinRestriction->outerrelRelids;
if (joinType == JOIN_SEMI || joinType == JOIN_ANTI || joinType == JOIN_LEFT)
/*
* This loop aims to determine whether this join is between a recurring
* rel and a non-recurring rel, and if so, whether it can yield an incorrect
* result set due to recurring tuples.
*
* For outer joins, this can only happen if it's a lateral outer join
* where the inner distributed subquery references the recurring outer
* rel. This because, such outer joins should not appear here because
* the recursive planner (RecursivelyPlanRecurringTupleOuterJoinWalker)
* should have already planned the non-recurring side if it wasn't a
* lateral join. For this reason, if the outer join is between a recurring
* rel --on the outer side-- and a non-recurring rel --on the other side--,
* we throw an error assuming that it's a lateral outer join.
* Also note that; in the context of outer joins, we only check left outer
* and full outer joins because PostgreSQL converts right joins to left
* joins before passing them through "set_join_pathlist_hook"s.
*
* For semi / anti joins, we anyway throw an error when the inner
* side is a distributed subquery that references a recurring outer rel
* (in the FROM clause) thanks to DeferErrorIfFromClauseRecurs. And when
* the inner side is a recurring rel and the outer side a non-recurring
* one, then the non-recurring side can't reference the recurring side
* anyway.
*
* For those reasons, here we perform below lateral join checks only for
* outer (except anti) / inner joins but not for anti / semi joins.
*/
if (joinType == JOIN_LEFT)
{
/*
* If there are only recurring tuples on the inner side of a join then
* we can push it down, regardless of whether the outer side is
* recurring or not. Otherwise, we check the outer side for recurring
* tuples.
*/
if (RelationInfoContainsOnlyRecurringTuples(plannerInfo, innerrelRelids))
{
/* inner side only contains recurring rels */
continue;
}
/*
* If the outer side of the join doesn't have any distributed tables
* (e.g., contains only recurring tuples), Citus should not pushdown
* the query. The reason is that recurring tuples on every shard would
* be added to the result, which is wrong.
*/
if (RelationInfoContainsOnlyRecurringTuples(plannerInfo, outerrelRelids))
{
/*
* Find the first (or only) recurring RTE to give a meaningful
* error to the user.
* Inner side contains distributed rels but the outer side only
* contains recurring rels, must be an unsupported lateral outer
* join.
*/
recurType = FetchFirstRecurType(plannerInfo, outerrelRelids);
@ -816,11 +825,6 @@ DeferredErrorIfUnsupportedRecurringTuplesJoin(
}
else if (joinType == JOIN_FULL)
{
/*
* If one of the outer or inner side contains recurring tuples and the other side
* contains nonrecurring tuples, then duplicate results can exist in the result.
* Thus, Citus should not pushdown the query.
*/
bool innerContainOnlyRecurring =
RelationInfoContainsOnlyRecurringTuples(plannerInfo, innerrelRelids);
bool outerContainOnlyRecurring =
@ -829,8 +833,9 @@ DeferredErrorIfUnsupportedRecurringTuplesJoin(
if (innerContainOnlyRecurring && !outerContainOnlyRecurring)
{
/*
* Find the first (or only) recurring RTE to give a meaningful
* error to the user.
* Right side contains distributed rels but the left side only
* contains recurring rels, must be an unsupported lateral outer
* join.
*/
recurType = FetchFirstRecurType(plannerInfo, innerrelRelids);
@ -840,8 +845,9 @@ DeferredErrorIfUnsupportedRecurringTuplesJoin(
if (!innerContainOnlyRecurring && outerContainOnlyRecurring)
{
/*
* Find the first (or only) recurring RTE to give a meaningful
* error to the user.
* Left side contains distributed rels but the right side only
* contains recurring rels, must be an unsupported lateral outer
* join.
*/
recurType = FetchFirstRecurType(plannerInfo, outerrelRelids);
@ -856,7 +862,7 @@ DeferredErrorIfUnsupportedRecurringTuplesJoin(
* See comment on DeferredErrorIfUnsupportedLateralSubquery for
* details.
*
* When planning inner joins postgres can move RTEs from left to
* When planning inner joins, postgres can move RTEs from left to
* right and from right to left. So we don't know on which side the
* lateral join wil appear. Thus we try to find a side of the join
* that only contains recurring tuples. And then we check the other
@ -893,41 +899,13 @@ DeferredErrorIfUnsupportedRecurringTuplesJoin(
}
}
if (recurType == RECURRING_TUPLES_REFERENCE_TABLE)
if (recurType != RECURRING_TUPLES_INVALID)
{
char *errmsg = psprintf("cannot perform a lateral outer join when "
"a distributed subquery references %s",
RecurringTypeDescription(recurType));
return DeferredError(ERRCODE_FEATURE_NOT_SUPPORTED,
"cannot pushdown the subquery",
"There exist a reference table in the outer "
"part of the outer join", NULL);
}
else if (recurType == RECURRING_TUPLES_FUNCTION)
{
return DeferredError(ERRCODE_FEATURE_NOT_SUPPORTED,
"cannot pushdown the subquery",
"There exist a table function in the outer "
"part of the outer join", NULL);
}
else if (recurType == RECURRING_TUPLES_EMPTY_JOIN_TREE)
{
return DeferredError(ERRCODE_FEATURE_NOT_SUPPORTED,
"cannot pushdown the subquery",
"There exist a subquery without FROM in the outer "
"part of the outer join", NULL);
}
else if (recurType == RECURRING_TUPLES_RESULT_FUNCTION)
{
return DeferredError(ERRCODE_FEATURE_NOT_SUPPORTED,
"cannot pushdown the subquery",
"Complex subqueries, CTEs and local tables cannot be in "
"the outer part of an outer join with a distributed table",
NULL);
}
else if (recurType == RECURRING_TUPLES_VALUES)
{
return DeferredError(ERRCODE_FEATURE_NOT_SUPPORTED,
"cannot pushdown the subquery",
"There exist a VALUES clause in the outer "
"part of the outer join", NULL);
errmsg, NULL, NULL);
}
return NULL;

View File

@ -137,6 +137,7 @@ static bool ShouldRecursivelyPlanNonColocatedSubqueries(Query *subquery,
RecursivePlanningContext *
context);
static bool ContainsSubquery(Query *query);
static bool ShouldRecursivelyPlanOuterJoins(RecursivePlanningContext *context);
static void RecursivelyPlanNonColocatedSubqueries(Query *subquery,
RecursivePlanningContext *context);
static void RecursivelyPlanNonColocatedJoinWalker(Node *joinNode,
@ -149,6 +150,11 @@ static void RecursivelyPlanNonColocatedSubqueriesInWhere(Query *query,
colocatedJoinChecker,
RecursivePlanningContext *
recursivePlanningContext);
static bool RecursivelyPlanRecurringTupleOuterJoinWalker(Node *node, Query *query,
RecursivePlanningContext *context);
static void RecursivelyPlanDistributedJoinNode(Node *node, Query *query,
RecursivePlanningContext *context);
static bool IsRTERefRecurring(RangeTblRef *rangeTableRef, Query *query);
static List * SublinkListFromWhere(Query *originalQuery);
static bool ExtractSublinkWalker(Node *node, List **sublinkList);
static bool ShouldRecursivelyPlanSublinks(Query *query);
@ -359,6 +365,19 @@ RecursivelyPlanSubqueriesAndCTEs(Query *query, RecursivePlanningContext *context
RecursivelyPlanLocalTableJoins(query, context);
}
/*
* Similarly, logical planner cannot handle outer joins when the outer rel
* is recurring, such as "<recurring> LEFT JOIN <distributed>". In that case,
* we convert distributed table into a subquery and recursively plan inner
* side of the outer join. That way, inner rel gets converted into an intermediate
* result and logical planner can handle the new query since it's of the from
* "<recurring> LEFT JOIN <recurring>".
*/
if (ShouldRecursivelyPlanOuterJoins(context))
{
RecursivelyPlanRecurringTupleOuterJoinWalker((Node *) query->jointree,
query, context);
}
return NULL;
}
@ -440,6 +459,25 @@ ContainsSubquery(Query *query)
}
/*
* ShouldRecursivelyPlanOuterJoins returns true if the JoinRestrictionContext
* that given RecursivePlanningContext holds implies that the query has outer
* join(s) that might need to be recursively planned.
*/
static bool
ShouldRecursivelyPlanOuterJoins(RecursivePlanningContext *context)
{
if (!context || !context->plannerRestrictionContext ||
!context->plannerRestrictionContext->joinRestrictionContext)
{
ereport(ERROR, (errmsg("unexpectedly got NULL pointer in recursive "
"planning context")));
}
return context->plannerRestrictionContext->joinRestrictionContext->hasOuterJoin;
}
/*
* RecursivelyPlanNonColocatedSubqueries gets a query which includes one or more
* other subqueries that are not joined on their distribution keys. The function
@ -599,6 +637,333 @@ RecursivelyPlanNonColocatedSubqueriesInWhere(Query *query,
}
/*
* RecursivelyPlanRecurringTupleOuterJoinWalker descends into a join tree and
* recursively plans all non-recurring (i.e., distributed) rels that that
* participate in an outer join expression together with a recurring rel,
* such as <distributed> in "<recurring> LEFT JOIN <distributed>", i.e.,
* where the recurring rel causes returning recurring tuples from the worker
* nodes.
*
* Returns true if given node is recurring.
*
* See RecursivelyPlanDistributedJoinNode() function for the explanation on
* what does it mean for a node to be "recurring" or "distributed".
*/
static bool
RecursivelyPlanRecurringTupleOuterJoinWalker(Node *node, Query *query,
RecursivePlanningContext *
recursivePlanningContext)
{
if (node == NULL)
{
return false;
}
else if (IsA(node, FromExpr))
{
FromExpr *fromExpr = (FromExpr *) node;
ListCell *fromExprCell;
/* search for join trees in each FROM element */
foreach(fromExprCell, fromExpr->fromlist)
{
Node *fromElement = (Node *) lfirst(fromExprCell);
RecursivelyPlanRecurringTupleOuterJoinWalker(fromElement, query,
recursivePlanningContext);
}
/*
* Can only appear during the top-level call and top-level callers
* are not interested in the return value. Even more, we can't tell
* whether a FromExpr is recurring or not.
*/
return false;
}
else if (IsA(node, JoinExpr))
{
JoinExpr *joinExpr = (JoinExpr *) node;
Node *leftNode = joinExpr->larg;
Node *rightNode = joinExpr->rarg;
/*
* There may be recursively plannable outer joins deeper in the join tree.
*
* We first handle the sub join trees and then the top level one since the
* top level join expression might not require recursive planning after
* handling the sub join trees.
*/
bool leftNodeRecurs =
RecursivelyPlanRecurringTupleOuterJoinWalker(leftNode, query,
recursivePlanningContext);
bool rightNodeRecurs =
RecursivelyPlanRecurringTupleOuterJoinWalker(rightNode, query,
recursivePlanningContext);
switch (joinExpr->jointype)
{
case JOIN_LEFT:
{
/* <recurring> left join <distributed> */
if (leftNodeRecurs && !rightNodeRecurs)
{
ereport(DEBUG1, (errmsg("recursively planning right side of "
"the left join since the outer side "
"is a recurring rel")));
RecursivelyPlanDistributedJoinNode(rightNode, query,
recursivePlanningContext);
}
/*
* A LEFT JOIN is recurring if the lhs is recurring.
* Note that we might have converted the rhs into a recurring
* one too if the lhs is recurring, but this anyway has no
* effects when deciding whether a LEFT JOIN is recurring.
*/
return leftNodeRecurs;
}
case JOIN_RIGHT:
{
/* <distributed> right join <recurring> */
if (!leftNodeRecurs && rightNodeRecurs)
{
ereport(DEBUG1, (errmsg("recursively planning left side of "
"the right join since the outer side "
"is a recurring rel")));
RecursivelyPlanDistributedJoinNode(leftNode, query,
recursivePlanningContext);
}
/*
* Similar to LEFT JOINs, a RIGHT JOIN is recurring if the rhs
* is recurring.
*/
return rightNodeRecurs;
}
case JOIN_FULL:
{
/*
* <recurring> full join <distributed>
* <distributed> full join <recurring>
*/
if (leftNodeRecurs && !rightNodeRecurs)
{
ereport(DEBUG1, (errmsg("recursively planning right side of "
"the full join since the other side "
"is a recurring rel")));
RecursivelyPlanDistributedJoinNode(rightNode, query,
recursivePlanningContext);
}
else if (!leftNodeRecurs && rightNodeRecurs)
{
ereport(DEBUG1, (errmsg("recursively planning left side of "
"the full join since the other side "
"is a recurring rel")));
RecursivelyPlanDistributedJoinNode(leftNode, query,
recursivePlanningContext);
}
/*
* An OUTER JOIN is recurring if any sides of the join is
* recurring. As in other outer join types, it doesn't matter
* whether the other side was / became recurring or not.
*/
return leftNodeRecurs || rightNodeRecurs;
}
case JOIN_INNER:
{
/*
* We don't need to recursively plan non-outer joins and we
* already descended into sub join trees to handle outer joins
* buried in them.
*/
return leftNodeRecurs && rightNodeRecurs;
}
default:
{
ereport(ERROR, (errmsg("got unexpected join type (%d) when recursively "
"planning a join",
joinExpr->jointype)));
}
}
}
else if (IsA(node, RangeTblRef))
{
return IsRTERefRecurring((RangeTblRef *) node, query);
}
else
{
ereport(ERROR, errmsg("got unexpected node type (%d) when recursively "
"planning a join",
nodeTag(node)));
}
}
/*
* RecursivelyPlanDistributedJoinNode is a helper function for
* RecursivelyPlanRecurringTupleOuterJoinWalker that recursively plans given
* distributed node that is known to be inner side of an outer join.
*
* Fails to do so if the distributed join node references the recurring one.
* In that case, we don't throw an error here but instead we let
* DeferredErrorIfUnsupportedRecurringTuplesJoin to so for a better error
* message.
*
* We call a node "distributed" if it points to a distributed table or a
* more complex object (i.e., a join tree or a subquery) that can be pushed
* down to the worker nodes directly. For a join, this means that it's either
* an INNER join where any side of it is a distributed table / a distributed
* sub join tree, or an OUTER join where the outer side is a distributed table
* / a distributed sub join tree.
*/
static void
RecursivelyPlanDistributedJoinNode(Node *node, Query *query,
RecursivePlanningContext *recursivePlanningContext)
{
if (IsA(node, JoinExpr))
{
/*
* This, for example, means that RecursivelyPlanRecurringTupleOuterJoinWalker
* needs to plan inner side, i.e., "<distributed> INNER JOIN <distributed>",
* of the following join:
* <recurring> LEFT JOIN (<distributed> JOIN <distributed>)
*
* XXX: Ideally, we should handle such a sub join tree by moving
* it into a subquery "as a whole" but this implies that we need to
* rebuild the rtable and re-point all the Vars to the new rtable
* indexes, so we've not implemented that yet.
*
* Instead, we recursively plan all the distributed tables in that
* sub join tree. This is much more inefficient than the other
* approach (since we lose the opportunity to push-down the whole
* sub join tree into the workers) but is easier to implement.
*/
RecursivelyPlanDistributedJoinNode(((JoinExpr *) node)->larg,
query, recursivePlanningContext);
RecursivelyPlanDistributedJoinNode(((JoinExpr *) node)->rarg,
query, recursivePlanningContext);
return;
}
if (!IsA(node, RangeTblRef))
{
ereport(ERROR, (errmsg("unexpected join node type (%d)",
nodeTag(node))));
}
RangeTblRef *rangeTableRef = (RangeTblRef *) node;
if (IsRTERefRecurring(rangeTableRef, query))
{
/*
* Not the top-level callers but RecursivelyPlanDistributedJoinNode
* might call itself for recurring nodes and need to skip them.
*/
return;
}
RangeTblEntry *distributedRte = rt_fetch(rangeTableRef->rtindex,
query->rtable);
if (distributedRte->rtekind == RTE_RELATION)
{
ereport(DEBUG1, (errmsg("recursively planning distributed relation %s "
"since it is part of a distributed join node "
"that is outer joined with a recurring rel",
GetRelationNameAndAliasName(distributedRte))));
PlannerRestrictionContext *restrictionContext =
GetPlannerRestrictionContext(recursivePlanningContext);
List *requiredAttributes =
RequiredAttrNumbersForRelation(distributedRte, restrictionContext);
ReplaceRTERelationWithRteSubquery(distributedRte, requiredAttributes,
recursivePlanningContext);
}
else if (distributedRte->rtekind == RTE_SUBQUERY)
{
/*
* We don't try logging the subquery here because RecursivelyPlanSubquery
* will anyway do so if the query doesn't reference the outer query.
*/
ereport(DEBUG1, (errmsg("recursively planning the distributed subquery "
"since it is part of a distributed join node "
"that is outer joined with a recurring rel")));
bool recursivelyPlanned = RecursivelyPlanSubquery(distributedRte->subquery,
recursivePlanningContext);
if (!recursivelyPlanned)
{
/*
* RecursivelyPlanSubquery fails to plan a subquery only if it
* contains references to the outer query. This means that, we can't
* plan such outer joins (like <recurring LEFT OUTER distributed>)
* if it's a LATERAL join where the distributed side is a subquery that
* references the outer side, as in,
*
* SELECT * FROM reference
* LEFT JOIN LATERAL
* (SELECT * FROM distributed WHERE reference.b > distributed.b) q
* USING (a);
*/
Assert(ContainsReferencesToOuterQuery(distributedRte->subquery));
}
}
else
{
/*
* We don't expect RecursivelyPlanRecurringTupleOuterJoinWalker to try recursively
* plan such an RTE.
*/
ereport(ERROR, errmsg("got unexpected RTE type (%d) when recursively "
"planning a join",
distributedRte->rtekind));
}
}
/*
* IsRTERefRecurring returns true if given rte reference points to a recurring
* rte.
*
* If an rte points to a table, then we call it recurring if the table is not
* a distributed table. Otherwise, e.g., if it points a query, then we call it
* recurring if none of the rtes that belongs to the query point to a distributed
* table.
*
* Note that it's safe to assume a subquery is not recurring if we have a rte reference
* to a distributed table somewhere in the query tree. For example, considering
* the subquery (q) of the the following query:
* SELECT * FROM ref LEFT JOIN (SELECT * FROM ref LEFT dist) q,
* one might think that it's not appropriate to call IsRTERefRecurring for subquery
* (q). However, this is already not the case because this function is called
* in the context of recursive planning and hence any query that contains
* rtes pointing to distributed tables and that cannot be pushed down to worker
* nodes should've been recursively planned already. This is because, the recursive
* planner processes the queries in bottom-up fashion. For this reason, the subquery
* in the example should've already be converted to the following before we check
* the rte reference that points to the subquery (q):
* SELECT * FROM ref LEFT JOIN (SELECT * FROM ref LEFT (SELECT * FROM read_intermediate_result()) dist_1)
* That way, we wouldn't incorrectly say that (SELECT * FROM ref LEFT dist) is a
* distributed subquery (due to having a reference to a distributed table).
*/
static bool
IsRTERefRecurring(RangeTblRef *rangeTableRef, Query *query)
{
int rangeTableIndex = rangeTableRef->rtindex;
List *rangeTableList = query->rtable;
RangeTblEntry *rangeTableEntry = rt_fetch(rangeTableIndex, rangeTableList);
return !FindNodeMatchingCheckFunctionInRangeTableList(list_make1(rangeTableEntry),
IsDistributedTableRTE);
}
/*
* SublinkListFromWhere finds the subquery nodes in the where clause of the given query. Note
* that the function should be called on the original query given that postgres

View File

@ -2230,6 +2230,7 @@ FilterJoinRestrictionContext(JoinRestrictionContext *joinRestrictionContext, Rel
* the same query and as these values are calculated per-query basis.
*/
filtererdJoinRestrictionContext->hasSemiJoin = joinRestrictionContext->hasSemiJoin;
filtererdJoinRestrictionContext->hasOuterJoin = joinRestrictionContext->hasOuterJoin;
return filtererdJoinRestrictionContext;
}

View File

@ -72,6 +72,7 @@ typedef struct JoinRestrictionContext
{
List *joinRestrictionList;
bool hasSemiJoin;
bool hasOuterJoin;
} JoinRestrictionContext;
typedef struct JoinRestriction

View File

@ -30,5 +30,8 @@ extern int LocalTableJoinPolicy;
extern bool ShouldConvertLocalTableJoinsToSubqueries(List *rangeTableList);
extern void RecursivelyPlanLocalTableJoins(Query *query,
RecursivePlanningContext *context);
extern List * RequiredAttrNumbersForRelation(RangeTblEntry *relationRte,
PlannerRestrictionContext *
plannerRestrictionContext);
#endif /* LOCAL_DISTRIBUTED_JOIN_PLANNER_H */

View File

@ -0,0 +1,119 @@
SET search_path TO local_dist_join_mixed;
SELECT COUNT(*) FROM reference LEFT JOIN distributed USING (id);
count
---------------------------------------------------------------------
101
(1 row)
SELECT COUNT(*) FROM distributed RIGHT JOIN reference USING (id);
count
---------------------------------------------------------------------
101
(1 row)
SELECT COUNT(*) FROM reference FULL JOIN distributed USING (id);
count
---------------------------------------------------------------------
101
(1 row)
SELECT COUNT(*) FROM distributed FULL JOIN reference USING (id);
count
---------------------------------------------------------------------
101
(1 row)
SELECT COUNT(*) FROM distributed FULL JOIN reference USING (id);
count
---------------------------------------------------------------------
101
(1 row)
-- distributed side is a subquery
SELECT COUNT(*) FROM reference LEFT JOIN (SELECT * FROM distributed) q USING (id);
count
---------------------------------------------------------------------
101
(1 row)
-- distributed side is a join tree
SELECT COUNT(*) FROM reference LEFT JOIN (distributed t1 JOIN distributed t2 USING (id)) q USING (id);
count
---------------------------------------------------------------------
101
(1 row)
SELECT COUNT(*) FROM reference LEFT JOIN (distributed t1 LEFT JOIN distributed t2 USING (id)) q USING (id);
count
---------------------------------------------------------------------
101
(1 row)
SELECT COUNT(*) FROM
-- 2) right side is distributed but t1 is recurring, hence what
-- makes the right side distributed (t4) is recursively planned
reference t1
LEFT JOIN
(
distributed t4
JOIN
-- 1) t6 is recursively planned since the outer side is recurring
(SELECT t6.id FROM distributed t6 RIGHT JOIN reference t7 USING(id)) t5
USING(id)
) q
USING(id)
-- 3) outer side of the join tree became recurring, hence t8 is
-- recursively planned too
LEFT JOIN
distributed t8
USING (id)
WHERE t8.id IS NULL;
count
---------------------------------------------------------------------
0
(1 row)
SELECT COUNT(*) FROM
local t1
LEFT JOIN
-- 2) t6 subquery is distributed so needs to be recursively planned
-- because t1 is first recursively planned
(
SELECT * FROM
(SELECT * FROM reference t2 JOIN distributed t3 USING (id)) p
JOIN
-- 1) t5 is recursively planned since the outer side is recurring
(SELECT * FROM reference t4 LEFT JOIN distributed t5 USING (id)) q
USING(id)
) t6
USING (id);
count
---------------------------------------------------------------------
101
(1 row)
BEGIN;
WITH cte AS (
DELETE FROM distributed
USING (
SELECT t1.id, t1.id*3 FROM reference t1
LEFT JOIN
(
SELECT * FROM distributed t2 WHERE EXISTS (
SELECT * FROM distributed t4
WHERE t4.id = t2.id
)
) t3
USING (id)
) q
WHERE distributed.id = q.id AND
distributed.id > 65
RETURNING *
)
SELECT COUNT(*) FROM cte;
count
---------------------------------------------------------------------
35
(1 row)
ROLLBACK;

View File

@ -497,9 +497,14 @@ NOTICE: executing the command locally: SELECT count(*) AS count FROM (((citus_l
1296
(1 row)
-- not supported direct outer join
SELECT count(*) FROM citus_local_table LEFT JOIN distributed_table ON (true);
ERROR: cannot pushdown the subquery
NOTICE: executing the command locally: SELECT NULL::integer AS "dummy-1" FROM citus_local_table_queries.citus_local_table_1509001 citus_local_table WHERE true
NOTICE: executing the command locally: SELECT count(*) AS count FROM ((SELECT NULL::integer AS a, NULL::integer AS b FROM (SELECT intermediate_result."dummy-1" FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result("dummy-1" integer)) citus_local_table_1) citus_local_table LEFT JOIN (SELECT NULL::integer AS a, NULL::integer AS b FROM (SELECT intermediate_result."dummy-1" FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result("dummy-1" integer)) distributed_table_1) distributed_table ON (true))
count
---------------------------------------------------------------------
36
(1 row)
-- distinct in subquery on CTE
WITH one_row AS (
SELECT a from citus_local_table WHERE b = 1

View File

@ -497,9 +497,14 @@ NOTICE: executing the command locally: SELECT count(*) AS count FROM (((citus_l
1296
(1 row)
-- not supported direct outer join
SELECT count(*) FROM citus_local_table LEFT JOIN distributed_table ON (true);
ERROR: cannot pushdown the subquery
NOTICE: executing the command locally: SELECT NULL::integer AS "dummy-1" FROM citus_local_table_queries.citus_local_table_1509001 citus_local_table WHERE true
NOTICE: executing the command locally: SELECT count(*) AS count FROM ((SELECT NULL::integer AS a, NULL::integer AS b FROM (SELECT intermediate_result."dummy-1" FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result("dummy-1" integer)) citus_local_table_1) citus_local_table LEFT JOIN (SELECT NULL::integer AS a, NULL::integer AS b FROM (SELECT intermediate_result."dummy-1" FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result("dummy-1" integer)) distributed_table_1) distributed_table ON (true))
count
---------------------------------------------------------------------
36
(1 row)
-- distinct in subquery on CTE
WITH one_row AS (
SELECT a from citus_local_table WHERE b = 1

View File

@ -430,9 +430,13 @@ SELECT count(*) FROM reference_table
1296
(1 row)
-- not supported direct outer join
-- supported outer join
SELECT count(*) FROM citus_local_table LEFT JOIN distributed_table ON (true);
ERROR: cannot pushdown the subquery
count
---------------------------------------------------------------------
36
(1 row)
-- distinct in subquery on CTE
WITH one_row AS (
SELECT a from citus_local_table WHERE b = 1

View File

@ -90,28 +90,49 @@ SELECT count(*) FROM users_ref_test_table ref1 CROSS JOIN users_ref_test_table r
-- two reference tables CROSS JOINNed, and later JOINED with distributed tables
-- but the reference table CROSS JOIN is in the outer side of the JOIN with the distributed table
-- so we cannot pushdown
-- so this is supported by recursively planning the distributed table (users_table)
SELECT count(*) FROM users_ref_test_table ref1 CROSS JOIN users_ref_test_table ref2 LEFT JOIN users_table ON (ref1.id = users_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
606
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 CROSS JOIN users_ref_test_table ref2 FULL JOIN users_table ON (ref1.id = users_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
606
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 CROSS JOIN users_ref_test_table ref2 LEFT JOIN users_table ON (ref1.id != users_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
3030
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 CROSS JOIN users_ref_test_table ref2 LEFT JOIN users_table ON (ref1.id > 0);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
3636
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 CROSS JOIN users_ref_test_table ref2 LEFT JOIN users_table ON (users_table.user_id > 0);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
3636
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 CROSS JOIN users_ref_test_table ref2 LEFT JOIN users_table ON true;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
3636
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 CROSS JOIN users_ref_test_table ref2 LEFT JOIN users_table ON false;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
36
(1 row)
-- a reference tables CROSS JOINed with a distribted table, and later JOINED with distributed tables on distribution keys
-- so safe to pushdown
SELECT count(*) FROM users_table u1 CROSS JOIN users_ref_test_table ref2 JOIN users_table u2 ON (u1.user_id = u2.user_id);
@ -167,8 +188,11 @@ SELECT count(*) FROM users_ref_test_table ref1 CROSS JOIN users_ref_test_table r
-- this is the same query as the above, but this time the outer query is also LEFT JOIN, meaning that Postgres
-- cannot eliminate the outer join
SELECT count(*) FROM users_ref_test_table ref1 CROSS JOIN users_ref_test_table ref2 LEFT JOIN users_table ON (ref1.id = users_table.user_id) LEFT JOIN users_table u2 ON (u2.user_id = users_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
11802
(1 row)
-- cross join that goes through non-colocated subquery logic
-- for the "events_table" subquery as both distributed tables
-- do not have JOIN on the distribution key

View File

@ -1357,22 +1357,17 @@ DEBUG: Router planner cannot handle multi-shard select queries
{ "8" : "test18", "8" : "test28", "8" : "test38", "8" : "test48", "8" : "test58", "8" : "test68", "8" : "test78", "8" : "test8", "8" : "test88", "8" : "test98" }
(1 row)
-- this test can only work if the CTE is recursively
-- planned
WITH b AS (SELECT * FROM test_table)
SELECT count(*) FROM (SELECT key as x FROM test_table OFFSET 0) as ref LEFT JOIN b ON (ref.x = b.key);
DEBUG: CTE b is going to be inlined via distributed planning
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_1 for subquery SELECT key AS x FROM cte_inline.test_table OFFSET 0
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS count FROM ((SELECT intermediate_result.x FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(x integer)) ref LEFT JOIN (SELECT test_table.key, test_table.value, test_table.other_value FROM cte_inline.test_table) b ON ((ref.x OPERATOR(pg_catalog.=) b.key)))
DEBUG: recursively planning right side of the left join since the outer side is a recurring rel
DEBUG: recursively planning the distributed subquery since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_1 for CTE b: SELECT key, value, other_value FROM cte_inline.test_table
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_2 for subquery SELECT key AS x FROM cte_inline.test_table OFFSET 0
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS count FROM ((SELECT intermediate_result.x FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(x integer)) ref LEFT JOIN (SELECT intermediate_result.key, intermediate_result.value, intermediate_result.other_value FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(key integer, value text, other_value jsonb)) b ON ((ref.x OPERATOR(pg_catalog.=) b.key)))
DEBUG: generating subplan XXX_2 for subquery SELECT key, value, other_value FROM cte_inline.test_table
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS count FROM ((SELECT intermediate_result.x FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(x integer)) ref LEFT JOIN (SELECT intermediate_result.key, intermediate_result.value, intermediate_result.other_value FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(key integer, value text, other_value jsonb)) b ON ((ref.x OPERATOR(pg_catalog.=) b.key)))
DEBUG: Creating router plan
count
---------------------------------------------------------------------
@ -1397,8 +1392,6 @@ DEBUG: Router planner cannot handle multi-shard select queries
480
(1 row)
-- cte a has to be recursively planned because of OFFSET 0
-- after that, cte b also requires recursive planning
WITH a AS (SELECT * FROM test_table OFFSET 0),
b AS (SELECT * FROM test_table)
SELECT min(a.key) FROM a LEFT JOIN b ON (a.value = b.value);
@ -1407,13 +1400,10 @@ DEBUG: CTE b is going to be inlined via distributed planning
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_1 for subquery SELECT key, value, other_value FROM cte_inline.test_table OFFSET 0
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT min(a.key) AS min FROM ((SELECT intermediate_result.key, intermediate_result.value, intermediate_result.other_value FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(key integer, value text, other_value jsonb)) a LEFT JOIN (SELECT test_table.key, test_table.value, test_table.other_value FROM cte_inline.test_table) b ON ((a.value OPERATOR(pg_catalog.=) b.value)))
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_1 for CTE a: SELECT key, value, other_value FROM cte_inline.test_table OFFSET 0
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_2 for CTE b: SELECT key, value, other_value FROM cte_inline.test_table
DEBUG: recursively planning right side of the left join since the outer side is a recurring rel
DEBUG: recursively planning the distributed subquery since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_2 for subquery SELECT key, value, other_value FROM cte_inline.test_table
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT min(a.key) AS min FROM ((SELECT intermediate_result.key, intermediate_result.value, intermediate_result.other_value FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(key integer, value text, other_value jsonb)) a LEFT JOIN (SELECT intermediate_result.key, intermediate_result.value, intermediate_result.other_value FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(key integer, value text, other_value jsonb)) b ON ((a.value OPERATOR(pg_catalog.=) b.value)))
DEBUG: Creating router plan
min
@ -1450,14 +1440,11 @@ DEBUG: CTE cte_2 is going to be inlined via distributed planning
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_1 for subquery SELECT value FROM cte_inline.test_table WHERE (key OPERATOR(pg_catalog.>) 3)
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT value FROM ((SELECT test_table.value FROM cte_inline.test_table WHERE (test_table.key OPERATOR(pg_catalog.>) 1)) cte_1 FULL JOIN (SELECT intermediate_result.value FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(value text)) cte_2 USING (value)) ORDER BY value DESC LIMIT 3
DEBUG: recursively planning left side of the full join since the other side is a recurring rel
DEBUG: recursively planning the distributed subquery since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_1 for CTE cte_1: SELECT value FROM cte_inline.test_table WHERE (key OPERATOR(pg_catalog.>) 1)
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_2 for CTE cte_2: SELECT value FROM cte_inline.test_table WHERE (key OPERATOR(pg_catalog.>) 3)
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT value FROM ((SELECT intermediate_result.value FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(value text)) cte_1 FULL JOIN (SELECT intermediate_result.value FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(value text)) cte_2 USING (value)) ORDER BY value DESC LIMIT 3
DEBUG: generating subplan XXX_2 for subquery SELECT value FROM cte_inline.test_table WHERE (key OPERATOR(pg_catalog.>) 1)
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT value FROM ((SELECT intermediate_result.value FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(value text)) cte_1 FULL JOIN (SELECT intermediate_result.value FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(value text)) cte_2 USING (value)) ORDER BY value DESC LIMIT 3
DEBUG: Creating router plan
value
---------------------------------------------------------------------

View File

@ -1357,22 +1357,17 @@ DEBUG: Router planner cannot handle multi-shard select queries
{ "8" : "test18", "8" : "test28", "8" : "test38", "8" : "test48", "8" : "test58", "8" : "test68", "8" : "test78", "8" : "test8", "8" : "test88", "8" : "test98" }
(1 row)
-- this test can only work if the CTE is recursively
-- planned
WITH b AS (SELECT * FROM test_table)
SELECT count(*) FROM (SELECT key as x FROM test_table OFFSET 0) as ref LEFT JOIN b ON (ref.x = b.key);
DEBUG: CTE b is going to be inlined via distributed planning
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_1 for subquery SELECT key AS x FROM cte_inline.test_table OFFSET 0
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS count FROM ((SELECT intermediate_result.x FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(x integer)) ref LEFT JOIN (SELECT test_table.key, test_table.value, test_table.other_value FROM cte_inline.test_table) b ON ((ref.x OPERATOR(pg_catalog.=) b.key)))
DEBUG: recursively planning right side of the left join since the outer side is a recurring rel
DEBUG: recursively planning the distributed subquery since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_1 for CTE b: SELECT key, value, other_value FROM cte_inline.test_table
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_2 for subquery SELECT key AS x FROM cte_inline.test_table OFFSET 0
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS count FROM ((SELECT intermediate_result.x FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(x integer)) ref LEFT JOIN (SELECT intermediate_result.key, intermediate_result.value, intermediate_result.other_value FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(key integer, value text, other_value jsonb)) b ON ((ref.x OPERATOR(pg_catalog.=) b.key)))
DEBUG: generating subplan XXX_2 for subquery SELECT key, value, other_value FROM cte_inline.test_table
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS count FROM ((SELECT intermediate_result.x FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(x integer)) ref LEFT JOIN (SELECT intermediate_result.key, intermediate_result.value, intermediate_result.other_value FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(key integer, value text, other_value jsonb)) b ON ((ref.x OPERATOR(pg_catalog.=) b.key)))
DEBUG: Creating router plan
count
---------------------------------------------------------------------
@ -1397,8 +1392,6 @@ DEBUG: Router planner cannot handle multi-shard select queries
480
(1 row)
-- cte a has to be recursively planned because of OFFSET 0
-- after that, cte b also requires recursive planning
WITH a AS (SELECT * FROM test_table OFFSET 0),
b AS (SELECT * FROM test_table)
SELECT min(a.key) FROM a LEFT JOIN b ON (a.value = b.value);
@ -1407,13 +1400,10 @@ DEBUG: CTE b is going to be inlined via distributed planning
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_1 for subquery SELECT key, value, other_value FROM cte_inline.test_table OFFSET 0
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT min(a.key) AS min FROM ((SELECT intermediate_result.key, intermediate_result.value, intermediate_result.other_value FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(key integer, value text, other_value jsonb)) a LEFT JOIN (SELECT test_table.key, test_table.value, test_table.other_value FROM cte_inline.test_table) b ON ((a.value OPERATOR(pg_catalog.=) b.value)))
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_1 for CTE a: SELECT key, value, other_value FROM cte_inline.test_table OFFSET 0
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_2 for CTE b: SELECT key, value, other_value FROM cte_inline.test_table
DEBUG: recursively planning right side of the left join since the outer side is a recurring rel
DEBUG: recursively planning the distributed subquery since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_2 for subquery SELECT key, value, other_value FROM cte_inline.test_table
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT min(a.key) AS min FROM ((SELECT intermediate_result.key, intermediate_result.value, intermediate_result.other_value FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(key integer, value text, other_value jsonb)) a LEFT JOIN (SELECT intermediate_result.key, intermediate_result.value, intermediate_result.other_value FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(key integer, value text, other_value jsonb)) b ON ((a.value OPERATOR(pg_catalog.=) b.value)))
DEBUG: Creating router plan
min
@ -1450,14 +1440,11 @@ DEBUG: CTE cte_2 is going to be inlined via distributed planning
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_1 for subquery SELECT value FROM cte_inline.test_table WHERE (key OPERATOR(pg_catalog.>) 3)
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT value FROM ((SELECT test_table.value FROM cte_inline.test_table WHERE (test_table.key OPERATOR(pg_catalog.>) 1)) cte_1 FULL JOIN (SELECT intermediate_result.value FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(value text)) cte_2 USING (value)) ORDER BY value DESC LIMIT 3
DEBUG: recursively planning left side of the full join since the other side is a recurring rel
DEBUG: recursively planning the distributed subquery since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_1 for CTE cte_1: SELECT value FROM cte_inline.test_table WHERE (key OPERATOR(pg_catalog.>) 1)
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_2 for CTE cte_2: SELECT value FROM cte_inline.test_table WHERE (key OPERATOR(pg_catalog.>) 3)
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT value FROM ((SELECT intermediate_result.value FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(value text)) cte_1 FULL JOIN (SELECT intermediate_result.value FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(value text)) cte_2 USING (value)) ORDER BY value DESC LIMIT 3
DEBUG: generating subplan XXX_2 for subquery SELECT value FROM cte_inline.test_table WHERE (key OPERATOR(pg_catalog.>) 1)
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT value FROM ((SELECT intermediate_result.value FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(value text)) cte_1 FULL JOIN (SELECT intermediate_result.value FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(value text)) cte_2 USING (value)) ORDER BY value DESC LIMIT 3
DEBUG: Creating router plan
value
---------------------------------------------------------------------

View File

@ -1212,9 +1212,16 @@ DEBUG: push down of limit count: 1
SELECT id, name FROM local LEFT JOIN distributed USING (id) ORDER BY 1 LIMIT 1;
DEBUG: Wrapping relation "local" to a subquery
DEBUG: generating subplan XXX_1 for subquery SELECT id FROM local_dist_join_mixed.local WHERE true
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT local.id, distributed.name FROM ((SELECT NULL::integer AS "dummy-1", local_1.id, NULL::integer AS "dummy-3", NULL::text AS title, NULL::integer AS "dummy-5" FROM (SELECT intermediate_result.id FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(id bigint)) local_1) local LEFT JOIN local_dist_join_mixed.distributed USING (id)) ORDER BY local.id LIMIT 1
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
DEBUG: recursively planning right side of the left join since the outer side is a recurring rel
DEBUG: recursively planning distributed relation "distributed" since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: Wrapping relation "distributed" to a subquery
DEBUG: generating subplan XXX_2 for subquery SELECT id, name FROM local_dist_join_mixed.distributed WHERE true
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT local.id, distributed.name FROM ((SELECT NULL::integer AS "dummy-1", local_1.id, NULL::integer AS "dummy-3", NULL::text AS title, NULL::integer AS "dummy-5" FROM (SELECT intermediate_result.id FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(id bigint)) local_1) local LEFT JOIN (SELECT NULL::integer AS "dummy-1", distributed_1.id, distributed_1.name, NULL::timestamp with time zone AS created_at FROM (SELECT intermediate_result.id, intermediate_result.name FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(id bigint, name text)) distributed_1) distributed USING (id)) ORDER BY local.id LIMIT 1
id | name
---------------------------------------------------------------------
0 | 0
(1 row)
SELECT
foo1.id
FROM

View File

@ -1577,9 +1577,16 @@ SELECT 1 AS res FROM table2 RIGHT JOIN (SELECT 1 FROM table1, table2) AS sub1 ON
DEBUG: Wrapping relation "table1" to a subquery
DEBUG: generating subplan XXX_1 for subquery SELECT NULL::integer AS "dummy-1" FROM local_table_join.table1 WHERE true
DEBUG: generating subplan XXX_2 for subquery SELECT 1 FROM (SELECT NULL::integer AS a FROM (SELECT intermediate_result."dummy-1" FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result("dummy-1" integer)) table1_1) table1, local_table_join.table2
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT 1 AS res FROM (local_table_join.table2 RIGHT JOIN (SELECT intermediate_result."?column?" FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result("?column?" integer)) sub1("?column?") ON (false))
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
DEBUG: recursively planning left side of the right join since the outer side is a recurring rel
DEBUG: recursively planning distributed relation "table2" since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: Wrapping relation "table2" to a subquery
DEBUG: generating subplan XXX_3 for subquery SELECT NULL::integer AS "dummy-1" FROM local_table_join.table2 WHERE true
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT 1 AS res FROM ((SELECT NULL::integer AS a FROM (SELECT intermediate_result."dummy-1" FROM read_intermediate_result('XXX_3'::text, 'binary'::citus_copy_format) intermediate_result("dummy-1" integer)) table2_1) table2 RIGHT JOIN (SELECT intermediate_result."?column?" FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result("?column?" integer)) sub1("?column?") ON (false))
res
---------------------------------------------------------------------
1
(1 row)
ROLLBACK;
BEGIN;
SELECT create_reference_table('table1');

View File

@ -349,10 +349,17 @@ DEBUG: Wrapping relation "citus_local_table" to a subquery
DEBUG: generating subplan XXX_1 for subquery SELECT NULL::integer AS "dummy-1" FROM mixed_relkind_tests.citus_local_table WHERE true
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT foo.a FROM (mixed_relkind_tests.partitioned_distributed_table foo JOIN (SELECT NULL::integer AS a FROM (SELECT intermediate_result."dummy-1" FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result("dummy-1" integer)) citus_local_table_1) citus_local_table ON (true))
DEBUG: performing repartitioned INSERT ... SELECT
-- should fail
SELECT COUNT(*) FROM reference_table LEFT JOIN partitioned_distributed_table ON true;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
DEBUG: recursively planning right side of the left join since the outer side is a recurring rel
DEBUG: recursively planning distributed relation "partitioned_distributed_table" since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: Wrapping relation "partitioned_distributed_table" to a subquery
DEBUG: generating subplan XXX_1 for subquery SELECT NULL::integer AS "dummy-1" FROM mixed_relkind_tests.partitioned_distributed_table WHERE true
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS count FROM (mixed_relkind_tests.reference_table LEFT JOIN (SELECT NULL::integer AS a, NULL::integer AS b FROM (SELECT intermediate_result."dummy-1" FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result("dummy-1" integer)) partitioned_distributed_table_1) partitioned_distributed_table ON (true))
count
---------------------------------------------------------------------
468
(1 row)
-- non-colocated subquery should work
SELECT COUNT(*) FROM
(SELECT *, random() FROM partitioned_distributed_table) AS foo,

View File

@ -51,12 +51,24 @@ SELECT count(*) FROM customer, orders WHERE c_custkey = o_custkey;
1956
(1 row)
-- Test joinExpr aliases by performing an outer-join. This code path is
-- currently not exercised, but we are adding this test to catch this bug when
-- we start supporting outer joins.
-- Test joinExpr aliases by performing an outer-join.
SELECT c_custkey
FROM (customer LEFT OUTER JOIN orders ON (c_custkey = o_custkey)) AS
test(c_custkey, c_nationkey)
INNER JOIN lineitem ON (test.c_custkey = l_orderkey)
ORDER BY 1
LIMIT 10;
ERROR: complex joins are only supported when all distributed tables are co-located and joined on their distribution columns
c_custkey
---------------------------------------------------------------------
1
1
1
1
1
1
1
1
1
1
(10 rows)

View File

@ -1839,8 +1839,6 @@ FROM (SELECT SUM(raw_events_second.value_4) AS v4,
ON (f.id = f2.id)
WHERE f.id IN (SELECT user_id
FROM raw_events_second));
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
-- cannot push down since join is not equi join (f.id > f2.id)
INSERT INTO raw_events_second
(user_id)
@ -1869,8 +1867,6 @@ FROM (SELECT SUM(raw_events_second.value_4) AS v4,
ON (f.id > f2.id)
WHERE f.id IN (SELECT user_id
FROM raw_events_second));
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
-- we currently not support grouping sets
INSERT INTO agg_events
(user_id,
@ -1978,7 +1974,7 @@ INSERT INTO raw_events_first (user_id, time, value_1, value_2, value_3, value_4)
SELECT count(*) FROM raw_events_second;
count
---------------------------------------------------------------------
36
45
(1 row)
INSERT INTO raw_events_second SELECT * FROM test_view;
@ -1988,7 +1984,7 @@ INSERT INTO raw_events_second SELECT * FROM test_view WHERE user_id = 17 GROUP B
SELECT count(*) FROM raw_events_second;
count
---------------------------------------------------------------------
38
47
(1 row)
-- intermediate results (CTEs) should be allowed when doing INSERT...SELECT within a CTE

View File

@ -1839,8 +1839,6 @@ FROM (SELECT SUM(raw_events_second.value_4) AS v4,
ON (f.id = f2.id)
WHERE f.id IN (SELECT user_id
FROM raw_events_second));
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
-- cannot push down since join is not equi join (f.id > f2.id)
INSERT INTO raw_events_second
(user_id)
@ -1869,8 +1867,6 @@ FROM (SELECT SUM(raw_events_second.value_4) AS v4,
ON (f.id > f2.id)
WHERE f.id IN (SELECT user_id
FROM raw_events_second));
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
-- we currently not support grouping sets
INSERT INTO agg_events
(user_id,
@ -1978,7 +1974,7 @@ INSERT INTO raw_events_first (user_id, time, value_1, value_2, value_3, value_4)
SELECT count(*) FROM raw_events_second;
count
---------------------------------------------------------------------
36
45
(1 row)
INSERT INTO raw_events_second SELECT * FROM test_view;
@ -1988,7 +1984,7 @@ INSERT INTO raw_events_second SELECT * FROM test_view WHERE user_id = 17 GROUP B
SELECT count(*) FROM raw_events_second;
count
---------------------------------------------------------------------
38
47
(1 row)
-- intermediate results (CTEs) should be allowed when doing INSERT...SELECT within a CTE

View File

@ -107,7 +107,7 @@ FROM (
ERROR: the query contains a join that requires repartitioning
HINT: Set citus.enable_repartition_joins to on to enable repartitioning
-- the LEFT JOIN conditon is not on the partition column (i.e., is it part_key divided by 2)
-- still, recursive planning will kick in to plan some part of the query
-- but, we can plan the query thanks to recursive planning
SET client_min_messages TO DEBUG1;
INSERT INTO agg_results_third (user_id, value_1_agg, value_2_agg )
SELECT user_id, sum(array_length(events_table, 1)), length(hasdone_event)
@ -145,14 +145,22 @@ FROM (
AND e.event_type IN (106, 107, 108)
) t2 ON (t1.user_id = (t2.user_id)/2)
GROUP BY t1.user_id, hasdone_event
) t GROUP BY user_id, hasdone_event;
) t GROUP BY user_id, hasdone_event
RETURNING user_id, value_1_agg, value_2_agg;
DEBUG: Set operations are not allowed in distributed INSERT ... SELECT queries
DEBUG: generating subplan XXX_1 for subquery SELECT u.user_id, 'step=>1'::text AS event, e."time" FROM public.users_table u, public.events_table e WHERE ((u.user_id OPERATOR(pg_catalog.=) e.user_id) AND (u.user_id OPERATOR(pg_catalog.>=) 10) AND (u.user_id OPERATOR(pg_catalog.<=) 25) AND (e.event_type OPERATOR(pg_catalog.=) ANY (ARRAY[100, 101, 102])))
DEBUG: generating subplan XXX_2 for subquery SELECT u.user_id, 'step=>2'::text AS event, e."time" FROM public.users_table u, public.events_table e WHERE ((u.user_id OPERATOR(pg_catalog.=) e.user_id) AND (u.user_id OPERATOR(pg_catalog.>=) 10) AND (u.user_id OPERATOR(pg_catalog.<=) 25) AND (e.event_type OPERATOR(pg_catalog.=) ANY (ARRAY[103, 104, 105])))
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT intermediate_result.user_id, intermediate_result.event, intermediate_result."time" FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, event text, "time" timestamp without time zone) UNION SELECT intermediate_result.user_id, intermediate_result.event, intermediate_result."time" FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, event text, "time" timestamp without time zone)
DEBUG: generating subplan XXX_1 for subquery SELECT u.user_id, 'step=>1'::text AS event, e."time" FROM public.users_table u, public.events_table e WHERE ((u.user_id OPERATOR(pg_catalog.=) e.user_id) AND (u.user_id OPERATOR(pg_catalog.>=) 10) AND (u.user_id OPERATOR(pg_catalog.<=) 25) AND (e.event_type OPERATOR(pg_catalog.=) ANY (ARRAY[100, 101, 102]))) UNION SELECT u.user_id, 'step=>2'::text AS event, e."time" FROM public.users_table u, public.events_table e WHERE ((u.user_id OPERATOR(pg_catalog.=) e.user_id) AND (u.user_id OPERATOR(pg_catalog.>=) 10) AND (u.user_id OPERATOR(pg_catalog.<=) 25) AND (e.event_type OPERATOR(pg_catalog.=) ANY (ARRAY[103, 104, 105])))
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
DEBUG: recursively planning right side of the left join since the outer side is a recurring rel
DEBUG: recursively planning the distributed subquery since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: generating subplan XXX_2 for subquery SELECT DISTINCT user_id, 'Has done event'::text AS hasdone_event FROM public.events_table e WHERE ((user_id OPERATOR(pg_catalog.>=) 10) AND (user_id OPERATOR(pg_catalog.<=) 25) AND (event_type OPERATOR(pg_catalog.=) ANY (ARRAY[106, 107, 108])))
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT user_id, int4(sum(array_length(events_table, 1))) AS value_1_agg, length(hasdone_event) AS value_2_agg FROM (SELECT t1.user_id, array_agg(t1.event ORDER BY t1."time") AS events_table, COALESCE(t2.hasdone_event, 'Has not done event'::text) AS hasdone_event FROM ((SELECT intermediate_result.user_id, intermediate_result.event, intermediate_result."time" FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, event text, "time" timestamp without time zone)) t1 LEFT JOIN (SELECT intermediate_result.user_id, intermediate_result.hasdone_event FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, hasdone_event text)) t2 ON ((t1.user_id OPERATOR(pg_catalog.=) (t2.user_id OPERATOR(pg_catalog./) 2)))) GROUP BY t1.user_id, t2.hasdone_event) t GROUP BY user_id, hasdone_event
DEBUG: Collecting INSERT ... SELECT results on coordinator
user_id | value_1_agg | value_2_agg
---------------------------------------------------------------------
(0 rows)
RESET client_min_messages;
---------------------------------------------------------------------
---------------------------------------------------------------------
@ -229,7 +237,7 @@ ORDER BY
ERROR: complex joins are only supported when all distributed tables are joined on their distribution columns with equal operator
-- not pushable since the JOIN condition is not equi JOIN
-- (subquery_1 JOIN subquery_2)
-- still, recursive planning will kick in
-- but, we can plan the query thanks to recursive planning
SET client_min_messages TO DEBUG1;
INSERT INTO agg_results_third (user_id, value_1_agg, value_2_agg)
SELECT
@ -295,14 +303,22 @@ WHERE
GROUP BY
count_pay, user_id
ORDER BY
count_pay;
count_pay
RETURNING user_id, value_1_agg, value_2_agg;
DEBUG: Set operations are not allowed in distributed INSERT ... SELECT queries
DEBUG: generating subplan XXX_1 for subquery SELECT users_table.user_id, 'action=>1'::text AS event, events_table."time" FROM public.users_table, public.events_table WHERE ((users_table.user_id OPERATOR(pg_catalog.=) events_table.user_id) AND (users_table.user_id OPERATOR(pg_catalog.>=) 10) AND (users_table.user_id OPERATOR(pg_catalog.<=) 70) AND (events_table.event_type OPERATOR(pg_catalog.>) 10) AND (events_table.event_type OPERATOR(pg_catalog.<) 12))
DEBUG: generating subplan XXX_2 for subquery SELECT users_table.user_id, 'action=>2'::text AS event, events_table."time" FROM public.users_table, public.events_table WHERE ((users_table.user_id OPERATOR(pg_catalog.=) events_table.user_id) AND (users_table.user_id OPERATOR(pg_catalog.>=) 10) AND (users_table.user_id OPERATOR(pg_catalog.<=) 70) AND (events_table.event_type OPERATOR(pg_catalog.>) 12) AND (events_table.event_type OPERATOR(pg_catalog.<) 14))
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT intermediate_result.user_id, intermediate_result.event, intermediate_result."time" FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, event text, "time" timestamp without time zone) UNION SELECT intermediate_result.user_id, intermediate_result.event, intermediate_result."time" FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, event text, "time" timestamp without time zone)
DEBUG: generating subplan XXX_1 for subquery SELECT users_table.user_id, 'action=>1'::text AS event, events_table."time" FROM public.users_table, public.events_table WHERE ((users_table.user_id OPERATOR(pg_catalog.=) events_table.user_id) AND (users_table.user_id OPERATOR(pg_catalog.>=) 10) AND (users_table.user_id OPERATOR(pg_catalog.<=) 70) AND (events_table.event_type OPERATOR(pg_catalog.>) 10) AND (events_table.event_type OPERATOR(pg_catalog.<) 12)) UNION SELECT users_table.user_id, 'action=>2'::text AS event, events_table."time" FROM public.users_table, public.events_table WHERE ((users_table.user_id OPERATOR(pg_catalog.=) events_table.user_id) AND (users_table.user_id OPERATOR(pg_catalog.>=) 10) AND (users_table.user_id OPERATOR(pg_catalog.<=) 70) AND (events_table.event_type OPERATOR(pg_catalog.>) 12) AND (events_table.event_type OPERATOR(pg_catalog.<) 14))
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
DEBUG: recursively planning right side of the left join since the outer side is a recurring rel
DEBUG: recursively planning the distributed subquery since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: generating subplan XXX_2 for subquery SELECT user_id, count(*) AS count_pay FROM public.users_table WHERE ((user_id OPERATOR(pg_catalog.>=) 10) AND (user_id OPERATOR(pg_catalog.<=) 70) AND (value_1 OPERATOR(pg_catalog.>) 15) AND (value_1 OPERATOR(pg_catalog.<) 17)) GROUP BY user_id HAVING (count(*) OPERATOR(pg_catalog.>) 1)
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT user_id, int4(avg(array_length(events_table, 1))) AS value_1_agg, int4(count_pay) AS value_2_agg FROM (SELECT subquery_1.user_id, array_agg(subquery_1.event ORDER BY subquery_1."time") AS events_table, COALESCE(subquery_2.count_pay, (0)::bigint) AS count_pay FROM ((SELECT intermediate_result.user_id, intermediate_result.event, intermediate_result."time" FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, event text, "time" timestamp without time zone)) subquery_1 LEFT JOIN (SELECT intermediate_result.user_id, intermediate_result.count_pay FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, count_pay bigint)) subquery_2 ON ((subquery_1.user_id OPERATOR(pg_catalog.>) subquery_2.user_id))) GROUP BY subquery_1.user_id, subquery_2.count_pay) subquery_top WHERE (array_ndims(events_table) OPERATOR(pg_catalog.>) 0) GROUP BY count_pay, user_id ORDER BY count_pay
DEBUG: Collecting INSERT ... SELECT results on coordinator
user_id | value_1_agg | value_2_agg
---------------------------------------------------------------------
(0 rows)
RESET client_min_messages;
---------------------------------------------------------------------
---------------------------------------------------------------------

View File

@ -406,7 +406,7 @@ ORDER BY l_custkey, r_custkey, t_custkey;
30 | 30 | 30
(17 rows)
-- Right join with single shard right most table should error out
-- Right join with single shard right most table should work
SELECT
l_custkey, r_custkey, t_custkey
FROM
@ -414,8 +414,42 @@ FROM
LEFT JOIN multi_outer_join_right r1 ON (l1.l_custkey = r1.r_custkey)
RIGHT JOIN multi_outer_join_third_reference t1 ON (r1.r_custkey = t1.t_custkey)
ORDER BY l_custkey, r_custkey, t_custkey;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
LOG: join order: [ "multi_outer_join_left" ]
LOG: join order: [ "multi_outer_join_right" ]
l_custkey | r_custkey | t_custkey
---------------------------------------------------------------------
11 | 11 | 11
12 | 12 | 12
14 | 14 | 14
16 | 16 | 16
17 | 17 | 17
18 | 18 | 18
20 | 20 | 20
21 | 21 | 21
22 | 22 | 22
24 | 24 | 24
26 | 26 | 26
27 | 27 | 27
28 | 28 | 28
30 | 30 | 30
| | 1
| | 2
| | 3
| | 4
| | 5
| | 6
| | 7
| | 8
| | 9
| | 10
| | 13
| | 15
| | 19
| | 23
| | 25
| | 29
(30 rows)
-- Right join with single shard left most table should work
SELECT
t_custkey, r_custkey, l_custkey

View File

@ -208,13 +208,17 @@ FROM
20 | 0
(1 row)
-- Right join should be disallowed in this case
-- Right join is allowed as we recursively plan the distributed table (multi_outer_join_left_hash)
SELECT
min(r_custkey), max(r_custkey)
FROM
multi_outer_join_left_hash a RIGHT JOIN multi_outer_join_right_reference b ON (l_custkey = r_custkey);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
LOG: join order: [ "multi_outer_join_left_hash" ]
min | max
---------------------------------------------------------------------
1 | 15
(1 row)
-- Reverse right join should be same as left join
SELECT
min(l_custkey), max(l_custkey)
@ -319,13 +323,17 @@ FROM
25 | 1
(1 row)
-- Right join should not be allowed in this case
-- Right join should be allowed in this case as we recursively plan the distributed table (multi_outer_join_left_hash
SELECT
min(r_custkey), max(r_custkey)
FROM
multi_outer_join_left_hash a RIGHT JOIN multi_outer_join_right_reference b ON (l_custkey = r_custkey);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
LOG: join order: [ "multi_outer_join_left_hash" ]
min | max
---------------------------------------------------------------------
11 | 30
(1 row)
-- Reverse right join should be same as left join
SELECT
min(l_custkey), max(l_custkey)
@ -393,15 +401,50 @@ ORDER BY 1;
30 | 30 | 30
(25 rows)
-- Right join with single shard right most table should error out
-- Right join with single shard right most table should work
SELECT
l_custkey, r_custkey, t_custkey
FROM
multi_outer_join_left_hash l1
LEFT JOIN multi_outer_join_right_hash r1 ON (l1.l_custkey = r1.r_custkey)
RIGHT JOIN multi_outer_join_third_reference t1 ON (r1.r_custkey = t1.t_custkey);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
RIGHT JOIN multi_outer_join_third_reference t1 ON (r1.r_custkey = t1.t_custkey)
ORDER BY 1,2,3;
LOG: join order: [ "multi_outer_join_left_hash" ]
LOG: join order: [ "multi_outer_join_right_hash" ]
l_custkey | r_custkey | t_custkey
---------------------------------------------------------------------
11 | 11 | 11
12 | 12 | 12
13 | 13 | 13
14 | 14 | 14
15 | 15 | 15
21 | 21 | 21
22 | 22 | 22
23 | 23 | 23
24 | 24 | 24
25 | 25 | 25
26 | 26 | 26
27 | 27 | 27
28 | 28 | 28
29 | 29 | 29
30 | 30 | 30
| | 1
| | 2
| | 3
| | 4
| | 5
| | 6
| | 7
| | 8
| | 9
| | 10
| | 16
| | 17
| | 18
| | 19
| | 20
(30 rows)
-- Right join with single shard left most table should work
SELECT
t_custkey, r_custkey, l_custkey
@ -600,14 +643,48 @@ ORDER BY 1,2 DESC;
| 16
(15 rows)
-- full outer join should error out for mismatched shards
-- full outer join should work as we recursively plan the distributed table (multi_outer_join_left_hash
SELECT
l_custkey, t_custkey
FROM
multi_outer_join_left_hash l1
FULL JOIN multi_outer_join_third_reference t1 ON (l1.l_custkey = t1.t_custkey);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
FULL JOIN multi_outer_join_third_reference t1 ON (l1.l_custkey = t1.t_custkey)
ORDER BY 1,2;
LOG: join order: [ "multi_outer_join_left_hash" ]
l_custkey | t_custkey
---------------------------------------------------------------------
1 | 1
2 | 2
3 | 3
4 | 4
5 | 5
6 | 6
7 | 7
8 | 8
9 | 9
10 | 10
11 | 11
12 | 12
13 | 13
14 | 14
15 | 15
21 | 21
22 | 22
23 | 23
24 | 24
25 | 25
26 | 26
27 | 27
28 | 28
29 | 29
30 | 30
| 16
| 17
| 18
| 19
| 20
(30 rows)
-- inner join + single shard left join should work
SELECT
l_custkey, r_custkey, t_custkey

View File

@ -664,15 +664,28 @@ WHERE user_id IN
INTERSECT
SELECT user_id
FROM events_test_table);
-- Reference tables can not locate on the outer part of the outer join
-- Reference tables can locate on the outer part of the outer join
-- Note that we don't need to sort the output because
-- citus.sort_returning is enabled by default during
-- regression tests.
UPDATE users_test_table
SET value_1 = 4
WHERE user_id IN
(SELECT DISTINCT e2.user_id
FROM users_reference_copy_table
LEFT JOIN users_test_table e2 ON (e2.user_id = users_reference_copy_table.value_1)) RETURNING *;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
WHERE user_id IN (
SELECT DISTINCT e2.user_id
FROM users_reference_copy_table
LEFT JOIN users_test_table e2 ON (e2.user_id = users_reference_copy_table.value_1)
)
RETURNING *;
user_id | value_1 | value_2 | value_3
---------------------------------------------------------------------
5 | 4 | 1 | 0
5 | 4 | 1 | 0
5 | 4 | 1 | 0
6 | 4 | 11 | 0
6 | 4 | 15 | 0
16 | 4 | | 0
(6 rows)
-- Volatile functions are also not supported
UPDATE users_test_table
SET value_2 = 5

View File

@ -76,12 +76,20 @@ ORDER BY 1;
7
(10 rows)
-- Shouldn't work, reference table at the outer side is not allowed
-- Should work, reference table at the outer side is allowed
SELECT * FROM
(SELECT random() FROM users_ref_test_table LEFT JOIN user_buy_test_table
(SELECT random() > 2 FROM users_ref_test_table LEFT JOIN user_buy_test_table
ON users_ref_test_table.id = user_buy_test_table.user_id) subquery_1;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
?column?
---------------------------------------------------------------------
f
f
f
f
f
f
(6 rows)
-- Should work, reference table at the inner side is allowed
SELECT count(*) FROM
(SELECT random() FROM users_ref_test_table RIGHT JOIN user_buy_test_table
@ -91,12 +99,20 @@ SELECT count(*) FROM
4
(1 row)
-- Shouldn't work, reference table at the outer side is not allowed
-- Should work, reference table at the outer side is allowed
SELECT * FROM
(SELECT random() FROM user_buy_test_table RIGHT JOIN users_ref_test_table
(SELECT random() > 2 FROM user_buy_test_table RIGHT JOIN users_ref_test_table
ON user_buy_test_table.user_id = users_ref_test_table.id) subquery_1;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
?column?
---------------------------------------------------------------------
f
f
f
f
f
f
(6 rows)
-- Equi join test with reference table on non-partition keys
SELECT count(*) FROM
(SELECT random() FROM user_buy_test_table JOIN users_ref_test_table
@ -270,16 +286,22 @@ ON user_buy_test_table.item_id = users_ref_test_table.id;
4
(1 row)
-- table function cannot be the outer relationship in an outer join
-- table function can be the outer relationship in an outer join
SELECT count(*) FROM
(SELECT random() FROM user_buy_test_table RIGHT JOIN generate_series(1,10) AS users_ref_test_table(id)
ON user_buy_test_table.item_id > users_ref_test_table.id) subquery_1;
ERROR: cannot pushdown the subquery
DETAIL: There exist a table function in the outer part of the outer join
count
---------------------------------------------------------------------
16
(1 row)
SELECT count(*) FROM user_buy_test_table RIGHT JOIN (SELECT * FROM generate_series(1,10) id) users_ref_test_table
ON user_buy_test_table.item_id = users_ref_test_table.id;
ERROR: cannot pushdown the subquery
DETAIL: There exist a table function in the outer part of the outer join
count
---------------------------------------------------------------------
10
(1 row)
-- volatile functions can be used as table expressions through recursive planning
SET client_min_messages TO DEBUG;
SELECT count(*) FROM
@ -351,11 +373,14 @@ ON user_buy_test_table.item_id = users_ref_test_table.id;
4
(1 row)
-- subquery without FROM cannot be the outer relationship in an outer join
-- subquery without FROM can be the outer relationship in an outer join
SELECT count(*) FROM user_buy_test_table RIGHT JOIN (SELECT 5 AS id) users_ref_test_table
ON user_buy_test_table.item_id = users_ref_test_table.id;
ERROR: cannot pushdown the subquery
DETAIL: There exist a subquery without FROM in the outer part of the outer join
count
---------------------------------------------------------------------
1
(1 row)
-- can perform a union with subquery without FROM
-- with pulling data to coordinator
SET client_min_messages TO DEBUG;
@ -464,8 +489,14 @@ SELECT * FROM
SELECT user_id FROM user_buy_test_table) sub
ORDER BY 1 DESC;
DEBUG: Router planner cannot handle multi-shard select queries
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
user_id
---------------------------------------------------------------------
7
3
2
1
(4 rows)
SELECT * FROM
(SELECT user_id FROM users_ref_test_table ref JOIN user_buy_test_table dis
on (ref.id = dis.user_id)
@ -498,8 +529,8 @@ DEBUG: Router planner cannot handle multi-shard select queries
1
(4 rows)
-- query can be pushed down when a reference table inside union query is
-- joined with a distributed table. reference table cannot be at
-- query is supported when a reference table inside union query is
-- joined with a distributed table. reference table can be at
-- the outer part.
SELECT * FROM
(SELECT user_id FROM users_ref_test_table ref LEFT JOIN user_buy_test_table dis
@ -508,8 +539,26 @@ SELECT * FROM
SELECT user_id FROM user_buy_test_table) sub
ORDER BY 1 DESC;
DEBUG: Router planner cannot handle multi-shard select queries
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
DEBUG: recursively planning right side of the left join since the outer side is a recurring rel
DEBUG: recursively planning distributed relation "user_buy_test_table" "dis" since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: Wrapping relation "user_buy_test_table" "dis" to a subquery
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_1 for subquery SELECT user_id FROM public.user_buy_test_table dis WHERE true
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_2 for subquery SELECT user_id FROM public.user_buy_test_table
DEBUG: Creating router plan
DEBUG: generating subplan XXX_3 for subquery SELECT dis.user_id FROM (public.users_ref_test_table ref LEFT JOIN (SELECT dis_1.user_id, NULL::integer AS item_id, NULL::integer AS buy_count FROM (SELECT intermediate_result.user_id FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer)) dis_1) dis ON ((ref.id OPERATOR(pg_catalog.=) dis.user_id))) UNION SELECT intermediate_result.user_id FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer)
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT user_id FROM (SELECT intermediate_result.user_id FROM read_intermediate_result('XXX_3'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer)) sub ORDER BY user_id DESC
DEBUG: Creating router plan
user_id
---------------------------------------------------------------------
7
3
2
1
(5 rows)
RESET client_min_messages;
-- should be able to pushdown since reference table is in the
-- inner part of the left join
@ -533,7 +582,7 @@ FROM
6 | 210
(6 rows)
-- should not be able to pushdown since reference table is in the
-- supported even if the reference table is in the
-- direct outer part of the left join
SELECT
user_id, sum(value_1)
@ -545,18 +594,29 @@ FROM
LEFT JOIN events_table ON (events_table.user_id = users_table.user_id)
) as foo
GROUP BY user_id ORDER BY 2 DESC LIMIT 10;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
-- should not be able to pushdown since reference table is in the
user_id | sum
---------------------------------------------------------------------
|
2 | 31248
3 | 15120
4 | 14994
5 | 8694
1 | 7590
(6 rows)
-- supported even if the reference table is in the
-- direct outer part of the left join wrapped into a subquery
SELECT
*
COUNT(*) = 1581
FROM
(SELECT *, random() FROM events_reference_table) as ref_all LEFT JOIN users_table
ON (users_table.user_id = ref_all.value_2);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
-- should not be able to pushdown since reference table is in the
?column?
---------------------------------------------------------------------
t
(1 row)
-- supported even if the reference table is in the
-- outer part of the left join
SELECT
user_id, sum(value_1)
@ -568,8 +628,16 @@ FROM
LEFT JOIN events_table ON (events_table.user_id = users_table.user_id)
) as foo
GROUP BY user_id ORDER BY 2 DESC LIMIT 10;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
user_id | sum
---------------------------------------------------------------------
|
2 | 31248
3 | 15120
4 | 14994
5 | 8694
1 | 7590
(6 rows)
-- should be able to pushdown since reference table is in the
-- inner part of the left join
SELECT * FROM
@ -617,8 +685,14 @@ SELECT * FROM
WHERE
user_id > 2 and value_2 = 1) as foo_in ON (event_val_2 = user_id)) as foo LEFT JOIN
(SELECT user_id as user_user_id FROM users_table) as fooo ON (user_id = user_user_id)) as bar;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
user_id
---------------------------------------------------------------------
3
5
4
(4 rows)
-- we could even support the following where the subquery
-- on the outer part of the left join contains a reference table
SELECT max(events_all.cnt), events_all.usr_id
@ -1064,7 +1138,7 @@ count(*) AS cnt, "generated_group_field"
84 | 0
(6 rows)
-- RIGHT JOINs used with INNER JOINs should error out since reference table exist in the
-- RIGHT JOINs used with INNER JOINs should work even if the reference table exist in the
-- right side of the RIGHT JOIN.
SELECT
count(*) AS cnt, "generated_group_field"
@ -1102,8 +1176,16 @@ count(*) AS cnt, "generated_group_field"
ORDER BY
cnt DESC, generated_group_field ASC
LIMIT 10;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
cnt | generated_group_field
---------------------------------------------------------------------
1007 | 2
952 | 5
773 | 1
696 | 3
433 | 4
190 | 0
(6 rows)
-- right join where the inner part of the join includes a reference table
-- joined with hash partitioned table using non-equi join
SELECT user_id, sum(array_length(events_table, 1)), length(hasdone_event), hasdone_event
@ -1254,8 +1336,27 @@ FROM
ORDER BY
user_id
limit 50;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
user_id | lastseen
---------------------------------------------------------------------
1 |
1 |
2 | Thu Nov 23 17:26:14.563216 2017
2 | Thu Nov 23 17:26:14.563216 2017
2 | Thu Nov 23 17:26:14.563216 2017
3 | Thu Nov 23 18:08:26.550729 2017
3 | Thu Nov 23 18:08:26.550729 2017
3 | Thu Nov 23 18:08:26.550729 2017
4 |
4 |
4 |
5 |
5 |
5 |
5 |
5 |
6 |
(17 rows)
--
-- UNIONs and JOINs with reference tables, should error out
--
@ -1466,13 +1567,16 @@ ORDER BY types;
3 | 120
(4 rows)
-- just a sanity check that we don't allow this if the reference table is on the
-- just a sanity check that we allow this even if the reference table is on the
-- left part of the left join
SELECT count(*) FROM
(SELECT random() FROM users_ref_test_table LEFT JOIN user_buy_test_table
ON user_buy_test_table.item_id > users_ref_test_table.id) subquery_1;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
12
(1 row)
-- we do allow non equi join among subqueries via recursive planning
SET client_min_messages TO DEBUG1;
SELECT count(*) FROM
@ -1488,9 +1592,6 @@ DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS c
67
(1 row)
-- we could not push this query not due to non colocated
-- subqueries (i.e., they are recursively planned)
-- but due to outer join restrictions
SELECT
count(*) AS cnt, "generated_group_field"
FROM
@ -1528,9 +1629,20 @@ count(*) AS cnt, "generated_group_field"
cnt DESC, generated_group_field ASC
LIMIT 10;
DEBUG: generating subplan XXX_1 for subquery SELECT user_id, value_2 AS generated_group_field FROM public.users_table users
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS cnt, generated_group_field FROM (SELECT "eventQuery".user_id, random() AS random, "eventQuery".generated_group_field FROM (SELECT multi_group_wrapper_1."time", multi_group_wrapper_1.event_user_id, multi_group_wrapper_1.user_id, left_group_by_1.generated_group_field, random() AS random FROM ((SELECT temp_data_queries."time", temp_data_queries.event_user_id, user_filters_1.user_id FROM ((SELECT events."time", events.user_id AS event_user_id FROM public.events_table events WHERE (events.user_id OPERATOR(pg_catalog.>) 2)) temp_data_queries JOIN (SELECT users.user_id FROM public.users_reference_table users WHERE ((users.user_id OPERATOR(pg_catalog.>) 2) AND (users.value_2 OPERATOR(pg_catalog.=) 5))) user_filters_1 ON ((temp_data_queries.event_user_id OPERATOR(pg_catalog.<) user_filters_1.user_id)))) multi_group_wrapper_1 RIGHT JOIN (SELECT intermediate_result.user_id, intermediate_result.generated_group_field FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, generated_group_field integer)) left_group_by_1 ON ((left_group_by_1.user_id OPERATOR(pg_catalog.>) multi_group_wrapper_1.event_user_id)))) "eventQuery") "pushedDownQuery" GROUP BY generated_group_field ORDER BY (count(*)) DESC, generated_group_field LIMIT 10
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
DEBUG: recursively planning left side of the right join since the outer side is a recurring rel
DEBUG: recursively planning the distributed subquery since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: generating subplan XXX_2 for subquery SELECT temp_data_queries."time", temp_data_queries.event_user_id, user_filters_1.user_id FROM ((SELECT events."time", events.user_id AS event_user_id FROM public.events_table events WHERE (events.user_id OPERATOR(pg_catalog.>) 2)) temp_data_queries JOIN (SELECT users.user_id FROM public.users_reference_table users WHERE ((users.user_id OPERATOR(pg_catalog.>) 2) AND (users.value_2 OPERATOR(pg_catalog.=) 5))) user_filters_1 ON ((temp_data_queries.event_user_id OPERATOR(pg_catalog.<) user_filters_1.user_id)))
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS cnt, generated_group_field FROM (SELECT "eventQuery".user_id, random() AS random, "eventQuery".generated_group_field FROM (SELECT multi_group_wrapper_1."time", multi_group_wrapper_1.event_user_id, multi_group_wrapper_1.user_id, left_group_by_1.generated_group_field, random() AS random FROM ((SELECT intermediate_result."time", intermediate_result.event_user_id, intermediate_result.user_id FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result("time" timestamp without time zone, event_user_id integer, user_id integer)) multi_group_wrapper_1 RIGHT JOIN (SELECT intermediate_result.user_id, intermediate_result.generated_group_field FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, generated_group_field integer)) left_group_by_1 ON ((left_group_by_1.user_id OPERATOR(pg_catalog.>) multi_group_wrapper_1.event_user_id)))) "eventQuery") "pushedDownQuery" GROUP BY generated_group_field ORDER BY (count(*)) DESC, generated_group_field LIMIT 10
cnt | generated_group_field
---------------------------------------------------------------------
2042 | 1
1675 | 2
1470 | 4
1259 | 3
941 | 0
686 | 5
(6 rows)
RESET client_min_messages;
-- two hash partitioned relations are not joined
-- on partiton keys although reference table is fine
@ -1961,73 +2073,126 @@ LIMIT 5;
6
(1 row)
-- outer part of the LEFT JOIN consists only reference tables, so we cannot push down
-- supported even if the outer part of the LEFT JOIN consists only reference tables
-- we have different combinations for ON condition, true/false/two column join/single column filter
SELECT count(*) FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id LEFT JOIN user_buy_test_table ON true;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
24
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id LEFT JOIN user_buy_test_table ON true;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
24
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id LEFT JOIN user_buy_test_table ON false;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
6
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id LEFT JOIN user_buy_test_table ON false;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
6
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id LEFT JOIN user_buy_test_table ON (ref1.id > 5);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
9
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id LEFT JOIN user_buy_test_table ON (user_buy_test_table.user_id > 5);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
6
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id LEFT JOIN user_buy_test_table ON (ref1.id = user_buy_test_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
6
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id LEFT JOIN user_buy_test_table ON (ref2.id = user_buy_test_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
6
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id LEFT JOIN user_buy_test_table ON (ref1.id = user_buy_test_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
6
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id LEFT JOIN user_buy_test_table ON (ref2.id = user_buy_test_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
-- outer part of the LEFT JOIN consists only reference tables within a subquery, so we cannot push down
count
---------------------------------------------------------------------
6
(1 row)
-- outer part of the LEFT JOIN consists only reference tables within a subquery
-- we have different combinations for ON condition, true/false/two column join/single column filter
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo LEFT JOIN user_buy_test_table ON true;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
24
(1 row)
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo LEFT JOIN user_buy_test_table ON true;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
24
(1 row)
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo LEFT JOIN user_buy_test_table ON false;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
6
(1 row)
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo LEFT JOIN user_buy_test_table ON false;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
6
(1 row)
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo LEFT JOIN user_buy_test_table ON (foo.id > 5);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
9
(1 row)
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo LEFT JOIN user_buy_test_table ON (user_buy_test_table.user_id > 19);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
6
(1 row)
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo LEFT JOIN user_buy_test_table ON (foo.id = user_buy_test_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
-- one example where unsupported outer join is deep inside a subquery
count
---------------------------------------------------------------------
6
(1 row)
-- one example where supported outer join is deep inside a subquery
SELECT *, random() FROM (
SELECT *,random() FROM user_buy_test_table WHERE user_id > (
SELECT count(*) FROM (SELECT *,random() FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as bar) as foo LEFT JOIN (SELECT *, random() FROM (SELECT *,random() FROM user_buy_test_table d1 JOIN user_buy_test_table d2 USING (user_id)) as bar_inner ) as bar ON true)) as boo;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
-- In theory, we should be able to pushdown this query
-- however, as the LEFT JOIN condition is between a reference table and the distributed table
-- Postgres generates a LEFT JOIN alternative among those tables
user_id | item_id | buy_count | random | random
---------------------------------------------------------------------
(0 rows)
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN user_buy_test_table u1 on ref1.id = u1.user_id) as foo LEFT JOIN user_buy_test_table ON (foo.id = user_buy_test_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
count
---------------------------------------------------------------------
3
(1 row)
-- same as the above query, but this time LEFT JOIN condition is between distributed tables
-- so Postgres doesn't generate join restriction between reference and distributed tables
SELECT count(*) FROM (SELECT u1.*, random() FROM users_ref_test_table ref1 INNER JOIN user_buy_test_table u1 on ref1.id = u1.user_id) as foo LEFT JOIN user_buy_test_table ON (foo.user_id = user_buy_test_table.user_id);
@ -2036,10 +2201,13 @@ SELECT count(*) FROM (SELECT u1.*, random() FROM users_ref_test_table ref1 INNER
3
(1 row)
-- outer part of the LEFT JOIN consists only intermediate result due to LIMIT, so we cannot push down
-- outer part of the LEFT JOIN consists only intermediate result due to LIMIT but this is still supported
SELECT count(*) FROM (SELECT ref1.* FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id LIMIT 5) as foo LEFT JOIN user_buy_test_table ON true;
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
count
---------------------------------------------------------------------
20
(1 row)
-- should be fine as OUTER part is the distributed table
SELECT count(*) FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id RIGHT JOIN user_buy_test_table ON true;
count
@ -2062,123 +2230,212 @@ SELECT count(*) FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table r
-- left outer part of the FULL JOIN consists only reference tables, so we cannot push down
-- we have different combinations for ON condition, true/false/two column join/single column filter
SELECT count(*) FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id FULL JOIN user_buy_test_table ON true;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
24
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id FULL JOIN user_buy_test_table ON true;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
24
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id FULL JOIN user_buy_test_table ON true;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
24
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id FULL JOIN user_buy_test_table ON false;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
10
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id FULL JOIN user_buy_test_table ON false;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
10
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id FULL JOIN user_buy_test_table ON (ref1.id > 5);
ERROR: FULL JOIN is only supported with merge-joinable or hash-joinable join conditions
SELECT count(*) FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id FULL JOIN user_buy_test_table ON (user_buy_test_table.user_id > 5);
ERROR: FULL JOIN is only supported with merge-joinable or hash-joinable join conditions
SELECT count(*) FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id FULL JOIN user_buy_test_table ON (ref1.id = user_buy_test_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
7
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id FULL JOIN user_buy_test_table ON (ref2.id = user_buy_test_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
7
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id FULL JOIN user_buy_test_table ON (ref1.id = user_buy_test_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
7
(1 row)
SELECT count(*) FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id FULL JOIN user_buy_test_table ON (ref2.id = user_buy_test_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
7
(1 row)
-- right outer part of the FULL JOIN consists only reference tables, so we cannot push down
-- we have different combinations for ON condition, true/false/two column join/single column filter
SELECT count(*) FROM user_buy_test_table FULL JOIN users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id ON true;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
24
(1 row)
SELECT count(*) FROM user_buy_test_table FULL JOIN users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id ON true;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
24
(1 row)
SELECT count(*) FROM user_buy_test_table FULL JOIN users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id ON true;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
24
(1 row)
SELECT count(*) FROM user_buy_test_table FULL JOIN users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id ON false;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
10
(1 row)
SELECT count(*) FROM user_buy_test_table FULL JOIN users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id ON false;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
10
(1 row)
SELECT count(*) FROM user_buy_test_table FULL JOIN users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id ON (ref1.id > 5);
ERROR: FULL JOIN is only supported with merge-joinable or hash-joinable join conditions
SELECT count(*) FROM user_buy_test_table FULL JOIN users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id ON (user_buy_test_table.user_id > 5);
ERROR: FULL JOIN is only supported with merge-joinable or hash-joinable join conditions
SELECT count(*) FROM user_buy_test_table FULL JOIN users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id ON (ref1.id = user_buy_test_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
7
(1 row)
SELECT count(*) FROM user_buy_test_table FULL JOIN users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id ON (ref2.id = user_buy_test_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
7
(1 row)
SELECT count(*) FROM user_buy_test_table FULL JOIN users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id ON (ref1.id = user_buy_test_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
7
(1 row)
SELECT count(*) FROM user_buy_test_table FULL JOIN users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id ON (ref2.id = user_buy_test_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
7
(1 row)
-- left outer part of the FULL JOIN consists only reference tables within a subquery, so we cannot push down
-- we have different combinations for ON condition, true/false/two column join/single column filter
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo FULL JOIN user_buy_test_table ON true;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
24
(1 row)
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo FULL JOIN user_buy_test_table ON true;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
24
(1 row)
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo FULL JOIN user_buy_test_table ON false;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
10
(1 row)
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo FULL JOIN user_buy_test_table ON false;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
10
(1 row)
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo FULL JOIN user_buy_test_table ON (foo.id > 5);
ERROR: FULL JOIN is only supported with merge-joinable or hash-joinable join conditions
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo FULL JOIN user_buy_test_table ON (user_buy_test_table.user_id > 19);
ERROR: FULL JOIN is only supported with merge-joinable or hash-joinable join conditions
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo FULL JOIN user_buy_test_table ON (foo.id = user_buy_test_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
7
(1 row)
-- right outer part of the FULL JOIN consists only reference tables within a subquery, so we cannot push down
-- we have different combinations for ON condition, true/false/two column join/single column filter
SELECT count(*) FROM user_buy_test_table FULL JOIN (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo ON true;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
24
(1 row)
SELECT count(*) FROM user_buy_test_table FULL JOIN (SELECT ref1.*, random() FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo ON true;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
24
(1 row)
SELECT count(*) FROM user_buy_test_table FULL JOIN (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo ON false;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
10
(1 row)
SELECT count(*) FROM user_buy_test_table FULL JOIN (SELECT ref1.*, random() FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo ON false;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
10
(1 row)
SELECT count(*) FROM user_buy_test_table FULL JOIN (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo ON (foo.id > 5);
ERROR: FULL JOIN is only supported with merge-joinable or hash-joinable join conditions
SELECT count(*) FROM user_buy_test_table FULL JOIN (SELECT ref1.*, random() FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo ON (user_buy_test_table.user_id > 19);
ERROR: FULL JOIN is only supported with merge-joinable or hash-joinable join conditions
SELECT count(*) FROM user_buy_test_table FULL JOIN (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo ON (foo.id = user_buy_test_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
7
(1 row)
-- one example where unsupported outer join is deep inside a subquery
SELECT *, random() FROM (
SELECT *,random() FROM user_buy_test_table WHERE user_id > (
SELECT count(*) FROM (SELECT *,random() FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as bar) as foo FULL JOIN (SELECT *, random() FROM (SELECT *,random() FROM user_buy_test_table d1 JOIN user_buy_test_table d2 USING (user_id)) as bar_inner ) as bar ON true)) as boo;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
user_id | item_id | buy_count | random | random
---------------------------------------------------------------------
(0 rows)
-- In theory, we should be able to pushdown this query
-- however, as the FULL JOIN condition is between a reference table and the distributed table
-- Postgres generates a FULL JOIN alternative among those tables
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN user_buy_test_table u1 on ref1.id = u1.user_id) as foo FULL JOIN user_buy_test_table ON (foo.id = user_buy_test_table.user_id);
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
count
---------------------------------------------------------------------
4
(1 row)
-- same as the above query, but this time FULL JOIN condition is between distributed tables
-- so Postgres doesn't generate join restriction between reference and distributed tables
SELECT count(*) FROM (SELECT u1.*, random() FROM users_ref_test_table ref1 INNER JOIN user_buy_test_table u1 on ref1.id = u1.user_id) as foo FULL JOIN user_buy_test_table ON (foo.user_id = user_buy_test_table.user_id);
@ -2189,8 +2446,11 @@ SELECT count(*) FROM (SELECT u1.*, random() FROM users_ref_test_table ref1 INNER
-- left outer part of the FULL JOIN consists only intermediate result due to LIMIT, so we cannot push down
SELECT count(*) FROM (SELECT ref1.* FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id LIMIT 5) as foo FULL JOIN user_buy_test_table ON true;
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
count
---------------------------------------------------------------------
20
(1 row)
DROP TABLE user_buy_test_table;
DROP TABLE users_ref_test_table;
DROP TABLE users_return_test_table;

View File

@ -534,8 +534,13 @@ WHERE
GROUP BY user_id
HAVING count(*) > 1 AND sum(value_2) > 29
ORDER BY 1;
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
user_id
---------------------------------------------------------------------
2
3
4
(3 rows)
-- NOT EXISTS query has non-equi join
SELECT user_id, array_length(events_table, 1)
FROM (

View File

@ -162,8 +162,7 @@ WHERE
)
ORDER BY user_id
LIMIT 3;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
ERROR: correlated subqueries are not supported when the FROM clause contains a CTE or subquery
-- subqueries in WHERE with IN operator without equality
SELECT
users_table.user_id, count(*)
@ -362,8 +361,7 @@ SELECT user_id, value_2 FROM users_table WHERE
HAVING sum(submit_card_info) > 0
)
ORDER BY 1, 2;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
ERROR: cannot perform a lateral outer join when a distributed subquery references a reference table
-- non-partition key equality with reference table
SELECT
user_id, count(*)

View File

@ -486,17 +486,23 @@ ORDER BY 2 DESC, 1;
5 | NO
(3 rows)
-- event vs table non-partition-key join is not supported
-- given that we cannot recursively plan tables yet
SELECT * FROM
-- event vs table non-partition-key join is supported
-- given that we can recursively plan events_table
SELECT count(*), user_id, done_event FROM
(SELECT ru.user_id, CASE WHEN et.user_id IS NULL THEN 'NO' ELSE 'YES' END as done_event
FROM recent_users ru
LEFT JOIN events_table et
ON(ru.user_id = et.event_type)
) s1
ORDER BY 2 DESC, 1;
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
GROUP BY user_id, done_event
ORDER BY 1,2,3;
count | user_id | done_event
---------------------------------------------------------------------
7 | 5 | YES
23 | 3 | YES
24 | 1 | YES
(3 rows)
-- create a select only view
CREATE VIEW selected_users AS SELECT * FROM users_table WHERE value_1 >= 1 and value_1 <3;
CREATE VIEW recent_selected_users AS SELECT su.* FROM selected_users su JOIN recent_users ru USING(user_id);

View File

@ -715,8 +715,6 @@ DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS c
(1 row)
-- recursive planning should kick in for outer joins as well
-- but this time recursive planning might convert the query
-- into a not supported join
SELECT true AS valid FROM explain_json_2($$
SELECT
@ -729,8 +727,15 @@ SELECT true AS valid FROM explain_json_2($$
$$);
DEBUG: function does not have co-located tables
DEBUG: generating subplan XXX_1 for subquery SELECT value_2, random() AS random FROM public.users_table
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS count FROM ((SELECT users_table.value_2, random() AS random FROM public.users_table) u1 RIGHT JOIN (SELECT intermediate_result.value_2, intermediate_result.random FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(value_2 integer, random double precision)) u2 USING (value_2))
ERROR: cannot pushdown the subquery
DEBUG: recursively planning left side of the right join since the outer side is a recurring rel
DEBUG: recursively planning the distributed subquery since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: generating subplan XXX_2 for subquery SELECT value_2, random() AS random FROM public.users_table
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS count FROM ((SELECT intermediate_result.value_2, intermediate_result.random FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(value_2 integer, random double precision)) u1 RIGHT JOIN (SELECT intermediate_result.value_2, intermediate_result.random FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(value_2 integer, random double precision)) u2 USING (value_2))
valid
---------------------------------------------------------------------
t
(1 row)
-- set operations may produce not very efficient plans
-- although we could have picked a as our anchor subquery,
-- we pick foo in this case and recursively plan a
@ -987,7 +992,7 @@ DEBUG: skipping recursive planning for the subquery since it contains reference
ERROR: complex joins are only supported when all distributed tables are co-located and joined on their distribution columns
-- similar to the above, make sure that we skip recursive planning when
-- the subquery contains only intermediate results
SELECT *
SELECT COUNT(*) = 176
FROM
(
SELECT * FROM(
@ -1026,9 +1031,18 @@ DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_5 for subquery SELECT user_id, "time", event_type, value_2, value_3, value_4 FROM public.events_table WHERE (value_2 OPERATOR(pg_catalog.>) 2)
DEBUG: Creating router plan
DEBUG: generating subplan XXX_6 for subquery SELECT intermediate_result.user_id, intermediate_result."time", intermediate_result.event_type, intermediate_result.value_2, intermediate_result.value_3, intermediate_result.value_4 FROM read_intermediate_result('XXX_4'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, "time" timestamp without time zone, event_type integer, value_2 integer, value_3 double precision, value_4 bigint) INTERSECT SELECT intermediate_result.user_id, intermediate_result."time", intermediate_result.event_type, intermediate_result.value_2, intermediate_result.value_3, intermediate_result.value_4 FROM read_intermediate_result('XXX_5'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, "time" timestamp without time zone, event_type integer, value_2 integer, value_3 double precision, value_4 bigint)
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT users_table_limited.user_id, users_table_limited."time", users_table_limited.value_1, users_table_limited.value_2, users_table_limited.value_3, users_table_limited.value_4, foo.user_id, foo."time", foo.event_type, foo.value_2, foo.value_3, foo.value_4, foo.user_id_1 AS user_id, foo.time_1 AS "time", foo.value_1, foo.value_2_1 AS value_2, foo.value_3_1 AS value_3, foo.value_4_1 AS value_4 FROM ((SELECT users_table_union.user_id, users_table_union."time", users_table_union.value_1, users_table_union.value_2, users_table_union.value_3, users_table_union.value_4 FROM (SELECT intermediate_result.user_id, intermediate_result."time", intermediate_result.value_1, intermediate_result.value_2, intermediate_result.value_3, intermediate_result.value_4 FROM read_intermediate_result('XXX_3'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, "time" timestamp without time zone, value_1 integer, value_2 integer, value_3 double precision, value_4 bigint)) users_table_union) users_table_limited JOIN LATERAL (SELECT bar.user_id, bar."time", bar.event_type, bar.value_2, bar.value_3, bar.value_4, u2.user_id, u2."time", u2.value_1, u2.value_2, u2.value_3, u2.value_4 FROM ((SELECT events_table.user_id, events_table."time", events_table.event_type, events_table.value_2, events_table.value_3, events_table.value_4 FROM (SELECT intermediate_result.user_id, intermediate_result."time", intermediate_result.event_type, intermediate_result.value_2, intermediate_result.value_3, intermediate_result.value_4 FROM read_intermediate_result('XXX_6'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, "time" timestamp without time zone, event_type integer, value_2 integer, value_3 double precision, value_4 bigint)) events_table WHERE (events_table.user_id OPERATOR(pg_catalog.=) users_table_limited.user_id)) bar LEFT JOIN public.users_table u2 ON ((u2.user_id OPERATOR(pg_catalog.=) bar.value_2)))) foo(user_id, "time", event_type, value_2, value_3, value_4, user_id_1, time_1, value_1, value_2_1, value_3_1, value_4_1) ON (true))
DEBUG: recursively planning right side of the left join since the outer side is a recurring rel
DEBUG: recursively planning distributed relation "users_table" "u2" since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: Wrapping relation "users_table" "u2" to a subquery
DEBUG: Router planner cannot handle multi-shard select queries
ERROR: cannot pushdown the subquery
DEBUG: generating subplan XXX_7 for subquery SELECT user_id FROM public.users_table u2 WHERE true
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT (count(*) OPERATOR(pg_catalog.=) 176) FROM ((SELECT users_table_union.user_id, users_table_union."time", users_table_union.value_1, users_table_union.value_2, users_table_union.value_3, users_table_union.value_4 FROM (SELECT intermediate_result.user_id, intermediate_result."time", intermediate_result.value_1, intermediate_result.value_2, intermediate_result.value_3, intermediate_result.value_4 FROM read_intermediate_result('XXX_3'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, "time" timestamp without time zone, value_1 integer, value_2 integer, value_3 double precision, value_4 bigint)) users_table_union) users_table_limited JOIN LATERAL (SELECT bar.user_id, bar."time", bar.event_type, bar.value_2, bar.value_3, bar.value_4, u2.user_id, u2."time", u2.value_1, u2.value_2, u2.value_3, u2.value_4 FROM ((SELECT events_table.user_id, events_table."time", events_table.event_type, events_table.value_2, events_table.value_3, events_table.value_4 FROM (SELECT intermediate_result.user_id, intermediate_result."time", intermediate_result.event_type, intermediate_result.value_2, intermediate_result.value_3, intermediate_result.value_4 FROM read_intermediate_result('XXX_6'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, "time" timestamp without time zone, event_type integer, value_2 integer, value_3 double precision, value_4 bigint)) events_table WHERE (events_table.user_id OPERATOR(pg_catalog.=) users_table_limited.user_id)) bar LEFT JOIN (SELECT u2_1.user_id, NULL::timestamp without time zone AS "time", NULL::integer AS value_1, NULL::integer AS value_2, NULL::double precision AS value_3, NULL::bigint AS value_4 FROM (SELECT intermediate_result.user_id FROM read_intermediate_result('XXX_7'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer)) u2_1) u2 ON ((u2.user_id OPERATOR(pg_catalog.=) bar.value_2)))) foo(user_id, "time", event_type, value_2, value_3, value_4, user_id_1, time_1, value_1, value_2_1, value_3_1, value_4_1) ON (true))
DEBUG: Creating router plan
?column?
---------------------------------------------------------------------
t
(1 row)
-- similar to the above, but this time there are multiple
-- non-colocated subquery joins one of them contains lateral
-- join
@ -1071,8 +1085,18 @@ DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_5 for subquery SELECT user_id, "time", event_type, value_2, value_3, value_4 FROM public.events_table WHERE (value_2 OPERATOR(pg_catalog.>) 2)
DEBUG: Creating router plan
DEBUG: generating subplan XXX_6 for subquery SELECT intermediate_result.user_id, intermediate_result."time", intermediate_result.event_type, intermediate_result.value_2, intermediate_result.value_3, intermediate_result.value_4 FROM read_intermediate_result('XXX_4'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, "time" timestamp without time zone, event_type integer, value_2 integer, value_3 double precision, value_4 bigint) INTERSECT SELECT intermediate_result.user_id, intermediate_result."time", intermediate_result.event_type, intermediate_result.value_2, intermediate_result.value_3, intermediate_result.value_4 FROM read_intermediate_result('XXX_5'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, "time" timestamp without time zone, event_type integer, value_2 integer, value_3 double precision, value_4 bigint)
DEBUG: recursively planning right side of the left join since the outer side is a recurring rel
DEBUG: recursively planning distributed relation "users_table" "u2" since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: Wrapping relation "users_table" "u2" to a subquery
DEBUG: Router planner cannot handle multi-shard select queries
ERROR: cannot pushdown the subquery
DEBUG: generating subplan XXX_7 for subquery SELECT user_id FROM public.users_table u2 WHERE true
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS count FROM public.events_table WHERE (NOT (user_id OPERATOR(pg_catalog.=) ANY (SELECT users_table_limited.user_id FROM ((SELECT intermediate_result.user_id, intermediate_result."time", intermediate_result.value_1, intermediate_result.value_2, intermediate_result.value_3, intermediate_result.value_4 FROM read_intermediate_result('XXX_3'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, "time" timestamp without time zone, value_1 integer, value_2 integer, value_3 double precision, value_4 bigint)) users_table_limited JOIN LATERAL (SELECT bar.user_id, bar."time", bar.event_type, bar.value_2, bar.value_3, bar.value_4, u2.user_id, u2."time", u2.value_1, u2.value_2, u2.value_3, u2.value_4 FROM ((SELECT events_table_1.user_id, events_table_1."time", events_table_1.event_type, events_table_1.value_2, events_table_1.value_3, events_table_1.value_4 FROM (SELECT intermediate_result.user_id, intermediate_result."time", intermediate_result.event_type, intermediate_result.value_2, intermediate_result.value_3, intermediate_result.value_4 FROM read_intermediate_result('XXX_6'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer, "time" timestamp without time zone, event_type integer, value_2 integer, value_3 double precision, value_4 bigint)) events_table_1 WHERE (events_table_1.user_id OPERATOR(pg_catalog.=) users_table_limited.user_id)) bar LEFT JOIN (SELECT u2_1.user_id, NULL::timestamp without time zone AS "time", NULL::integer AS value_1, NULL::integer AS value_2, NULL::double precision AS value_3, NULL::bigint AS value_4 FROM (SELECT intermediate_result.user_id FROM read_intermediate_result('XXX_7'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer)) u2_1) u2 ON ((u2.user_id OPERATOR(pg_catalog.=) bar.value_2)))) foo(user_id, "time", event_type, value_2, value_3, value_4, user_id_1, time_1, value_1, value_2_1, value_3_1, value_4_1) ON (true)))))
DEBUG: Router planner cannot handle multi-shard select queries
count
---------------------------------------------------------------------
62
(1 row)
-- make sure that non-colocated subquery joins work fine in
-- modifications
CREATE TABLE table1 (id int, tenant_id int);

File diff suppressed because it is too large Load Diff

View File

@ -250,10 +250,17 @@ DEBUG: push down of limit count: 1
DEBUG: generating subplan XXX_2 for subquery SELECT x, y FROM recursive_set_local.test ORDER BY x LIMIT 1
DEBUG: Creating router plan
DEBUG: generating subplan XXX_3 for subquery SELECT intermediate_result.x, intermediate_result.y FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(x integer, y integer) INTERSECT SELECT intermediate_result.x, intermediate_result.y FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(x integer, y integer)
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT u.x, u.y, test.y FROM ((SELECT intermediate_result.x, intermediate_result.y FROM read_intermediate_result('XXX_3'::text, 'binary'::citus_copy_format) intermediate_result(x integer, y integer)) u LEFT JOIN recursive_set_local.test USING (x)) ORDER BY u.x, u.y
DEBUG: recursively planning right side of the left join since the outer side is a recurring rel
DEBUG: recursively planning distributed relation "test" since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: Wrapping relation "test" to a subquery
DEBUG: Router planner cannot handle multi-shard select queries
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
DEBUG: generating subplan XXX_4 for subquery SELECT x, y FROM recursive_set_local.test WHERE true
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT u.x, u.y, test.y FROM ((SELECT intermediate_result.x, intermediate_result.y FROM read_intermediate_result('XXX_3'::text, 'binary'::citus_copy_format) intermediate_result(x integer, y integer)) u LEFT JOIN (SELECT test_1.x, test_1.y FROM (SELECT intermediate_result.x, intermediate_result.y FROM read_intermediate_result('XXX_4'::text, 'binary'::citus_copy_format) intermediate_result(x integer, y integer)) test_1) test USING (x)) ORDER BY u.x, u.y
DEBUG: Creating router plan
x | y | y
---------------------------------------------------------------------
(0 rows)
-- we replace some queries including the local query, the intermediate result is on the inner part of an outer join
SELECT * FROM ((SELECT * FROM local_test) INTERSECT (SELECT * FROM test ORDER BY x LIMIT 1)) u RIGHT JOIN test USING (x) ORDER BY 1,2;
DEBUG: Local tables cannot be used in distributed queries.

View File

@ -505,10 +505,20 @@ DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_2 for subquery SELECT x, y FROM recursive_union.test
DEBUG: Creating router plan
DEBUG: generating subplan XXX_3 for subquery SELECT intermediate_result.x, intermediate_result.y FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(x integer, y integer) UNION ALL SELECT intermediate_result.x, intermediate_result.y FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(x integer, y integer)
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT u.x, u.y, test.y FROM ((SELECT intermediate_result.x, intermediate_result.y FROM read_intermediate_result('XXX_3'::text, 'binary'::citus_copy_format) intermediate_result(x integer, y integer)) u LEFT JOIN recursive_union.test USING (x)) ORDER BY u.x, u.y
DEBUG: recursively planning right side of the left join since the outer side is a recurring rel
DEBUG: recursively planning distributed relation "test" since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: Wrapping relation "test" to a subquery
DEBUG: Router planner cannot handle multi-shard select queries
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
DEBUG: generating subplan XXX_4 for subquery SELECT x, y FROM recursive_union.test WHERE true
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT u.x, u.y, test.y FROM ((SELECT intermediate_result.x, intermediate_result.y FROM read_intermediate_result('XXX_3'::text, 'binary'::citus_copy_format) intermediate_result(x integer, y integer)) u LEFT JOIN (SELECT test_1.x, test_1.y FROM (SELECT intermediate_result.x, intermediate_result.y FROM read_intermediate_result('XXX_4'::text, 'binary'::citus_copy_format) intermediate_result(x integer, y integer)) test_1) test USING (x)) ORDER BY u.x, u.y
DEBUG: Creating router plan
x | y | y
---------------------------------------------------------------------
1 | 1 | 1
1 | 1 | 1
2 | 2 | 2
(3 rows)
-- unions in a join without partition column equality (column names from first query are used for join)
SELECT * FROM ((SELECT x, y FROM test) UNION (SELECT y, x FROM test)) u JOIN test USING (x) ORDER BY 1,2;
DEBUG: Router planner cannot handle multi-shard select queries
@ -686,10 +696,18 @@ DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: generating subplan XXX_2 for subquery SELECT x, y FROM recursive_union.test
DEBUG: Creating router plan
DEBUG: generating subplan XXX_3 for subquery SELECT intermediate_result.x, intermediate_result.y FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(x integer, y integer) INTERSECT SELECT intermediate_result.x, intermediate_result.y FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(x integer, y integer)
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT u.x, u.y, test.y FROM ((SELECT intermediate_result.x, intermediate_result.y FROM read_intermediate_result('XXX_3'::text, 'binary'::citus_copy_format) intermediate_result(x integer, y integer)) u LEFT JOIN recursive_union.test USING (x)) ORDER BY u.x, u.y
DEBUG: recursively planning right side of the left join since the outer side is a recurring rel
DEBUG: recursively planning distributed relation "test" since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: Wrapping relation "test" to a subquery
DEBUG: Router planner cannot handle multi-shard select queries
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
DEBUG: generating subplan XXX_4 for subquery SELECT x, y FROM recursive_union.test WHERE true
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT u.x, u.y, test.y FROM ((SELECT intermediate_result.x, intermediate_result.y FROM read_intermediate_result('XXX_3'::text, 'binary'::citus_copy_format) intermediate_result(x integer, y integer)) u LEFT JOIN (SELECT test_1.x, test_1.y FROM (SELECT intermediate_result.x, intermediate_result.y FROM read_intermediate_result('XXX_4'::text, 'binary'::citus_copy_format) intermediate_result(x integer, y integer)) test_1) test USING (x)) ORDER BY u.x, u.y
DEBUG: Creating router plan
x | y | y
---------------------------------------------------------------------
1 | 1 | 1
(1 row)
-- distributed table in WHERE clause is recursively planned
SELECT * FROM ((SELECT * FROM test) UNION (SELECT * FROM ref WHERE a IN (SELECT x FROM test))) u ORDER BY 1,2;
DEBUG: Router planner cannot handle multi-shard select queries

View File

@ -149,8 +149,11 @@ SELECT create_reference_table('t10');
SELECT count(*) FROM (
SELECT ALL t7.c1, t7.c0, t8.c1, t10.c1, t8.c0 FROM t7 CROSS JOIN t10 FULL OUTER JOIN t8 ON (((((((('[832125354,1134163512)'::int4range)*('(0,2106623281)'::int4range)))-('(-600267905,509840582]'::int4range)))*('(-365203965,1662828182)'::int4range)))&<((((((('(-1286467417,697584012]'::int4range)*('[-1691485781,1341103963)'::int4range)))*((('(-1768368435,1719707648)'::int4range)*('(139536997,1275813540]'::int4range)))))*((((('[-2103910157,-1961746758)'::int4range)*('[-834534078,533073939)'::int4range)))*((('[-1030552151,552856781]'::int4range)*('[-1109419376,1205173697]'::int4range))))))))
) AS foo;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
0
(1 row)
CREATE TABLE reference_table(id int, it_name varchar(25), k_no int);
SELECT create_reference_table('reference_table');
create_reference_table
@ -165,33 +168,55 @@ SELECT create_distributed_table('distributed_table', 'user_id');
(1 row)
INSERT INTO distributed_table VALUES
(1, 10),
(2, 22),
(3, 34),
(7, 40);
INSERT INTO reference_table VALUES
(1, '100'),
(null, '202'),
(4, '300'),
(null, '401'),
(null, '402');
-- postgres plans below queries by evaluating joins as below:
-- L
-- / \
-- ref L
-- / \
-- dist ref
-- so we should error out as reference table is in the outer part of the top level (left) outer join
SELECT count(*) FROM distributed_table a
LEFT JOIN reference_table b ON (true)
RIGHT JOIN reference_table c ON (true);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
100
(1 row)
SELECT count(*) FROM distributed_table a
LEFT JOIN (SELECT * FROM reference_table OFFSET 0) b ON (true)
RIGHT JOIN (SELECT * FROM reference_table OFFSET 0) c ON (true);
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
count
---------------------------------------------------------------------
100
(1 row)
SELECT count(*) FROM distributed_table a
LEFT JOIN reference_table b ON (true)
RIGHT JOIN reference_table c ON (c.id > 0);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
43
(1 row)
SELECT count(*) FROM distributed_table a
LEFT JOIN (SELECT * FROM reference_table OFFSET 0) b ON (true)
RIGHT JOIN (SELECT * FROM reference_table OFFSET 0) c ON (c.id > 0);
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
count
---------------------------------------------------------------------
43
(1 row)
-- drop existing sqlancer tables before next tests
DROP TABLE t0, t1, t2, t3, t4 CASCADE;
CREATE TABLE tbl1(a REAL, b FLOAT, c money);
@ -255,14 +280,16 @@ SELECT create_reference_table('t4');
-- t1(ref) L
-- / \
-- t0(dist) t4(ref)
-- -- so we should error out
SELECT count(*) FROM (
SELECT ALL t4.c1, t0.c0, t0.c1 FROM ONLY t0
LEFT OUTER JOIN t4 ON CAST(masklen('142.158.96.44') AS BOOLEAN)
RIGHT OUTER JOIN t1 ON ((0.024767844)::MONEY) BETWEEN (t1.c1) AND (CAST(0.0602135 AS MONEY))
) AS foo;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
0
(1 row)
-- first subquery has the same join tree as above, so we should error out
SELECT count(*) FROM (
SELECT ALL t4.c1, t0.c0, t0.c1 FROM ONLY t0
@ -278,21 +305,29 @@ UNION ALL SELECT ALL t4.c1, t0.c0, t0.c1 FROM ONLY t0
RIGHT OUTER JOIN t1 ON ((0.024767844)::MONEY) BETWEEN (t1.c1) AND ((0.0602135)::MONEY)
WHERE (NOT (((t0.c0)LIKE((t4.c0))))) ISNULL
) AS foo;
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
0
(1 row)
-- unsupported outer JOIN inside a subquery in WHERE clause
SELECT * FROM distributed_table WHERE buy_count > (
SELECT count(*) FROM distributed_table a
LEFT JOIN reference_table b ON (true)
RIGHT JOIN reference_table c ON (false));
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
user_id | item_id | buy_count
---------------------------------------------------------------------
(0 rows)
-- unsupported outer JOIN via subqueries
SELECT count(*) FROM (SELECT *, random() FROM distributed_table) AS a
LEFT JOIN (SELECT *, random() FROM reference_table) AS b ON (true)
RIGHT JOIN (SELECT *, random() FROM reference_table) AS c ON (false);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
5
(1 row)
-- unsupported outer JOIN in a sublevel subquery
SELECT
count(*)
@ -307,8 +342,11 @@ JOIN
RIGHT JOIN reference_table c ON (true)
) AS unsupported_join
ON (true);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
400
(1 row)
SELECT
count(*)
FROM
@ -322,32 +360,41 @@ JOIN
RIGHT JOIN (SELECT * FROM reference_table OFFSET 0) c ON (true)
) AS unsupported_join
ON (true);
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
count
---------------------------------------------------------------------
400
(1 row)
-- unsupported outer JOIN in a sublevel INNER JOIN
SELECT
unsupported_join.*
COUNT(unsupported_join.*)
FROM
(distributed_table a
LEFT JOIN reference_table b ON (true)
RIGHT JOIN reference_table c ON (true)) as unsupported_join (x,y,z,t,e,f,q)
JOIN
(reference_table d JOIN reference_table e ON(true)) ON (true);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
2500
(1 row)
-- unsupported outer JOIN in a sublevel LEFT JOIN
SELECT
unsupported_join.*
COUNT(unsupported_join.*)
FROM
(distributed_table a
LEFT JOIN reference_table b ON (true)
RIGHT JOIN reference_table c ON (true)) as unsupported_join
LEFT JOIN
(reference_table d JOIN reference_table e ON(true)) ON (true);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
2500
(1 row)
SELECT
unsupported_join.*
COUNT(unsupported_join.*)
FROM
(distributed_table a
LEFT JOIN (SELECT * FROM reference_table OFFSET 0) b ON (true)
@ -360,21 +407,27 @@ LEFT JOIN
ON(true)
)
ON (true);
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
count
---------------------------------------------------------------------
2500
(1 row)
-- unsupported outer JOIN in a sublevel RIGHT JOIN
SELECT
unsupported_join.*
COUNT(unsupported_join.*)
FROM
(distributed_table a
LEFT JOIN reference_table b ON (true)
RIGHT JOIN reference_table c ON (false)) as unsupported_join
RIGHT JOIN
(reference_table d JOIN reference_table e ON(true)) ON (true);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
count
---------------------------------------------------------------------
125
(1 row)
SELECT
unsupported_join.*
COUNT(unsupported_join.*)
FROM
(distributed_table a
LEFT JOIN (SELECT * FROM reference_table OFFSET 0) b ON (true)
@ -387,9 +440,12 @@ RIGHT JOIN
ON(true)
)
ON (true);
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
EXPLAIN SELECT
count
---------------------------------------------------------------------
125
(1 row)
EXPLAIN (COSTS OFF) SELECT
unsupported_join.*
FROM
(distributed_table a
@ -397,7 +453,34 @@ FROM
RIGHT JOIN reference_table c ON (true)) as unsupported_join (x,y,z,t,e,f,q)
JOIN
(reference_table d JOIN reference_table e ON(true)) ON (d.id > 0);
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
-> Distributed Subplan XXX_1
-> Custom Scan (Citus Adaptive)
Task Count: 4
Tasks Shown: One of 4
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Seq Scan on distributed_table_92862439 a
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Nested Loop
-> Nested Loop
-> Nested Loop Left Join
-> Seq Scan on reference_table_92862438 c
-> Nested Loop Left Join
-> Function Scan on read_intermediate_result intermediate_result
-> Materialize
-> Seq Scan on reference_table_92862438 b
-> Materialize
-> Seq Scan on reference_table_92862438 d
Filter: (id > 0)
-> Materialize
-> Seq Scan on reference_table_92862438 e
(25 rows)
SET client_min_messages TO WARNING;
DROP SCHEMA sqlancer_failures CASCADE;

View File

@ -70,19 +70,34 @@ DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT user_id FROM
ERROR: cannot handle complex subqueries when the router executor is disabled
SET citus.enable_router_execution TO true;
-- OUTER JOINs where the outer part is recursively planned and not the other way
-- around is not supported
-- around are supported
SELECT
foo.value_2
INTO result_table
FROM
(SELECT users_table.value_2 FROM users_table, events_table WHERE users_table.user_id = events_table.user_id AND event_type IN (1,2,3,4) LIMIT 5) as foo
(SELECT users_table.value_2 FROM users_table, events_table WHERE users_table.user_id = events_table.user_id AND event_type IN (1,2,3,4) ORDER BY users_table.value_2 LIMIT 5) as foo
LEFT JOIN
(SELECT users_table.value_2 FROM users_table, events_table WHERE users_table.user_id = events_table.user_id AND event_type IN (5,6,7,8)) as bar
ON(foo.value_2 = bar.value_2);
DEBUG: push down of limit count: 5
DEBUG: generating subplan XXX_1 for subquery SELECT users_table.value_2 FROM public.users_table, public.events_table WHERE ((users_table.user_id OPERATOR(pg_catalog.=) events_table.user_id) AND (events_table.event_type OPERATOR(pg_catalog.=) ANY (ARRAY[1, 2, 3, 4]))) LIMIT 5
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT foo.value_2 FROM ((SELECT intermediate_result.value_2 FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(value_2 integer)) foo LEFT JOIN (SELECT users_table.value_2 FROM public.users_table, public.events_table WHERE ((users_table.user_id OPERATOR(pg_catalog.=) events_table.user_id) AND (events_table.event_type OPERATOR(pg_catalog.=) ANY (ARRAY[5, 6, 7, 8])))) bar ON ((foo.value_2 OPERATOR(pg_catalog.=) bar.value_2)))
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
DEBUG: generating subplan XXX_1 for subquery SELECT users_table.value_2 FROM public.users_table, public.events_table WHERE ((users_table.user_id OPERATOR(pg_catalog.=) events_table.user_id) AND (events_table.event_type OPERATOR(pg_catalog.=) ANY (ARRAY[1, 2, 3, 4]))) ORDER BY users_table.value_2 LIMIT 5
DEBUG: recursively planning right side of the left join since the outer side is a recurring rel
DEBUG: recursively planning the distributed subquery since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: generating subplan XXX_2 for subquery SELECT users_table.value_2 FROM public.users_table, public.events_table WHERE ((users_table.user_id OPERATOR(pg_catalog.=) events_table.user_id) AND (events_table.event_type OPERATOR(pg_catalog.=) ANY (ARRAY[5, 6, 7, 8])))
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT foo.value_2 FROM ((SELECT intermediate_result.value_2 FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(value_2 integer)) foo LEFT JOIN (SELECT intermediate_result.value_2 FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(value_2 integer)) bar ON ((foo.value_2 OPERATOR(pg_catalog.=) bar.value_2)))
SELECT COUNT(*) = 60 FROM result_table WHERE value_2 = 0;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT COUNT(*) = 0 FROM result_table WHERE value_2 != 0;
?column?
---------------------------------------------------------------------
t
(1 row)
DROP TABLE result_table;
-- We do not support GROUPING SETS in subqueries
-- This also includes ROLLUP or CUBE clauses
SELECT * FROM (SELECT user_id, value_1 FROM users_table GROUP BY GROUPING SETS ((user_id), (value_1))) s;

View File

@ -248,7 +248,7 @@ ON (key = num);
101
(1 row)
-- VALUES with unsupported OUTER join
-- VALUES with supported OUTER join (since test_values is recursively planned)
SELECT
count(*)
FROM
@ -256,8 +256,16 @@ FROM
RIGHT JOIN
(SELECT a,b FROM (VALUES (1, 'one'), (2, 'two'), (3, 'three')) as t(a,b) ) as foo (num,letter)
ON (key = num);
ERROR: cannot pushdown the subquery
DETAIL: There exist a VALUES clause in the outer part of the outer join
DEBUG: recursively planning left side of the right join since the outer side is a recurring rel
DEBUG: recursively planning distributed relation "test_values" since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: Wrapping relation "test_values" to a subquery
DEBUG: generating subplan XXX_1 for subquery SELECT key FROM values_subquery.test_values WHERE true
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS count FROM ((SELECT test_values_1.key, NULL::text AS value, NULL::jsonb AS data FROM (SELECT intermediate_result.key FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(key integer)) test_values_1) test_values RIGHT JOIN (SELECT t.a, t.b FROM (VALUES (1,'one'::text), (2,'two'::text), (3,'three'::text)) t(a, b)) foo(num, letter) ON ((test_values.key OPERATOR(pg_catalog.=) foo.num)))
count
---------------------------------------------------------------------
3
(1 row)
-- values with router queries
SELECT
count(*)
@ -456,12 +464,20 @@ SELECT count(*) FROM
296
(1 row)
-- VALUES cannot be the right relationship in a join
-- VALUES can be the right relationship in a join
SELECT count(*) FROM
(SELECT random() FROM test_values RIGHT JOIN (SELECT a, b FROM (VALUES (1, 'one'), (2, 'two'), (3, 'three')) as t(a,b)) as values_data(a,b)
ON test_values.key > values_data.a) subquery_1;
ERROR: cannot pushdown the subquery
DETAIL: There exist a VALUES clause in the outer part of the outer join
DEBUG: recursively planning left side of the right join since the outer side is a recurring rel
DEBUG: recursively planning distributed relation "test_values" since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: Wrapping relation "test_values" to a subquery
DEBUG: generating subplan XXX_1 for subquery SELECT key FROM values_subquery.test_values WHERE true
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS count FROM (SELECT random() AS random FROM ((SELECT test_values_1.key, NULL::text AS value, NULL::jsonb AS data FROM (SELECT intermediate_result.key FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(key integer)) test_values_1) test_values RIGHT JOIN (SELECT t.a, t.b FROM (VALUES (1,'one'::text), (2,'two'::text), (3,'three'::text)) t(a, b)) values_data(a, b) ON ((test_values.key OPERATOR(pg_catalog.>) values_data.a)))) subquery_1
count
---------------------------------------------------------------------
294
(1 row)
-- subquery IN WHERE clause need to be recursively planned
-- but it is correlated so cannot be pushed down
SELECT

View File

@ -840,8 +840,18 @@ LEFT JOIN
WHERE distinct_undistribured.user_id = test_cte_distributed.user_id)) exsist_in_distributed
ON distinct_undistribured.user_id = exsist_in_distributed.user_id
ORDER BY 2 DESC, 1 DESC;
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
user_id | user_id
---------------------------------------------------------------------
7 |
6 |
4 |
2 |
8 | 8
5 | 5
3 | 3
1 | 1
(8 rows)
-- similar query as the above, but this time
-- use NOT EXITS, which is pretty common struct
WITH distinct_undistribured AS
@ -855,9 +865,20 @@ LEFT JOIN
WHERE NOT EXISTS
(SELECT NULL
FROM distinct_undistribured
WHERE distinct_undistribured.user_id = test_cte_distributed.user_id)) exsist_in_distributed ON distinct_undistribured.user_id = exsist_in_distributed.user_id;
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
WHERE distinct_undistribured.user_id = test_cte_distributed.user_id)) exsist_in_distributed ON distinct_undistribured.user_id = exsist_in_distributed.user_id
ORDER BY 1,2;
user_id | user_id
---------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
(8 rows)
-- same NOT EXISTS struct, but with CTE
-- so should work
WITH distinct_undistribured AS (
@ -905,8 +926,11 @@ LEFT JOIN
(SELECT NULL
FROM distinct_undistribured d2
WHERE d1.user_id = d2.user_id )) AS bar USING (user_id);
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
count
---------------------------------------------------------------------
8
(1 row)
-- should work fine with materialized ctes
WITH distinct_undistribured AS MATERIALIZED (
SELECT DISTINCT user_id
@ -951,8 +975,18 @@ LEFT JOIN
WHERE distinct_undistribured.user_id = test_cte_distributed.user_id)) exsist_in_distributed
ON distinct_undistribured.user_id = exsist_in_distributed.user_id
ORDER BY 2 DESC, 1 DESC;
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
user_id | user_id
---------------------------------------------------------------------
7 |
6 |
4 |
2 |
8 | 8
5 | 5
3 | 3
1 | 1
(8 rows)
WITH distinct_undistribured AS MATERIALIZED
(SELECT DISTINCT user_id
FROM test_cte)
@ -964,9 +998,20 @@ LEFT JOIN
WHERE NOT EXISTS
(SELECT NULL
FROM distinct_undistribured
WHERE distinct_undistribured.user_id = test_cte_distributed.user_id)) exsist_in_distributed ON distinct_undistribured.user_id = exsist_in_distributed.user_id;
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
WHERE distinct_undistribured.user_id = test_cte_distributed.user_id)) exsist_in_distributed ON distinct_undistribured.user_id = exsist_in_distributed.user_id
ORDER BY 1,2;
user_id | user_id
---------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
(8 rows)
-- NOT EXISTS struct, with cte inlining disabled
WITH distinct_undistribured AS MATERIALIZED(
SELECT DISTINCT user_id
@ -1013,8 +1058,11 @@ LEFT JOIN
(SELECT NULL
FROM distinct_undistribured d2
WHERE d1.user_id = d2.user_id )) AS bar USING (user_id);
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
count
---------------------------------------------------------------------
8
(1 row)
-- some test with failures
WITH a AS MATERIALIZED (SELECT * FROM users_table LIMIT 10)
SELECT user_id/0 FROM users_table JOIN a USING (user_id);

View File

@ -151,7 +151,7 @@ LIMIT
2
(5 rows)
-- cte LEFT JOIN distributed_table should error out
-- cte LEFT JOIN distributed_table should work
-- as long as the CTE is recursively planned
WITH cte AS MATERIALIZED (
SELECT * FROM users_table WHERE user_id = 1 ORDER BY value_1
@ -166,8 +166,15 @@ ORDER BY
1,2,3
LIMIT
5;
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
user_id | time | event_type
---------------------------------------------------------------------
1 | Wed Nov 22 22:51:43.132261 2017 | 0
1 | Wed Nov 22 22:51:43.132261 2017 | 0
1 | Wed Nov 22 22:51:43.132261 2017 | 1
1 | Wed Nov 22 22:51:43.132261 2017 | 1
1 | Wed Nov 22 22:51:43.132261 2017 | 2
(5 rows)
-- cte RIGHT JOIN distributed_table should work
WITH cte AS MATERIALIZED (
SELECT * FROM users_table WHERE user_id = 1 ORDER BY value_1
@ -214,7 +221,7 @@ LIMIT
1 | Thu Nov 23 09:26:42.145043 2017 | 2
(5 rows)
-- distributed_table RIGHT JOIN cte should error out
-- distributed_table RIGHT JOIN cte should work
WITH cte AS MATERIALIZED (
SELECT * FROM users_table WHERE value_1 = 1 ORDER BY value_1
)
@ -228,9 +235,16 @@ ORDER BY
1,2,3
LIMIT
5;
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
-- cte FULL JOIN distributed_table should error out
user_id | time | event_type
---------------------------------------------------------------------
1 | Thu Nov 23 09:26:42.145043 2017 | 0
1 | Thu Nov 23 09:26:42.145043 2017 | 0
1 | Thu Nov 23 09:26:42.145043 2017 | 1
1 | Thu Nov 23 09:26:42.145043 2017 | 1
1 | Thu Nov 23 09:26:42.145043 2017 | 2
(5 rows)
-- cte FULL JOIN distributed_table should work
WITH cte AS MATERIALIZED (
SELECT * FROM users_table WHERE user_id = 1 ORDER BY value_1
)
@ -244,8 +258,15 @@ ORDER BY
1,2,3
LIMIT
5;
ERROR: cannot pushdown the subquery
DETAIL: Complex subqueries, CTEs and local tables cannot be in the outer part of an outer join with a distributed table
user_id | time | event_type
---------------------------------------------------------------------
1 | Wed Nov 22 22:51:43.132261 2017 | 0
1 | Wed Nov 22 22:51:43.132261 2017 | 0
1 | Wed Nov 22 22:51:43.132261 2017 | 1
1 | Wed Nov 22 22:51:43.132261 2017 | 1
1 | Wed Nov 22 22:51:43.132261 2017 | 2
(5 rows)
-- Joins with reference tables are planned as router queries
WITH cte AS MATERIALIZED (
SELECT value_2, max(user_id) AS user_id FROM users_table WHERE value_2 = 1 GROUP BY value_2 HAVING count(*) > 1

View File

@ -198,6 +198,7 @@ test: local_dist_join_modifications
test: local_table_join
test: local_dist_join_mixed
test: citus_local_dist_joins
test: recurring_outer_join
test: pg_dump
# ---------

View File

@ -0,0 +1,74 @@
SET search_path TO local_dist_join_mixed;
SELECT COUNT(*) FROM reference LEFT JOIN distributed USING (id);
SELECT COUNT(*) FROM distributed RIGHT JOIN reference USING (id);
SELECT COUNT(*) FROM reference FULL JOIN distributed USING (id);
SELECT COUNT(*) FROM distributed FULL JOIN reference USING (id);
SELECT COUNT(*) FROM distributed FULL JOIN reference USING (id);
-- distributed side is a subquery
SELECT COUNT(*) FROM reference LEFT JOIN (SELECT * FROM distributed) q USING (id);
-- distributed side is a join tree
SELECT COUNT(*) FROM reference LEFT JOIN (distributed t1 JOIN distributed t2 USING (id)) q USING (id);
SELECT COUNT(*) FROM reference LEFT JOIN (distributed t1 LEFT JOIN distributed t2 USING (id)) q USING (id);
SELECT COUNT(*) FROM
-- 2) right side is distributed but t1 is recurring, hence what
-- makes the right side distributed (t4) is recursively planned
reference t1
LEFT JOIN
(
distributed t4
JOIN
-- 1) t6 is recursively planned since the outer side is recurring
(SELECT t6.id FROM distributed t6 RIGHT JOIN reference t7 USING(id)) t5
USING(id)
) q
USING(id)
-- 3) outer side of the join tree became recurring, hence t8 is
-- recursively planned too
LEFT JOIN
distributed t8
USING (id)
WHERE t8.id IS NULL;
SELECT COUNT(*) FROM
local t1
LEFT JOIN
-- 2) t6 subquery is distributed so needs to be recursively planned
-- because t1 is first recursively planned
(
SELECT * FROM
(SELECT * FROM reference t2 JOIN distributed t3 USING (id)) p
JOIN
-- 1) t5 is recursively planned since the outer side is recurring
(SELECT * FROM reference t4 LEFT JOIN distributed t5 USING (id)) q
USING(id)
) t6
USING (id);
BEGIN;
WITH cte AS (
DELETE FROM distributed
USING (
SELECT t1.id, t1.id*3 FROM reference t1
LEFT JOIN
(
SELECT * FROM distributed t2 WHERE EXISTS (
SELECT * FROM distributed t4
WHERE t4.id = t2.id
)
) t3
USING (id)
) q
WHERE distributed.id = q.id AND
distributed.id > 65
RETURNING *
)
SELECT COUNT(*) FROM cte;
ROLLBACK;

View File

@ -210,7 +210,6 @@ SELECT count(*) FROM reference_table
LEFT JOIN postgres_local_table ON (true)
LEFT JOIN reference_table r2 ON (true);
-- not supported direct outer join
SELECT count(*) FROM citus_local_table LEFT JOIN distributed_table ON (true);
-- distinct in subquery on CTE

View File

@ -203,7 +203,7 @@ SELECT count(*) FROM reference_table
LEFT JOIN postgres_local_table ON (true)
LEFT JOIN reference_table r2 ON (true);
-- not supported direct outer join
-- supported outer join
SELECT count(*) FROM citus_local_table LEFT JOIN distributed_table ON (true);
-- distinct in subquery on CTE

View File

@ -33,7 +33,7 @@ SELECT count(*) FROM users_ref_test_table ref1 CROSS JOIN users_ref_test_table r
-- two reference tables CROSS JOINNed, and later JOINED with distributed tables
-- but the reference table CROSS JOIN is in the outer side of the JOIN with the distributed table
-- so we cannot pushdown
-- so this is supported by recursively planning the distributed table (users_table)
SELECT count(*) FROM users_ref_test_table ref1 CROSS JOIN users_ref_test_table ref2 LEFT JOIN users_table ON (ref1.id = users_table.user_id);
SELECT count(*) FROM users_ref_test_table ref1 CROSS JOIN users_ref_test_table ref2 FULL JOIN users_table ON (ref1.id = users_table.user_id);
SELECT count(*) FROM users_ref_test_table ref1 CROSS JOIN users_ref_test_table ref2 LEFT JOIN users_table ON (ref1.id != users_table.user_id);

View File

@ -566,8 +566,6 @@ EXECUTE retry_planning(4);
EXECUTE retry_planning(5);
EXECUTE retry_planning(6);
-- this test can only work if the CTE is recursively
-- planned
WITH b AS (SELECT * FROM test_table)
SELECT count(*) FROM (SELECT key as x FROM test_table OFFSET 0) as ref LEFT JOIN b ON (ref.x = b.key);
@ -578,8 +576,6 @@ WITH a AS (SELECT * FROM test_table),
b AS (SELECT * FROM test_table)
SELECT count(*) FROM a LEFT JOIN b ON (a.value = b.value);
-- cte a has to be recursively planned because of OFFSET 0
-- after that, cte b also requires recursive planning
WITH a AS (SELECT * FROM test_table OFFSET 0),
b AS (SELECT * FROM test_table)
SELECT min(a.key) FROM a LEFT JOIN b ON (a.value = b.value);

View File

@ -103,10 +103,8 @@ INSERT INTO partitioned_distributed_table SELECT foo.* FROM partitioned_distribu
INSERT INTO partitioned_distributed_table SELECT foo.* FROM distributed_table AS foo JOIN citus_local_table ON (true);
INSERT INTO distributed_table SELECT foo.a FROM partitioned_distributed_table AS foo JOIN citus_local_table ON (true);
-- should fail
SELECT COUNT(*) FROM reference_table LEFT JOIN partitioned_distributed_table ON true;
-- non-colocated subquery should work
SELECT COUNT(*) FROM
(SELECT *, random() FROM partitioned_distributed_table) AS foo,

View File

@ -22,12 +22,11 @@ SELECT * FROM customer LIMIT 2;
-- Verify joins work with dropped columns.
SELECT count(*) FROM customer, orders WHERE c_custkey = o_custkey;
-- Test joinExpr aliases by performing an outer-join. This code path is
-- currently not exercised, but we are adding this test to catch this bug when
-- we start supporting outer joins.
-- Test joinExpr aliases by performing an outer-join.
SELECT c_custkey
FROM (customer LEFT OUTER JOIN orders ON (c_custkey = o_custkey)) AS
test(c_custkey, c_nationkey)
INNER JOIN lineitem ON (test.c_custkey = l_orderkey)
ORDER BY 1
LIMIT 10;

View File

@ -112,7 +112,7 @@ FROM (
) t GROUP BY user_id, hasdone_event;
-- the LEFT JOIN conditon is not on the partition column (i.e., is it part_key divided by 2)
-- still, recursive planning will kick in to plan some part of the query
-- but, we can plan the query thanks to recursive planning
SET client_min_messages TO DEBUG1;
INSERT INTO agg_results_third (user_id, value_1_agg, value_2_agg )
@ -153,7 +153,8 @@ FROM (
) t2 ON (t1.user_id = (t2.user_id)/2)
GROUP BY t1.user_id, hasdone_event
) t GROUP BY user_id, hasdone_event;
) t GROUP BY user_id, hasdone_event
RETURNING user_id, value_1_agg, value_2_agg;
RESET client_min_messages;
------------------------------------
@ -232,7 +233,7 @@ ORDER BY
-- not pushable since the JOIN condition is not equi JOIN
-- (subquery_1 JOIN subquery_2)
-- still, recursive planning will kick in
-- but, we can plan the query thanks to recursive planning
SET client_min_messages TO DEBUG1;
INSERT INTO agg_results_third (user_id, value_1_agg, value_2_agg)
SELECT
@ -298,7 +299,8 @@ WHERE
GROUP BY
count_pay, user_id
ORDER BY
count_pay;
count_pay
RETURNING user_id, value_1_agg, value_2_agg;
RESET client_min_messages;
------------------------------------

View File

@ -302,7 +302,7 @@ FROM
LEFT JOIN multi_outer_join_third_reference t1 ON (r1.r_custkey = t1.t_custkey)
ORDER BY l_custkey, r_custkey, t_custkey;
-- Right join with single shard right most table should error out
-- Right join with single shard right most table should work
SELECT
l_custkey, r_custkey, t_custkey
FROM

View File

@ -159,7 +159,7 @@ FROM
multi_outer_join_left_hash a LEFT JOIN multi_outer_join_right_reference b
ON (l_custkey = r_custkey AND l_custkey = -1 /* nonexistant */);
-- Right join should be disallowed in this case
-- Right join is allowed as we recursively plan the distributed table (multi_outer_join_left_hash)
SELECT
min(r_custkey), max(r_custkey)
FROM
@ -259,7 +259,7 @@ FROM
ON (l_custkey = r_custkey AND r_custkey = 21);
-- Right join should not be allowed in this case
-- Right join should be allowed in this case as we recursively plan the distributed table (multi_outer_join_left_hash
SELECT
min(r_custkey), max(r_custkey)
FROM
@ -302,13 +302,14 @@ FROM
LEFT JOIN multi_outer_join_third_reference t1 ON (r1.r_custkey = t1.t_custkey)
ORDER BY 1;
-- Right join with single shard right most table should error out
-- Right join with single shard right most table should work
SELECT
l_custkey, r_custkey, t_custkey
FROM
multi_outer_join_left_hash l1
LEFT JOIN multi_outer_join_right_hash r1 ON (l1.l_custkey = r1.r_custkey)
RIGHT JOIN multi_outer_join_third_reference t1 ON (r1.r_custkey = t1.t_custkey);
RIGHT JOIN multi_outer_join_third_reference t1 ON (r1.r_custkey = t1.t_custkey)
ORDER BY 1,2,3;
-- Right join with single shard left most table should work
SELECT
@ -377,12 +378,13 @@ WHERE
l_custkey is NULL or r_custkey is NULL
ORDER BY 1,2 DESC;
-- full outer join should error out for mismatched shards
-- full outer join should work as we recursively plan the distributed table (multi_outer_join_left_hash
SELECT
l_custkey, t_custkey
FROM
multi_outer_join_left_hash l1
FULL JOIN multi_outer_join_third_reference t1 ON (l1.l_custkey = t1.t_custkey);
FULL JOIN multi_outer_join_third_reference t1 ON (l1.l_custkey = t1.t_custkey)
ORDER BY 1,2;
-- inner join + single shard left join should work
SELECT

View File

@ -568,13 +568,18 @@ WHERE user_id IN
SELECT user_id
FROM events_test_table);
-- Reference tables can not locate on the outer part of the outer join
-- Reference tables can locate on the outer part of the outer join
-- Note that we don't need to sort the output because
-- citus.sort_returning is enabled by default during
-- regression tests.
UPDATE users_test_table
SET value_1 = 4
WHERE user_id IN
(SELECT DISTINCT e2.user_id
FROM users_reference_copy_table
LEFT JOIN users_test_table e2 ON (e2.user_id = users_reference_copy_table.value_1)) RETURNING *;
WHERE user_id IN (
SELECT DISTINCT e2.user_id
FROM users_reference_copy_table
LEFT JOIN users_test_table e2 ON (e2.user_id = users_reference_copy_table.value_1)
)
RETURNING *;
-- Volatile functions are also not supported
UPDATE users_test_table

View File

@ -42,9 +42,9 @@ SELECT subquery_1.user_id FROM
ON user_buy_test_table.user_id > users_ref_test_table.id) subquery_1
ORDER BY 1;
-- Shouldn't work, reference table at the outer side is not allowed
-- Should work, reference table at the outer side is allowed
SELECT * FROM
(SELECT random() FROM users_ref_test_table LEFT JOIN user_buy_test_table
(SELECT random() > 2 FROM users_ref_test_table LEFT JOIN user_buy_test_table
ON users_ref_test_table.id = user_buy_test_table.user_id) subquery_1;
-- Should work, reference table at the inner side is allowed
@ -52,9 +52,9 @@ SELECT count(*) FROM
(SELECT random() FROM users_ref_test_table RIGHT JOIN user_buy_test_table
ON user_buy_test_table.user_id = users_ref_test_table.id) subquery_1;
-- Shouldn't work, reference table at the outer side is not allowed
-- Should work, reference table at the outer side is allowed
SELECT * FROM
(SELECT random() FROM user_buy_test_table RIGHT JOIN users_ref_test_table
(SELECT random() > 2 FROM user_buy_test_table RIGHT JOIN users_ref_test_table
ON user_buy_test_table.user_id = users_ref_test_table.id) subquery_1;
-- Equi join test with reference table on non-partition keys
@ -166,7 +166,7 @@ SELECT count(*) FROM
SELECT count(*) FROM user_buy_test_table LEFT JOIN (SELECT * FROM generate_series(1,10) id) users_ref_test_table
ON user_buy_test_table.item_id = users_ref_test_table.id;
-- table function cannot be the outer relationship in an outer join
-- table function can be the outer relationship in an outer join
SELECT count(*) FROM
(SELECT random() FROM user_buy_test_table RIGHT JOIN generate_series(1,10) AS users_ref_test_table(id)
ON user_buy_test_table.item_id > users_ref_test_table.id) subquery_1;
@ -208,7 +208,7 @@ ON user_buy_test_table.item_id = users_ref_test_table.id;
SELECT count(*) FROM user_buy_test_table LEFT JOIN (SELECT 5 AS id) users_ref_test_table
ON user_buy_test_table.item_id = users_ref_test_table.id;
-- subquery without FROM cannot be the outer relationship in an outer join
-- subquery without FROM can be the outer relationship in an outer join
SELECT count(*) FROM user_buy_test_table RIGHT JOIN (SELECT 5 AS id) users_ref_test_table
ON user_buy_test_table.item_id = users_ref_test_table.id;
@ -276,8 +276,8 @@ SELECT * FROM
SELECT user_id FROM user_buy_test_table) sub
ORDER BY 1 DESC;
-- query can be pushed down when a reference table inside union query is
-- joined with a distributed table. reference table cannot be at
-- query is supported when a reference table inside union query is
-- joined with a distributed table. reference table can be at
-- the outer part.
SELECT * FROM
(SELECT user_id FROM users_ref_test_table ref LEFT JOIN user_buy_test_table dis
@ -300,7 +300,7 @@ FROM
) as foo
GROUP BY user_id ORDER BY 2 DESC LIMIT 10;
-- should not be able to pushdown since reference table is in the
-- supported even if the reference table is in the
-- direct outer part of the left join
SELECT
user_id, sum(value_1)
@ -313,15 +313,15 @@ FROM
) as foo
GROUP BY user_id ORDER BY 2 DESC LIMIT 10;
-- should not be able to pushdown since reference table is in the
-- supported even if the reference table is in the
-- direct outer part of the left join wrapped into a subquery
SELECT
*
COUNT(*) = 1581
FROM
(SELECT *, random() FROM events_reference_table) as ref_all LEFT JOIN users_table
ON (users_table.user_id = ref_all.value_2);
-- should not be able to pushdown since reference table is in the
-- supported even if the reference table is in the
-- outer part of the left join
SELECT
user_id, sum(value_1)
@ -723,7 +723,7 @@ count(*) AS cnt, "generated_group_field"
cnt DESC, generated_group_field ASC
LIMIT 10;
-- RIGHT JOINs used with INNER JOINs should error out since reference table exist in the
-- RIGHT JOINs used with INNER JOINs should work even if the reference table exist in the
-- right side of the RIGHT JOIN.
SELECT
count(*) AS cnt, "generated_group_field"
@ -1082,7 +1082,7 @@ INNER JOIN
GROUP BY types
ORDER BY types;
-- just a sanity check that we don't allow this if the reference table is on the
-- just a sanity check that we allow this even if the reference table is on the
-- left part of the left join
SELECT count(*) FROM
(SELECT random() FROM users_ref_test_table LEFT JOIN user_buy_test_table
@ -1097,9 +1097,6 @@ SELECT count(*) FROM
ON user_buy_test_table.user_id > users_ref_test_table.id) subquery_2
WHERE subquery_1.user_id != subquery_2.user_id ;
-- we could not push this query not due to non colocated
-- subqueries (i.e., they are recursively planned)
-- but due to outer join restrictions
SELECT
count(*) AS cnt, "generated_group_field"
FROM
@ -1425,7 +1422,7 @@ JOIN
ORDER BY 1
LIMIT 5;
-- outer part of the LEFT JOIN consists only reference tables, so we cannot push down
-- supported even if the outer part of the LEFT JOIN consists only reference tables
-- we have different combinations for ON condition, true/false/two column join/single column filter
SELECT count(*) FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id LEFT JOIN user_buy_test_table ON true;
SELECT count(*) FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id LEFT JOIN user_buy_test_table ON true;
@ -1439,7 +1436,7 @@ SELECT count(*) FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table re
SELECT count(*) FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id LEFT JOIN user_buy_test_table ON (ref2.id = user_buy_test_table.user_id);
-- outer part of the LEFT JOIN consists only reference tables within a subquery, so we cannot push down
-- outer part of the LEFT JOIN consists only reference tables within a subquery
-- we have different combinations for ON condition, true/false/two column join/single column filter
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo LEFT JOIN user_buy_test_table ON true;
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo LEFT JOIN user_buy_test_table ON true;
@ -1449,21 +1446,18 @@ SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 INN
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 LEFT JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo LEFT JOIN user_buy_test_table ON (user_buy_test_table.user_id > 19);
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as foo LEFT JOIN user_buy_test_table ON (foo.id = user_buy_test_table.user_id);
-- one example where unsupported outer join is deep inside a subquery
-- one example where supported outer join is deep inside a subquery
SELECT *, random() FROM (
SELECT *,random() FROM user_buy_test_table WHERE user_id > (
SELECT count(*) FROM (SELECT *,random() FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id) as bar) as foo LEFT JOIN (SELECT *, random() FROM (SELECT *,random() FROM user_buy_test_table d1 JOIN user_buy_test_table d2 USING (user_id)) as bar_inner ) as bar ON true)) as boo;
-- In theory, we should be able to pushdown this query
-- however, as the LEFT JOIN condition is between a reference table and the distributed table
-- Postgres generates a LEFT JOIN alternative among those tables
SELECT count(*) FROM (SELECT ref1.*, random() FROM users_ref_test_table ref1 INNER JOIN user_buy_test_table u1 on ref1.id = u1.user_id) as foo LEFT JOIN user_buy_test_table ON (foo.id = user_buy_test_table.user_id);
-- same as the above query, but this time LEFT JOIN condition is between distributed tables
-- so Postgres doesn't generate join restriction between reference and distributed tables
SELECT count(*) FROM (SELECT u1.*, random() FROM users_ref_test_table ref1 INNER JOIN user_buy_test_table u1 on ref1.id = u1.user_id) as foo LEFT JOIN user_buy_test_table ON (foo.user_id = user_buy_test_table.user_id);
-- outer part of the LEFT JOIN consists only intermediate result due to LIMIT, so we cannot push down
-- outer part of the LEFT JOIN consists only intermediate result due to LIMIT but this is still supported
SELECT count(*) FROM (SELECT ref1.* FROM users_ref_test_table ref1 INNER JOIN users_ref_test_table ref2 on ref1.id = ref2.id LIMIT 5) as foo LEFT JOIN user_buy_test_table ON true;
-- should be fine as OUTER part is the distributed table

View File

@ -231,15 +231,16 @@ SELECT * FROM
) s1
ORDER BY 2 DESC, 1;
-- event vs table non-partition-key join is not supported
-- given that we cannot recursively plan tables yet
SELECT * FROM
-- event vs table non-partition-key join is supported
-- given that we can recursively plan events_table
SELECT count(*), user_id, done_event FROM
(SELECT ru.user_id, CASE WHEN et.user_id IS NULL THEN 'NO' ELSE 'YES' END as done_event
FROM recent_users ru
LEFT JOIN events_table et
ON(ru.user_id = et.event_type)
) s1
ORDER BY 2 DESC, 1;
GROUP BY user_id, done_event
ORDER BY 1,2,3;
-- create a select only view
CREATE VIEW selected_users AS SELECT * FROM users_table WHERE value_1 >= 1 and value_1 <3;

View File

@ -532,8 +532,6 @@ $$);
-- recursive planning should kick in for outer joins as well
-- but this time recursive planning might convert the query
-- into a not supported join
SELECT true AS valid FROM explain_json_2($$
SELECT
@ -724,7 +722,7 @@ $$);
-- similar to the above, make sure that we skip recursive planning when
-- the subquery contains only intermediate results
SELECT *
SELECT COUNT(*) = 176
FROM
(
SELECT * FROM(

File diff suppressed because it is too large Load Diff

View File

@ -64,13 +64,25 @@ SELECT create_reference_table('reference_table');
CREATE TABLE distributed_table(user_id int, item_id int, buy_count int);
SELECT create_distributed_table('distributed_table', 'user_id');
INSERT INTO distributed_table VALUES
(1, 10),
(2, 22),
(3, 34),
(7, 40);
INSERT INTO reference_table VALUES
(1, '100'),
(null, '202'),
(4, '300'),
(null, '401'),
(null, '402');
-- postgres plans below queries by evaluating joins as below:
-- L
-- / \
-- ref L
-- / \
-- dist ref
-- so we should error out as reference table is in the outer part of the top level (left) outer join
SELECT count(*) FROM distributed_table a
LEFT JOIN reference_table b ON (true)
@ -123,7 +135,6 @@ SELECT create_reference_table('t4');
-- t1(ref) L
-- / \
-- t0(dist) t4(ref)
-- -- so we should error out
SELECT count(*) FROM (
SELECT ALL t4.c1, t0.c0, t0.c1 FROM ONLY t0
LEFT OUTER JOIN t4 ON CAST(masklen('142.158.96.44') AS BOOLEAN)
@ -189,7 +200,7 @@ ON (true);
-- unsupported outer JOIN in a sublevel INNER JOIN
SELECT
unsupported_join.*
COUNT(unsupported_join.*)
FROM
(distributed_table a
LEFT JOIN reference_table b ON (true)
@ -199,7 +210,7 @@ JOIN
-- unsupported outer JOIN in a sublevel LEFT JOIN
SELECT
unsupported_join.*
COUNT(unsupported_join.*)
FROM
(distributed_table a
LEFT JOIN reference_table b ON (true)
@ -208,7 +219,7 @@ LEFT JOIN
(reference_table d JOIN reference_table e ON(true)) ON (true);
SELECT
unsupported_join.*
COUNT(unsupported_join.*)
FROM
(distributed_table a
LEFT JOIN (SELECT * FROM reference_table OFFSET 0) b ON (true)
@ -224,7 +235,7 @@ ON (true);
-- unsupported outer JOIN in a sublevel RIGHT JOIN
SELECT
unsupported_join.*
COUNT(unsupported_join.*)
FROM
(distributed_table a
LEFT JOIN reference_table b ON (true)
@ -233,7 +244,7 @@ RIGHT JOIN
(reference_table d JOIN reference_table e ON(true)) ON (true);
SELECT
unsupported_join.*
COUNT(unsupported_join.*)
FROM
(distributed_table a
LEFT JOIN (SELECT * FROM reference_table OFFSET 0) b ON (true)
@ -247,7 +258,7 @@ RIGHT JOIN
)
ON (true);
EXPLAIN SELECT
EXPLAIN (COSTS OFF) SELECT
unsupported_join.*
FROM
(distributed_table a

View File

@ -68,15 +68,19 @@ FROM
SET citus.enable_router_execution TO true;
-- OUTER JOINs where the outer part is recursively planned and not the other way
-- around is not supported
-- around are supported
SELECT
foo.value_2
INTO result_table
FROM
(SELECT users_table.value_2 FROM users_table, events_table WHERE users_table.user_id = events_table.user_id AND event_type IN (1,2,3,4) LIMIT 5) as foo
(SELECT users_table.value_2 FROM users_table, events_table WHERE users_table.user_id = events_table.user_id AND event_type IN (1,2,3,4) ORDER BY users_table.value_2 LIMIT 5) as foo
LEFT JOIN
(SELECT users_table.value_2 FROM users_table, events_table WHERE users_table.user_id = events_table.user_id AND event_type IN (5,6,7,8)) as bar
ON(foo.value_2 = bar.value_2);
SELECT COUNT(*) = 60 FROM result_table WHERE value_2 = 0;
SELECT COUNT(*) = 0 FROM result_table WHERE value_2 != 0;
DROP TABLE result_table;
-- We do not support GROUPING SETS in subqueries
-- This also includes ROLLUP or CUBE clauses

View File

@ -159,7 +159,7 @@ FROM
(SELECT a,b FROM (VALUES (1, 'one'), (2, 'two'), (3, 'three')) as t(a,b) ) as foo (num,letter)
ON (key = num);
-- VALUES with unsupported OUTER join
-- VALUES with supported OUTER join (since test_values is recursively planned)
SELECT
count(*)
FROM
@ -299,7 +299,7 @@ SELECT count(*) FROM
(SELECT random() FROM test_values LEFT JOIN (SELECT a, b FROM (VALUES (1, 'one'), (2, 'two'), (3, 'three')) as t(a,b)) as values_data(a,b)
ON test_values.key > values_data.a) subquery_1;
-- VALUES cannot be the right relationship in a join
-- VALUES can be the right relationship in a join
SELECT count(*) FROM
(SELECT random() FROM test_values RIGHT JOIN (SELECT a, b FROM (VALUES (1, 'one'), (2, 'two'), (3, 'three')) as t(a,b)) as values_data(a,b)
ON test_values.key > values_data.a) subquery_1;

View File

@ -573,7 +573,8 @@ LEFT JOIN
WHERE NOT EXISTS
(SELECT NULL
FROM distinct_undistribured
WHERE distinct_undistribured.user_id = test_cte_distributed.user_id)) exsist_in_distributed ON distinct_undistribured.user_id = exsist_in_distributed.user_id;
WHERE distinct_undistribured.user_id = test_cte_distributed.user_id)) exsist_in_distributed ON distinct_undistribured.user_id = exsist_in_distributed.user_id
ORDER BY 1,2;
-- same NOT EXISTS struct, but with CTE
-- so should work
@ -658,7 +659,8 @@ LEFT JOIN
WHERE NOT EXISTS
(SELECT NULL
FROM distinct_undistribured
WHERE distinct_undistribured.user_id = test_cte_distributed.user_id)) exsist_in_distributed ON distinct_undistribured.user_id = exsist_in_distributed.user_id;
WHERE distinct_undistribured.user_id = test_cte_distributed.user_id)) exsist_in_distributed ON distinct_undistribured.user_id = exsist_in_distributed.user_id
ORDER BY 1,2;
-- NOT EXISTS struct, with cte inlining disabled
WITH distinct_undistribured AS MATERIALIZED(

View File

@ -126,7 +126,7 @@ ORDER BY
LIMIT
5;
-- cte LEFT JOIN distributed_table should error out
-- cte LEFT JOIN distributed_table should work
-- as long as the CTE is recursively planned
WITH cte AS MATERIALIZED (
SELECT * FROM users_table WHERE user_id = 1 ORDER BY value_1
@ -173,7 +173,7 @@ ORDER BY
LIMIT
5;
-- distributed_table RIGHT JOIN cte should error out
-- distributed_table RIGHT JOIN cte should work
WITH cte AS MATERIALIZED (
SELECT * FROM users_table WHERE value_1 = 1 ORDER BY value_1
)
@ -188,7 +188,7 @@ ORDER BY
LIMIT
5;
-- cte FULL JOIN distributed_table should error out
-- cte FULL JOIN distributed_table should work
WITH cte AS MATERIALIZED (
SELECT * FROM users_table WHERE user_id = 1 ORDER BY value_1
)

View File

@ -4,7 +4,7 @@ test: ch_benchmarks_1 ch_benchmarks_2 ch_benchmarks_3
test: ch_benchmarks_4 ch_benchmarks_5 ch_benchmarks_6
test: intermediate_result_pruning_queries_1 intermediate_result_pruning_queries_2
test: dropped_columns_1 distributed_planning
test: local_dist_join nested_execution
test: local_dist_join nested_execution arbitrary_configs_recurring_outer_join
test: connectivity_checks citus_run_command
test: schemas
test: views