Introduce fast path router planning

In this context, we define "Fast Path Planning for SELECT" as trivial
queries where Citus can skip relying on the standard_planner() and
handle all the planning.

For router planner, standard_planner() is mostly important to generate
the necessary restriction information. Later, the restriction information
generated by the standard_planner is used to decide whether all the shards
that a distributed query touches reside on a single worker node. However,
standard_planner() does a lot of extra things such as cost estimation and
execution path generations which are completely unnecessary in the context
of distributed planning.

There are certain types of queries where Citus could skip relying on
standard_planner() to generate the restriction information. For queries
in the following format, Citus does not need any information that the
standard_planner() generates:

  SELECT ... FROM single_table WHERE distribution_key = X;  or
  DELETE FROM single_table WHERE distribution_key = X; or
  UPDATE single_table SET value_1 = value_2 + 1 WHERE distribution_key = X;

Note that the queries might not be as simple as the above such that
GROUP BY, WINDOW FUNCIONS, ORDER BY or HAVING etc. are all acceptable. The
only rule is that the query is on a single distributed (or reference) table
and there is a "distribution_key = X;" in the WHERE clause. With that, we
could use to decide the shard that a distributed query touches reside on
a worker node.
pull/2606/head
Onder Kalaci 2019-02-10 20:05:59 +03:00
parent fbc22aa6d3
commit f144bb4911
31 changed files with 4272 additions and 40 deletions

View File

@ -64,7 +64,6 @@ static DistributedPlan * CreateDistributedPlan(uint64 planId, Query *originalQue
plannerRestrictionContext);
static DeferredErrorMessage * DeferErrorIfPartitionTableNotSingleReplicated(Oid
relationId);
static Node * ResolveExternalParams(Node *inputNode, ParamListInfo boundParams);
static void AssignRTEIdentities(Query *queryTree);
static void AssignRTEIdentity(RangeTblEntry *rangeTableEntry, int rteIdentifier);
@ -147,11 +146,22 @@ distributed_planner(Query *parse, int cursorOptions, ParamListInfo boundParams)
PG_TRY();
{
/*
* First call into standard planner. This is required because the Citus
* planner relies on parse tree transformations made by postgres' planner.
* For trivial queries, we're skipping the standard_planner() in
* order to eliminate its overhead.
*
* Otherwise, call into standard planner. This is required because the Citus
* planner relies on both the restriction information per table and parse tree
* transformations made by postgres' planner.
*/
result = standard_planner(parse, cursorOptions, boundParams);
if (needsDistributedPlanning && FastPathRouterQuery(originalQuery))
{
result = FastPathPlanner(originalQuery, parse, boundParams);
}
else
{
result = standard_planner(parse, cursorOptions, boundParams);
}
if (needsDistributedPlanning)
{
@ -831,7 +841,7 @@ DeferErrorIfPartitionTableNotSingleReplicated(Oid relationId)
* Note that this function is inspired by eval_const_expr() on Postgres.
* We cannot use that function because it requires access to PlannerInfo.
*/
static Node *
Node *
ResolveExternalParams(Node *inputNode, ParamListInfo boundParams)
{
/* consider resolving external parameters only when boundParams exists */

View File

@ -0,0 +1,431 @@
/*-------------------------------------------------------------------------
*
* fast_path_router_planner.c
*
* Planning logic for fast path router planner queries. In this context,
* we define "Fast Path Planning" as trivial queries where Citus
* can skip relying on the standard_planner() and handle all the planning.
*
* For router planner, standard_planner() is mostly important to generate
* the necessary restriction information. Later, the restriction information
* generated by the standard_planner is used to decide whether all the shards
* that a distributed query touches reside on a single worker node. However,
* standard_planner() does a lot of extra things such as cost estimation and
* execution path generations which are completely unnecessary in the context
* of distributed planning.
*
* There are certain types of queries where Citus could skip relying on
* standard_planner() to generate the restriction information. For queries
* in the following format, Citus does not need any information that the
* standard_planner() generates:
* SELECT ... FROM single_table WHERE distribution_key = X; or
* DELETE FROM single_table WHERE distribution_key = X; or
* UPDATE single_table SET value_1 = value_2 + 1 WHERE distribution_key = X;
*
* Note that the queries might not be as simple as the above such that
* GROUP BY, WINDOW FUNCIONS, ORDER BY or HAVING etc. are all acceptable. The
* only rule is that the query is on a single distributed (or reference) table
* and there is a "distribution_key = X;" in the WHERE clause. With that, we
* could use to decide the shard that a distributed query touches reside on
* a worker node.
*
* Copyright (c) 2019, Citus Data, Inc.
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "distributed/distributed_planner.h"
#include "distributed/multi_physical_planner.h" /* only to use some utility functions */
#include "distributed/metadata_cache.h"
#include "distributed/multi_router_planner.h"
#include "distributed/pg_dist_partition.h"
#include "distributed/shardinterval_utils.h"
#include "distributed/shard_pruning.h"
#include "nodes/nodeFuncs.h"
#include "nodes/parsenodes.h"
#include "nodes/pg_list.h"
#include "optimizer/clauses.h"
bool EnableFastPathRouterPlanner = true;
static bool ColumnAppearsMultipleTimes(Node *quals, Var *distributionKey);
static bool ConjunctionContainsColumnFilter(Node *node, Var *column);
static bool DistKeyInSimpleOpExpression(Expr *clause, Var *distColumn);
/*
* FastPathPlanner is intended to be used instead of standard_planner() for trivial
* queries defined by FastPathRouterQuery().
*
* The basic idea is that we need a very little of what standard_planner() does for
* the trivial queries. So skip calling standard_planner() to save CPU cycles.
*
*/
PlannedStmt *
FastPathPlanner(Query *originalQuery, Query *parse, ParamListInfo boundParams)
{
PlannedStmt *result = NULL;
/*
* To support prepared statements for fast-path queries, we resolve the
* external parameters at this point. Note that this is normally done by
* eval_const_expr() in standard planner when the boundParams are avaliable.
* If not avaliable, as does for all other types of queries, Citus goes
* through the logic of increasing the cost of the plan and forcing
* PostgreSQL to pick custom plans.
*
* We're also only interested in resolving the quals since we'd want to
* do shard pruning based on the filter on the distribution column.
*/
originalQuery->jointree->quals =
ResolveExternalParams((Node *) originalQuery->jointree->quals,
copyParamList(boundParams));
/*
* Citus planner relies on some of the transformations on constant
* evaluation on the parse tree.
*/
parse->targetList =
(List *) eval_const_expressions(NULL, (Node *) parse->targetList);
parse->jointree->quals =
(Node *) eval_const_expressions(NULL, (Node *) parse->jointree->quals);
result = GeneratePlaceHolderPlannedStmt(originalQuery);
return result;
}
/*
* GeneratePlaceHolderPlannedStmt creates a planned statement which contains
* a sequential scan on the relation that is accessed by the input query.
* The returned PlannedStmt is not proper (e.g., set_plan_references() is
* not called on the plan or the quals are not set), so should not be
* passed to the executor directly. This is only useful to have a
* placeholder PlannedStmt where target list is properly set. Note that
* this is what router executor relies on.
*
* This function makes the assumption (and the assertion) that
* the input query is in the form defined by FastPathRouterQuery().
*/
PlannedStmt *
GeneratePlaceHolderPlannedStmt(Query *parse)
{
PlannedStmt *result = makeNode(PlannedStmt);
SeqScan *seqScanNode = makeNode(SeqScan);
Plan *plan = &seqScanNode->plan;
Oid relationId = InvalidOid;
AssertArg(FastPathRouterQuery(parse));
/* there is only a single relation rte */
seqScanNode->scanrelid = 1;
plan->targetlist = copyObject(parse->targetList);
plan->qual = NULL;
plan->lefttree = NULL;
plan->righttree = NULL;
plan->plan_node_id = 1;
/* rtable is used for access permission checks */
result->commandType = parse->commandType;
result->queryId = parse->queryId;
result->stmt_len = parse->stmt_len;
result->rtable = copyObject(parse->rtable);
result->planTree = (Plan *) plan;
relationId = ExtractFirstDistributedTableId(parse);
result->relationOids = list_make1_oid(relationId);
return result;
}
/*
* FastPathRouterQuery gets a query and returns true if the query is eligable for
* being a fast path router query.
* The requirements for the fast path query can be listed below:
*
* - SELECT query without CTES, sublinks-subqueries, set operations
* - The query should touch only a single hash distributed or reference table
* - The distribution with equality operator should be in the WHERE clause
* and it should be ANDed with any other filters. Also, the distribution
* key should only exists once in the WHERE clause. So basically,
* SELECT ... FROM dist_table WHERE dist_key = X
* - No returning for UPDATE/DELETE queries
*/
bool
FastPathRouterQuery(Query *query)
{
RangeTblEntry *rangeTableEntry = NULL;
FromExpr *joinTree = query->jointree;
Node *quals = NULL;
Oid distributedTableId = InvalidOid;
Var *distributionKey = NULL;
DistTableCacheEntry *cacheEntry = NULL;
if (!EnableFastPathRouterPlanner)
{
return false;
}
if (!(query->commandType == CMD_SELECT || query->commandType == CMD_UPDATE ||
query->commandType == CMD_DELETE))
{
return false;
}
/*
* We want to deal with only very simple select queries. Some of the
* checks might be too restrictive, still we prefer this way.
*/
if (query->cteList != NIL || query->returningList != NIL ||
query->hasSubLinks || query->setOperations != NULL ||
query->hasTargetSRFs || query->hasModifyingCTE)
{
return false;
}
/* make sure that the only range table in FROM clause */
if (list_length(query->rtable) != 1)
{
return false;
}
rangeTableEntry = (RangeTblEntry *) linitial(query->rtable);
if (rangeTableEntry->rtekind != RTE_RELATION)
{
return false;
}
/* we don't want to deal with append/range distributed tables */
distributedTableId = rangeTableEntry->relid;
cacheEntry = DistributedTableCacheEntry(distributedTableId);
if (!(cacheEntry->partitionMethod == DISTRIBUTE_BY_HASH ||
cacheEntry->partitionMethod == DISTRIBUTE_BY_NONE))
{
return false;
}
/*
* hasForUpdate is tricky because Citus does support only when
* replication = 1 or reference tables.
*/
if (query->hasForUpdate)
{
if (cacheEntry->partitionMethod == DISTRIBUTE_BY_NONE ||
SingleReplicatedTable(distributedTableId))
{
return true;
}
return false;
}
/* WHERE clause should not be empty for distributed tables */
if (joinTree == NULL ||
(cacheEntry->partitionMethod != DISTRIBUTE_BY_NONE && joinTree->quals == NULL))
{
return false;
}
/* if that's a reference table, we don't need to check anything further */
distributionKey = PartitionColumn(distributedTableId, 1);
if (!distributionKey)
{
return true;
}
/* convert list of expressions into expression tree for further processing */
quals = joinTree->quals;
if (quals != NULL && IsA(quals, List))
{
quals = (Node *) make_ands_explicit((List *) quals);
}
/*
* Distribution column must be used in a simple equality match check and it must be
* place at top level conjustion operator. In simple words, we should have
* WHERE dist_key = VALUE [AND ....];
*
* We're also not allowing any other appearances of the distribution key in the quals.
*
* Overall the logic is might sound fuzzy since it involves two individual checks:
* (a) Check for top level AND operator with one side being "dist_key = const"
* (b) Only allow single appearance of "dist_key" in the quals
*
* This is to simplify both of the individual checks and omit various edge cases
* that might arise with multiple distribution keys in the quals.
*/
if (ConjunctionContainsColumnFilter(quals, distributionKey) &&
!ColumnAppearsMultipleTimes(quals, distributionKey))
{
return true;
}
return false;
}
/*
* ColumnAppearsMultipleTimes returns true if the given input
* appears more than once in the quals.
*/
static bool
ColumnAppearsMultipleTimes(Node *quals, Var *distributionKey)
{
ListCell *varClauseCell = NULL;
List *varClauseList = NIL;
int partitionColumnReferenceCount = 0;
/* make sure partition column is used only once in the quals */
varClauseList = pull_var_clause_default(quals);
foreach(varClauseCell, varClauseList)
{
Var *column = (Var *) lfirst(varClauseCell);
if (equal(column, distributionKey))
{
partitionColumnReferenceCount++;
if (partitionColumnReferenceCount > 1)
{
return true;
}
}
}
return false;
}
/*
* ConjunctionContainsColumnFilter returns true if the query contains an exact
* match (equal) expression on the provided column. The function returns true only
* if the match expression has an AND relation with the rest of the expression tree.
*/
static bool
ConjunctionContainsColumnFilter(Node *node, Var *column)
{
if (node == NULL)
{
return false;
}
if (IsA(node, OpExpr))
{
OpExpr *opExpr = (OpExpr *) node;
bool distKeyInSimpleOpExpression =
DistKeyInSimpleOpExpression((Expr *) opExpr, column);
if (!distKeyInSimpleOpExpression)
{
return false;
}
return OperatorImplementsEquality(opExpr->opno);
}
else if (IsA(node, BoolExpr))
{
BoolExpr *boolExpr = (BoolExpr *) node;
List *argumentList = boolExpr->args;
ListCell *argumentCell = NULL;
/*
* We do not descend into boolean expressions other than AND.
* If the column filter appears in an OR clause, we do not
* consider it even if it is logically the same as a single value
* comparison (e.g. `<column> = <Const> OR false`)
*/
if (boolExpr->boolop != AND_EXPR)
{
return false;
}
foreach(argumentCell, argumentList)
{
Node *argumentNode = (Node *) lfirst(argumentCell);
if (ConjunctionContainsColumnFilter(argumentNode, column))
{
return true;
}
}
}
return false;
}
/*
* DistKeyInSimpleOpExpression checks whether given expression is a simple operator
* expression with either (dist_key = param) or (dist_key = const). Note that the
* operands could be in the reverse order as well.
*/
static bool
DistKeyInSimpleOpExpression(Expr *clause, Var *distColumn)
{
Node *leftOperand = NULL;
Node *rightOperand = NULL;
Param *paramClause = NULL;
Const *constantClause = NULL;
Var *columnInExpr = NULL;
if (is_opclause(clause) && list_length(((OpExpr *) clause)->args) == 2)
{
leftOperand = get_leftop(clause);
rightOperand = get_rightop(clause);
}
else
{
return false; /* not a binary opclause */
}
/* strip coercions before doing check */
leftOperand = strip_implicit_coercions(leftOperand);
rightOperand = strip_implicit_coercions(rightOperand);
if (IsA(rightOperand, Param) && IsA(leftOperand, Var))
{
paramClause = (Param *) rightOperand;
columnInExpr = (Var *) leftOperand;
}
else if (IsA(leftOperand, Param) && IsA(rightOperand, Var))
{
paramClause = (Param *) leftOperand;
columnInExpr = (Var *) rightOperand;
}
else if (IsA(rightOperand, Const) && IsA(leftOperand, Var))
{
constantClause = (Const *) rightOperand;
columnInExpr = (Var *) leftOperand;
}
else if (IsA(leftOperand, Const) && IsA(rightOperand, Var))
{
constantClause = (Const *) leftOperand;
columnInExpr = (Var *) rightOperand;
}
else
{
return false;
}
if (paramClause && paramClause->paramkind != PARAM_EXTERN)
{
/* we can only handle param_externs */
return false;
}
else if (constantClause && constantClause->constisnull)
{
/* we can only handle non-null constants */
return false;
}
/* at this point we should have the columnInExpr */
Assert(columnInExpr);
return equal(distColumn, columnInExpr);
}

View File

@ -2005,13 +2005,11 @@ BuildJobTreeTaskList(Job *jobTree, PlannerRestrictionContext *plannerRestriction
if (job->subqueryPushdown)
{
bool isMultiShardQuery = false;
List *prunedRelationShardList = TargetShardIntervalsForQuery(job->jobQuery,
plannerRestrictionContext
->
relationRestrictionContext,
&
isMultiShardQuery,
NULL);
List *prunedRelationShardList =
TargetShardIntervalsForRestrictInfo(plannerRestrictionContext->
relationRestrictionContext,
&isMultiShardQuery, NULL);
sqlTaskList = QueryPushdownSqlTaskList(job->jobQuery, job->jobId,
plannerRestrictionContext->
relationRestrictionContext,

View File

@ -146,6 +146,9 @@ static List * get_all_actual_clauses(List *restrictinfo_list);
static int CompareInsertValuesByShardId(const void *leftElement,
const void *rightElement);
static uint64 GetInitialShardId(List *relationShardList);
static List * TargetShardIntervalForFastPathQuery(Query *query,
Const **partitionValueConst,
bool *isMultiShardQuery);
static List * SingleShardSelectTaskList(Query *query, uint64 jobId,
List *relationShardList, List *placementList,
uint64 shardId);
@ -1886,11 +1889,46 @@ PlanRouterQuery(Query *originalQuery,
*placementList = NIL;
prunedRelationShardList = TargetShardIntervalsForQuery(originalQuery,
plannerRestrictionContext->
relationRestrictionContext,
&isMultiShardQuery,
partitionValueConst);
/*
* When FastPathRouterQuery() returns true, we know that standard_planner() has
* not been called. Thus, restriction information is not avaliable and we do the
* shard pruning based on the distribution column in the quals of the query.
*/
if (FastPathRouterQuery(originalQuery))
{
List *shardIntervalList =
TargetShardIntervalForFastPathQuery(originalQuery, partitionValueConst,
&isMultiShardQuery);
/*
* This could only happen when there is a parameter on the distribution key.
* We defer error here, later the planner is forced to use a generic plan
* by assigning arbitrarily high cost to the plan.
*/
if (UpdateOrDeleteQuery(originalQuery) && isMultiShardQuery)
{
planningError = DeferredError(ERRCODE_FEATURE_NOT_SUPPORTED,
"Router planner cannot handle multi-shard "
"modify queries", NULL, NULL);
return planningError;
}
prunedRelationShardList = list_make1(shardIntervalList);
if (!isMultiShardQuery)
{
ereport(DEBUG2, (errmsg("Distributed planning for a fast-path router "
"query")));
}
}
else
{
prunedRelationShardList =
TargetShardIntervalsForRestrictInfo(plannerRestrictionContext->
relationRestrictionContext,
&isMultiShardQuery,
partitionValueConst);
}
if (isMultiShardQuery)
{
@ -2065,19 +2103,59 @@ GetInitialShardId(List *relationShardList)
/*
* TargetShardIntervalsForQuery performs shard pruning for all referenced relations
* in the query and returns list of shards per relation. Shard pruning is done based
* on provided restriction context per relation. The function sets multiShardQuery
* to true if any of the relations pruned down to more than one active shard. It
* also records pruned shard intervals in relation restriction context to be used
* later on. Some queries may have contradiction clauses like 'and false' or
* 'and 1=0', such queries are treated as if all of the shards of joining
* relations are pruned out.
* TargetShardIntervalForFastPathQuery gets a query which is in
* the form defined by FastPathRouterQuery() and returns exactly
* one shard interval (see FastPathRouterQuery() for the detail).
*
* Also set the outgoing partition column value if requested via
* partitionValueConst
*/
static List *
TargetShardIntervalForFastPathQuery(Query *query, Const **partitionValueConst,
bool *isMultiShardQuery)
{
Const *queryPartitionValueConst = NULL;
Oid relationId = ExtractFirstDistributedTableId(query);
Node *quals = query->jointree->quals;
int relationIndex = 1;
List *prunedShardIntervalList =
PruneShards(relationId, relationIndex, make_ands_implicit((Expr *) quals),
&queryPartitionValueConst);
/* we're only expecting single shard from a single table */
Assert(FastPathRouterQuery(query));
if (list_length(prunedShardIntervalList) > 1)
{
*isMultiShardQuery = true;
}
else if (list_length(prunedShardIntervalList) == 1 &&
queryPartitionValueConst != NULL)
{
/* set the outgoing partition column value if requested */
*partitionValueConst = queryPartitionValueConst;
}
return prunedShardIntervalList;
}
/*
* TargetShardIntervalsForRestrictInfo performs shard pruning for all referenced
* relations in the relation restriction context and returns list of shards per
* relation. Shard pruning is done based on provided restriction context per relation.
* The function sets multiShardQuery to true if any of the relations pruned down to
* more than one active shard. It also records pruned shard intervals in relation
* restriction context to be used later on. Some queries may have contradiction
* clauses like 'and false' or 'and 1=0', such queries are treated as if all of
* the shards of joining relations are pruned out.
*/
List *
TargetShardIntervalsForQuery(Query *query,
RelationRestrictionContext *restrictionContext,
bool *multiShardQuery, Const **partitionValueConst)
TargetShardIntervalsForRestrictInfo(RelationRestrictionContext *restrictionContext,
bool *multiShardQuery, Const **partitionValueConst)
{
List *prunedRelationShardList = NIL;
ListCell *restrictionCell = NULL;

View File

@ -2,14 +2,23 @@
The distributed query planner is entered through the `distributed_planner` function in `distributed_planner.c`. This is the hook that Postgres calls instead of `standard_planner`.
We always first call `standard_planner` to build a `PlannedStmt`. For queries containing a distributed table or reference table, we then proceed with distributed planning, which overwrites the `planTree` in the `PlannedStmt`.
If the input query is trivial (e.g., no joins, no subqueries/ctes, single table and single shard), we create a very simple `PlannedStmt`. If the query is not trivial, call `standard_planner` to build a `PlannedStmt`. For queries containing a distributed table or reference table, we then proceed with distributed planning, which overwrites the `planTree` in the `PlannedStmt`.
Distributed planning (`CreateDistributedPlan`) tries several different methods to plan the query:
1. Router planner, proceed if the query prunes down to a single set of co-located shards
2. Modification planning, proceed if the query is a DML command and all joins are co-located
3. Recursive planning, find CTEs and subqueries that cannot be pushed down and go back to 1
4. Logical planner, constructs a multi-relational algebra tree to find a distributed execution plan
1. Fast-path router planner, proceed if the query prunes down to a single shard of a single table
2. Router planner, proceed if the query prunes down to a single set of co-located shards
3. Modification planning, proceed if the query is a DML command and all joins are co-located
4. Recursive planning, find CTEs and subqueries that cannot be pushed down and go back to 1
5. Logical planner, constructs a multi-relational algebra tree to find a distributed execution plan
## Fast-path router planner
By examining the query tree, if we can decide that the query hits only a single shard of a single table, we can skip calling `standard_planner()`. Later on the execution, we simply fetch the filter on the distribution key and do the pruning.
As the name reveals, this can be considered as a sub-item of Router planner described below. The only difference is that fast-path planner doesn't rely on `standard_planner()` for collecting restriction information.
## Router planner

View File

@ -411,6 +411,16 @@ RegisterCitusConfigVariables(void)
GUC_NO_SHOW_ALL,
NULL, NULL, NULL);
DefineCustomBoolVariable(
"citus.enable_fast_path_router_planner",
gettext_noop("Enables fast path router planner"),
NULL,
&EnableFastPathRouterPlanner,
true,
PGC_USERSET,
GUC_NO_SHOW_ALL,
NULL, NULL, NULL);
DefineCustomBoolVariable(
"citus.override_table_visibility",
gettext_noop("Enables replacing occurencens of pg_catalog.pg_table_visible() "

View File

@ -97,6 +97,7 @@ extern bool IsModifyCommand(Query *query);
extern bool IsUpdateOrDelete(struct DistributedPlan *distributedPlan);
extern bool IsModifyDistributedPlan(struct DistributedPlan *distributedPlan);
extern void EnsurePartitionTableNotReplicated(Oid relationId);
extern Node * ResolveExternalParams(Node *inputNode, ParamListInfo boundParams);
extern bool IsMultiTaskPlan(struct DistributedPlan *distributedPlan);
extern bool IsMultiShardModifyPlan(struct DistributedPlan *distributedPlan);
extern RangeTblEntry * RemoteScanRangeTableEntry(List *columnNameList);

View File

@ -25,6 +25,7 @@
#define CITUS_TABLE_ALIAS "citus_table_alias"
extern bool EnableRouterExecution;
extern bool EnableFastPathRouterPlanner;
extern DistributedPlan * CreateRouterPlan(Query *originalQuery, Query *query,
PlannerRestrictionContext *
@ -42,10 +43,10 @@ extern DeferredErrorMessage * PlanRouterQuery(Query *originalQuery,
Const **partitionValueConst);
extern List * RouterInsertTaskList(Query *query, DeferredErrorMessage **planningError);
extern Const * ExtractInsertPartitionKeyValue(Query *query);
extern List * TargetShardIntervalsForQuery(Query *query,
RelationRestrictionContext *restrictionContext,
bool *multiShardQuery,
Const **partitionValueConst);
extern List * TargetShardIntervalsForRestrictInfo(RelationRestrictionContext *
restrictionContext,
bool *multiShardQuery,
Const **partitionValueConst);
extern List * WorkersContainingAllShards(List *prunedShardIntervalsList);
extern List * IntersectPlacementList(List *lhsPlacementList, List *rhsPlacementList);
extern DeferredErrorMessage * ModifyQuerySupported(Query *queryTree, Query *originalQuery,
@ -68,5 +69,13 @@ extern void AddShardIntervalRestrictionToSelect(Query *subqery,
extern bool UpdateOrDeleteQuery(Query *query);
extern List * WorkersContainingAllShards(List *prunedShardIntervalsList);
/*
* FastPathPlanner is a subset of router planner, that's why we prefer to
* keep the external function here.
*/extern PlannedStmt * GeneratePlaceHolderPlannedStmt(Query *parse);
extern PlannedStmt * FastPathPlanner(Query *originalQuery, Query *parse, ParamListInfo
boundParams);
extern bool FastPathRouterQuery(Query *query);
#endif /* MULTI_ROUTER_PLANNER_H */

View File

@ -0,0 +1,362 @@
CREATE SCHEMA fast_path_router_modify;
SET search_path TO fast_path_router_modify;
SET citus.next_shard_id TO 1840000;
-- all the tests in this file is intended for testing fast-path
-- router planner, so we're explicitly enabling itin this file.
-- We've bunch of other tests that triggers non-fast-path-router
-- planner (note this is already true by default)
SET citus.enable_fast_path_router_planner TO true;
SET citus.shard_replication_factor TO 1;
CREATE TABLE modify_fast_path(key int, value_1 int, value_2 text);
SELECT create_distributed_table('modify_fast_path', 'key');
create_distributed_table
--------------------------
(1 row)
SET citus.shard_replication_factor TO 2;
CREATE TABLE modify_fast_path_replication_2(key int, value_1 int, value_2 text);
SELECT create_distributed_table('modify_fast_path_replication_2', 'key');
create_distributed_table
--------------------------
(1 row)
CREATE TABLE modify_fast_path_reference(key int, value_1 int, value_2 text);
SELECT create_reference_table('modify_fast_path_reference');
create_reference_table
------------------------
(1 row)
-- show the output
SET client_min_messages TO DEBUG;
-- very simple queries goes through fast-path planning
DELETE FROM modify_fast_path WHERE key = 1;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
UPDATE modify_fast_path SET value_1 = 1 WHERE key = 1;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
UPDATE modify_fast_path SET value_1 = value_1 + 1 WHERE key = 1;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
UPDATE modify_fast_path SET value_1 = value_1 + value_2::int WHERE key = 1;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
DELETE FROM modify_fast_path WHERE value_1 = 15 AND (key = 1 AND value_2 = 'citus');
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
DELETE FROM modify_fast_path WHERE key = 1 and FALSE;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
-- UPDATE may include complex target entries
UPDATE modify_fast_path SET value_1 = value_1 + 12 * value_1 WHERE key = 1;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
UPDATE modify_fast_path SET value_1 = abs(-19) WHERE key = 1;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
-- cannot go through fast-path because there are multiple keys
DELETE FROM modify_fast_path WHERE key = 1 AND key = 2;
DEBUG: Creating router plan
DEBUG: Plan is router executable
DELETE FROM modify_fast_path WHERE key = 1 AND (key = 2 AND value_1 = 15);
DEBUG: Creating router plan
DEBUG: Plan is router executable
-- cannot go through fast-path because key is not on the top level
DELETE FROM modify_fast_path WHERE value_1 = 15 OR (key = 1 AND value_2 = 'citus');
DEBUG: Creating router plan
DEBUG: Plan is router executable
DELETE FROM modify_fast_path WHERE value_1 = 15 AND (key = 1 OR value_2 = 'citus');
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
-- goes through fast-path planning even if the key is updated to the same value
UPDATE modify_fast_path SET key = 1 WHERE key = 1;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
UPDATE modify_fast_path SET key = 1::float WHERE key = 1;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
-- cannot support if key changes
UPDATE modify_fast_path SET key = 2 WHERE key = 1;
DEBUG: modifying the partition value of rows is not allowed
ERROR: modifying the partition value of rows is not allowed
UPDATE modify_fast_path SET key = 2::numeric WHERE key = 1;
DEBUG: modifying the partition value of rows is not allowed
ERROR: modifying the partition value of rows is not allowed
-- returning is not supported via fast-path
DELETE FROM modify_fast_path WHERE key = 1 RETURNING *;
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
key | value_1 | value_2
-----+---------+---------
(0 rows)
-- modifying ctes are not supported via fast-path
WITH t1 AS (DELETE FROM modify_fast_path WHERE key = 1), t2 AS (SELECT * FROM modify_fast_path) SELECT * FROM t2;
DEBUG: data-modifying statements are not supported in the WITH clauses of distributed queries
DEBUG: generating subplan 18_1 for CTE t1: DELETE FROM fast_path_router_modify.modify_fast_path WHERE (key OPERATOR(pg_catalog.=) 1)
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
DEBUG: generating subplan 18_2 for CTE t2: SELECT key, value_1, value_2 FROM fast_path_router_modify.modify_fast_path
DEBUG: Plan 18 query after replacing subqueries and CTEs: SELECT key, value_1, value_2 FROM (SELECT intermediate_result.key, intermediate_result.value_1, intermediate_result.value_2 FROM read_intermediate_result('18_2'::text, 'binary'::citus_copy_format) intermediate_result(key integer, value_1 integer, value_2 text)) t2
DEBUG: Creating router plan
DEBUG: Plan is router executable
key | value_1 | value_2
-----+---------+---------
(0 rows)
-- for update/share is supported via fast-path when replication factor = 1 or reference table
SELECT * FROM modify_fast_path WHERE key = 1 FOR UPDATE;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
key | value_1 | value_2
-----+---------+---------
(0 rows)
SELECT * FROM modify_fast_path WHERE key = 1 FOR SHARE;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
key | value_1 | value_2
-----+---------+---------
(0 rows)
SELECT * FROM modify_fast_path_reference WHERE key = 1 FOR UPDATE;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
key | value_1 | value_2
-----+---------+---------
(0 rows)
SELECT * FROM modify_fast_path_reference WHERE key = 1 FOR SHARE;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
key | value_1 | value_2
-----+---------+---------
(0 rows)
-- for update/share is not supported via fast-path wen replication factor > 1
SELECT * FROM modify_fast_path_replication_2 WHERE key = 1 FOR UPDATE;
ERROR: could not run distributed query with FOR UPDATE/SHARE commands
HINT: Consider using an equality filter on the distributed table's partition column.
SELECT * FROM modify_fast_path_replication_2 WHERE key = 1 FOR SHARE;
ERROR: could not run distributed query with FOR UPDATE/SHARE commands
HINT: Consider using an equality filter on the distributed table's partition column.
-- very simple queries on reference tables goes through fast-path planning
DELETE FROM modify_fast_path_reference WHERE key = 1;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
UPDATE modify_fast_path_reference SET value_1 = 1 WHERE key = 1;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
UPDATE modify_fast_path_reference SET value_1 = value_1 + 1 WHERE key = 1;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
UPDATE modify_fast_path_reference SET value_1 = value_1 + value_2::int WHERE key = 1;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
-- joins are not supported via fast-path
UPDATE modify_fast_path
SET value_1 = 1
FROM modify_fast_path_reference
WHERE
modify_fast_path.key = modify_fast_path_reference.key AND
modify_fast_path.key = 1 AND
modify_fast_path_reference.key = 1;
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
PREPARE p1 (int, int, int) AS
UPDATE modify_fast_path SET value_1 = value_1 + $1 WHERE key = $2 AND value_1 = $3;
EXECUTE p1(1,1,1);
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
EXECUTE p1(2,2,2);
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 2
EXECUTE p1(3,3,3);
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 3
EXECUTE p1(4,4,4);
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 4
EXECUTE p1(5,5,5);
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 5
EXECUTE p1(6,6,6);
DEBUG: Router planner cannot handle multi-shard modify queries
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 6
CREATE FUNCTION modify_fast_path_plpsql(int, int) RETURNS void as $$
BEGIN
DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2;
END;
$$ LANGUAGE plpgsql;
SELECT modify_fast_path_plpsql(1,1);
DEBUG: Distributed planning for a fast-path router query
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
DEBUG: Creating router plan
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
modify_fast_path_plpsql
-------------------------
(1 row)
SELECT modify_fast_path_plpsql(2,2);
DEBUG: Distributed planning for a fast-path router query
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
DEBUG: Creating router plan
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
DEBUG: Plan is router executable
DETAIL: distribution column value: 2
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
modify_fast_path_plpsql
-------------------------
(1 row)
SELECT modify_fast_path_plpsql(3,3);
DEBUG: Distributed planning for a fast-path router query
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
DEBUG: Creating router plan
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
DEBUG: Plan is router executable
DETAIL: distribution column value: 3
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
modify_fast_path_plpsql
-------------------------
(1 row)
SELECT modify_fast_path_plpsql(4,4);
DEBUG: Distributed planning for a fast-path router query
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
DEBUG: Creating router plan
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
DEBUG: Plan is router executable
DETAIL: distribution column value: 4
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
modify_fast_path_plpsql
-------------------------
(1 row)
SELECT modify_fast_path_plpsql(5,5);
DEBUG: Distributed planning for a fast-path router query
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
DEBUG: Creating router plan
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
DEBUG: Plan is router executable
DETAIL: distribution column value: 5
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
modify_fast_path_plpsql
-------------------------
(1 row)
SELECT modify_fast_path_plpsql(6,6);
DEBUG: Router planner cannot handle multi-shard modify queries
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
DEBUG: Distributed planning for a fast-path router query
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
DEBUG: Creating router plan
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
DEBUG: Plan is router executable
DETAIL: distribution column value: 6
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
modify_fast_path_plpsql
-------------------------
(1 row)
SELECT modify_fast_path_plpsql(6,6);
DEBUG: Distributed planning for a fast-path router query
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
DEBUG: Creating router plan
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
DEBUG: Plan is router executable
DETAIL: distribution column value: 6
CONTEXT: SQL statement "DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2"
PL/pgSQL function modify_fast_path_plpsql(integer,integer) line 3 at SQL statement
modify_fast_path_plpsql
-------------------------
(1 row)
RESET client_min_messages;
DROP SCHEMA fast_path_router_modify CASCADE;
NOTICE: drop cascades to 4 other objects
DETAIL: drop cascades to table modify_fast_path
drop cascades to table modify_fast_path_replication_2
drop cascades to table modify_fast_path_reference
drop cascades to function modify_fast_path_plpsql(integer,integer)

View File

@ -2,6 +2,10 @@
-- MULTI_FUNCTION_EVALUATION
--
SET citus.next_shard_id TO 1200000;
-- many of the tests in this file is intended for testing non-fast-path
-- router planner, so we're explicitly disabling it in this file.
-- We've bunch of other tests that triggers fast-path-router
SET citus.enable_fast_path_router_planner TO false;
-- nextval() works (no good way to test DEFAULT, or, by extension, SERIAL)
CREATE TABLE example (key INT, value INT);
SELECT master_create_distributed_table('example', 'key', 'hash');

View File

@ -5,6 +5,10 @@
SET citus.next_shard_id TO 630000;
SET citus.shard_count to 4;
SET citus.shard_replication_factor to 1;
-- many of the tests in this file is intended for testing non-fast-path
-- router planner, so we're explicitly disabling it in this file.
-- We've bunch of other tests that triggers fast-path-router
SET citus.enable_fast_path_router_planner TO false;
-- Create a table partitioned on integer column and update partition type to
-- hash. Then load data into this table and update shard min max values with
-- hashed ones. Hash value of 1, 2, 3 and 4 are consecutively -1905060026,

View File

@ -1179,6 +1179,7 @@ FROM
DEBUG: cannot perform distributed INSERT INTO ... SELECT because the partition columns in the source table and subquery do not match
DETAIL: The target table's partition column should correspond to a partition column in the subquery.
DEBUG: Collecting INSERT ... SELECT results on coordinator
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
-- foo2 is recursively planned and INSERT...SELECT is done via coordinator

View File

@ -64,6 +64,10 @@ DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 10
-- single-shard tests
-- many of the tests in this file is intended for testing non-fast-path
-- router planner, so we're explicitly disabling it in this file.
-- We've bunch of other tests that triggers fast-path-router
SET citus.enable_fast_path_router_planner TO false;
-- test simple select for a single row
SELECT * FROM articles_hash_mx WHERE author_id = 10 AND id = 50;
DEBUG: Creating router plan

View File

@ -4,6 +4,10 @@
-- Many of the queries are taken from other regression test files
-- and converted into both plain SQL and PL/pgsql functions, which
-- use prepared statements internally.
-- many of the tests in this file is intended for testing non-fast-path
-- router planner, so we're explicitly disabling it in this file.
-- We've bunch of other tests that triggers fast-path-router
SET citus.enable_fast_path_router_planner TO false;
CREATE FUNCTION plpgsql_test_1() RETURNS TABLE(count bigint) AS $$
DECLARE
BEGIN

View File

@ -1,6 +1,10 @@
--
-- MULTI_PREPARE_SQL
--
-- many of the tests in this file is intended for testing non-fast-path
-- router planner, so we're explicitly disabling it in this file.
-- We've bunch of other tests that triggers fast-path-router
SET citus.enable_fast_path_router_planner TO false;
-- Tests covering PREPARE statements. Many of the queries are
-- taken from other regression test files and converted into
-- prepared statements.

View File

@ -2,6 +2,10 @@ SET citus.next_shard_id TO 840000;
-- ===================================================================
-- test router planner functionality for single shard select queries
-- ===================================================================
-- all the tests in this file is intended for testing non-fast-path
-- router planner, so we're disabling it in this file. We've bunch of
-- other tests that triggers fast-path-router planner
SET citus.enable_fast_path_router_planner TO false;
CREATE TABLE articles_hash (
id bigint NOT NULL,
author_id bigint NOT NULL,

File diff suppressed because it is too large Load Diff

View File

@ -1,4 +1,8 @@
SET citus.next_shard_id TO 850000;
-- many of the tests in this file is intended for testing non-fast-path
-- router planner, so we're explicitly disabling it in this file.
-- We've bunch of other tests that triggers fast-path-router
SET citus.enable_fast_path_router_planner TO false;
-- ===================================================================
-- test end-to-end query functionality
-- ===================================================================
@ -658,4 +662,52 @@ DETAIL: distribution column value: 1
41 | 1 | aznavour | 11814
(5 rows)
-- test tablesample with fast path as well
SET citus.enable_fast_path_router_planner TO true;
SELECT * FROM articles TABLESAMPLE SYSTEM (0) WHERE author_id = 1;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
id | author_id | title | word_count
----+-----------+-------+------------
(0 rows)
SELECT * FROM articles TABLESAMPLE BERNOULLI (0) WHERE author_id = 1;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
id | author_id | title | word_count
----+-----------+-------+------------
(0 rows)
SELECT * FROM articles TABLESAMPLE SYSTEM (100) WHERE author_id = 1 ORDER BY id;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
id | author_id | title | word_count
----+-----------+--------------+------------
1 | 1 | arsenous | 9572
11 | 1 | alamo | 1347
21 | 1 | arcading | 5890
31 | 1 | athwartships | 7271
41 | 1 | aznavour | 11814
(5 rows)
SELECT * FROM articles TABLESAMPLE BERNOULLI (100) WHERE author_id = 1 ORDER BY id;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
id | author_id | title | word_count
----+-----------+--------------+------------
1 | 1 | arsenous | 9572
11 | 1 | alamo | 1347
21 | 1 | arcading | 5890
31 | 1 | athwartships | 7271
41 | 1 | aznavour | 11814
(5 rows)
SET client_min_messages to 'NOTICE';

View File

@ -1,4 +1,8 @@
SET citus.next_shard_id TO 850000;
-- many of the tests in this file is intended for testing non-fast-path
-- router planner, so we're explicitly disabling it in this file.
-- We've bunch of other tests that triggers fast-path-router
SET citus.enable_fast_path_router_planner TO false;
-- ===================================================================
-- test end-to-end query functionality
-- ===================================================================
@ -602,4 +606,52 @@ DETAIL: distribution column value: 1
41 | 1 | aznavour | 11814
(5 rows)
-- test tablesample with fast path as well
SET citus.enable_fast_path_router_planner TO true;
SELECT * FROM articles TABLESAMPLE SYSTEM (0) WHERE author_id = 1;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
id | author_id | title | word_count
----+-----------+-------+------------
(0 rows)
SELECT * FROM articles TABLESAMPLE BERNOULLI (0) WHERE author_id = 1;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
id | author_id | title | word_count
----+-----------+-------+------------
(0 rows)
SELECT * FROM articles TABLESAMPLE SYSTEM (100) WHERE author_id = 1 ORDER BY id;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
id | author_id | title | word_count
----+-----------+--------------+------------
1 | 1 | arsenous | 9572
11 | 1 | alamo | 1347
21 | 1 | arcading | 5890
31 | 1 | athwartships | 7271
41 | 1 | aznavour | 11814
(5 rows)
SELECT * FROM articles TABLESAMPLE BERNOULLI (100) WHERE author_id = 1 ORDER BY id;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DETAIL: distribution column value: 1
id | author_id | title | word_count
----+-----------+--------------+------------
1 | 1 | arsenous | 9572
11 | 1 | alamo | 1347
21 | 1 | arcading | 5890
31 | 1 | athwartships | 7271
41 | 1 | aznavour | 11814
(5 rows)
SET client_min_messages to 'NOTICE';

View File

@ -153,6 +153,7 @@ SELECT create_reference_table('task_assignment_reference_table');
SET LOCAL citus.task_assignment_policy TO 'greedy';
EXPLAIN (COSTS FALSE) SELECT * FROM task_assignment_reference_table;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
QUERY PLAN
@ -162,6 +163,7 @@ DEBUG: Plan is router executable
(2 rows)
EXPLAIN (COSTS FALSE) SELECT * FROM task_assignment_reference_table;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
QUERY PLAN
@ -172,6 +174,7 @@ DEBUG: Plan is router executable
SET LOCAL citus.task_assignment_policy TO 'first-replica';
EXPLAIN (COSTS FALSE) SELECT * FROM task_assignment_reference_table;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
QUERY PLAN
@ -181,6 +184,7 @@ DEBUG: Plan is router executable
(2 rows)
EXPLAIN (COSTS FALSE) SELECT * FROM task_assignment_reference_table;
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
QUERY PLAN
@ -192,6 +196,7 @@ DEBUG: Plan is router executable
-- here we expect debug output showing two different hosts for subsequent queries
SET LOCAL citus.task_assignment_policy TO 'round-robin';
EXPLAIN (COSTS FALSE) SELECT * FROM task_assignment_reference_table;
DEBUG: Distributed planning for a fast-path router query
DEBUG: assigned task 0 to node localhost:57637
DEBUG: Creating router plan
DEBUG: Plan is router executable
@ -202,6 +207,7 @@ DEBUG: Plan is router executable
(2 rows)
EXPLAIN (COSTS FALSE) SELECT * FROM task_assignment_reference_table;
DEBUG: Distributed planning for a fast-path router query
DEBUG: assigned task 0 to node localhost:57638
DEBUG: Creating router plan
DEBUG: Plan is router executable

View File

@ -139,6 +139,7 @@ FROM
WHERE test.y = foo.x;
DEBUG: generating subplan 19_1 for CTE cte_1: SELECT x FROM recursive_set_local.test
DEBUG: generating subplan 19_2 for CTE cte_1: SELECT a FROM recursive_set_local.ref
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DEBUG: generating subplan 19_3 for subquery SELECT x FROM recursive_set_local.local_test
@ -165,6 +166,7 @@ FROM
WHERE ref.a = foo.x;
DEBUG: generating subplan 23_1 for CTE cte_1: SELECT x FROM recursive_set_local.test
DEBUG: generating subplan 23_2 for CTE cte_1: SELECT a FROM recursive_set_local.ref
DEBUG: Distributed planning for a fast-path router query
DEBUG: Creating router plan
DEBUG: Plan is router executable
DEBUG: generating subplan 23_3 for subquery SELECT x FROM recursive_set_local.local_test

View File

@ -184,8 +184,8 @@ test: multi_transaction_recovery
# multi_copy creates hash and range-partitioned tables and performs COPY
# multi_router_planner creates hash partitioned tables.
# ---------
test: multi_copy
test: multi_router_planner
test: multi_copy fast_path_router_modify
test: multi_router_planner multi_router_planner_fast_path
# ----------
# multi_large_shardid loads more lineitem data using high shard identifiers

View File

@ -0,0 +1,113 @@
CREATE SCHEMA fast_path_router_modify;
SET search_path TO fast_path_router_modify;
SET citus.next_shard_id TO 1840000;
-- all the tests in this file is intended for testing fast-path
-- router planner, so we're explicitly enabling itin this file.
-- We've bunch of other tests that triggers non-fast-path-router
-- planner (note this is already true by default)
SET citus.enable_fast_path_router_planner TO true;
SET citus.shard_replication_factor TO 1;
CREATE TABLE modify_fast_path(key int, value_1 int, value_2 text);
SELECT create_distributed_table('modify_fast_path', 'key');
SET citus.shard_replication_factor TO 2;
CREATE TABLE modify_fast_path_replication_2(key int, value_1 int, value_2 text);
SELECT create_distributed_table('modify_fast_path_replication_2', 'key');
CREATE TABLE modify_fast_path_reference(key int, value_1 int, value_2 text);
SELECT create_reference_table('modify_fast_path_reference');
-- show the output
SET client_min_messages TO DEBUG;
-- very simple queries goes through fast-path planning
DELETE FROM modify_fast_path WHERE key = 1;
UPDATE modify_fast_path SET value_1 = 1 WHERE key = 1;
UPDATE modify_fast_path SET value_1 = value_1 + 1 WHERE key = 1;
UPDATE modify_fast_path SET value_1 = value_1 + value_2::int WHERE key = 1;
DELETE FROM modify_fast_path WHERE value_1 = 15 AND (key = 1 AND value_2 = 'citus');
DELETE FROM modify_fast_path WHERE key = 1 and FALSE;
-- UPDATE may include complex target entries
UPDATE modify_fast_path SET value_1 = value_1 + 12 * value_1 WHERE key = 1;
UPDATE modify_fast_path SET value_1 = abs(-19) WHERE key = 1;
-- cannot go through fast-path because there are multiple keys
DELETE FROM modify_fast_path WHERE key = 1 AND key = 2;
DELETE FROM modify_fast_path WHERE key = 1 AND (key = 2 AND value_1 = 15);
-- cannot go through fast-path because key is not on the top level
DELETE FROM modify_fast_path WHERE value_1 = 15 OR (key = 1 AND value_2 = 'citus');
DELETE FROM modify_fast_path WHERE value_1 = 15 AND (key = 1 OR value_2 = 'citus');
-- goes through fast-path planning even if the key is updated to the same value
UPDATE modify_fast_path SET key = 1 WHERE key = 1;
UPDATE modify_fast_path SET key = 1::float WHERE key = 1;
-- cannot support if key changes
UPDATE modify_fast_path SET key = 2 WHERE key = 1;
UPDATE modify_fast_path SET key = 2::numeric WHERE key = 1;
-- returning is not supported via fast-path
DELETE FROM modify_fast_path WHERE key = 1 RETURNING *;
-- modifying ctes are not supported via fast-path
WITH t1 AS (DELETE FROM modify_fast_path WHERE key = 1), t2 AS (SELECT * FROM modify_fast_path) SELECT * FROM t2;
-- for update/share is supported via fast-path when replication factor = 1 or reference table
SELECT * FROM modify_fast_path WHERE key = 1 FOR UPDATE;
SELECT * FROM modify_fast_path WHERE key = 1 FOR SHARE;
SELECT * FROM modify_fast_path_reference WHERE key = 1 FOR UPDATE;
SELECT * FROM modify_fast_path_reference WHERE key = 1 FOR SHARE;
-- for update/share is not supported via fast-path wen replication factor > 1
SELECT * FROM modify_fast_path_replication_2 WHERE key = 1 FOR UPDATE;
SELECT * FROM modify_fast_path_replication_2 WHERE key = 1 FOR SHARE;
-- very simple queries on reference tables goes through fast-path planning
DELETE FROM modify_fast_path_reference WHERE key = 1;
UPDATE modify_fast_path_reference SET value_1 = 1 WHERE key = 1;
UPDATE modify_fast_path_reference SET value_1 = value_1 + 1 WHERE key = 1;
UPDATE modify_fast_path_reference SET value_1 = value_1 + value_2::int WHERE key = 1;
-- joins are not supported via fast-path
UPDATE modify_fast_path
SET value_1 = 1
FROM modify_fast_path_reference
WHERE
modify_fast_path.key = modify_fast_path_reference.key AND
modify_fast_path.key = 1 AND
modify_fast_path_reference.key = 1;
PREPARE p1 (int, int, int) AS
UPDATE modify_fast_path SET value_1 = value_1 + $1 WHERE key = $2 AND value_1 = $3;
EXECUTE p1(1,1,1);
EXECUTE p1(2,2,2);
EXECUTE p1(3,3,3);
EXECUTE p1(4,4,4);
EXECUTE p1(5,5,5);
EXECUTE p1(6,6,6);
CREATE FUNCTION modify_fast_path_plpsql(int, int) RETURNS void as $$
BEGIN
DELETE FROM modify_fast_path WHERE key = $1 AND value_1 = $2;
END;
$$ LANGUAGE plpgsql;
SELECT modify_fast_path_plpsql(1,1);
SELECT modify_fast_path_plpsql(2,2);
SELECT modify_fast_path_plpsql(3,3);
SELECT modify_fast_path_plpsql(4,4);
SELECT modify_fast_path_plpsql(5,5);
SELECT modify_fast_path_plpsql(6,6);
SELECT modify_fast_path_plpsql(6,6);
RESET client_min_messages;
DROP SCHEMA fast_path_router_modify CASCADE;

View File

@ -4,6 +4,11 @@
SET citus.next_shard_id TO 1200000;
-- many of the tests in this file is intended for testing non-fast-path
-- router planner, so we're explicitly disabling it in this file.
-- We've bunch of other tests that triggers fast-path-router
SET citus.enable_fast_path_router_planner TO false;
-- nextval() works (no good way to test DEFAULT, or, by extension, SERIAL)
CREATE TABLE example (key INT, value INT);

View File

@ -9,6 +9,12 @@ SET citus.next_shard_id TO 630000;
SET citus.shard_count to 4;
SET citus.shard_replication_factor to 1;
-- many of the tests in this file is intended for testing non-fast-path
-- router planner, so we're explicitly disabling it in this file.
-- We've bunch of other tests that triggers fast-path-router
SET citus.enable_fast_path_router_planner TO false;
-- Create a table partitioned on integer column and update partition type to
-- hash. Then load data into this table and update shard min max values with
-- hashed ones. Hash value of 1, 2, 3 and 4 are consecutively -1905060026,

View File

@ -70,6 +70,11 @@ INSERT INTO articles_single_shard_hash_mx VALUES (50, 10, 'anjanette', 19519);
-- single-shard tests
-- many of the tests in this file is intended for testing non-fast-path
-- router planner, so we're explicitly disabling it in this file.
-- We've bunch of other tests that triggers fast-path-router
SET citus.enable_fast_path_router_planner TO false;
-- test simple select for a single row
SELECT * FROM articles_hash_mx WHERE author_id = 10 AND id = 50;

View File

@ -6,6 +6,10 @@
-- and converted into both plain SQL and PL/pgsql functions, which
-- use prepared statements internally.
-- many of the tests in this file is intended for testing non-fast-path
-- router planner, so we're explicitly disabling it in this file.
-- We've bunch of other tests that triggers fast-path-router
SET citus.enable_fast_path_router_planner TO false;
CREATE FUNCTION plpgsql_test_1() RETURNS TABLE(count bigint) AS $$
DECLARE

View File

@ -2,6 +2,11 @@
-- MULTI_PREPARE_SQL
--
-- many of the tests in this file is intended for testing non-fast-path
-- router planner, so we're explicitly disabling it in this file.
-- We've bunch of other tests that triggers fast-path-router
SET citus.enable_fast_path_router_planner TO false;
-- Tests covering PREPARE statements. Many of the queries are
-- taken from other regression test files and converted into
-- prepared statements.

View File

@ -1,11 +1,15 @@
SET citus.next_shard_id TO 840000;
-- ===================================================================
-- test router planner functionality for single shard select queries
-- ===================================================================
-- all the tests in this file is intended for testing non-fast-path
-- router planner, so we're disabling it in this file. We've bunch of
-- other tests that triggers fast-path-router planner
SET citus.enable_fast_path_router_planner TO false;
CREATE TABLE articles_hash (
id bigint NOT NULL,
author_id bigint NOT NULL,

View File

@ -0,0 +1,829 @@
CREATE SCHEMA fast_path_router_select;
SET search_path TO fast_path_router_select;
SET citus.next_shard_id TO 1840000;
-- all the tests in this file is intended for testing fast-path
-- router planner, so we're explicitly enabling itin this file.
-- We've bunch of other tests that triggers non-fast-path-router
-- planner (note this is already true by default)
SET citus.enable_fast_path_router_planner TO true;
-- ===================================================================
-- test router planner functionality for via fast path on
-- single shard select queries
-- ===================================================================
CREATE TABLE articles_hash (
id bigint NOT NULL,
author_id bigint NOT NULL,
title varchar(20) NOT NULL,
word_count integer
);
CREATE TABLE articles_range (
id bigint NOT NULL,
author_id bigint NOT NULL,
title varchar(20) NOT NULL,
word_count integer
);
CREATE TABLE articles_append (
id bigint NOT NULL,
author_id bigint NOT NULL,
title varchar(20) NOT NULL,
word_count integer
);
-- Check for the existence of line 'DEBUG: Creating router plan'
-- to determine if router planner is used.
-- this table is used in a CTE test
CREATE TABLE authors_hash ( name varchar(20), id bigint );
CREATE TABLE authors_range ( name varchar(20), id bigint );
SET citus.shard_replication_factor TO 1;
SET citus.shard_count TO 2;
SELECT create_distributed_table('articles_hash', 'author_id');
CREATE TABLE authors_reference ( name varchar(20), id bigint );
SELECT create_reference_table('authors_reference');
-- create a bunch of test data
INSERT INTO articles_hash VALUES (1, 1, 'arsenous', 9572), (2, 2, 'abducing', 13642),( 3, 3, 'asternal', 10480),( 4, 4, 'altdorfer', 14551),( 5, 5, 'aruru', 11389),
(6, 6, 'atlases', 15459),(7, 7, 'aseptic', 12298),( 8, 8, 'agatized', 16368),(9, 9, 'alligate', 438),
(10, 10, 'aggrandize', 17277),(11, 1, 'alamo', 1347),(12, 2, 'archiblast', 18185),
(13, 3, 'aseyev', 2255),(14, 4, 'andesite', 19094),(15, 5, 'adversa', 3164),
(16, 6, 'allonym', 2),(17, 7, 'auriga', 4073),(18, 8, 'assembly', 911),(19, 9, 'aubergiste', 4981),
(20, 10, 'absentness', 1820),(21, 1, 'arcading', 5890),(22, 2, 'antipope', 2728),(23, 3, 'abhorring', 6799),
(24, 4, 'audacious', 3637),(25, 5, 'antehall', 7707),(26, 6, 'abington', 4545),(27, 7, 'arsenous', 8616),
(28, 8, 'aerophyte', 5454),(29, 9, 'amateur', 9524),(30, 10, 'andelee', 6363),(31, 1, 'athwartships', 7271),
(32, 2, 'amazon', 11342),(33, 3, 'autochrome', 8180),(34, 4, 'amnestied', 12250),(35, 5, 'aminate', 9089),
(36, 6, 'ablation', 13159),(37, 7, 'archduchies', 9997),(38, 8, 'anatine', 14067),(39, 9, 'anchises', 10906),
(40, 10, 'attemper', 14976),(41, 1, 'aznavour', 11814),(42, 2, 'ausable', 15885),(43, 3, 'affixal', 12723),
(44, 4, 'anteport', 16793),(45, 5, 'afrasia', 864),(46, 6, 'atlanta', 17702),(47, 7, 'abeyance', 1772),
(48, 8, 'alkylic', 18610),(49, 9, 'anyone', 2681),(50, 10, 'anjanette', 19519);
SET citus.task_executor_type TO 'real-time';
SET client_min_messages TO 'DEBUG2';
-- test simple select for a single row
SELECT * FROM articles_hash WHERE author_id = 10 AND id = 50;
-- get all titles by a single author
SELECT title FROM articles_hash WHERE author_id = 10;
-- try ordering them by word count
SELECT title, word_count FROM articles_hash
WHERE author_id = 10
ORDER BY word_count DESC NULLS LAST;
-- look at last two articles by an author
SELECT title, id FROM articles_hash
WHERE author_id = 5
ORDER BY id
LIMIT 2;
-- find all articles by two authors in same shard
-- but plan is not fast path router plannable due to
-- two distribution columns in the query
SELECT title, author_id FROM articles_hash
WHERE author_id = 7 OR author_id = 8
ORDER BY author_id ASC, id;
-- having clause is supported if it goes to a single shard
-- and single dist. key on the query
SELECT author_id, sum(word_count) AS corpus_size FROM articles_hash
WHERE author_id = 1
GROUP BY author_id
HAVING sum(word_count) > 1000
ORDER BY sum(word_count) DESC;
-- fast path planner only support = operator
SELECT * FROM articles_hash WHERE author_id <= 1;
SELECT * FROM articles_hash WHERE author_id IN (1, 3);
-- queries with CTEs cannot go through fast-path planning
WITH first_author AS ( SELECT id FROM articles_hash WHERE author_id = 1)
SELECT * FROM first_author;
-- two CTE joins also cannot go through fast-path planning
WITH id_author AS ( SELECT id, author_id FROM articles_hash WHERE author_id = 1),
id_title AS (SELECT id, title from articles_hash WHERE author_id = 1)
SELECT * FROM id_author, id_title WHERE id_author.id = id_title.id;
-- this is a different case where each CTE is recursively planned and those goes
-- through the fast-path router planner, but the top level join is not
WITH id_author AS ( SELECT id, author_id FROM articles_hash WHERE author_id = 1),
id_title AS (SELECT id, title from articles_hash WHERE author_id = 2)
SELECT * FROM id_author, id_title WHERE id_author.id = id_title.id;
CREATE TABLE company_employees (company_id int, employee_id int, manager_id int);
SELECT master_create_distributed_table('company_employees', 'company_id', 'hash');
SELECT master_create_worker_shards('company_employees', 4, 1);
INSERT INTO company_employees values(1, 1, 0);
INSERT INTO company_employees values(1, 2, 1);
INSERT INTO company_employees values(1, 3, 1);
INSERT INTO company_employees values(1, 4, 2);
INSERT INTO company_employees values(1, 5, 4);
INSERT INTO company_employees values(3, 1, 0);
INSERT INTO company_employees values(3, 15, 1);
INSERT INTO company_employees values(3, 3, 1);
-- recursive CTEs are also cannot go through fast
-- path planning
WITH RECURSIVE hierarchy as (
SELECT *, 1 AS level
FROM company_employees
WHERE company_id = 1 and manager_id = 0
UNION
SELECT ce.*, (h.level+1)
FROM hierarchy h JOIN company_employees ce
ON (h.employee_id = ce.manager_id AND
h.company_id = ce.company_id AND
ce.company_id = 1))
SELECT * FROM hierarchy WHERE LEVEL <= 2;
WITH update_article AS (
UPDATE articles_hash SET word_count = 10 WHERE id = 1 AND word_count = 9 RETURNING *
)
SELECT * FROM update_article;
WITH delete_article AS (
DELETE FROM articles_hash WHERE id = 1 AND word_count = 10 RETURNING *
)
SELECT * FROM delete_article;
-- grouping sets are supported via fast-path
SELECT
id, substring(title, 2, 1) AS subtitle, count(*)
FROM articles_hash
WHERE author_id = 1
GROUP BY GROUPING SETS ((id),(subtitle))
ORDER BY id, subtitle;
-- grouping sets are not supported with multiple quals
SELECT
id, substring(title, 2, 1) AS subtitle, count(*)
FROM articles_hash
WHERE author_id = 1 or author_id = 2
GROUP BY GROUPING SETS ((id),(subtitle))
ORDER BY id, subtitle;
-- queries which involve functions in FROM clause are not supported via fast path planning
SELECT * FROM articles_hash, position('om' in 'Thomas') WHERE author_id = 1;
-- sublinks are not supported via fast path planning
SELECT * FROM articles_hash
WHERE author_id IN (SELECT author_id FROM articles_hash WHERE author_id = 2)
ORDER BY articles_hash.id;
-- subqueries are not supported via fast path planning
SELECT articles_hash.id,test.word_count
FROM articles_hash, (SELECT id, word_count FROM articles_hash) AS test WHERE test.id = articles_hash.id
ORDER BY test.word_count DESC, articles_hash.id LIMIT 5;
SELECT articles_hash.id,test.word_count
FROM articles_hash, (SELECT id, word_count FROM articles_hash) AS test
WHERE test.id = articles_hash.id and articles_hash.author_id = 1
ORDER BY articles_hash.id;
-- subqueries are not supported in SELECT clause
SELECT a.title AS name, (SELECT a2.id FROM articles_hash a2 WHERE a.id = a2.id LIMIT 1)
AS special_price FROM articles_hash a;
-- simple lookup query just works
SELECT *
FROM articles_hash
WHERE author_id = 1;
-- below query hits a single shard but with multiple filters
-- so cannot go via fast-path
SELECT *
FROM articles_hash
WHERE author_id = 1 OR author_id = 17;
-- rename the output columns
SELECT id as article_id, word_count * id as random_value
FROM articles_hash
WHERE author_id = 1;
-- joins do not go through fast-path planning
SELECT a.author_id as first_author, b.word_count as second_word_count
FROM articles_hash a, articles_hash b
WHERE a.author_id = 10 and a.author_id = b.author_id
LIMIT 3;
-- single shard select with limit goes through fast-path planning
SELECT *
FROM articles_hash
WHERE author_id = 1
LIMIT 3;
-- single shard select with limit + offset goes through fast-path planning
SELECT *
FROM articles_hash
WHERE author_id = 1
LIMIT 2
OFFSET 1;
-- single shard select with limit + offset + order by goes through fast-path planning
SELECT *
FROM articles_hash
WHERE author_id = 1
ORDER BY id desc
LIMIT 2
OFFSET 1;
-- single shard select with group by on non-partition column goes through fast-path planning
SELECT id
FROM articles_hash
WHERE author_id = 1
GROUP BY id
ORDER BY id;
-- single shard select with distinct goes through fast-path planning
SELECT DISTINCT id
FROM articles_hash
WHERE author_id = 1
ORDER BY id;
-- single shard aggregate goes through fast-path planning
SELECT avg(word_count)
FROM articles_hash
WHERE author_id = 2;
-- max, min, sum, count goes through fast-path planning
SELECT max(word_count) as max, min(word_count) as min,
sum(word_count) as sum, count(word_count) as cnt
FROM articles_hash
WHERE author_id = 2;
-- queries with aggregates and group by goes through fast-path planning
SELECT max(word_count)
FROM articles_hash
WHERE author_id = 1
GROUP BY author_id;
-- set operations are not supported via fast-path planning
SELECT * FROM (
SELECT * FROM articles_hash WHERE author_id = 1
UNION
SELECT * FROM articles_hash WHERE author_id = 3
) AS combination
ORDER BY id;
-- function calls in the target list is supported via fast path
SELECT LEFT(title, 1) FROM articles_hash WHERE author_id = 1
-- top-level union queries are supported through recursive planning
SET client_min_messages to 'NOTICE';
-- unions in subqueries are not supported via fast-path planning
SELECT * FROM (
(SELECT * FROM articles_hash WHERE author_id = 1)
UNION
(SELECT * FROM articles_hash WHERE author_id = 1)) uu
ORDER BY 1, 2
LIMIT 5;
-- Test various filtering options for router plannable check
SET client_min_messages to 'DEBUG2';
-- cannot go through fast-path if there is
-- explicit coercion
SELECT *
FROM articles_hash
WHERE author_id = 1::bigint;
-- can go through fast-path if there is
-- implicit coercion
-- This doesn't work see the related issue
-- reported https://github.com/citusdata/citus/issues/2605
-- SELECT *
-- FROM articles_hash
-- WHERE author_id = 1.0;
SELECT *
FROM articles_hash
WHERE author_id = 68719476736; -- this is bigint
-- cannot go through fast-path due to
-- multiple filters on the dist. key
SELECT *
FROM articles_hash
WHERE author_id = 1 and author_id >= 1;
-- cannot go through fast-path due to
-- multiple filters on the dist. key
SELECT *
FROM articles_hash
WHERE author_id = 1 or id = 1;
-- goes through fast-path planning because
-- the dist. key is ANDed with the rest of the
-- filters
SELECT *
FROM articles_hash
WHERE author_id = 1 and (id = 1 or id = 41);
-- this time there is an OR clause which prevents
-- router planning at all
SELECT *
FROM articles_hash
WHERE author_id = 1 and id = 1 or id = 41;
-- goes through fast-path planning because
-- the dist. key is ANDed with the rest of the
-- filters
SELECT *
FROM articles_hash
WHERE author_id = 1 and (id = random()::int * 0);
-- not router plannable due to function call on the right side
SELECT *
FROM articles_hash
WHERE author_id = (random()::int * 0 + 1);
-- Citus does not qualify this as a fast-path because
-- dist_key = func()
SELECT *
FROM articles_hash
WHERE author_id = abs(-1);
-- Citus does not qualify this as a fast-path because
-- dist_key = func()
SELECT *
FROM articles_hash
WHERE 1 = abs(author_id);
-- Citus does not qualify this as a fast-path because
-- dist_key = func()
SELECT *
FROM articles_hash
WHERE author_id = abs(author_id - 2);
-- the function is not on the dist. key, so qualify as
-- fast-path
SELECT *
FROM articles_hash
WHERE author_id = 1 and (id = abs(id - 2));
-- not router plannable due to is true
SELECT *
FROM articles_hash
WHERE (author_id = 1) is true;
-- router plannable, (boolean expression) = true is collapsed to (boolean expression)
SELECT *
FROM articles_hash
WHERE (author_id = 1) = true;
-- some more complex quals
SELECT count(*) FROM articles_hash WHERE (author_id = 15) AND (id = 1 OR word_count > 5);
SELECT count(*) FROM articles_hash WHERE (author_id = 15) OR (id = 1 AND word_count > 5);
SELECT count(*) FROM articles_hash WHERE (id = 15) OR (author_id = 1 AND word_count > 5);
SELECT count(*) FROM articles_hash WHERE (id = 15) AND (author_id = 1 OR word_count > 5);
SELECT count(*) FROM articles_hash WHERE (id = 15) AND (author_id = 1 AND (word_count > 5 OR id = 2));
SELECT count(*) FROM articles_hash WHERE (id = 15) AND (title ilike 'a%' AND (word_count > 5 OR author_id = 2));
SELECT count(*) FROM articles_hash WHERE (id = 15) AND (title ilike 'a%' AND (word_count > 5 AND author_id = 2));
SELECT count(*) FROM articles_hash WHERE (id = 15) AND (title ilike 'a%' AND ((word_count > 5 OR title ilike 'b%' ) AND (author_id = 2 AND word_count > 50)));
-- fast-path router plannable, between operator is on another column
SELECT *
FROM articles_hash
WHERE (author_id = 1) and id between 0 and 20;
-- fast-path router plannable, partition column expression is and'ed to rest
SELECT *
FROM articles_hash
WHERE (author_id = 1) and (id = 1 or id = 31) and title like '%s';
-- fast-path router plannable, order is changed
SELECT *
FROM articles_hash
WHERE (id = 1 or id = 31) and title like '%s' and (author_id = 1);
-- fast-path router plannable
SELECT *
FROM articles_hash
WHERE (title like '%s' or title like 'a%') and (author_id = 1);
-- fast-path router plannable
SELECT *
FROM articles_hash
WHERE (title like '%s' or title like 'a%') and (author_id = 1) and (word_count < 3000 or word_count > 8000);
-- window functions are supported with fast-path router plannable
SELECT LAG(title, 1) over (ORDER BY word_count) prev, title, word_count
FROM articles_hash
WHERE author_id = 5;
SELECT LAG(title, 1) over (ORDER BY word_count) prev, title, word_count
FROM articles_hash
WHERE author_id = 5
ORDER BY word_count DESC;
SELECT id, MIN(id) over (order by word_count)
FROM articles_hash
WHERE author_id = 1;
SELECT id, word_count, AVG(word_count) over (order by word_count)
FROM articles_hash
WHERE author_id = 1;
SELECT word_count, rank() OVER (PARTITION BY author_id ORDER BY word_count)
FROM articles_hash
WHERE author_id = 1;
-- some more tests on complex target lists
SELECT DISTINCT ON (author_id, id) author_id, id,
MIN(id) over (order by avg(word_count)) * AVG(id * 5.2 + (1.0/max(word_count))) over (order by max(word_count)) as t1,
count(*) FILTER (WHERE title LIKE 'al%') as cnt_with_filter,
count(*) FILTER (WHERE '0300030' LIKE '%3%') as cnt_with_filter_2,
avg(case when id > 2 then char_length(word_count::text) * (id * strpos(word_count::text, '1')) end) as case_cnt,
COALESCE(strpos(avg(word_count)::text, '1'), 20)
FROM articles_hash as aliased_table
WHERE author_id = 1
GROUP BY author_id, id
HAVING count(DISTINCT title) > 0
ORDER BY author_id, id, sum(word_count) - avg(char_length(title)) DESC, COALESCE(array_upper(ARRAY[max(id)],1) * 5,0) DESC;
-- where false queries are router plannable but not fast-path
SELECT *
FROM articles_hash
WHERE false;
-- fast-path with false
SELECT *
FROM articles_hash
WHERE author_id = 1 and false;
-- fast-path with false
SELECT *
FROM articles_hash
WHERE author_id = 1 and 1=0;
SELECT *
FROM articles_hash
WHERE null and author_id = 1;
-- we cannot qualify dist_key = X operator Y via
-- fast-path planning
SELECT *
FROM articles_hash
WHERE author_id = 1 + 1;
-- where false with immutable function returning false
-- goes through fast-path
SELECT *
FROM articles_hash a
WHERE a.author_id = 10 and int4eq(1, 2);
-- partition_column is null clause does not prune out any shards,
-- all shards remain after shard pruning, not router plannable
-- not fast-path router either
SELECT *
FROM articles_hash a
WHERE a.author_id is null;
-- partition_column equals to null clause prunes out all shards
-- no shards after shard pruning, router plannable
-- not fast-path router either
SELECT *
FROM articles_hash a
WHERE a.author_id = null;
-- union/difference /intersection with where false
-- this query was not originally router plannable, addition of 1=0
-- makes it router plannable but not fast-path
SELECT * FROM (
SELECT * FROM articles_hash WHERE author_id = 1
UNION
SELECT * FROM articles_hash WHERE author_id = 2 and 1=0
) AS combination
ORDER BY id;
-- same with the above, but with WHERE false
SELECT * FROM (
SELECT * FROM articles_hash WHERE author_id = 1
UNION
SELECT * FROM articles_hash WHERE author_id = 2 and 1=0
) AS combination WHERE false
ORDER BY id;
-- window functions with where false
SELECT word_count, rank() OVER (PARTITION BY author_id ORDER BY word_count)
FROM articles_hash
WHERE author_id = 1 and 1=0;
-- create a dummy function to be used in filtering
CREATE OR REPLACE FUNCTION someDummyFunction(regclass)
RETURNS text AS
$$
BEGIN
RETURN md5($1::text);
END;
$$ LANGUAGE 'plpgsql' IMMUTABLE;
-- fast path router plannable, but errors
SELECT * FROM articles_hash
WHERE
someDummyFunction('articles_hash') = md5('articles_hash') AND author_id = 1
ORDER BY
author_id, id
LIMIT 5;
-- temporarily turn off debug messages before dropping the function
SET client_min_messages TO 'NOTICE';
DROP FUNCTION someDummyFunction(regclass);
SET client_min_messages TO 'DEBUG2';
-- complex query hitting a single shard and a fast-path
SELECT
count(DISTINCT CASE
WHEN
word_count > 100
THEN
id
ELSE
NULL
END) as c
FROM
articles_hash
WHERE
author_id = 5;
-- queries inside transactions can be fast-path router plannable
BEGIN;
SELECT *
FROM articles_hash
WHERE author_id = 1
ORDER BY id;
END;
-- queries inside read-only transactions can be fast-path router plannable
SET TRANSACTION READ ONLY;
SELECT *
FROM articles_hash
WHERE author_id = 1
ORDER BY id;
END;
-- cursor queries are fast-path router plannable
BEGIN;
DECLARE test_cursor CURSOR FOR
SELECT *
FROM articles_hash
WHERE author_id = 1
ORDER BY id;
FETCH test_cursor;
FETCH ALL test_cursor;
FETCH test_cursor; -- fetch one row after the last
FETCH BACKWARD test_cursor;
END;
-- queries inside copy can be router plannable
COPY (
SELECT *
FROM articles_hash
WHERE author_id = 1
ORDER BY id) TO STDOUT;
-- table creation queries inside can be fast-path router plannable
CREATE TEMP TABLE temp_articles_hash as
SELECT *
FROM articles_hash
WHERE author_id = 1
ORDER BY id;
-- fast-path router plannable queries may include filter for aggragates
SELECT count(*), count(*) FILTER (WHERE id < 3)
FROM articles_hash
WHERE author_id = 1;
-- prepare queries can be router plannable
PREPARE author_1_articles as
SELECT *
FROM articles_hash
WHERE author_id = 1;
EXECUTE author_1_articles;
EXECUTE author_1_articles;
EXECUTE author_1_articles;
EXECUTE author_1_articles;
EXECUTE author_1_articles;
EXECUTE author_1_articles;
-- parametric prepare queries can be router plannable
PREPARE author_articles(int) as
SELECT *
FROM articles_hash
WHERE author_id = $1;
EXECUTE author_articles(1);
EXECUTE author_articles(1);
EXECUTE author_articles(1);
EXECUTE author_articles(1);
EXECUTE author_articles(1);
EXECUTE author_articles(1);
-- queries inside plpgsql functions could be router plannable
CREATE OR REPLACE FUNCTION author_articles_max_id() RETURNS int AS $$
DECLARE
max_id integer;
BEGIN
SELECT MAX(id) FROM articles_hash ah
WHERE author_id = 1
into max_id;
return max_id;
END;
$$ LANGUAGE plpgsql;
-- we don't want too many details. though we're omitting
-- "DETAIL: distribution column value:", we see it acceptable
-- since the query results verifies the correctness
\set VERBOSITY terse
SELECT author_articles_max_id();
SELECT author_articles_max_id();
SELECT author_articles_max_id();
SELECT author_articles_max_id();
SELECT author_articles_max_id();
SELECT author_articles_max_id();
-- queries inside plpgsql functions could be router plannable
CREATE OR REPLACE FUNCTION author_articles_max_id(int) RETURNS int AS $$
DECLARE
max_id integer;
BEGIN
SELECT MAX(id) FROM articles_hash ah
WHERE author_id = $1
into max_id;
return max_id;
END;
$$ LANGUAGE plpgsql;
SELECT author_articles_max_id(1);
SELECT author_articles_max_id(1);
SELECT author_articles_max_id(1);
SELECT author_articles_max_id(1);
SELECT author_articles_max_id(1);
SELECT author_articles_max_id(1);
-- check that function returning setof query are router plannable
CREATE OR REPLACE FUNCTION author_articles_id_word_count() RETURNS TABLE(id bigint, word_count int) AS $$
DECLARE
BEGIN
RETURN QUERY
SELECT ah.id, ah.word_count
FROM articles_hash ah
WHERE author_id = 1;
END;
$$ LANGUAGE plpgsql;
SELECT * FROM author_articles_id_word_count();
SELECT * FROM author_articles_id_word_count();
SELECT * FROM author_articles_id_word_count();
SELECT * FROM author_articles_id_word_count();
SELECT * FROM author_articles_id_word_count();
SELECT * FROM author_articles_id_word_count();
-- check that function returning setof query are router plannable
CREATE OR REPLACE FUNCTION author_articles_id_word_count(int) RETURNS TABLE(id bigint, word_count int) AS $$
DECLARE
BEGIN
RETURN QUERY
SELECT ah.id, ah.word_count
FROM articles_hash ah
WHERE author_id = $1;
END;
$$ LANGUAGE plpgsql;
SELECT * FROM author_articles_id_word_count(1);
SELECT * FROM author_articles_id_word_count(1);
SELECT * FROM author_articles_id_word_count(1);
SELECT * FROM author_articles_id_word_count(1);
SELECT * FROM author_articles_id_word_count(1);
SELECT * FROM author_articles_id_word_count(1);
\set VERBOSITY default
-- insert .. select via coordinator could also
-- use fast-path queries
PREPARE insert_sel(int, int) AS
INSERT INTO articles_hash
SELECT * FROM articles_hash WHERE author_id = $2 AND word_count = $1 OFFSET 0;
EXECUTE insert_sel(1,1);
EXECUTE insert_sel(1,1);
EXECUTE insert_sel(1,1);
EXECUTE insert_sel(1,1);
EXECUTE insert_sel(1,1);
EXECUTE insert_sel(1,1);
-- one final interesting preperad statement
-- where one of the filters is on the target list
PREPARE fast_path_agg_filter(int, int) AS
SELECT
count(*) FILTER (WHERE word_count=$1)
FROM
articles_hash
WHERE author_id = $2;
EXECUTE fast_path_agg_filter(1,1);
EXECUTE fast_path_agg_filter(2,2);
EXECUTE fast_path_agg_filter(3,3);
EXECUTE fast_path_agg_filter(4,4);
EXECUTE fast_path_agg_filter(5,5);
EXECUTE fast_path_agg_filter(6,6);
-- views internally become subqueries, so not fast-path router query
CREATE VIEW test_view AS
SELECT * FROM articles_hash WHERE author_id = 1;
SELECT * FROM test_view;
-- materialized views can be created for fast-path router plannable queries
CREATE MATERIALIZED VIEW mv_articles_hash_empty AS
SELECT * FROM articles_hash WHERE author_id = 1;
SELECT * FROM mv_articles_hash_empty;
-- fast-path router planner/executor is enabled for task-tracker executor
SET citus.task_executor_type to 'task-tracker';
SELECT id
FROM articles_hash
WHERE author_id = 1;
-- insert query is router plannable even under task-tracker
INSERT INTO articles_hash VALUES (51, 1, 'amateus', 1814), (52, 1, 'second amateus', 2824);
-- verify insert is successfull (not router plannable and executable)
SELECT id
FROM articles_hash
WHERE author_id = 1;
SET client_min_messages to 'NOTICE';
-- finally, some tests with partitioned tables
CREATE TABLE collections_list (
key bigint,
ts timestamptz,
collection_id integer,
value numeric
) PARTITION BY LIST (collection_id );
CREATE TABLE collections_list_1
PARTITION OF collections_list (key, ts, collection_id, value)
FOR VALUES IN ( 1 );
CREATE TABLE collections_list_2
PARTITION OF collections_list (key, ts, collection_id, value)
FOR VALUES IN ( 2 );
-- we don't need many shards
SET citus.shard_count TO 2;
SELECT create_distributed_table('collections_list', 'key');
INSERT INTO collections_list SELECT i % 10, now(), (i % 2) + 1, i*i FROM generate_series(0, 50)i;
SET client_min_messages to 'DEBUG2';
SELECT count(*) FROM collections_list WHERE key = 4;
SELECT count(*) FROM collections_list_1 WHERE key = 4;
SELECT count(*) FROM collections_list_2 WHERE key = 4;
UPDATE collections_list SET value = 15 WHERE key = 4;
SELECT count(*) FILTER (where value = 15) FROM collections_list WHERE key = 4;
SELECT count(*) FILTER (where value = 15) FROM collections_list_1 WHERE key = 4;
SELECT count(*) FILTER (where value = 15) FROM collections_list_2 WHERE key = 4;
SET client_min_messages to 'NOTICE';
DROP FUNCTION author_articles_max_id();
DROP FUNCTION author_articles_id_word_count();
DROP MATERIALIZED VIEW mv_articles_hash_empty;
DROP MATERIALIZED VIEW mv_articles_hash_data;
DROP TABLE articles_hash;
DROP TABLE authors_hash;
DROP TABLE authors_range;
DROP TABLE authors_reference;
DROP TABLE company_employees;
DROP TABLE articles_range;
DROP TABLE articles_append;
DROP TABLE collections_list;
RESET search_path;
DROP SCHEMA fast_path_router_select CASCADE;

View File

@ -1,6 +1,10 @@
SET citus.next_shard_id TO 850000;
-- many of the tests in this file is intended for testing non-fast-path
-- router planner, so we're explicitly disabling it in this file.
-- We've bunch of other tests that triggers fast-path-router
SET citus.enable_fast_path_router_planner TO false;
-- ===================================================================
-- test end-to-end query functionality
@ -302,4 +306,12 @@ SELECT * FROM articles TABLESAMPLE BERNOULLI (0) WHERE author_id = 1;
SELECT * FROM articles TABLESAMPLE SYSTEM (100) WHERE author_id = 1 ORDER BY id;
SELECT * FROM articles TABLESAMPLE BERNOULLI (100) WHERE author_id = 1 ORDER BY id;
-- test tablesample with fast path as well
SET citus.enable_fast_path_router_planner TO true;
SELECT * FROM articles TABLESAMPLE SYSTEM (0) WHERE author_id = 1;
SELECT * FROM articles TABLESAMPLE BERNOULLI (0) WHERE author_id = 1;
SELECT * FROM articles TABLESAMPLE SYSTEM (100) WHERE author_id = 1 ORDER BY id;
SELECT * FROM articles TABLESAMPLE BERNOULLI (100) WHERE author_id = 1 ORDER BY id;
SET client_min_messages to 'NOTICE';