citus

History

Onder Kalaci f144bb4911 Introduce fast path router planning In this context, we define "Fast Path Planning for SELECT" as trivial queries where Citus can skip relying on the standard_planner() and handle all the planning. For router planner, standard_planner() is mostly important to generate the necessary restriction information. Later, the restriction information generated by the standard_planner is used to decide whether all the shards that a distributed query touches reside on a single worker node. However, standard_planner() does a lot of extra things such as cost estimation and execution path generations which are completely unnecessary in the context of distributed planning. There are certain types of queries where Citus could skip relying on standard_planner() to generate the restriction information. For queries in the following format, Citus does not need any information that the standard_planner() generates: SELECT ... FROM single_table WHERE distribution_key = X; or DELETE FROM single_table WHERE distribution_key = X; or UPDATE single_table SET value_1 = value_2 + 1 WHERE distribution_key = X; Note that the queries might not be as simple as the above such that GROUP BY, WINDOW FUNCIONS, ORDER BY or HAVING etc. are all acceptable. The only rule is that the query is on a single distributed (or reference) table and there is a "distribution_key = X;" in the WHERE clause. With that, we could use to decide the shard that a distributed query touches reside on a worker node.	2019-02-21 13:27:01 +03:00
..
regress	Introduce fast path router planning	2019-02-21 13:27:01 +03:00

Onder Kalaci f144bb4911 Introduce fast path router planning

In this context, we define "Fast Path Planning for SELECT" as trivial
queries where Citus can skip relying on the standard_planner() and
handle all the planning.

For router planner, standard_planner() is mostly important to generate
the necessary restriction information. Later, the restriction information
generated by the standard_planner is used to decide whether all the shards
that a distributed query touches reside on a single worker node. However,
standard_planner() does a lot of extra things such as cost estimation and
execution path generations which are completely unnecessary in the context
of distributed planning.

There are certain types of queries where Citus could skip relying on
standard_planner() to generate the restriction information. For queries
in the following format, Citus does not need any information that the
standard_planner() generates:

  SELECT ... FROM single_table WHERE distribution_key = X;  or
  DELETE FROM single_table WHERE distribution_key = X; or
  UPDATE single_table SET value_1 = value_2 + 1 WHERE distribution_key = X;

Note that the queries might not be as simple as the above such that
GROUP BY, WINDOW FUNCIONS, ORDER BY or HAVING etc. are all acceptable. The
only rule is that the query is on a single distributed (or reference) table
and there is a "distribution_key = X;" in the WHERE clause. With that, we
could use to decide the shard that a distributed query touches reside on
a worker node.

2019-02-21 13:27:01 +03:00

regress

Introduce fast path router planning

2019-02-21 13:27:01 +03:00