citus/src/test/regress/sql
Andres Freund bb456d4002
Faster shard pruning.
So far citus used postgres' predicate proofing logic for shard
pruning, except for INSERT and COPY which were already optimized for
speed.  That turns out to be too slow:
* Shard pruning for SELECTs is currently O(#shards), because
  PruneShardList calls predicate_refuted_by() for every
  shard. Obviously using an O(N) type algorithm for general pruning
  isn't good.
* predicate_refuted_by() is quite expensive on its own right. That's
  primarily because it's optimized for doing a single refutation
  proof, rather than performing the same proof over and over.
* predicate_refuted_by() does not keep persistent state (see 2.) for
  function calls, which means that a lot of syscache lookups will be
  performed. That's particularly bad if the partitioning key is a
  composite key, because without a persistent FunctionCallInfo
  record_cmp() has to repeatedly look-up the type definition of the
  composite key. That's quite expensive.

Thus replace this with custom-code that works in two phases:
1) Search restrictions for constraints that can be pruned upon
2) Use those restrictions to search for matching shards in the most
   efficient manner available:
   a) Binary search / Hash Lookup in case of hash partitioned tables
   b) Binary search for equal clauses in case of range or append
      tables without overlapping shards.
   c) Binary search for inequality clauses, searching for both lower
      and upper boundaries, again in case of range or append
      tables without overlapping shards.
   d) exhaustive search testing each ShardInterval

My measurements suggest that we are considerably, often orders of
magnitude, faster than the previous solution, even if we have to fall
back to exhaustive pruning.
2017-06-06 15:58:05 -06:00
..
.gitignore Add Regression Tests For Querying MX Tables from Workers 2017-01-24 10:36:59 +03:00
multi_agg_approximate_distinct.sql Add support for filters 2016-12-01 08:53:46 +03:00
multi_array_agg.sql Support PostgreSQL 9.6 2016-10-18 16:23:55 -06:00
multi_average_expression.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_basic_queries.sql Add LIMIT/OFFSET Support 2016-07-18 12:00:24 +03:00
multi_binary_master_copy_format.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_citus_tools.sql Add citus tools to default configuration 2017-01-10 17:53:27 +03:00
multi_cluster_management.sql Fix CloseNodeConnections to actually close connections 2017-01-11 01:13:58 +02:00
multi_colocated_shard_transfer.sql Improve regression tests for multi_colocated_shard_transfer 2016-12-20 14:09:35 +02:00
multi_colocation_utils.sql Modify tests to create clean workspace 2017-01-05 12:22:44 +03:00
multi_complex_expressions.sql Add support for filters 2016-12-01 08:53:46 +03:00
multi_count_type_conversion.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_create_fdw.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_create_insert_proxy.sql Add support for prepared statements with parameterized non-partition columns in router executor 2016-07-21 11:09:28 +03:00
multi_create_shards.sql Add syscols in queries; extend relnames in indexes 2016-09-07 11:54:55 -05:00
multi_create_table.sql Add replication_model GUC 2017-01-23 09:05:14 -07:00
multi_create_table_constraints.sql Always CASCADE while dropping a shard 2016-11-01 10:16:34 +01:00
multi_data_types.sql Add DistTableCacheEntry->shardValueCompareFunction. 2017-06-06 15:58:05 -06:00
multi_deparse_shard_query.sql Add ability to reorder target list for INSERT/SELECT queries 2016-10-26 10:00:03 +03:00
multi_distribution_metadata.sql Refactor get_shard_id_for_distribution_column() and other minor changes 2017-01-20 14:38:01 +02:00
multi_drop_extension.sql Switch from pg_worker_list.conf file to pg_dist_node metadata table. 2016-10-05 13:01:35 +03:00
multi_dropped_column_aliases.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_expire_table_cache.sql Add UDF master_expire_table_cache 2016-09-28 12:08:37 +03:00
multi_explain.sql Hack up PREPARE/EXECUTE for nearly all distributed queries. 2017-01-23 09:23:50 -08:00
multi_extension.sql Add worker_hash() and a stub for isolate_tenant_to_new_shard() 2017-01-20 14:38:01 +02:00
multi_foreign_key.sql Bugfix for creating foreign key 2017-02-07 09:34:24 +02:00
multi_function_evaluation.sql Evaluate functions on the master 2016-07-13 11:45:51 -07:00
multi_generate_ddl_commands.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_hash_pruning.sql Make router planner active at all times 2016-12-20 11:24:01 +03:00
multi_index_statements.sql Avoid error during CREATE INDEX IF NOT EXISTS 2016-11-01 14:51:19 -07:00
multi_insert_select.sql Fix pushing down wrong INSERT ... SELECT queries 2017-05-04 11:17:22 -07:00
multi_join_order_additional.sql Expand router planner coverage 2016-07-27 23:35:38 +03:00
multi_join_order_tpch_large.sql Replace verb 'stage' with 'load' in test comments 2016-08-22 13:24:18 -06:00
multi_join_order_tpch_small.sql Add EXPLAIN for simple distributed queries 2016-04-30 00:11:02 +02:00
multi_join_pruning.sql Fix segmentation fault in case of joins with WHERE 1=0 2016-09-26 15:12:29 +02:00
multi_large_table_join_planning.sql Remove variant files 2016-06-13 12:12:06 +03:00
multi_large_table_pruning.sql Fix segmentation fault in case of joins with WHERE 1=0 2016-09-26 15:12:29 +02:00
multi_large_table_task_assignment.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_limit_clause.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_limit_clause_approximate.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_master_protocol.sql Fix Travis local_first_candidate_nodes failures 2016-08-14 23:12:10 -06:00
multi_metadata_access.sql GRANT SELECT access for metadata tables to public 2016-12-23 16:32:47 +03:00
multi_metadata_sync.sql Fix dependent tests 2017-01-25 19:19:39 +03:00
multi_modifications.sql Feature: INSERT INTO ... SELECT 2016-10-26 10:01:00 +03:00
multi_modifying_xacts.sql Add copy failure tests inside transactions 2017-01-26 11:54:40 +03:00
multi_mx_create_table.sql Add Regression Tests For Querying MX Tables from Workers 2017-01-24 10:36:59 +03:00
multi_mx_ddl.sql Use coordinator instead of schema node in terminology 2017-01-25 11:07:23 +01:00
multi_mx_explain.sql Add Regression Tests For Querying MX Tables from Workers 2017-01-24 10:36:59 +03:00
multi_mx_metadata.sql Use coordinator instead of schema node in terminology 2017-01-25 11:07:23 +01:00
multi_mx_modifications.sql Use coordinator instead of schema node in terminology 2017-01-25 11:07:23 +01:00
multi_mx_modifying_xacts.sql Use coordinator instead of schema node in terminology 2017-01-25 11:07:23 +01:00
multi_mx_reference_table.sql Add Regression Tests For Querying MX Tables from Workers 2017-01-24 10:36:59 +03:00
multi_mx_repartition_join_w1.sql Add Regression Tests For Querying MX Tables from Workers 2017-01-24 10:36:59 +03:00
multi_mx_repartition_join_w2.sql Add Regression Tests For Querying MX Tables from Workers 2017-01-24 10:36:59 +03:00
multi_mx_repartition_udt_prepare.sql Add Regression Tests For Querying MX Tables from Workers 2017-01-24 10:36:59 +03:00
multi_mx_repartition_udt_w1.sql Add Regression Tests For Querying MX Tables from Workers 2017-01-24 10:36:59 +03:00
multi_mx_repartition_udt_w2.sql Add Regression Tests For Querying MX Tables from Workers 2017-01-24 10:36:59 +03:00
multi_mx_router_planner.sql Add Regression Tests For Querying MX Tables from Workers 2017-01-24 10:36:59 +03:00
multi_mx_schema_support.sql Add Regression Tests For Querying MX Tables from Workers 2017-01-24 10:36:59 +03:00
multi_mx_tpch_query1.sql Use coordinator instead of schema node in terminology 2017-01-25 11:07:23 +01:00
multi_mx_tpch_query3.sql Use coordinator instead of schema node in terminology 2017-01-25 11:07:23 +01:00
multi_mx_tpch_query6.sql Use coordinator instead of schema node in terminology 2017-01-25 11:07:23 +01:00
multi_mx_tpch_query7.sql Use coordinator instead of schema node in terminology 2017-01-25 11:07:23 +01:00
multi_mx_tpch_query7_nested.sql Use coordinator instead of schema node in terminology 2017-01-25 11:07:23 +01:00
multi_mx_tpch_query10.sql Add Regression Tests For Querying MX Tables from Workers 2017-01-24 10:36:59 +03:00
multi_mx_tpch_query12.sql Use coordinator instead of schema node in terminology 2017-01-25 11:07:23 +01:00
multi_mx_tpch_query14.sql Use coordinator instead of schema node in terminology 2017-01-25 11:07:23 +01:00
multi_mx_tpch_query19.sql Use coordinator instead of schema node in terminology 2017-01-25 11:07:23 +01:00
multi_name_lengths.sql Provides safe, idempotent shard-extended names to any object name 2016-10-03 17:02:34 -04:00
multi_null_minmax_value_pruning.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_partition_pruning.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_prepare_plsql.sql Don't change query tree of DDL commands 2017-05-04 15:05:47 -07:00
multi_prepare_sql.sql Hack up PREPARE/EXECUTE for nearly all distributed queries. 2017-01-23 09:23:50 -08:00
multi_prune_shard_list.sql Faster shard pruning. 2017-06-06 15:58:05 -06:00
multi_query_directory_cleanup.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_reference_table.sql Improve error messages for INSERT INTO .. SELECT 2017-01-16 12:16:14 -07:00
multi_remove_node_reference_table.sql Fix dependent tests 2017-01-25 19:19:39 +03:00
multi_repair_shards.sql Disallow SendCommandListToWorkerInSingleTransaction when modifications have occurred 2016-11-02 12:26:56 +01:00
multi_repartition_udt.sql During repartitions, the partitionColumnType argument sent to workers 2016-10-03 13:41:20 -04:00
multi_repartitioned_subquery_udf.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_replicate_reference_table.sql Convert DropShards to use new connection API 2017-01-23 21:08:41 +03:00
multi_router_planner.sql Refactor CheckShardPlacements 2017-01-26 13:20:45 +02:00
multi_schema_support.sql Add ORDER BY clause to shard state tests to have consistent output 2017-01-13 02:42:28 +03:00
multi_shard_modify.sql Error on Unsupported Features on Workers 2017-01-02 16:03:45 +03:00
multi_simple_queries.sql Support PostgreSQL 9.6 2016-10-18 16:23:55 -06:00
multi_single_relation_subquery.sql Add HAVING support 2016-10-13 15:47:53 +03:00
multi_sql_function.sql Add regression tests for parameterized queries 2016-10-18 14:02:50 +03:00
multi_table_ddl.sql Convert DropShards to use new connection API 2017-01-23 21:08:41 +03:00
multi_task_assignment_policy.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_tpch_query1.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_tpch_query3.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_tpch_query6.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_tpch_query7.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_tpch_query7_nested.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_tpch_query10.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_tpch_query12.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_tpch_query14.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_tpch_query19.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_transaction_recovery.sql Enable transaction recovery in connection API 2016-12-23 16:14:29 +01:00
multi_transactional_drop_shards.sql Add ORDER BY to some tests to have consistent output 2017-01-25 11:43:25 +02:00
multi_truncate.sql Convert DropShards to use new connection API 2017-01-23 21:08:41 +03:00
multi_unsupported_worker_operations.sql Allow dropping sequences on mx workers 2017-01-31 14:51:44 -08:00
multi_upgrade_reference_table.sql Add replication_model GUC 2017-01-23 09:05:14 -07:00
multi_upsert.sql Remove references to 9.4 2016-09-29 17:35:19 -06:00
multi_utilities.sql Add worker_hash() and a stub for isolate_tenant_to_new_shard() 2017-01-20 14:38:01 +02:00
multi_utility_statements.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
multi_utility_warnings.sql Remove warnings on schema creation 2016-07-22 18:24:23 +03:00
multi_verify_no_subquery.sql Add LIMIT/OFFSET Support 2016-07-18 12:00:24 +03:00
multi_view.sql Add view support 2017-01-13 09:39:42 +03:00
multi_working_columns.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
task_tracker_assign_task.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
task_tracker_cleanup_job.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
task_tracker_create_table.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
task_tracker_partition_task.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
worker_binary_data_partition.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
worker_check_invalid_arguments.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
worker_create_table.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
worker_hash_partition.sql During repartitions, the partitionColumnType argument sent to workers 2016-10-03 13:41:20 -04:00
worker_hash_partition_complex.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
worker_merge_hash_files.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
worker_merge_range_files.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
worker_null_data_partition.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00
worker_range_partition.sql Use single-quote interpolation in partition test 2016-10-10 13:03:43 -06:00
worker_range_partition_complex.sql Set Explicit ShardId/JobId In Regression Tests 2016-06-07 14:32:44 +03:00