Fixes issue #258
Prior to this change, Citus gives a deceptive NOTICE message when a query
including ANY or ALL on a non-partition column is issued on a hash
partitioned table.
Let the github_events table be hash-distributed on repo_id column. Then,
issuing this query:
SELECT count(*) FROM github_events WHERE event_id = ANY ('{1,2,3}')
Gives this message:
NOTICE: cannot use shard pruning with ANY (array expression)
HINT: Consider rewriting the expression with OR clauses.
Note that since event_id is not the partition column, shard pruning would
not be applied in any case. However, the NOTICE message would be valid
and be given if the ANY clause would have been applied on repo_id column.
Reviewer: Murat Tuncer
The previous form of the test, utilizing DEBUG2, included too much
output dependent on the specifc system and version. Reformulate it to
explicitly connect to workers and show the schema there, when necessary.
The only remaining difference in some of the remaining alternate
regression test files was due to an older minor version release
change. Remove those as well.
There already exist tests that locally embed knowledge about port
numbers, and there's more tests requiring that. Instead of copying
\set's to several tests, make these port number variables available to
all tests.
The default staging policy is now round-robin, though tests were still
configured to use local-first. Testing with the shipping default seems
like the best option, correctness-wise, and since local-first has some
issues with OSes where connecting from localhost doesn't always resolve
to 'localhost', just going with the default is a win-win.
After this change, shards and associated metadata are automatically
dropped when running DROP TABLE on a distributed table, which fixes#230.
It also adds schema support for master_apply_delete_command, which
fixes#73.
Dropping the shards happens in the master_drop_all_shards UDF, which is
called from the SQL_DROP trigger. Inside the trigger, the table is no
longer visible and calling master_apply_delete_command directly wouldn't
work and oid <-> name mappings are not available. The
master_drop_all_shards function therefore takes the relation id, schema
name, and table name as parameters, which can be obtained from
pg_event_trigger_dropped_objects() in the SQL_DROP trigger. If the user
calls master_drop_all_shards while the table still exists, the schema
name and table name are ignored.
Author: Marco Slot
Reviewed-By: Andres Freund
All citusdb references in
- extension, binary names
- file headers
- all configuration name prefixes
- error/warning messages
- some functions names
- regression tests
are changed to be citus.
This entirely removes any restriction on the type of partitioning
during DML planning and execution. Though there aren't actually any
technical limitations preventing DML commands against append- (or even
range-) partitioned tables, we had initially forbidden this, as any
future stage operation could cause shards to overlap, banning all
subsequent DML operations to partition values contained within more
than one shards. This ended up mostly restricting us, so we're now
removing that restriction.
When two data types have the same binary representation, PostgreSQL may
add an implicit coercion between them by wrapping a node in a relabel
type. This wrapper signals that the wrapped value is completely binary
compatible with the designated "final type" of the relabel node. As an
example, the varchar type is often relabeled to text, since functions
provided for use with text (comparisons, hashes, etc.) are completely
compatible with varchar as well.
The hash-partitioned codepath contains functions that verify queries
actually contain an equality constraint on the partition column, but
those functions expect such constraints to be comparison operations
between a Var and Const. The RelabelType wrapper node causes these
functions to always return false, which bypasses shard pruning.