These functions are holdovers from pg_shard and were created for unit
testing c-level functions (like InsertShardPlacementRow) which our
regression tests already test quite effectively. Removing because it
makes refactoring the signatures of those c-level functions
unnecessarily difficult.
- create_healthy_local_shard_placement_row
- update_shard_placement_row_state
- delete_shard_placement_row
Uncrustify 0.65 appears to have changed some defaults, resulting in
breakages for those of us who have already upgraded; Travis still uses
Uncrustify 0.64, but these changes work with both versions (assuming
appropriately updated config), so this should permit use of either
version for the time being.
Before this change, we used ShareLock to acquire lock on distributed tables while
running VACUUM. This makes VACUUM and INSERT block each other. With this change we
changed lock mode from ShareLock to ShareUpdateExclusiveLock, which does not conflict
with the locks INSERT acquire.
MasterIrreducibleExpressionWalker has a copied code from
function check_functions_in_node() which was available with
PG 9.6+. Now PG 9.5 support is dropped we can remove
duplicate code and directly call check_functions_in_node().
Previously we used ForgetResults() in StartRemoteTransactionAbort() -
that's problematic because there might still be an ongoing statement,
and this causes us to wait for its completion. That e.g. happens when
a statement running on the coordinator is cancelled.
That's important because the currently running statement on a worker
might continue to hold locks and consume resources, even after the
connection is closed. Unfortunately postgres will only notice closed
connections when reading from / writing to the network. That might
only happen much later.
Now that there's no blocking libpq callers left, default to using
non-blocking mode in connection_management.c. This has two
advantages:
1) Blockiness doesn't have to frequently be reset, simplifying code
2) Prevents accidental use of blocking libpq functions, since they'll
frequently return 'need IO'
This commit is intended to be a base for supporting declarative partitioning
on distributed tables. Here we add the following utility functions and their
unit tests:
* Very basic functions including differnentiating partitioned tables and
partitions, listing the partitions
* Generating the PARTITION BY (expr) and adding this to the DDL events
of partitioned tables
* Ability to generate text representations of the ranges for partitions
* Ability to generate the `ALTER TABLE parent_table ATTACH PARTITION
partition_table FOR VALUES value_range`
* Ability to apply add shard ids to the above command using
`worker_apply_inter_shard_ddl_command()`
* Ability to generate `ALTER TABLE parent_table DETACH PARTITION`
Adds support for PostgreSQL 10 by copying in the requisite ruleutils
and updating all API usages to conform with changes in PostgreSQL 10.
Most changes are fairly minor but they are numerous. One particular
obstacle was the change in \d behavior in PostgreSQL 10's psql; I had
to add SQL implementations (views, mostly) to mimic the pre-10 output.
Add a second implementation of INSERT INTO distributed_table SELECT ... that is used if
the query cannot be pushed down. The basic idea is to execute the SELECT query separately
and pass the results into the distributed table using a CopyDestReceiver, which is also
used for COPY and create_distributed_table. When planning the SELECT, we go through
planner hooks again, which means the SELECT can also be a distributed query.
EXPLAIN is supported, but EXPLAIN ANALYZE is not because preventing double execution was
a lot more complicated in this case.
- Use native postgres function for composite key btree functions
- Move explain tests to multi_explain.sql (get rid of .out _0.out files)
- Get rid of input/output files for multi_subquery.sql by moving table creations
- Update some comments
With this commit we start to register InvalidateDistRelationCacheCallback
function as cache invalidation callback function before version checks
because during version checks we use cache to look up relation ids of some
relations like pg_dist_relation or pg_dist_partition_logical_relid_index
and we want to know about cache invalidation before accessing them.
During version update, we indirectly calld CheckInstalledVersion via
ChackCitusVersions. This obviously fails because during version update it is
expected to have version mismatch between installed version and binary version.
Thus, we remove that ChackCitusVersions. We now only call ChackAvailableVersion.
Before this commit, we were erroring out at almost all queries if there is a
version mismatch. With this commit, we started to error out only requested
operation touches distributed tables.
Normally we would need to use distributed cache to understand whether a table
is distributed or not. However, it is not safe to read our metadata tables when
there is a version mismatch, thus it is not safe to create distributed cache.
Therefore for this specific occasion, we directly read from pg_dist_partition
table. However; reading from catalog is costly and we should not use this
method in other places as much as possible.
This commit fixes the problem where we incorrectly try to reach distributed table
cache when the extension is not loaded completely. We tried to reach the cache
because we wanted to get reference table information to activate the node. However
it is actually not necessary to explicitly activate the nodes which come from
master_initialize_node_metadata. Because it only runs during extension creation and
at that time there are no reference tables and all nodes are considered as active.
When we propogate the schema creation command to data nodes we add schema's
owner name too. Before this patch, we did not quote the owner's name which
causes problems with the names containing characters like '-'.
We incorrectly try to use relation cache to find particular schema's owner and
when we cannot find the schema in the relation cache(i.e always), we automatically
used current user as the schema's owner. This means we always created schemas in
the data nodes with current user. With this patch we started to use namespace
cache to find schemas.
With this commit, we start to use custom compiled PostgreSQL builds in
Travis for merge commits. This allows us to run isolation tests and
PostgreSQL's own regression tests along with our regression tests in
Travis.
Since manually compiling PostgreSQL takes more time and we also add new
tests, we only enable running these tests on merge commits.
* Accept invalidation messages before accessing the metadata cache
This commit is crucial to prevent stale metadata reads from the
cache. Without this commit, some of the operations may use stale
metadata which could end up with various bugs such as crashes,
inconsistent/lost data etc.
As an example, consider that a COPY operation is blocked on shard
metadata lock. Another concurrent session updates the metadata and
invalidates the cache. However, since Citus doesn't accept invalidations,
COPY continues with the stale metadata once it acquires the lock.
With this commit, we make sure that invalidation messages are accepted
just before accessing the metadata cache and preventing any operation to
use stale metadata.
* Add isolation tests for placement changes and conccurrent operations
- add node with reference table vs COPY/insert/update/DDL
- repair shard vs COPY/insert/update/DDL
- repair shard vs repair shard
Distributed query planning for subquery pushdown is done on the original
query. This prevents the usage of external parameters on the execution.
To overcome this, we manually replace the parameters on the original
query.