As the previous versions of Citus don't know how to handle citus local
tables, we should prevent downgrading from 9.5 to older versions if any
citus local tables exists.
* Merge enterprise branch if it exists
We should merge the enterprise branch if it exists in the check
enterpise merge job, otherwise the following can happen:
- there is some change on community that breaks the compilation on
enterprise without creating any conflicts
- we fix the compilation issue by opening a branch on enterprise
- the job doesn't see the enterprise specific fix because it doesn't try
to merge enterprise branch if there are no conflicts
* Update ci/check_enterprise_merge.sh
Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
* Simplify the steps
Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
Pushing down the CALLs to the node that the CALL is executed is
dangerous and could lead to infinite recursion.
When the coordinator added as worker, Citus was by chance preventing
this. The coordinator was marked as "not metadatasynced" node
in pg_dist_node, which prevented CALL/function delegation to happen.
With this commit, we do the following:
- Fix metadatasynced column for the coordinator on pg_dist_node
- Prevent pushdown of function/procedure to the same node that
the function/procedure is being executed. Today, we do not sync
pg_dist_object (e.g., distributed functions metadata) to the
worker nodes. But, even if we do it now, the function call delegation
would prevent the infinite recursion.
* Not take ShareUpdateExlusiveLock on pg_dist_transaction
We were taking ShareUpdateExlusiveLock on pg_dist_transaction during
recovery to prevent multiple recoveries happening concurrenly. VACUUM(
not FULL) also takes ShareUpdateExclusiveLock, and they can conflict. It
seems that VACUUM will skip the table if there is a conflicting lock
already taken unless it is doing the vacuum to prevent id wraparound, in
which case there can be a deadlock. I guess the deadlock happens if:
- VACUUM takes a lock on pg_dist_transaction and is done for id
wraparound problem
- The transaction in the maintenance tries to take a lock but
cannot as that conflicts with the lock acquired by VACUUM
- The transaction in the maintenance daemon has a very old xid hence
VACUUM cannot proceed.
If we take a row exclusive lock in transaction recovery then it wouldn't
conflict with VACUUM hence it could proceed so the deadlock would be
resolved. To prevent concurrent transaction recoveries happening, an
advisory lock is taken with ShareUpdateExlusiveLock as before.
* Use CITUS_OPERATIONS tag
* Not allow removing a single node with ref tables
We should not allow removing a node if it is the only node in the
cluster and there is a data on it. We have this check for distributed
tables but we didn't have it for reference tables.
* Update src/test/regress/expected/single_node.out
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
* Update src/test/regress/sql/single_node.sql
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
DESCRIPTION: Introduce citus local tables
The commits in this pr are merged from other sub-pr's:
* community/#3852: Brings lazy&fast table creation logic for create_citus_local_table udf
* community/#3995: Brings extended utility command support for citus local tables
* community/#4133: Brings changes in planner and in several places to integrate citus local tables into our distributed execution logic
We are introducing citus local tables, which a new table type to citus.
To be able to create a citus local table, first we need to add coordinator as a worker
node.
Then, we can create a citus local table via SELECT create_citus_local_table(<tableName>).
Calling this udf from coordinator will actually create a single-shard table whose shard
is on the coordinator.
Also, from the citus metadata perspective, for citus local tables:
* partitionMethod is set to DISTRIBUTE_BY_NONE (like reference tables) and
* replicationModel is set to the current value of citus.replication_model, which
already can't be equal to REPLICATION_MODEL_2PC, which is only used for reference
tables internally.
Note that currently we support creating citus local tables only from postgres tables
living in the coordinator.
That means, it is not allowed to execute this udf from worker nodes or it is not allowed
to move shard of a citus local table to any other nodes.
Also, run-time complexity of calling create_citus_local_table udf does not depend
on the size of the relation, that means, creating citus local tables is actually a
non-blocking operation.
This is because, instead of copying the data to a new shard, this udf just does the
following:
* convert input postgres table to the single-shard of the citus local table by suffixing
the shardId to it's name, constraints, indexes and triggers etc.,
* create a shell table for citus local table in coordinator and in mx-worker nodes when
metadata sycn is enabled.
* create necessary objects on shell table.
Here, we should also note we can execute queries/dml's from mx worker nodes
as citus local tables are already first class citus tables.
Even more, we brought trigger support for citus local tables.
That means, we can define triggers on citus local tables so that users can define trigger
objects to perform execution of custom functions that might even modify other citus tables
and other postgres tables.
Other than trigger support, citus local tables can also be involved in foreign key relationships
with reference tables.
Here the only restriction is, foreign keys from reference tables to citus local tables cannot
have behaviors other than RESTRICT & NO ACTION behavior.
Other than that, foreign keys between citus local tables and reference tables just work fine.
All in all, citus local tables are actually just local tables living in the coordinator, but natively
accessible from other nodes like other first class citus tables and this enables us to set foreign
keys constraints between very big coordinator tables and reference tables without having to
do any data replication to worker nodes for local tables.
This commit brings following features:
Foreign key support from citus local tables to reference tables
* Foreign key support from reference tables to citus local tables
(only with RESTRICT & NO ACTION behavior)
* ALTER TABLE ENABLE/DISABLE trigger command support
* CREATE/DROP/ALTER trigger command support
and disallows:
* ALTER TABLE ATTACH/DETACH PARTITION commands
* CREATE TABLE <postgres table> ATTACH PARTITION <citus local table>
commands
* Foreign keys from postgres tables to citus local tables
(the other way was already disallowed)
for citus local tables.
Introduce table entry utility functions
Citus table cache entry utilities are introduced so that we can easily
extend existing functionality with minimum changes, specifically changes
to these functions. For example IsNonDistributedTableCacheEntry can be
extended for citus local tables without the need to scan the whole
codebase and update each relevant part.
* Introduce utility functions to find the type of tables
A table type can be a reference table, a hash/range/append distributed
table. Utility methods are created so that we don't have to worry about
how a table is considered as a reference table etc. This also makes it
easy to extend the table types.
* Add IsCitusTableType utilities
* Rename IsCacheEntryCitusTableType -> IsCitusTableTypeCacheEntry
* Change citus table types in some checks
create_distributed_function(function_name,
distribution_arg_name,
colocate_with text)
This UDF did not allow colocate_with parameters when there were no
disttribution_arg_name supplied. This commit changes the behaviour to
allow missing distribution_arg_name parameters when the function should
be colocated with a reference table.
* check compilation of enterprise job
* test that enterprise merge job fails with compilation error
* Revert "test that enterprise merge job fails with compilation error"
This reverts commit 0eaccd58c207a4c15365186017bf47601cc95552.
* Update readme and use citus extbuilder:13beta3
* Hide citus.subquery_pushdown flag
This flag is dangerous and could likely to let queries
return wrong results.
The flag has a very specific purpose for a very specific
data distribution and query structure. In those cases, when
the flag is set, the user can skip recursive planning altogether
*at their own risk*.
The meaning of the flag is that "I know what I'm doing such that
the query structure/data distribution is on my control, so Citus
can skip many correctness checks".
For regular users, enabling this flag is discouraged. We have to
keep the support only for backward compatibility for some users.
In addition to that, give a NOTICE to discourage new users to
use it.
* Update and separate test images
The build image was a single one and it would contain pg11, pg12 and
pg13. Now it is separated so that we can build each pg major
independently.
Tags are used as full postgres versions so that we can know which
version we use by looking at the tag. For example exttester:11.9 would
mean we are using pg11.9.
pg11 is updated from 11.5 to 11.9.
pg12 is updated from 12rc to 12.4.
* Ignore memory usage in pg13 explain
* Use citus instead of personal repo
RemoveCoordinatorPlacement does not do what it says. It removes the
coordinator placement only if there are other placements, so it is not a
single node, and only if the coordinator has a placement.
AllTargetExpressionsAreColumnReferences would return false if a query
had an entry that is referencing the outer query. It seems safe to not
have this for non-distributed tables, such as reference tables. We
already have separate checks for other cases such as having limits.