Commit Graph

268 Commits (fa6f2ed38204477dd4b90791b30a6b5d86381c70)

Author SHA1 Message Date
Jason Petersen 339e6e661e
Remove 9.6 (#2554)
Removes support and code for PostgreSQL 9.6

cr: @velioglu
2019-01-16 13:11:24 -07:00
Hadi Moshayedi 38579d52d0
Speed-up run_command_on_shards(). (#2564)
We were establishing connections synchronously. Establishing
connections asynchronously results in some parallelization, saving
hundreds of milliseconds.

In a test I did, this decreased the query time from 150ms to 40ms.
2018-12-24 08:47:01 -05:00
Nils Dijk 9309e63156
create_distributed_table as user, change table ownership during create 2018-11-29 14:20:42 +01:00
Marco Slot e9a7295ead Add multi-user tests for task-tracker protocol functions 2018-11-23 11:05:09 +01:00
Marco Slot 8e93fe5870 Check schema owner in task_tracker_assign_task 2018-11-23 11:05:09 +01:00
Onder Kalaci 052ba21b19 Make sure to prevent unauthorized users to drop sequences in Citus MX 2018-11-15 18:08:04 +03:00
Marco Slot f383e4f307
Description: Refactor code that handles DDL commands from one file into a module
The file handling the utility functions (DDL) for citus organically grew over time and became unreasonably large. This refactor takes that file and refactored the functionality into separate files per command. Initially modeled after the directory and file layout that can be found in postgres.

Although the size of the change is quite big there are barely any code changes. Only one two functions have been added for readability purposes:

- PostProcessIndexStmt which is extracted from PostProcessUtility
- PostProcessAlterTableStmt which is extracted from multi_ProcessUtility

A README.md has been added to `src/backend/distributed/commands` describing the contents of the module and every file in the module.
We need more documentation around the overloading of the COPY command, for now the boilerplate has been added for people with better knowledge to fill out.
2018-11-14 13:36:27 +01:00
Hadi Moshayedi d3e284dcd6
Use heap_deform_tuple() instead of calling heap_getattr(). (#2464)
After Fast ALTER TABLE ADD COLUMN with a non-NULL default in PG11, physical heaps might not contain all attributes after a ALTER TABLE ADD COLUMN happens. heap_getattr() returns NULL when the physical tuple doesn't contain an attribute. So we should use heap_deform_tuple() in these cases, which fills in the missing attributes.

Our catalog tables evolve over time, and an upgrade might involve some ALTER TABLE ADD COLUMN commands.

Note that we don't need to worry about postgres catalog tables and we can use heap_getattr() for them, because they only change between major versions.

This also fixes #2453.
2018-11-05 15:11:01 -05:00
Onder Kalaci cdc0d1491c Make sure to use correct execution mode for TRUNCATE
We used to set the execution mode in the truncate trigger. However,
when multiple tables are truncated with a single command, we could
set the execution mode very late. Instead, now set the execution mode
on the utility hook.
2018-09-25 15:35:27 +03:00
Onder Kalaci abc443d7fa Make sure that shard repair considers replication factor 2018-09-21 15:24:49 +03:00
Onder Kalaci c1b5a04f6e Allow partitioned tables with replication factor > 1
With this commit, we all partitioned distributed tables with
replication factor > 1. However, we also have many restrictions.

In summary, we disallow all kinds of modifications (including DDLs)
on the partition tables. Instead, the user is allowed to run the
modifications over the parent table.

The necessity for such a restriction have two aspects:
   - We need to acquire shard resource locks appropriately
   - We need to handle marking partitions INVALID in case
     of any failures. Note that, in theory, the parent table
     should also become INVALID, which is too aggressive.
2018-09-21 14:40:41 +03:00
Murat Tuncer b6930e3db9 Add distributed locking to truncated mx tables
We acquire distributed lock on all mx nodes for truncated
tables before actually doing truncate operation.

This is needed for distributed serialization of the truncate
command without causing a deadlock.
2018-09-21 14:23:19 +03:00
Onder Kalaci 76aa6951c2 Properly send commands to other nodes
We previously implemented OTHER_WORKERS_WITH_METADATA tag. However,
that was wrong. See the related discussion:
     https://github.com/citusdata/citus/issues/2320

Instead, we switched using OTHER_WORKER_NODES and make the command
that we're running optional such that even if the node is not a
metadata node, we won't be in trouble.
2018-09-10 16:01:30 +03:00
Onder Kalaci 5cf8fbe7b6 Add infrastructure to relation if exists 2018-09-07 14:49:36 +03:00
Onder Kalaci 26e308bf2a Support TRUNCATE from the MX worker nodes
This commit enables support for TRUNCATE on both
distributed table and reference tables.

The basic idea is to acquire lock on the relation by sending
the TRUNCATE command to all metedata worker nodes. We only
skip sending the TRUNCATE command to the node that actually
executus the command to prevent a self-distributed-deadlock.
2018-09-03 14:06:31 +03:00
velioglu 2639149bd8 Enterprise functions about metadata/resource locks 2018-08-27 16:32:20 +03:00
mehmet furkan ÅŸahin ef9f38b68d ApplyLogRedaction noop func is added 2018-08-17 14:48:54 -07:00
Onder Kalaci 7fb529aab9 Some stylistic improvements in the foreign keys to reference table
changes.
2018-07-05 23:23:34 +03:00
Nils Dijk c1c8c38dc9 create placeholder for policy ddl 2018-07-05 11:07:01 +02:00
mehmet furkan ÅŸahin f7b901e3fd CopyShardForeignConstraintCommandList API change for grouped constraints 2018-07-03 17:05:55 +03:00
mehmet furkan ÅŸahin 35eac2318d lock referenced reference table metadata is added
For certain operations in enterprise, we need to lock the
referenced reference table shard distribution metadata
2018-07-03 17:05:55 +03:00
mehmet furkan ÅŸahin 4db72c99f6 Specific DDLs are sequentialized when there is FK
-[x] drop constraint
-[x] drop column
-[x] alter column type
-[x] truncate

are sequentialized if there is a foreign constraint from
a distributed table to a reference table on the affected relations
by the above commands.
2018-07-03 17:05:55 +03:00
mehmet furkan ÅŸahin 2c5d59f3a8 create_distributed_table in transaction is fixed 2018-07-03 17:05:01 +03:00
mehmet furkan ÅŸahin 45f8017f42 create_distributed_table with fk to ref table is implemented 2018-07-03 17:05:01 +03:00
Murat Tuncer 3fc7cdfe6d Apply master_stage_protocol refactoring changes 2018-06-28 11:24:57 +03:00
Onder Kalaci 8ccb8b679e Real-time executor marks multi shard relation accesses before opening connections 2018-06-25 18:40:31 +03:00
Onder Kalaci 2890154420 Make sure that TRUNCATE always opens a DDL access 2018-06-25 18:40:31 +03:00
Onder Kalaci 21038f0d0e Make sure that inter-shard DDL commands are always covers both tables 2018-06-25 18:40:30 +03:00
Onder Kalaci 2f01894589 Track relation accesses using the connection management infrastructure 2018-06-25 18:40:30 +03:00
mehmet furkan ÅŸahin d1a3b20115 foreign_constraint_utils is created 2018-06-07 18:19:24 +03:00
Onder Kalaci 336044f2a8 master_modify_multiple_shards() and TRUNCATE honors multi_shard_modification_mode 2018-06-06 12:29:05 +03:00
Marco Slot fd4ff29f2f Add a debug message with distribution column value 2018-06-05 15:09:17 +03:00
Dimitri Fontaine 8b258cbdb0 Lock reads and writes only to the node being updated in master_update_node
Rather than locking out all the writes in the cluster, the function now only
locks out writes that target shards hosted by the node we're updating.
2018-05-09 15:14:20 +02:00
velioglu 32bcd610c1 Support modify queries with multiple tables
With this commit we begin to support modify queries with multiple
tables if these queries are pushdownable.
2018-05-02 16:22:26 +03:00
Marco Slot 2559b84049 Drop shards as current user instead of super user 2018-05-01 09:57:20 +02:00
Murat Tuncer a6fe5ca183 PG11 compatibility update
- changes in ruleutils_11.c is reflected
- vacuum statement api change is handled. We now allow
  multi-table vacuum commands.
- some other function header changes are reflected
- api conflicts between PG11 and earlier versions
  are handled by adding shims in version_compat.h
- various regression tests are fixed due output and
  functionality in PG1
- no change is made to support new features in PG11
  they need to be handled by new commit
2018-04-26 11:29:43 +03:00
Marco Slot 9318aeee6b Allow multiple size function calls per query 2018-04-12 14:16:17 +02:00
Marco Slot ee132c5ead Prune shards once per relation in subquery pushdown 2018-04-10 20:33:07 +02:00
velioglu 698d585fb5 Remove broadcast join logic
After this change all the logic related to shard data fetch logic
will be removed. Planner won't plan any ShardFetchTask anymore.
Shard fetch related steps in real time executor and task-tracker
executor have been removed.
2018-03-30 11:45:19 +03:00
Metin Doslu bcf660475a Add support for modifying CTEs 2018-02-27 15:08:32 +02:00
Marco Slot 1e9186a3b5 Do not use new connection in table size functions 2018-02-23 07:07:55 +01:00
Brian Cloutier a2ed45e206 Remove variable length arrays
VLAs aren't supported by Visual Studio.

- Remove all existing instances of VLAs.
- Add a flag, -Werror=vla, which makes gcc refuse to compile if we add
  VLAs in the future.
2018-02-01 10:30:41 -08:00
Marco Slot fa7fa2734b Log remote commands sent via MultiClientSendQuery 2017-12-22 16:18:40 +01:00
Brian Cloutier fb7b86fa14 Replace strtoull with pg_strtouint64
The macro we were using to detect strtoull isn't set on Windows, and
just in case there are differences use a portable function from PG
instead of calling strtoull directly.
2017-12-21 14:28:51 +01:00
Murat Tuncer 2d66bf5f16
Fix hard coded formatting strings for 64 bit numbers (#1831)
Postgres provides OS agnosting formatting macros for
formatting 64 bit numbers. Replaced %ld %lu with
INT64_FORMAT and UINT64_FORMAT respectively.

Also found some incorrect usages of formatting
flags and fixed them.
2017-12-04 14:11:06 +03:00
Marco Slot d3b634b301 Allow generating placement IDs without using the sequence 2017-11-15 10:12:06 +01:00
Marco Slot c24a0875a5 Allow generating shard IDs without using the sequence 2017-11-15 10:12:05 +01:00
Brian Cloutier 0db8277266 remove unused errno import 2017-11-14 13:09:34 -08:00
Marco Slot 533a533565 Only drop sequences on workers with metadata 2017-11-14 16:01:56 +01:00
velioglu be28ba8e70 Add stub UDF to run pg_upgrade flawlessly 2017-11-13 16:14:45 +02:00
Marco Slot 7e34348334 Add shard transfer mode parameter to shard copy functions 2017-10-31 13:30:48 +01:00
Furkan Sahin 2b39c52f0b Replica identity on create_distributed_table
By this commit, citus minds the replica identity of the table when
we distribute the table. So the shards of the distributed table
have the same replica identity with the local table.
2017-10-31 13:08:36 +03:00
Marco Slot be46661bf7 Block only 2PCs instead of all writes in citus_create_restore_point 2017-10-27 00:07:32 +02:00
velioglu 0b5db5d826 Support multi shard update/delete queries 2017-10-25 15:52:38 +03:00
Brian Cloutier 58cf15ceca DistributedTableSize doesn't emit oid when erring out 2017-10-14 02:42:57 +03:00
Jason Petersen 6c9b19a954
Add version-compat header
For polyfill macros, etc.
2017-09-25 17:20:23 -07:00
Jason Petersen fbeaa2f9d0
Remove direct access to tupleDesc->attrs
A level of indirection was removed from this field for PostgreSQL 11.
By using the handy provided macro, we can be version agnostic.
2017-09-25 17:20:23 -07:00
Onder Kalaci 4782f9f98a Properly copy and trim the error messages that come from pg_conn
When a NULL connection is provided to PQerrorMessage(), the
returned error message is a static text. Modifying that static
text, which doesn't necessarly be in a writeable memory, is
dangreous and might cause a segfault.
2017-09-22 19:43:09 +03:00
Marco Slot 9e7b1fb858 Return readable nodes in master_get_active_worker_nodes 2017-08-16 11:28:47 +02:00
Marco Slot fa70089766 Enable 2PC during distributed table creation 2017-08-15 13:44:20 +02:00
Marco Slot 0ae265c436 Add citus_create_restore_point for distributed snapshots 2017-08-11 07:36:20 +02:00
Brian Cloutier 9d93fb5551 Create citus.use_secondary_nodes GUC
This GUC has two settings, 'always' and 'never'. When it's set to
'never' all behavior stays exactly as it was prior to this commit. When
it's set to 'always' only SELECT queries are allowed to run, and only
secondary nodes are used when processing those queries.

Add some helper functions:
- WorkerNodeIsSecondary(), checks the noderole of the worker node
- WorkerNodeIsReadable(), returns whether we're currently allowed to
  read from this node
- ActiveReadableNodeList(), some functions (namely, the ones on the
  SELECT path) don't require working with Primary Nodes. They should call
  this function instead of ActivePrimaryNodeList(), because the latter
  will error out in contexts where we're not allowed to write to nodes.
- ActiveReadableNodeCount(), like the above, replaces
  ActivePrimaryNodeCount().
- EnsureModificationsCanRun(), error out if we're not currently allowed
  to run queries which modify data. (Either we're in read-only mode or
  use_secondary_nodes is set)

Some parts of the code were switched over to use readable nodes instead
of primary nodes:
- Deadlock detection
- DistributedTableSize,
- the router, real-time, and task tracker executors
- ShardPlacement resolution
2017-08-10 17:37:17 +03:00
Marco Slot 08ed6d8269 Prevent pg_dist_node changes during master_create_empty_shard 2017-08-09 14:22:09 +02:00
Marco Slot c2f8bafa05 Fix shard creation vs. pg_dist_node change locking 2017-08-09 14:09:54 +02:00
Burak Yucesoy 8455d1a4ef Ensure we are allowing partitioned tables at all appropriate places 2017-08-09 10:01:35 +03:00
Burak Yucesoy fddf9b3fcc Add distributed partitioned table support distributed table creation
With this PR, Citus starts to support all possible ways to create
distributed partitioned tables. These are;

- Distributing already created partitioning hierarchy
- CREATE TABLE ... PARTITION OF a distributed_table
- ALTER TABLE distributed_table ATTACH PARTITION non_distributed_table
- ALTER TABLE distributed_table ATTACH PARTITION distributed_table

We also support DETACHing partitions from partitioned tables and propogating
TRUNCATE and DDL commands to distributed partitioned tables.

This PR also refactors some parts of distributed table creation logic.
2017-08-09 10:01:35 +03:00
Metin Doslu b8a9e7c1bf Add support for UPDATE/DELETE with subqueries 2017-08-08 21:35:08 +03:00
Marco Slot d3e9746236 Avoid connections that accessed non-colocated placements in multi-shard commands 2017-08-08 18:32:34 +02:00
Brian Cloutier a3e9bef685 All users of WorkerNodeHash take an AccessShareLock
The metadata cache simulates a SELECT on pg_dist_node. Now the locks it
takes also simulate that SELECT.
2017-08-08 13:12:06 +03:00
Brian Cloutier f87fefa323 Refactor: DistributedTableSize more explicitly only locks pg_dist_node 2017-08-08 13:12:06 +03:00
Hadi Moshayedi 8229a64fe8 Remove distributed tables' dependency on distribution key columns. (#1527)
This change removes distributed tables' dependency on distribution key columns. We already check that we cannot drop distribution key columns in ErrorIfUnsupportedAlterTableStmt() at multi_utility.c, so we don't need to have distributed table to distribution key column dependency to avoid dropping of distribution key column.

Furthermore, having this dependency causes some warnings in pg_dump --schema-only (See #866), which are not desirable.

This change also adds check to disallow drop of distribution keys when citus.enable_ddl_propagation is set to false. Regression tests are updated accordingly.
2017-08-03 10:07:04 -04:00
Burak Yucesoy 7769f1d012 Refactor distributed table creation logic
This commit is preperation for introducing distributed partitioned
table support. We want to clean and refactor some code in distributed
table creation logic so that we can handle partitioned tables in more
robust way.
2017-07-31 11:11:23 +03:00
Murat Tuncer 8729b7d55a Use cstore_table_size function to determine cstore table size (#1521)
pg_table_size/pg_relation_size variants always return 0 for
cstore tables. We should be using cstore_table_size function
for cstore_tables.
2017-07-27 09:02:07 -07:00
Brian Cloutier ec99f8f983 Add nodeRole column
- master_add_node enforces that there is only one primary per group
- there's also a trigger on pg_dist_node to prevent multiple primaries
  per group
- functions in metadata cache only return primary nodes
- Rename ActiveWorkerNodeList -> ActivePrimaryNodeList
- Rename WorkerGetLive{Node->Group}Count()
- Refactor WorkerGetRandomCandidateNode
- master_remove_node only complains about active shard placements if the
  node being removed is a primary.
- master_remove_node only deletes all reference table placements in the
  group if the node being removed is the primary.
- Rename {Node->NodeGroup}HasShardPlacements, this reflects the behavior it
  already had.
- Rename DeleteAllReferenceTablePlacementsFrom{Node->NodeGroup}. This also
  reflects the behavior it already had, but the new signature forces the
  caller to pass in a groupId
- Rename {WorkerGetLiveGroup->ActivePrimaryNode}Count
2017-07-24 11:57:46 +03:00
Brian Cloutier e6c375eb81 Tiny refactor to master_create_empty_shard 2017-07-24 11:57:46 +03:00
Brian Cloutier ee270b65d7 make WorkerGetNodeWithName a static function 2017-07-24 11:57:46 +03:00
velioglu 6ea15fbb25 Make create_distributed_table transactional 2017-07-18 12:35:40 +03:00
Brian Cloutier ee4edc498f Don't release locks early in metadata functions 2017-07-12 14:18:27 +02:00
Brian Cloutier 7ad95b53d2 Rename pg_dist_shard_placement -> pg_dist_placement
Comes with a few changes:

- Change the signature of some functions to accept groupid
  - InsertShardPlacementRow
  - DeleteShardPlacementRow
  - UpdateShardPlacementState

- NodeHasActiveShardPlacements returns true if the group the node is a
  part of has any active shard placements

- TupleToShardPlacement now returns ShardPlacements which have NULL
  nodeName and nodePort.

- Populate (nodeName, nodePort) when creating ShardPlacements
- Disallow removing a node if it contains any shard placements

- DeleteAllReferenceTablePlacementsFromNode matches based on group. This
  doesn't change behavior for now (while there is only one node per
  group), but means in the future callers should be careful about
  calling it on a secondary node, it'll delete placements on the primary.

- Create concept of a GroupShardPlacement, which represents an actual
  tuple in pg_dist_placement and is distinct from a ShardPlacement,
  which has been resolved to a specific node. In the future
  ShardPlacement should be renamed to NodeShardPlacement.

- Create some triggers which allow existing code to continue to insert
  into and update pg_dist_shard_placement as if it still existed.
2017-07-12 14:17:31 +02:00
Marco Slot d3785b97c0 Remove XactModificationLevel distinction between DML and multi-shard 2017-07-12 11:59:19 +02:00
Marco Slot 29f21fea59 Use GetPlacementListConnection for multi-shard commands 2017-07-12 11:26:22 +02:00
Burak Yucesoy c8b9e4011b Remove LockRelationDistributionMetadata function 2017-07-10 15:46:37 +03:00
Andres Freund ddb0651967 Move citus tools to interrupt aware libpq wrappers. 2017-07-04 12:38:52 -07:00
Andres Freund b96ba9b490 Fix code only enabled for 9.5.
There's still supporting wrappers used, a subsequent commit will
remove those.

This also removes the already unused tuplecount_t define.
2017-06-26 08:46:32 -07:00
Jason Petersen 2204da19f0 Support PostgreSQL 10 (#1379)
Adds support for PostgreSQL 10 by copying in the requisite ruleutils
and updating all API usages to conform with changes in PostgreSQL 10.
Most changes are fairly minor but they are numerous. One particular
obstacle was the change in \d behavior in PostgreSQL 10's psql; I had
to add SQL implementations (views, mostly) to mimic the pre-10 output.
2017-06-26 02:35:46 -06:00
velioglu 173fe137af Convert DropShardsFromWorker to the new connection API 2017-06-15 15:24:06 +03:00
velioglu 43d2cdbd35 Convert DistributedTableSizeOnWorker function to new connection API 2017-06-14 17:29:58 +03:00
velioglu a1ea29ec2b Use placement connection to drop shards instead of node connection 2017-06-14 14:14:59 +03:00
Marco Slot f1d804180b Don't take a table lock in ForeignConstraintGetReferencedTableId 2017-05-31 11:15:21 +02:00
Burak Yucesoy 9fb15c439c Add version checks to necessary UDFs 2017-05-22 09:53:29 +03:00
Marco Slot a8f368fced Fix locking in master_drop_all_shards / master_apply_delete_command 2017-05-08 17:26:55 +02:00
Marco Slot 8edba5f309 Honour enable_ddl_propagation in truncate trigger 2017-04-29 03:32:52 +02:00
Marco Slot 6e58067962 Fix list length lookup in WorkerGetLiveNodeCount 2017-04-29 02:13:20 +02:00
Marco Slot 0b579d027a Check whether relation ID exists in citus_relation_size 2017-04-29 01:39:39 +02:00
Andres Freund d399f395f7 Faster shard pruning.
So far citus used postgres' predicate proofing logic for shard
pruning, except for INSERT and COPY which were already optimized for
speed.  That turns out to be too slow:
* Shard pruning for SELECTs is currently O(#shards), because
  PruneShardList calls predicate_refuted_by() for every
  shard. Obviously using an O(N) type algorithm for general pruning
  isn't good.
* predicate_refuted_by() is quite expensive on its own right. That's
  primarily because it's optimized for doing a single refutation
  proof, rather than performing the same proof over and over.
* predicate_refuted_by() does not keep persistent state (see 2.) for
  function calls, which means that a lot of syscache lookups will be
  performed. That's particularly bad if the partitioning key is a
  composite key, because without a persistent FunctionCallInfo
  record_cmp() has to repeatedly look-up the type definition of the
  composite key. That's quite expensive.

Thus replace this with custom-code that works in two phases:
1) Search restrictions for constraints that can be pruned upon
2) Use those restrictions to search for matching shards in the most
   efficient manner available:
   a) Binary search / Hash Lookup in case of hash partitioned tables
   b) Binary search for equal clauses in case of range or append
      tables without overlapping shards.
   c) Binary search for inequality clauses, searching for both lower
      and upper boundaries, again in case of range or append
      tables without overlapping shards.
   d) exhaustive search testing each ShardInterval

My measurements suggest that we are considerably, often orders of
magnitude, faster than the previous solution, even if we have to fall
back to exhaustive pruning.
2017-04-28 14:40:41 -07:00
Marco Slot 7f9e80db10 Only process error if not NULL in StoreErrorMessage 2017-04-21 17:01:01 +02:00
Marco Slot 5e58804d44 Support query parameters in combination with function evaluation 2017-04-17 15:40:55 +02:00
Marco Slot 0bcc227a62 Create indexes after worker_append_table_to_shard during shard repair 2017-04-17 15:17:21 +02:00
Burak Yucesoy e9095e62ec Decouple reference table replication
With this change we add an option to add a node without replicating all reference
tables to that node. If a node is added with this option, we mark the node as
inactive and no queries will sent to that node.

We also added two new UDFs;
 - master_activate_node(host, port):
    - marks node as active and replicates all reference tables to that node
 - master_add_inactive_node(host, port):
    - only adds node to pg_dist_node
2017-04-17 13:33:31 +03:00
velioglu e32aff1a26 Size UDFs implemented
citus_table_size, citus_relation_size and citus_total_relation_size UDFs are implemented.
2017-03-16 13:50:30 +03:00