citus

Commit Graph

Author	SHA1	Message	Date
Jelte Fennema	685b54b3de	Semmle: Check for NULL in some places where it might occur (#3509 ) Semmle reported quite some places where we use a value that could be NULL. Most of these are not actually a real issue, but better to be on the safe side with these things and make the static analysis happy.	2020-02-27 10:45:29 +01:00
Philip Dubé	4b5d6c3ebe	Rename RelayFileState to ShardState Replace FILE_ prefix with SHARD_STATE_	2020-01-12 05:57:53 +00:00
SaitTalhaNisanci	13204487e9	remove copyright years (#3286 )	2019-12-11 21:14:08 +03:00
Philip Dubé	fcf2fd819b	Add distributioncolumncollation to to pg_dist_colocation Use partition column's collation for range distributed tables Don't allow non deterministic collations for hash distributed tables CoPartitionedTables: don't compare unequal types	2019-12-09 19:51:40 +00:00
Hadi Moshayedi	2268a9cae6	Error for metadata commands if any metadata node is out-of-sync (#3226 ) * Error for metadata commands if any metadata node is out-of-sync * Make the functions have separate APIs for all workers/metadata workers	2019-11-27 09:52:57 +01:00
Jelte Fennema	1d8dde232f	Automatically convert useless declarations using regex replace (#3181 ) * Add declaration removal to CI * Convert declarations	2019-11-21 13:47:29 +01:00
Hadi Moshayedi	15af1637aa	Replicate reference tables to coordinator.	2019-11-15 05:50:19 -08:00
Hadi Moshayedi	cb011bb30f	Propagate isactive to metadata nodes.	2019-11-15 05:48:42 -08:00
Hadi Moshayedi	e00d1546f3	Don't maintain replicationfactor of reference tables	2019-11-05 07:23:14 -08:00
Hadi Moshayedi	76f3933b05	Add metadatasynced, and sync on master_update_node() Co-authored-by: pykello <hadi.moshayedi@microsoft.com> Co-authored-by: serprex <serprex@users.noreply.github.com>	2019-09-18 09:32:54 -07:00
Philip Dubé	492d1b2cba	ActivePrimaryNodeList: add lockMode parameter	2019-09-13 17:44:56 +00:00
Nils Dijk	2879689441	Distribute Types to worker nodes (#2893 ) DESCRIPTION: Distribute Types to worker nodes When to propagate ============== There are two logical moments that types could be distributed to the worker nodes - When they get used ( just in time distribution ) - When they get created ( proactive distribution ) The just in time distribution follows the model used by how schema's get created right before we are going to create a table in that schema, for types this would be when the table uses a type as its column. The proactive distribution is suitable for situations where it is benificial to have the type on the worker nodes directly. They can later on be used in queries where an intermediate result gets created with a cast to this type. Just in time creation is always the last resort, you cannot create a distributed table before the type gets created. A good example use case is; you have an existing postgres server that needs to scale out. By adding the citus extension, add some nodes to the cluster, and distribute the table. The type got created before citus existed. There was no moment where citus could have propagated the creation of a type. Proactive is almost always a good option. Types are not resource intensive objects, there is no performance overhead of having 100's of types. If you want to use them in a query to represent an intermediate result (which happens in our test suite) they just work. There is however a moment when proactive type distribution is not beneficial; in transactions where the type is used in a distributed table. Lets assume the following transaction: ```sql BEGIN; CREATE TYPE tt1 AS (a int, b int); CREATE TABLE t1 AS (a int PRIMARY KEY, b tt1); SELECT create_distributed_table('t1', 'a'); \copy t1 FROM bigdata.csv ``` Types are node scoped objects; meaning the type exists once per worker. Shards however have best performance when they are created over their own connection. For the type to be visible on all connections it needs to be created and committed before we try to create the shards. Here the just in time situation is most beneficial and follows how we create schema's on the workers. Outside of a transaction block we will just use 1 connection to propagate the creation. How propagation works ================= Just in time ----------- Just in time propagation hooks into the infrastructure introduced in #2882. It adds types as a supported object in `SupportedDependencyByCitus`. This will make sure that any object being distributed by citus that depends on types will now cascade into types. When types are depending them self on other objects they will get created first. Creation later works by getting the ddl commands to create the object by its `ObjectAddress` in `GetDependencyCreateDDLCommands` which will dispatch types to `CreateTypeDDLCommandsIdempotent`. For the correct walking of the graph we follow array types, when later asked for the ddl commands for array types we return `NIL` (empty list) which makes that the object will not be recorded as distributed, (its an internal type, dependant on the user type). Proactive distribution --------------------- When the user creates a type (composite or enum) we will have a hook running in `multi_ProcessUtility` after the command has been applied locally. Running after running locally makes that we already have an `ObjectAddress` for the type. This is required to mark the type as being distributed. Keeping the type up to date ==================== For types that are recorded in `pg_dist_object` (eg. `IsObjectDistributed` returns true for the `ObjectAddress`) we will intercept the utility commands that alter the type. - `AlterTableStmt` with `relkind` set to `OBJECT_TYPE` encapsulate changes to the fields of a composite type. - `DropStmt` with removeType set to `OBJECT_TYPE` encapsulate `DROP TYPE`. - `AlterEnumStmt` encapsulates changes to enum values. Enum types can not be changed transactionally. When the execution on a worker fails a warning will be shown to the user the propagation was incomplete due to worker communication failure. An idempotent command is shown for the user to re-execute when the worker communication is fixed. Keeping types up to date is done via the executor. Before the statement is executed locally we create a plan on how to apply it on the workers. This plan is executed after we have applied the statement locally. All changes to types need to be done in the same transaction for types that have already been distributed and will fail with an error if parallel queries have already been executed in the same transaction. Much like foreign keys to reference tables.	2019-09-13 17:46:07 +02:00
Nils Dijk	936d546a3c	Refactor Ensure Schema Exists to Ensure Dependecies Exists (#2882 ) DESCRIPTION: Refactor ensure schema exists to dependency exists Historically we only supported schema's as table dependencies to be created on the workers before a table gets distributed. This PR puts infrastructure in place to walk pg_depend to figure out which dependencies to create on the workers. Currently only schema's are supported as objects to create before creating a table. We also keep track of dependencies that have been created in the cluster. When we add a new node to the cluster we use this catalog to know which objects need to be created on the worker. Side effect of knowing which objects are already distributed is that we don't have debug messages anymore when creating schema's that are already created on the workers.	2019-09-04 14:10:20 +02:00
Hadi Moshayedi	a5b087c89b	Support FKs between reference tables	2019-08-21 16:11:27 -07:00
Philip Dubé	cd951fa9ca	Avoid multiple pg_dist_colocation records being created for reference tables master_deactivate_node is updated to decrement the replication factor Otherwise deactivation could have create_reference_table produce a second record UpdateColocationGroupReplicationFactor is renamed UpdateColocationGroupReplicationFactorForReferenceTables & the implementation looks up the record based on distributioncolumntype == InvalidOid, rather than by id Otherwise the record's replication factor fails to be maintained when there are no reference tables	2019-08-13 17:21:02 +00:00
Hanefi Onaldi	7e8fd49b94	Create Schemas as superuser on all shard/table creation UDFs - All the schema creations on the workers will now be via superuser connections - If a shard is being repaired or a shard is replicated, we will create the schema only in the relevant worker; and in all the other cases where a schema creation is needed, we will block operations until we ensure the schema exists in all the workers	2019-06-26 17:12:28 +02:00
Hadi Moshayedi	f4d3b94e22	Fix some of the casts for groupId (#2609 ) A small change which partially addresses #2608.	2019-03-05 12:06:44 -08:00
Marco Slot	aab9f623eb	Check table ownership in upgrade_to_reference_table	2018-11-16 23:27:34 +01:00
Marco Slot	f383e4f307	Description: Refactor code that handles DDL commands from one file into a module The file handling the utility functions (DDL) for citus organically grew over time and became unreasonably large. This refactor takes that file and refactored the functionality into separate files per command. Initially modeled after the directory and file layout that can be found in postgres. Although the size of the change is quite big there are barely any code changes. Only one two functions have been added for readability purposes: - PostProcessIndexStmt which is extracted from PostProcessUtility - PostProcessAlterTableStmt which is extracted from multi_ProcessUtility A README.md has been added to `src/backend/distributed/commands` describing the contents of the module and every file in the module. We need more documentation around the overloading of the COPY command, for now the boilerplate has been added for people with better knowledge to fill out.	2018-11-14 13:36:27 +01:00
Onder Kalaci	abc443d7fa	Make sure that shard repair considers replication factor	2018-09-21 15:24:49 +03:00
velioglu	bd30e3e908	Add support for writing to reference tables from MX nodes	2018-08-27 18:15:04 +03:00
mehmet furkan şahin	d1a3b20115	foreign_constraint_utils is created	2018-06-07 18:19:24 +03:00
Murat Tuncer	2d66bf5f16	Fix hard coded formatting strings for 64 bit numbers (#1831 ) Postgres provides OS agnosting formatting macros for formatting 64 bit numbers. Replaced %ld %lu with INT64_FORMAT and UINT64_FORMAT respectively. Also found some incorrect usages of formatting flags and fixed them.	2017-12-04 14:11:06 +03:00
Marco Slot	868ee6be83	Fix and simplify pg_dist_node locking	2017-08-09 14:09:54 +02:00
Brian Cloutier	a3e9bef685	All users of WorkerNodeHash take an AccessShareLock The metadata cache simulates a SELECT on pg_dist_node. Now the locks it takes also simulate that SELECT.	2017-08-08 13:12:06 +03:00
Brian Cloutier	94947c0d54	Refactor: ReplicateShardToAllWorkers more explicitly locks pg_dist_node	2017-08-08 13:12:06 +03:00
Brian Cloutier	ec99f8f983	Add nodeRole column - master_add_node enforces that there is only one primary per group - there's also a trigger on pg_dist_node to prevent multiple primaries per group - functions in metadata cache only return primary nodes - Rename ActiveWorkerNodeList -> ActivePrimaryNodeList - Rename WorkerGetLive{Node->Group}Count() - Refactor WorkerGetRandomCandidateNode - master_remove_node only complains about active shard placements if the node being removed is a primary. - master_remove_node only deletes all reference table placements in the group if the node being removed is the primary. - Rename {Node->NodeGroup}HasShardPlacements, this reflects the behavior it already had. - Rename DeleteAllReferenceTablePlacementsFrom{Node->NodeGroup}. This also reflects the behavior it already had, but the new signature forces the caller to pass in a groupId - Rename {WorkerGetLiveGroup->ActivePrimaryNode}Count	2017-07-24 11:57:46 +03:00
Brian Cloutier	74dd5bb281	Fix crash when removing an inactive node	2017-07-20 18:55:40 +03:00
Brian Cloutier	7ad95b53d2	Rename pg_dist_shard_placement -> pg_dist_placement Comes with a few changes: - Change the signature of some functions to accept groupid - InsertShardPlacementRow - DeleteShardPlacementRow - UpdateShardPlacementState - NodeHasActiveShardPlacements returns true if the group the node is a part of has any active shard placements - TupleToShardPlacement now returns ShardPlacements which have NULL nodeName and nodePort. - Populate (nodeName, nodePort) when creating ShardPlacements - Disallow removing a node if it contains any shard placements - DeleteAllReferenceTablePlacementsFromNode matches based on group. This doesn't change behavior for now (while there is only one node per group), but means in the future callers should be careful about calling it on a secondary node, it'll delete placements on the primary. - Create concept of a GroupShardPlacement, which represents an actual tuple in pg_dist_placement and is distinct from a ShardPlacement, which has been resolved to a specific node. In the future ShardPlacement should be renamed to NodeShardPlacement. - Create some triggers which allow existing code to continue to insert into and update pg_dist_shard_placement as if it still existed.	2017-07-12 14:17:31 +02:00
Onder Kalaci	5f3f1d75a3	Add some utility functions for partitioned tables This commit is intended to be a base for supporting declarative partitioning on distributed tables. Here we add the following utility functions and their unit tests: * Very basic functions including differnentiating partitioned tables and partitions, listing the partitions * Generating the PARTITION BY (expr) and adding this to the DDL events of partitioned tables * Ability to generate text representations of the ranges for partitions * Ability to generate the `ALTER TABLE parent_table ATTACH PARTITION partition_table FOR VALUES value_range` * Ability to apply add shard ids to the above command using `worker_apply_inter_shard_ddl_command()` * Ability to generate `ALTER TABLE parent_table DETACH PARTITION`	2017-06-28 09:39:55 +03:00
Burak Yucesoy	9fb15c439c	Add version checks to necessary UDFs	2017-05-22 09:53:29 +03:00
Burak Yucesoy	e9095e62ec	Decouple reference table replication With this change we add an option to add a node without replicating all reference tables to that node. If a node is added with this option, we mark the node as inactive and no queries will sent to that node. We also added two new UDFs; - master_activate_node(host, port): - marks node as active and replicates all reference tables to that node - master_add_inactive_node(host, port): - only adds node to pg_dist_node	2017-04-17 13:33:31 +03:00
Marco Slot	ba940a1de9	Use coordinator instead of schema node in terminology	2017-01-25 11:07:23 +01:00
Eren Basak	e7c15ecc1f	Make `upgrade_to_reference_table` function MX-compatible	2017-01-18 16:49:50 +03:00
Eren Basak	56ca590daa	Propagate metadata changes for deleted reference table placements on master_remove_node call	2017-01-18 16:00:07 +03:00
Eren Basak	be78769ae4	Propagate new reference table placement metadata on `master_add_node`	2017-01-18 15:59:06 +03:00
Burak Yucesoy	3315ae6142	Remove placement metadata of reference tables after master_remove_node With this change, we start to delete placement of reference tables at given worker node after master_remove_node UDF call. We remove placement metadata at master node but we do not drop actual shard from the worker node. There are two reasons for that decision, first, it is not critical to DROP the shards in the workers because Citus will ignore them as long as node is removed from cluster and if we add that node back to cluster we will DROP and recreate all reference tables. Second, if node is unreachable, it becomes complicated to cover failure cases and have a transaction support.	2017-01-16 11:24:56 +03:00
Burak Yucesoy	9c9f479e4b	Replicate reference tables when new node is added With this change, we start to replicate all reference tables to the new node when new node is added to the cluster with master_add_node command. We also update replication factor of reference table's colocation group.	2017-01-05 14:30:41 +03:00
Burak Yucesoy	31cd2357fe	Add upgrade_to_reference_table With this change we introduce new UDF, upgrade_to_reference_table, which can be used to upgrade existing broadcast tables reference tables. For upgrading, we require that given table contains only one shard.	2017-01-02 17:54:42 +02:00

39 Commits (685b54b3de1eb3d8f6b24e3e8505a73f06c6a98a)