citus

Commit Graph

Author	SHA1	Message	Date
Jelte Fennema	9fb897a074	Fix queries with repartition joins and group by unique column (#3157 ) Postgres doesn't require you to add all columns that are in the target list to the GROUP BY when you group by a unique column (or columns). It even actively removes these group by clauses when you do. This is normally fine, but for repartition joins it is not. The reason for this is that the temporary tables don't have these primary key columns. So when the worker executes the query it will complain that it is missing columns in the group by. This PR fixes that by adding an ANY_VALUE aggregate around each variable in the target list that does is not contained in the group by or in an aggregate. This is done only for repartition joins. The ANY_VALUE aggregate chooses the value from an undefined row in the group.	2019-11-08 15:36:18 +01:00
SaitTalhaNisanci	94a7e6475c	Remove copyright years (#2918 ) * Update year as 2012-2019 * Remove copyright years	2019-10-15 17:44:30 +03:00
Philip Dubé	dd490b6376	Cache whether an object is in pg_dist_object. Avoids redundant lookups for non-distributed objects	2019-10-10 14:50:38 +00:00
Marco Slot	2868e02a3d	Implement SELECT function call delegation. When a function is marked as colocated with a distributed table, we try delegating queries of kind "SELECT func(...)" to workers. We currently only support this simple form, and don't delegate forms like "SELECT f1(...), f2(...)", "SELECT f1(...) FROM ...", or function calls inside transactions. As a side effect, we also fix the transactional semantics of DO blocks. Previously we didn't consider a DO block a multi-statement transaction. Now we do. Co-authored-by: Marco Slot <marco@citusdata.com> Co-authored-by: serprex <serprex@users.noreply.github.com> Co-authored-by: pykello <hadi.moshayedi@microsoft.com>	2019-09-27 09:13:25 -07:00
Philip Dubé	432a8ef85b	Hadi's feedback Co-authored-by: pykello <hadi.moshayedi@microsoft.com> Co-authored-by: serprex <serprex@users.noreply.github.com>	2019-09-24 17:31:09 +00:00
Philip Dubé	bc1ad67eb5	Distribute CALL on distributed procedures to metadata workers Lots taken from https://github.com/citusdata/citus/pull/2829	2019-09-24 17:31:09 +00:00
Jelte Fennema	eb7e45d556	Make LookupNodeForGroup extern	2019-09-12 16:40:25 +02:00
Jelte Fennema	de5174f763	include postgres.h into some of our .h files to silence warnings	2019-09-12 16:40:25 +02:00
Nils Dijk	936d546a3c	Refactor Ensure Schema Exists to Ensure Dependecies Exists (#2882 ) DESCRIPTION: Refactor ensure schema exists to dependency exists Historically we only supported schema's as table dependencies to be created on the workers before a table gets distributed. This PR puts infrastructure in place to walk pg_depend to figure out which dependencies to create on the workers. Currently only schema's are supported as objects to create before creating a table. We also keep track of dependencies that have been created in the cluster. When we add a new node to the cluster we use this catalog to know which objects need to be created on the worker. Side effect of knowing which objects are already distributed is that we don't have debug messages anymore when creating schema's that are already created on the workers.	2019-09-04 14:10:20 +02:00
exialin	59e54de54d	Minor code clean-up	2019-05-24 14:26:26 +02:00
Marco Slot	5ff1821411	Cache the current database name Purely for performance reasons.	2019-03-20 12:14:46 +03:00
Hadi Moshayedi	f4d3b94e22	Fix some of the casts for groupId (#2609 ) A small change which partially addresses #2608.	2019-03-05 12:06:44 -08:00
Onder Kalaci	974cbf11a5	Hide shard names on MX worker nodes This commit by default enables hiding shard names on MX workers by simple replacing `pg_table_is_visible()` calls with `citus_table_is_visible()` calls on the MX worker nodes. The latter function filters out tables that are known to be shards. The main motivation of this change is a better UX. The functionality can be opted out via a GUC. We also added two views, namely citus_shards_on_worker and citus_shard_indexes_on_worker such that users can query them to see the shards and their corresponding indexes. We also added debug messages such that the filtered tables can be interactively seen by setting the level to DEBUG1.	2018-08-07 14:21:45 +03:00
velioglu	6be6911ed9	Create foreign key relation graph and functions to query on it	2018-07-03 17:05:55 +03:00
Onder Kalaci	2f01894589	Track relation accesses using the connection management infrastructure	2018-06-25 18:40:30 +03:00
Onder Kalaci	df44956dc3	Make sure that sequential DDL opens a single connection to each node After this commit DDL commands honour `citus.multi_shard_modify_mode`. We preferred using the code-path that executes single task router queries (e.g., ExecuteSingleModifyTask()) in order not to invent a new executor that is only applicable for DDL commands that require sequential execution.	2018-06-05 17:52:17 +03:00
Dimitri Fontaine	8b258cbdb0	Lock reads and writes only to the node being updated in master_update_node Rather than locking out all the writes in the cluster, the function now only locks out writes that target shards hosted by the node we're updating.	2018-05-09 15:14:20 +02:00
Onder Kalaci	317dd02a2f	Implement single repartitioning on hash distributed tables * Change worker_hash_partition_table() such that the divergence between Citus planner's hashing and worker_hash_partition_table() becomes the same. * Rename single partitioning to single range partitioning. * Add single hash repartitioning. Basically, logical planner treats single hash and range partitioning almost equally. Physical planner, on the other hand, treats single hash and dual hash repartitioning almost equally (except for JoinPruning). * Add a new GUC to enable this feature	2018-05-02 18:50:55 +03:00
Marco Slot	304b3a41ba	Cache the partition column Var	2018-04-26 14:58:16 -06:00
Marco Slot	6f7c3bd73b	Skip JSON validation on coordinator during COPY	2018-02-02 15:33:27 +01:00
Marco Slot	cbbd418af2	Add citus.copy_format OIDs to metadata cache	2017-12-14 09:32:55 +01:00
Marco Slot	f8550b8c85	Fix issues with read_intermediate_result signature	2017-12-07 13:47:56 +01:00
Marco Slot	7279d42849	Treat read_intermediate_result as recurring tuples	2017-12-04 14:50:11 +01:00
Hadi Moshayedi	34f3ec0961	Call FlushDistTableCache() before stats collection.	2017-10-31 21:51:43 -04:00
Hadi Moshayedi	9a04b78980	Send server_id for statistics reports. (#1698 ) This change introduces the `pg_dist_node_metadata` which has a single jsonb value. When creating the extension, a random server id is generated and stored in there. Everything in the metadata table is added as a nested objected to the json payload that is sent to the reports server.	2017-10-18 21:20:32 -04:00
Brian Cloutier	ebcb2b65e9	Add master_move_node function	2017-10-16 10:51:28 -07:00
Marco Slot	394918f9d0	Invalidate worker and group ID cache in maintenance daemon	2017-10-02 18:14:29 +02:00
Marco Slot	da6b42a3e2	Use unique constraint index for transaction record deletion	2017-09-28 12:04:56 +02:00
Marco Slot	7523753a73	Clear metadata OID cache prior to deadlock detection	2017-08-18 11:20:24 +02:00
Brian Cloutier	9d93fb5551	Create citus.use_secondary_nodes GUC This GUC has two settings, 'always' and 'never'. When it's set to 'never' all behavior stays exactly as it was prior to this commit. When it's set to 'always' only SELECT queries are allowed to run, and only secondary nodes are used when processing those queries. Add some helper functions: - WorkerNodeIsSecondary(), checks the noderole of the worker node - WorkerNodeIsReadable(), returns whether we're currently allowed to read from this node - ActiveReadableNodeList(), some functions (namely, the ones on the SELECT path) don't require working with Primary Nodes. They should call this function instead of ActivePrimaryNodeList(), because the latter will error out in contexts where we're not allowed to write to nodes. - ActiveReadableNodeCount(), like the above, replaces ActivePrimaryNodeCount(). - EnsureModificationsCanRun(), error out if we're not currently allowed to run queries which modify data. (Either we're in read-only mode or use_secondary_nodes is set) Some parts of the code were switched over to use readable nodes instead of primary nodes: - Deadlock detection - DistributedTableSize, - the router, real-time, and task tracker executors - ShardPlacement resolution	2017-08-10 17:37:17 +03:00
Brian Cloutier	ec99f8f983	Add nodeRole column - master_add_node enforces that there is only one primary per group - there's also a trigger on pg_dist_node to prevent multiple primaries per group - functions in metadata cache only return primary nodes - Rename ActiveWorkerNodeList -> ActivePrimaryNodeList - Rename WorkerGetLive{Node->Group}Count() - Refactor WorkerGetRandomCandidateNode - master_remove_node only complains about active shard placements if the node being removed is a primary. - master_remove_node only deletes all reference table placements in the group if the node being removed is the primary. - Rename {Node->NodeGroup}HasShardPlacements, this reflects the behavior it already had. - Rename DeleteAllReferenceTablePlacementsFrom{Node->NodeGroup}. This also reflects the behavior it already had, but the new signature forces the caller to pass in a groupId - Rename {WorkerGetLiveGroup->ActivePrimaryNode}Count	2017-07-24 11:57:46 +03:00
velioglu	6ea15fbb25	Make create_distributed_table transactional	2017-07-18 12:35:40 +03:00
Brian Cloutier	7ad95b53d2	Rename pg_dist_shard_placement -> pg_dist_placement Comes with a few changes: - Change the signature of some functions to accept groupid - InsertShardPlacementRow - DeleteShardPlacementRow - UpdateShardPlacementState - NodeHasActiveShardPlacements returns true if the group the node is a part of has any active shard placements - TupleToShardPlacement now returns ShardPlacements which have NULL nodeName and nodePort. - Populate (nodeName, nodePort) when creating ShardPlacements - Disallow removing a node if it contains any shard placements - DeleteAllReferenceTablePlacementsFromNode matches based on group. This doesn't change behavior for now (while there is only one node per group), but means in the future callers should be careful about calling it on a secondary node, it'll delete placements on the primary. - Create concept of a GroupShardPlacement, which represents an actual tuple in pg_dist_placement and is distinct from a ShardPlacement, which has been resolved to a specific node. In the future ShardPlacement should be renamed to NodeShardPlacement. - Create some triggers which allow existing code to continue to insert into and update pg_dist_shard_placement as if it still existed.	2017-07-12 14:17:31 +02:00
Marco Slot	01c9b1f921	Use GetPlacementListConnection for router SELECTs	2017-07-12 11:26:22 +02:00
Burak Yucesoy	c7bfa06cb9	Fix incorrect call to CheckInstalledVersion During version update, we indirectly calld CheckInstalledVersion via ChackCitusVersions. This obviously fails because during version update it is expected to have version mismatch between installed version and binary version. Thus, we remove that ChackCitusVersions. We now only call ChackAvailableVersion.	2017-05-24 17:39:25 +03:00
Burak Yucesoy	eea8c51e1f	Only error out on distributed queries when there is version mismatch Before this commit, we were erroring out at almost all queries if there is a version mismatch. With this commit, we started to error out only requested operation touches distributed tables. Normally we would need to use distributed cache to understand whether a table is distributed or not. However, it is not safe to read our metadata tables when there is a version mismatch, thus it is not safe to create distributed cache. Therefore for this specific occasion, we directly read from pg_dist_partition table. However; reading from catalog is costly and we should not use this method in other places as much as possible.	2017-05-22 09:53:29 +03:00
Andres Freund	6bd2e3ed30	Add DistTableCacheEntry->hasOverlappingShardInterval. This determines whether it's possible to perform binary search on sortedShardIntervalArray or not. If e.g. two shards have overlapping ranges, that'd be prohibitive. That'll be useful in later commit introducing faster shard pruning.	2017-04-28 14:40:38 -07:00
Andres Freund	105483ec56	Add DistTableCacheEntry->shardValueCompareFunction. That's useful when comparing values a hash-partitioned table is filtered by. The existing shardIntervalCompareFunction is about comparing hashed values, not unhashed ones. The added btree opclass function is so we can get a comparator back. This should be changed much more widely, but is not necessary so far.	2017-04-28 14:40:38 -07:00
Andres Freund	52571c00ad	Build DistTableCacheEntry->shardIntervalCompareFunction even for 0 shards. Previously we, unnecessarily, used a the first shard's type information to to look up the comparison function. But that information is already available, so use it. That's helpful because we sometimes want to access the comparator function even if there's no shards.	2017-04-28 14:40:38 -07:00
velioglu	2327b63291	Change native hash function with worker_hash	2017-04-19 22:16:55 +03:00
Burak Yucesoy	a09614553f	Add enable_version_checks GUC and address feedback	2017-04-04 19:11:13 +03:00
Burak Yucesoy	087d8427e3	Error out if binary citus version does not match installed extension With this change, we start to error out if loaded citus binaries does not match the available major version or installed citus extension version. In this case we force user to restart the server or run ALTER EXTENSION depending on the situation	2017-04-03 17:36:13 -06:00
Burak Yucesoy	484cb12cd0	Add LoadShardPlacement UDF This UDF returns a shard placement from cache given shard id and placement id. At the moment it iterates over all shard placements of given shard by ShardPlacementList and searches given placement id in that list, which is not a good solution performance-wise. However, currently, this function will be used only when there is a failed transaction. If a need arises we can optimize this function in the future.	2017-01-23 21:04:57 +03:00
Andres Freund	b813b39241	Cache ShardPlacements in metadata cache. So far we've reloaded them frequently. Besides avoiding that cost - noticeable for some workloads with large shard counts - it makes it easier to add information to ShardPlacements that help us make placement_connection.c colocation aware.	2017-01-10 18:14:18 -08:00
Burak Yucesoy	9c9f479e4b	Replicate reference tables when new node is added With this change, we start to replicate all reference tables to the new node when new node is added to the cluster with master_add_node command. We also update replication factor of reference table's colocation group.	2017-01-05 14:30:41 +03:00
Eren Basak	9eff968d1f	Add start_metadata_sync_to_node UDF This change adds `start_metadata_sync_to_node` UDF which copies the metadata about nodes and MX tables from master to the specified worker, sets its local group ID and marks its hasmetadata to true to allow it receive future DDL changes.	2016-12-13 10:48:03 +03:00
Andres Freund	fcd150c7c8	Invalidate relcache after pg_dist_shard_placement changes. This forces prepared statements to be re-planned after changes of the placement metadata. There's some locking issues remaining, but that's a a separate task. Also add regression tests verifying that invalidations take effect on prepared statements.	2016-10-26 03:36:35 -07:00
Metin Doslu	161093908e	Convert colocationid to uint32	2016-10-20 10:59:31 +03:00
Metin Doslu	40bdafa8d1	Add create_distributed_table() create_distributed_table() creates a hash distributed table with default values of shard count and shard replication factor.	2016-10-20 10:58:25 +03:00
Eren Basak	cee7b54e7c	Add worker transaction and transaction recovery infrastructure	2016-10-18 14:18:14 +03:00
Eren Basak	c7bf2021fa	Add metadata infrastructure for pg_dist_local_group table	2016-10-17 11:52:18 +03:00
Eren Basak	ed3af403fd	Add Metadata Snapshot Infrastructure This change adds the required infrastructure about metadata snapshot from MX codebase into Citus, mainly metadata_sync.c file and master_metadata_snapshot UDF.	2016-10-13 10:40:14 +03:00
Andres Freund	982ad66753	Introduce placement IDs. So far placements were assigned an Oid, but that was just used to track insertion order. It also did so incompletely, as it was not preserved across changes of the shard state. The behaviour around oid wraparound was also not entirely as intended. The newly introduced, explicitly assigned, IDs are preserved across shard-state changes. The prime goal of this change is not to improve ordering of task assignment policies, but to make it easier to reference shards. The newly introduced UpdateShardPlacementState() makes use of that, and so will the in-progress connection and transaction management changes.	2016-10-07 11:59:20 -07:00
Brian Cloutier	9d6699b07c	Switch from pg_worker_list.conf file to pg_dist_node metadata table. Related to #786 This change adds the `pg_dist_node` table that contains the information about the workers in the cluster, replacing the previously used `pg_worker_list.conf` file (or the one specified with `citus.worker_list_file`). Upon update, `pg_worker_list.conf` file is read and `pg_dist_node` table is populated with the file's content. After that, `pg_worker_list.conf` file is renamed to `pg_worker_list.conf.obsolete` For adding and removing nodes, the change also includes two new UDFs: `master_add_node` and `master_remove_node`, which require superuser permissions. 'citus.worker_list_file' guc is kept for update purposes but not used after the update is finished.	2016-10-05 13:01:35 +03:00
Marco Slot	32b2bd4ed8	Add replication model column to pg_dist_partition	2016-10-05 01:14:28 +02:00
Burak Yucesoy	1ee39eb098	Internal co-location API With this commit we introduce internal API for co-location related operations.	2016-09-29 11:56:53 +03:00
Andres Freund	25615ee9d7	Add CitusExtensionOwner(), to execute some priviledged operations under. There exist some operations we have to execute with elevated privileges. The most expedient user for that is the user owning the citusdb extension.	2016-04-27 10:26:08 -07:00
Andres Freund	42d232c0e8	Use the current session's username when connecting to worker nodes. So far we've always used libpq defaults when connecting to workers; bar special environment variables being set that'll always be the user that started the server. That's not desirable because it prevents using users with fewer privileges. Thus change the various APIs creating connections to workers to always use usernames. That means: 1) MultiClientConnect() needs to, optionally, accept a username 2) GetOrEstablishConnection(), including the underlying cache, need to use the current user as part of the connection cache key. That way connections for separate users are distinct, and we always use one with the correct authorization. 3) The task tracker needs to keep track of the username associated with a task, so it can use it when establishing connections outside the originating session.	2016-04-27 10:00:08 -07:00
Onder Kalaci	6c7abc2ba5	Add fast shard pruning path for INSERTs on hash partitioned tables This commit adds a fast shard pruning path for INSERTs on hash-partitioned tables. The rationale behind this change is that if there exists a sorted shard interval array, a single index lookup on the array allows us to find the corresponding shard interval. As mentioned above, we need a sorted (wrt shardminvalue) shard interval array. Thus, this commit updates shardIntervalArray to sortedShardIntervalArray in the metadata cache. Then uses the low-level API that is defined in multi_copy to handle the fast shard pruning. The performance impact of this change is more apparent as more shards exist for a distributed table. Previous implementation was relying on linear search through the shard intervals. However, this commit relies on constant lookup time on shard interval array. Thus, the shard pruning becomes less dependent on the shard count.	2016-04-26 11:16:00 +03:00
Jason Petersen	423e6c8ea0	Update copyright dates Fixed configure variable and updated all end dates to 2016.	2016-03-23 17:14:37 -06:00
Marco Slot	75a141a7c6	Merge remote-tracking branch 'origin/master' into feature/drop_shards_on_drop_table	2016-02-17 22:52:58 +01:00
Marco Slot	2af6797c04	Perform relcache invalidation in CitusInvalidateRelcacheByRelid	2016-02-16 12:59:38 +01:00
Murat Tuncer	55c44b48dd	Changed product name to citus All citusdb references in - extension, binary names - file headers - all configuration name prefixes - error/warning messages - some functions names - regression tests are changed to be citus.	2016-02-15 16:04:31 +02:00
Onder Kalaci	136306a1fe	Initial commit of Citus 5.0	2016-02-11 04:05:32 +02:00

1 2 3

114 Commits (184c7c0bce6b7bca61d25b828855fac5fba64816)