citus

Commit Graph

Author	SHA1	Message	Date
Naisila Puka	7d6410c838	Drop postgres 12 support (#6040 ) * Remove if conditions with PG_VERSION_NUM < 13 * Remove server_above_twelve(&eleven) checks from tests * Fix tests * Remove pg12 and pg11 alternative test output files * Remove pg12 specific normalization rules * Some more if conditions in the code * Change RemoteCollationIdExpression and some pg12/pg13 comments * Remove some more normalization rules	2022-07-20 17:49:36 +03:00
Onur Tirtir	dc31102630	Locally create objects having a dependency that we cannot distribute We were already doing so for functions & types believing that this cannot be the case for other object types. However, as in #5830, we cannot distribute an object that user attempts creating in temp schema. Even more, this doesn't only apply to functions and types but also to many other object types. So with this commit, we teach preprocess/postprocess functions (that need to create dependencies on worker nodes) how to skip trying to distribute such objects. We also start identifying temp schemas as the objects that we don't know how to propagate to worker nodes so that we can simply create objects locally if user attempts creating them in a temp schema. There are 36 callers of `EnsureDependenciesExistOnAllNodes` in the codebase atm and for the most we still need to throw a hard error (i.e.: not use `DeferErrorIfHasUnsupportedDependency` beforehand), such as: i) user explicitly wants to create a distributed object * CreateCitusLocalTable * CreateDistributedTable * master_create_worker_shards * master_create_empty_shard * create_distributed_function * EnsureExtensionFunctionCanBeDistributed ii) we don't want to skip altering distributed table on worker nodes * PostprocessIndexStmt * PostprocessCreateTriggerStmt * PostprocessCreateStatisticsStmt iii) object is already distributed / handled by Citus before, so we aren't okay with not propagating the ALTER command * PostprocessAlterTableSchemaStmt * PostprocessAlterCollationOwnerStmt * PostprocessAlterCollationSchemaStmt * PostprocessAlterDatabaseOwnerStmt * PostprocessAlterExtensionSchemaStmt * PostprocessAlterFunctionOwnerStmt * PostprocessAlterFunctionSchemaStmt * PostprocessAlterSequenceOwnerStmt * PostprocessAlterSequenceSchemaStmt * PostprocessAlterStatisticsSchemaStmt * PostprocessAlterStatisticsOwnerStmt * PostprocessAlterTextSearchConfigurationSchemaStmt * PostprocessAlterTextSearchDictionarySchemaStmt * PostprocessAlterTextSearchConfigurationOwnerStmt * PostprocessAlterTextSearchDictionaryOwnerStmt * PostprocessAlterTypeSchemaStmt * PostprocessAlterForeignServerOwnerStmt iv) we already cannot create those objects in temp schemas, so skipping for now * PostprocessCreateExtensionStmt * PostprocessCreateForeignServerStmt Also note that there are 3 more callers of `EnsureDependenciesExistOnAllNodes` in enterprise in addition to those 36 but we don't need to do anything specific about them due to the same reasoning given in iii).	2022-03-22 15:09:23 +03:00
Nils Dijk	3801576dfb	Move pg_dist_object to pg_catalog (#5765 ) DESCRIPTION: Move pg_dist_object to pg_catalog Historically `pg_dist_object` had been created in the `citus` schema as an experiment to understand if we could move our catalog tables to a branded schema. We quickly realised that this interfered with the UX on our managed services and other environments, where users connected via a user with the name of `citus`. By default postgres put the username on the search_path. To be able to read the catalog in the `citus` schema we would need to grant access permissions to the schema. This caused newly created objects like tables etc, to default to this schema for creation. This failed due to the write permissions to that schema. With this change we move the `pg_dist_object` catalog table to the `pg_catalog` schema, where our other schema's are also located. This makes the catalog table visible and readable by any user, like our other catalog tables, for debugging purposes. Note: due to the change of schema, we had to disable 1 test that was running into a discrepancy between the schema and binary. Secondly, we needed to make the lookup functions for the `pg_dist_object` relation and their indexes less strict on the fallback of the naming due to an other test that, due to an unfortunate cache invalidation, needed to lookup the relation again. This makes that we won't default to _only_ resolving from `pg_catalog` outside of upgrades.	2022-03-04 17:40:38 +00:00
Ahmet Gedemenli	b8eedcd261	Notice when create_distributed_function called without params (#5752 ) * Notice when create_distributed_function called without params * Move variable comments to top * Add valid check for cache entry * add objtype to notice msg * update test outputs * Add more tests * Address feedback	2022-03-04 17:26:39 +03:00
Ahmet Gedemenli	76b63a307b	Propagate create/drop schema commands	2022-02-10 14:58:09 +03:00
Teja Mupparti	1e3c8e34c0	Allow create_distributed_function() on a function owned by an extension Implement #5649 Allow create_distributed_function() on functions owned by extensions 1) Only update pg_dist_object, and do not propagate CREATE FUNCTION. 2) Ensure corresponding extension is in pg_dist_object. 3) Verify if dependencies exist on the function they should resolve to the extension. 4) Impact on node-scaling: We build a list of ddl commands based on all objects in pg_dist_object. We need to omit the ddl's for the extension-function, as it will get propagated by the virtue of the extension creation. 5) Extra checks for functions coming from extensions, to not propagate changes via ddl commands, even though the function is marked as distributed in pg_dist_object	2022-02-08 11:52:56 -08:00
Burak Velioglu	f88cc230bf	Handle tables and objects as metadata. Update UDFs accordingly With this commit we've started to propagate sequences and shell tables within the object dependency resolution. So, ensuring any dependencies for any object will consider shell tables and sequences as well. Separate logics for both shell tables and sequences have been removed. Since both shell tables and sequences logic were implemented as a part of the metadata handling before that logic, we were propagating them while syncing table metadata. With this commit we've divided metadata (which means anything except shards thereafter) syncing logic into multiple parts and implemented it either as a part of ActivateNode. You can check the functions called in ActivateNode to check definition of different metadata. Definitions of start_metadata_sync_to_node and citus_activate_node have also been updated. citus_activate_node will basically create an active node with all metadata and reference table shards. start_metadata_sync_to_node will be same with citus_activate_node except replicating reference tables. stop_metadata_sync_to_node will remove all the metadata. All of those UDFs need to be called by superuser.	2022-01-31 16:20:15 +03:00
Teja Mupparti	54862f8c22	(1) Functions will be delegated even when present in the scope of an explicit BEGIN/COMMIT transaction block or in a UDF calling another UDF. (2) Prohibit/Limit the delegated function not to do a 2PC (or any work on a remote connection). (3) Have a safety net to ensure the (2) i.e. we should block the connections from the delegated procedure or make sure that no 2PC happens on the node. (4) Such delegated functions are restricted to use only the distributed argument value. Note: To limit the scope of the project we are considering only Functions(not procedures) for the initial work. DESCRIPTION: Introduce a new flag "force_delegation" in create_distributed_function(), which will allow a function to be delegated in an explicit transaction block. Fixes #3265 Once the function is delegated to the worker, on that node during the planning distributed_planner() TryToDelegateFunctionCall() CheckDelegatedFunctionExecution() EnableInForceDelegatedFuncExecution() Save the distribution argument (Constant) ExecutorStart() CitusBeginScan() IsShardKeyValueAllowed() Ensure to not use non-distribution argument. ExecutorRun() AdaptiveExecutor() StartDistributedExecution() EnsureNoRemoteExecutionFromWorkers() Ensure all the shards are local to the node in the remoteTaskList. NonPushableInsertSelectExecScan() InitializeCopyShardState() EnsureNoRemoteExecutionFromWorkers() Ensure all the shards are local to the node in the placementList. This also fixes a minor issue: Properly handle expressions+parameters in distribution arguments	2022-01-19 16:43:33 -08:00
Onder Kalaci	22b5175fd1	Make sure that the community and enterprise tests produce the same output	2022-01-04 13:30:31 +01:00
Önder Kalacı	0a8b0b06c6	Do not allow distributed functions on non-metadata synced nodes (#5586 ) Before this commit, Citus was triggering metadata syncing in the background when a function is distributed. However, with Citus 11, we expect all clusters to have metadata synced enabled. So, we do not expect any nodes not to have the metadata. This change: (a) pro: simplifies the code and opens up possibilities to simplify futher by reducing the scope of bg worker to only sync node metadata (b) pro: explicitly asks users to sync the metadata such that any unforseen impact can be easily detected (c) con: For distributed functions without distribution argument, we do not necessarily require the metadata sycned. However, for completeness and simplicity, we do so.	2022-01-04 13:12:57 +01:00
Ahmet Gedemenli	042d45b263	Propagate foreign server ops	2021-12-23 17:54:04 +03:00
Ahmet Gedemenli	8e4ff34a2e	Do not include return table params in the function arg list (cherry picked from commit `90928cfd74`) Fix function signature generation Fix comment typo Add test for worker_create_or_replace_object Add test for recreating distributed functions with OUT/TABLE params Add test for recreating distributed function that returns setof int Fix test output Fix comment	2021-12-21 19:01:42 +03:00
Burak Velioglu	ed8e32de5e	Sync pg_dist_object on an update and propagate while syncing to a new node Before that PR we were updating citus.pg_dist_object metadata, which keeps the metadata related to objects on Citus, only on the coordinator node. In order to allow using those object from worker nodes (or erroring out with proper error message) we've started to propagate that metedata to worker nodes as well.	2021-12-06 19:25:50 +03:00
Sait Talha Nisanci	20c32a7a1d	Add alternative output for multi_deparse_function Postgres tightened up its checks for invalid GUC names hence we started to get an alternative output for one of our tests. We add an alternative output since the file is relatively small. Commit on PG: 3db826bd55cd1df0dd8c3d811f8e5b936d7ba1e4	2021-09-03 15:41:28 +03:00
Hanefi Onaldi	878513f325	Remove all occurences of replication_model GUC	2021-05-21 16:14:59 +03:00
Hanefi Önaldı	024d398cd7	Allow distribution of functions that read from reference tables create_distributed_function(function_name, distribution_arg_name, colocate_with text) This UDF did not allow colocate_with parameters when there were no disttribution_arg_name supplied. This commit changes the behaviour to allow missing distribution_arg_name parameters when the function should be colocated with a reference table.	2020-09-01 07:28:34 +03:00
SaitTalhaNisanci	cf98b9d6d5	not wait forever for metadata sync in tests (#3760 ) We shouldn't wait forever for metada sync in tests, otherwise when a test gets stuck, we don't know which line causes the problem.	2020-05-14 10:51:24 +03:00
Marco Slot	8b83306a27	Issue worker messages with the same log level	2020-04-14 21:08:25 +02:00
Philip Dubé	ccabf19090	Propagate DROP ROUTINE, ALTER ROUTINE In two places I've made code more straight forward by using ROUTINE in our own codegen Two changes which may seem extraneous: AppendFunctionName was updated to not use pg_get_function_identity_arguments. This is because that function includes ORDER BY when printing an aggregate like my_rank. While ALTER AGGREGATE my_rank(x "any" ORDER BY y "any") is accepted by postgres, ALTER ROUTINE my_rank(x "any" ORDER BY y "any") is not. Tests were updated to use macaddr over integer. Using integer is flaky, our logic could sometimes end up on tables like users_table. I originally wanted to use money, but money isn't hashable.	2020-01-13 15:37:46 +00:00
Jelte Fennema	7730bd449c	Normalize tests: Remove trailing whitespace	2020-01-06 09:32:03 +01:00
Jelte Fennema	6353c9907f	Normalize tests: Line info varies between versions	2020-01-06 09:32:03 +01:00
Jelte Fennema	7f3de68b0d	Normalize tests: header separator length	2020-01-06 09:32:03 +01:00
Marco Slot	5f656e22db	Fix issue in IsMultiStatementTransaction detection	2019-12-16 17:01:43 +01:00
Philip Dubé	5a17fd6d9d	Test more reference/local cases, also ALTER ROLE Test ALTER ROLE doesn't deadlock when coordinator added, or propagate from mx workers Consolidate wait_until_metadata_sync & verify_metadata to multi_test_helpers	2019-12-03 22:23:14 +00:00
Hadi Moshayedi	2268a9cae6	Error for metadata commands if any metadata node is out-of-sync (#3226 ) * Error for metadata commands if any metadata node is out-of-sync * Make the functions have separate APIs for all workers/metadata workers	2019-11-27 09:52:57 +01:00
Marco Slot	4b0ac4b0dd	Properly escape ALTER FUNCTION .. SET deparsing. Also test	2019-11-25 23:01:30 +00:00
Philip Dubé	3c10c27b13	GetFunctionAlterOwnerCommand: use format_procedure_qualified distributed_functions: test a function with a quote in name AppendDefElemSet: quote variable names	2019-11-25 23:01:30 +00:00
Philip Dubé	495c0f5117	Phase 1 implementation of custom aggregates Phase 1 seeks to implement minimal infrastructure, so does not include: - dynamic generation of support aggregates to handle multiple arguments - configuration methods to direct aggregation strategy, or mark an aggregate's serialize/deserialize as safe to operate across nodes Aggregates can be distributed when: - they have a single argument - they have a combinefunc - their transition type is not a pseudotype	2019-11-14 19:01:24 +00:00
Philip Dubé	2fc45e5897	create_distributed_function: accept aggregates Adds support for OCLASS_PROC to worker_create_or_replace_object	2019-11-06 18:23:37 +00:00
Marco Slot	03cae27782	Add tests for distributing functions with replication_model statement	2019-10-26 23:57:59 +02:00
Hadi Moshayedi	217db2a03e	Don't block for locks in SyncMetadataToNodes()	2019-10-03 16:53:36 -07:00
Nils Dijk	01b26cf91a	Disallow distributed functions for functions depending on an extension (#3049 ) DESCRIPTION: Disallow distributed functions for functions depending on an extension Functions depending on an extension cannot (yet) be distributed by citus. If we would allow this it would cause issues with our dependency following mechanism as we stop following objects depending on an extension. By not allowing functions to be distributed when they depend on an extension as well as not allowing to make distributed functions depend on an extension we won't break the ability to add new nodes. Allowing functions depending on extensions to be distributed at the moment could cause problems in that area.	2019-09-30 15:19:47 +02:00
Nils Dijk	473cbc0115	Propagate CREATE OR REPLACE FUNCTION to workers for distributed functions (#3043 ) DESCRIPTION: Propagate CREATE OR REPLACE FUNCTION Distributed functions could be replaced, which should be propagated to the workers to keep the function in sync between all nodes. Due to the complexity of deparsing the `CreateFunctionStmt` we actually produce the plan during the processing phase of our utilityhook. Since the changes have already been made in the catalog tables we can reuse `pg_get_functiondef` to get us the generated `CREATE OR REPLACE` sql.	2019-09-30 12:41:17 +02:00
Nils Dijk	9c2c50d875	Hookup function/procedure deparsing to our utility hook (#3041 ) DESCRIPTION: Propagate ALTER FUNCTION statements for distributed functions Using the implemented deparser for function statements to propagate changes to both functions and procedures that are previously distributed.	2019-09-27 22:06:49 +02:00
Onder Kalaci	18de78f386	Relax the colocation checks for distributed functions As long as the types can be coerced, it is safe to pushdown functions.	2019-09-24 16:31:08 +02:00
Onder Kalaci	d37745bfc7	Sync metadata to worker nodes after create_distributed_function Since the distributed functions are useful when the workers have metadata, we automatically sync it. Also, after master_add_node(). We do it lazily and let the deamon sync it. That's mainly because the metadata syncing cannot be done in transaction blocks, and we don't want to add lots of transactional limitations to master_add_node() and create_distributed_function().	2019-09-23 18:30:53 +02:00
Onder Kalaci	d7e2968120	Add parameters to create_distributed_function() With this commit, we're changing the API for create_distributed_function() such that users can provide the distribution argument and the colocation information.	2019-09-22 21:53:33 +02:00
Philip Dubé	ac14f1dd49	pg12 doesn't support client_min_messages as 'fatal'	2019-09-17 20:37:06 +00:00
Hanefi Onaldi	8f2a3a0604	Introduce create_distributed_function(regproc) UDF (#2961 ) This PR aims to add the minimal set of changes required to start distributing functions. You can use create_distributed_function(regproc) UDF to distribute a function. SELECT create_distributed_function('add(int,int)'); The function definition should include the param types to properly identify the correct function that we wish to distribute	2019-09-13 23:27:46 +03:00

39 Commits (7d6410c838ff3ebb04ca24d0557efe4913be2f85)