citus

Commit Graph

Author	SHA1	Message	Date
naisila	3c9eea031e	Normalizes Memory Usage, Buckets, Batches for PG15 explain diffs We create a new function in multi_test_helpers, which is similar to explain_merge function in PG15. This explain helper function normalies Memory Usage, Buckets and Batches, and we use it in the tests which give a different output for PG15.	2022-08-23 15:55:49 +03:00
naisila	2aa07e37a4	Handles EXPLAIN output diffs in PG15, Hash Agg/Join leverage To handle differences in usage of GroupAggregate vs HashAggregate or Merge Join vs Hash join in cases where this detail doesn't seem to matter, we use coordinator_plan(). - coordinator_plan() is updated to remove "Result" lines There are some cases where we have subplans so we add a new function that prints all Task Count lines as well - coordinator_plan_with_subplans() Still not sure of the relevant PG commit Could be db0d67db2401eb6238ccc04c6407a4fd4f985832 but disabling enable_group_by_reordering didn't help.	2022-08-23 15:55:49 +03:00
naisila	df5f628175	Handles EXPLAIN output diffs in PG15 - Extra result lines To handle extra "Result" lines in explain outputs, we add explain method to multi_test_helpers.sql file - plan_without_result_lines() is added for cases where we want the whole explain output with only "Result" lines removed	2022-08-23 15:55:49 +03:00
Nils Dijk	3801576dfb	Move pg_dist_object to pg_catalog (#5765 ) DESCRIPTION: Move pg_dist_object to pg_catalog Historically `pg_dist_object` had been created in the `citus` schema as an experiment to understand if we could move our catalog tables to a branded schema. We quickly realised that this interfered with the UX on our managed services and other environments, where users connected via a user with the name of `citus`. By default postgres put the username on the search_path. To be able to read the catalog in the `citus` schema we would need to grant access permissions to the schema. This caused newly created objects like tables etc, to default to this schema for creation. This failed due to the write permissions to that schema. With this change we move the `pg_dist_object` catalog table to the `pg_catalog` schema, where our other schema's are also located. This makes the catalog table visible and readable by any user, like our other catalog tables, for debugging purposes. Note: due to the change of schema, we had to disable 1 test that was running into a discrepancy between the schema and binary. Secondly, we needed to make the lookup functions for the `pg_dist_object` relation and their indexes less strict on the fallback of the naming due to an other test that, due to an unfortunate cache invalidation, needed to lookup the relation again. This makes that we won't default to _only_ resolving from `pg_catalog` outside of upgrades.	2022-03-04 17:40:38 +00:00
Sait Talha Nisanci	e7607b6bed	Add a helper function to check explain has a single task In order to avoid adding an alternative output, a function to check if a given explan plan has a single task added. This doesn't change what the changed tests intend to do.	2021-09-03 15:41:28 +03:00
Philip Dubé	4e22f02997	Fix various typos due to zealous repetition	2021-03-04 19:28:15 +00:00
Onder Kalaci	d1cd198655	Prevent infinite recursion for queries that involve UNION ALL and JOIN With this commit, we make sure to prevent infinite recursion for queries in the format: [subquery with a UNION ALL] JOIN [table or subquery] Also, fixes a bug where we pushdown UNION ALL below a JOIN even if the UNION ALL is not safe to pushdown.	2021-03-03 12:27:26 +01:00
Onur Tirtir	cc8be422ce	Fix relkind checks in planner for relkinds other than RELKIND_RELATION (#4294 ) We were qualifying relations with relkind != RELKIND_RELATION as non-relations due to the strict checks around RangeTblEntry->relkind in planner.	2020-11-05 14:21:02 +03:00
Hadi Moshayedi	ef778c1cd7	address feedback from Sait Talha & Hadi	2020-06-12 18:36:02 -07:00
Halil Ozan Akgul	a701fc774a	Adds multi_schedule_hyperscale schedule	2020-04-10 15:54:47 +03:00
Hadi Moshayedi	bb65669186	Failure tests for PartitionTasklistResults	2020-01-09 10:55:58 -08:00
Hadi Moshayedi	f38d0e5b3f	Partitioned task list results.	2020-01-09 10:32:58 -08:00
Jelte Fennema	7730bd449c	Normalize tests: Remove trailing whitespace	2020-01-06 09:32:03 +01:00
Jelte Fennema	7f3de68b0d	Normalize tests: header separator length	2020-01-06 09:32:03 +01:00
Hadi Moshayedi	e7a6cc0801	Fix some typos from #3280	2019-12-12 13:29:26 -08:00
Hadi Moshayedi	067d92a7f6	Don't plan joins between ref tables and views locally	2019-12-11 14:31:34 -08:00
Philip Dubé	5a17fd6d9d	Test more reference/local cases, also ALTER ROLE Test ALTER ROLE doesn't deadlock when coordinator added, or propagate from mx workers Consolidate wait_until_metadata_sync & verify_metadata to multi_test_helpers	2019-12-03 22:23:14 +00:00
Nils Dijk	9c2c50d875	Hookup function/procedure deparsing to our utility hook (#3041 ) DESCRIPTION: Propagate ALTER FUNCTION statements for distributed functions Using the implemented deparser for function statements to propagate changes to both functions and procedures that are previously distributed.	2019-09-27 22:06:49 +02:00
Marco Slot	2868e02a3d	Implement SELECT function call delegation. When a function is marked as colocated with a distributed table, we try delegating queries of kind "SELECT func(...)" to workers. We currently only support this simple form, and don't delegate forms like "SELECT f1(...), f2(...)", "SELECT f1(...) FROM ...", or function calls inside transactions. As a side effect, we also fix the transactional semantics of DO blocks. Previously we didn't consider a DO block a multi-statement transaction. Now we do. Co-authored-by: Marco Slot <marco@citusdata.com> Co-authored-by: serprex <serprex@users.noreply.github.com> Co-authored-by: pykello <hadi.moshayedi@microsoft.com>	2019-09-27 09:13:25 -07:00
Nils Dijk	2879689441	Distribute Types to worker nodes (#2893 ) DESCRIPTION: Distribute Types to worker nodes When to propagate ============== There are two logical moments that types could be distributed to the worker nodes - When they get used ( just in time distribution ) - When they get created ( proactive distribution ) The just in time distribution follows the model used by how schema's get created right before we are going to create a table in that schema, for types this would be when the table uses a type as its column. The proactive distribution is suitable for situations where it is benificial to have the type on the worker nodes directly. They can later on be used in queries where an intermediate result gets created with a cast to this type. Just in time creation is always the last resort, you cannot create a distributed table before the type gets created. A good example use case is; you have an existing postgres server that needs to scale out. By adding the citus extension, add some nodes to the cluster, and distribute the table. The type got created before citus existed. There was no moment where citus could have propagated the creation of a type. Proactive is almost always a good option. Types are not resource intensive objects, there is no performance overhead of having 100's of types. If you want to use them in a query to represent an intermediate result (which happens in our test suite) they just work. There is however a moment when proactive type distribution is not beneficial; in transactions where the type is used in a distributed table. Lets assume the following transaction: ```sql BEGIN; CREATE TYPE tt1 AS (a int, b int); CREATE TABLE t1 AS (a int PRIMARY KEY, b tt1); SELECT create_distributed_table('t1', 'a'); \copy t1 FROM bigdata.csv ``` Types are node scoped objects; meaning the type exists once per worker. Shards however have best performance when they are created over their own connection. For the type to be visible on all connections it needs to be created and committed before we try to create the shards. Here the just in time situation is most beneficial and follows how we create schema's on the workers. Outside of a transaction block we will just use 1 connection to propagate the creation. How propagation works ================= Just in time ----------- Just in time propagation hooks into the infrastructure introduced in #2882. It adds types as a supported object in `SupportedDependencyByCitus`. This will make sure that any object being distributed by citus that depends on types will now cascade into types. When types are depending them self on other objects they will get created first. Creation later works by getting the ddl commands to create the object by its `ObjectAddress` in `GetDependencyCreateDDLCommands` which will dispatch types to `CreateTypeDDLCommandsIdempotent`. For the correct walking of the graph we follow array types, when later asked for the ddl commands for array types we return `NIL` (empty list) which makes that the object will not be recorded as distributed, (its an internal type, dependant on the user type). Proactive distribution --------------------- When the user creates a type (composite or enum) we will have a hook running in `multi_ProcessUtility` after the command has been applied locally. Running after running locally makes that we already have an `ObjectAddress` for the type. This is required to mark the type as being distributed. Keeping the type up to date ==================== For types that are recorded in `pg_dist_object` (eg. `IsObjectDistributed` returns true for the `ObjectAddress`) we will intercept the utility commands that alter the type. - `AlterTableStmt` with `relkind` set to `OBJECT_TYPE` encapsulate changes to the fields of a composite type. - `DropStmt` with removeType set to `OBJECT_TYPE` encapsulate `DROP TYPE`. - `AlterEnumStmt` encapsulates changes to enum values. Enum types can not be changed transactionally. When the execution on a worker fails a warning will be shown to the user the propagation was incomplete due to worker communication failure. An idempotent command is shown for the user to re-execute when the worker communication is fixed. Keeping types up to date is done via the executor. Before the statement is executed locally we create a plan on how to apply it on the workers. This plan is executed after we have applied the statement locally. All changes to types need to be done in the same transaction for types that have already been distributed and will fail with an error if parallel queries have already been executed in the same transaction. Much like foreign keys to reference tables.	2019-09-13 17:46:07 +02:00
Hadi Moshayedi	3d0a521295	Show just coordinator plan in some test outputs.	2019-06-24 12:24:30 +02:00
Hanefi Onaldi	4edb193f25	make the tests parallelizeable helper view table_fkeys_in_workers now allows filtering by schema so that a test case can print out foreign keys in its schema only	2018-11-26 14:04:51 +03:00
Nils Dijk	97da44558b	Description: Fix failures of tests on recent postgres builds In recent postgres builds you cannot set client_min_messages to values higher then ERROR, if will silently set it to ERROR if so. During some tests we would set it to fatal to hide random values (eg. pid's of processes) from the test output. This patch will use different tactics for hiding these values.	2018-11-13 16:53:05 +01:00
Nils Dijk	6cf4516fdb	fix \d change for indexes in pg11	2018-08-15 23:27:31 -06:00
Jason Petersen	2204da19f0	Support PostgreSQL 10 (#1379 ) Adds support for PostgreSQL 10 by copying in the requisite ruleutils and updating all API usages to conform with changes in PostgreSQL 10. Most changes are fairly minor but they are numerous. One particular obstacle was the change in \d behavior in PostgreSQL 10's psql; I had to add SQL implementations (views, mostly) to mimic the pre-10 output.	2017-06-26 02:35:46 -06:00

25 Commits (3c9eea031ebf8dee0b3cf098f60c9a07eefa4c54)