citus

Commit Graph

Author	SHA1	Message	Date
Naisila Puka	a18040869a	Error out for queries with outer joins and pseudoconstant quals in PG<17 (#7937 ) PG15 commit d1ef5631e620f9a5b6480a32bb70124c857af4f1 and PG16 commit 695f5deb7902865901eb2d50a70523af655c3a00 disallow replacing joins with scans in queries with pseudoconstant quals. This commit prevents the set_join_pathlist_hook from being called if any of the join restrictions is a pseudo-constant. So in these cases, citus has no info on the join, never sees that the query has an outer join, and ends up producing an incorrect plan. PG17 fixes this by commit 9e9931d2bf40e2fea447d779c2e133c2c1256ef3 Therefore, we take this extra measure here for PG versions less than 17. hasOuterJoin can never be true when set_join_pathlist_hook is absent.	2025-05-11 21:47:28 +00:00
Naisila Puka	3e96a19606	Adds JSON_TABLE() support, and SQL/JSON constructor/query functions tests (#7816 ) DESCRIPTION: Adds JSON_TABLE() support PG17 has added basic `JSON_TABLE()` functionality `JSON_TABLE()` allows `JSON` data to be converted into a relational view and thus used, for example, in a `FROM` clause, like other tabular data. We treat `JSON_TABLE` the same as correlated functions (e.g., recurring tuples). In the end, for multi-shard `JSON_TABLE` commands, we apply the same restrictions as reference tables (e.g., cannot perform a lateral outer join when a distributed subquery references a (reference table)/(json table) etc.) Relevant PG17 commits: [basic JSON table](https://github.com/postgres/postgres/commit/de3600452), [nested paths in json table](https://github.com/postgres/postgres/commit/bb766cde6) Onder had previously added json table support for PG15BETA1, but we reverted that commit because json table was reverted in PG15. `ce7f1a530f` Previous relevant PG15Beta1 commit: https://github.com/postgres/postgres/commit/4e34747c8 Therefore, I referred to Onder's commit for this commit as well, with a few changes due to some differences between PG15/PG17: 1) In PG15Beta1, we had also `PLAN` clauses for `JSON_TABLE` https://github.com/postgres/postgres/commit/fadb48b00, and Onder's commit includes tests for those as well. However, `PLAN` nodes are _not_ added in PG17. Therefore, I didn't include the `json_table_select_only` test, which had mostly queries involving `PLAN`. I only included the last query from that test. 2) In PG15 timeline (Citus 11.1), we didn't support outer joins where the outer rel is a recurring one and the inner one is a non-recurring one. However, [Onur added support for that one in Citus 11.2](https://github.com/citusdata/citus/pull/6512), therefore I updated the tests from Onder's commit accordingly. 3) PG17 json table has nested paths and columns, therefore I added a test with a distributed table, which is exactly the same as the one in sqljson_jsontable in PG17. https://github.com/postgres/postgres/commit/bb766cde6 This pull request also adds some basic tests on validation of SQL/JSON constructor functions JSON(), JSON_SCALAR(), and JSON_SERIALIZE(), and also SQL/JSON query functions JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE(). The relevant PG commits are the following: [JSON(), JSON_SCALAR(), JSON_SERIALIZE()](https://github.com/postgres/postgres/commit/03734a7fe) [JSON_EXISTS(), JSON_VALUE(), JSON_QUERY()](https://github.com/postgres/postgres/commit/6185c9737)	2025-03-12 12:26:05 +03:00
Nils Dijk	0620c8f9a6	Sort includes (#7326 ) This change adds a script to programatically group all includes in a specific order. The script was used as a one time invocation to group and sort all includes throught our formatted code. The grouping is as follows: - System includes (eg. `#include<...>`) - Postgres.h (eg. `#include "postgres.h"`) - Toplevel imports from postgres, not contained in a directory (eg. `#include "miscadmin.h"`) - General postgres includes (eg . `#include "nodes/..."`) - Toplevel citus includes, not contained in a directory (eg. `#include "citus_verion.h"`) - Columnar includes (eg. `#include "columnar/..."`) - Distributed includes (eg. `#include "distributed/..."`) Because it is quite hard to understand the difference between toplevel citus includes and toplevel postgres includes it hardcodes the list of toplevel citus includes. In the same manner it assumes anything not prefixed with `columnar/` or `distributed/` as a postgres include. The sorting/grouping is enforced by CI. Since we do so with our own script there are not changes required in our uncrustify configuration.	2023-11-23 18:19:54 +01:00
naisila	47bea76c6c	Revert "Support JSON_TABLE on PG 15 (#6241 )" This reverts commit `1f4fe35512`.	2022-09-12 15:20:17 +03:00
Naisila Puka	1f4fe35512	Support JSON_TABLE on PG 15 (#6241 ) Postgres supports JSON_TABLE feature on PG 15. We treat JSON_TABLE the same as correlated functions (e.g., recurring tuples). In the end, for multi-shard JSON_TABLE commands, we apply the same restrictions as reference tables (e.g., cannot be in the outer part of an outer join etc.) Co-authored-by: Onder Kalaci <onderkalaci@gmail.com>	2022-08-24 19:11:18 +03:00
Onder Kalaci	918838e488	Allow constant VALUES clauses in pushdown queries As long as the VALUES clause contains constant values, we should not recursively plan the queries/CTEs. This is a follow-up work of #1805. So, we can easily apply OUTER join checks as if VALUES clause is a reference table/immutable function.	2021-04-21 14:28:08 +02:00
Marco Slot	f2538a456f	Support co-located/recurring sublinks in the target list	2020-12-13 15:45:24 +01:00
SaitTalhaNisanci	679bf0d2b2	Create CanPushdownSubqery wrapper for better readability (#4108 )	2020-08-12 17:28:20 +03:00
Hanefi Onaldi	d82f3e9406	Introduce intermediate result broadcasting In plain words, each distributed plan pulls the necessary intermediate results to the worker nodes that the plan hits. This is primarily useful in three ways. (i) If the distributed plan that uses intermediate result(s) is a router query, then the intermediate results are only broadcasted to a single node. (ii) If a distributed plan consists of only intermediate results, which is not uncommon, the intermediate results are broadcasted to a single node only. (iii) If a distributed query hits a sub-set of the shards in multiple workers, the intermediate results will be broadcasted to the relevant node(s). The final item (iii) becomes crucial for append/range distributed tables where typically the distributed queries hit a small subset of shards/workers. To do this, for each query that Citus creates a distributed plan, we keep track of the subPlans used in the queryTree, and save it in the distributed plan. Just before Citus executes each subPlan, Citus first keeps track of every worker node that the distributed plan hits, and marks every subPlan should be broadcasted to these nodes. Later, for each subPlan which is a distributed plan, Citus does this operation recursively since these distributed plans may access to different subPlans, and those have to be recorded as well.	2019-11-20 15:26:36 +03:00
SaitTalhaNisanci	b9b7fd7660	add IsLoggableLevel utility function (#3149 ) * add IsLoggableLevel utility function * add function comment for IsLoggableLevel * put ApplyLogRedaction to logutils	2019-11-15 14:59:13 +03:00
Jelte Fennema	adc6ca6100	Make simple in queries on unique columns work with repartion join (#3171 ) This is necassery to support Q20 of the CHbenCHmark: #2582. To summarize the fix: The subquery is converted into an INNER JOIN on a table. This fixes the issue, since an INNER JOIN on a table is already supported by the repartion planner. The way this replacement is happening.: 1. Postgres replaces `col in (subquery)` with a SEMI JOIN (subquery) on col = subquery_result 2. If this subquery is simple enough Postgres will replace it with a regular read from a table 3. If the subquery returns unique results (e.g. a primary key) Postgres will convert the SEMI JOIN into an INNER JOIN during the planning. It will not change this in the rewritten query though. 4. We check if Postgres sends us any SEMI JOINs during its join order planning, if it doesn't we replace all SEMI JOINs in the rewritten query with INNER JOIN (which we already support).	2019-11-11 13:44:28 +01:00
Jelte Fennema	7abedc38b0	Support subqueries in HAVING (#3098 ) Areas for further optimization: - Don't save subquery results to a local file on the coordinator when the subquery is not in the having clause - Push the the HAVING with subquery to the workers if there's a group by on the distribution column - Don't push down the results to the workers when we don't push down the HAVING clause, only the coordinator needs it Fixes #520 Fixes #756 Closes #2047	2019-10-16 16:40:14 +02:00
SaitTalhaNisanci	94a7e6475c	Remove copyright years (#2918 ) * Update year as 2012-2019 * Remove copyright years	2019-10-15 17:44:30 +03:00
Philip Dubé	018ad1c58e	pg12: version_compat.h, tuples, oids, misc	2019-08-22 18:57:23 +00:00
velioglu	d9fa69c031	Refactor query pushdown related logic	2018-05-02 15:03:09 +03:00

15 Commits (06224c36547e3b8cae2d084ef9cf1ffa4f822bf2)