Commit Graph

856 Commits (c8adde14941529bf94a34b3aa04690f641682c7d)

Author SHA1 Message Date
Teja Mupparti 01ea5f58a9 Fix the incorrect value passed to pointer-to-bool parameter, pass a NULL as the value is not used for this invocation. 2023-03-30 10:45:32 -07:00
Teja Mupparti 37500806d6 Add appropriate locks for MERGE to run in parallel 2023-03-28 09:45:40 -07:00
Onur Tirtir 616b5018a0
Add a GUC to disallow planning the queries that reference non-colocated tables via router planner (#6793)
Today we allow planning the queries that reference non-colocated tables
if the shards that query targets are placed on the same node. However,
this may not be the case, e.g., after rebalancing shards because it's
not guaranteed to have those shards on the same node anymore.
This commit adds citus.enable_non_colocated_router_query_pushdown GUC
that can be used to disallow  planning such queries via router planner,
when it's set to false. Note that the default value for this GUC will be
"true" for 11.3, but we will alter it to "false" on 12.0 to not
introduce
a breaking change in a minor release.

Closes #692.

Even more, allowing such queries to go through router planner also
causes
generating an incorrect plan for the DML queries that reference
distributed
tables that are sharded based on different replication factor settings.
For
this reason, #6779 can be closed after altering the default value for
this
GUC to "false", hence not now.

DESCRIPTION: Adds `citus.enable_non_colocated_router_query_pushdown` GUC
to ensure generating a consistent distributed plan for the queries that
reference non-colocated distributed tables (when set to "false", the
default is "true").
2023-03-28 13:10:29 +03:00
Teja Mupparti 9bab819f26 Disentangle MERGE planning code from the modify-planning code path 2023-03-27 10:41:46 -07:00
Teja Mupparti da7db53c87 Refactor some of the planning code to accomodate a new planning path for MERGE SQL 2023-03-22 11:29:24 -07:00
Onur Tirtir e1f1d63050
Rename AllRelations.. functions to AllDistributedRelations.. (#6789)
Because they're only interested in distributed tables. Even more,
this replaces HasDistributionKey() check with
IsCitusTableType(DISTRIBUTED_TABLE) because this doesn't make a
difference on main and sounds slightly more intuitive. Plus, this
would also allow safely using this function in
https://github.com/citusdata/citus/pull/6773.
2023-03-22 15:15:23 +03:00
Onur Tirtir aa465b6de1
Decide what to do with router planner error at one place (#6781) 2023-03-21 14:04:07 +03:00
Teja Mupparti cf55136281 1) Restrict MERGE command INSERT to the source's distribution column
Fixes #6672

2) Move all MERGE related routines to a new file merge_planner.c

3) Make ConjunctionContainsColumnFilter() static again, and rearrange the code in MergeQuerySupported()
4) Restore the original format in the comments section.
5) Add big serial test. Implement latest set of comments
2023-03-16 13:43:08 -07:00
Teja Mupparti 1e42cd3da0 Support MERGE on distributed tables with restrictions
This implements the phase - II of MERGE sql support

Support routable query where all the tables in the merge-sql are distributed, co-located, and both the source and
target relations are joined on the distribution column with a constant qual. This should be a Citus single-task
query. Below is an example.

SELECT create_distributed_table('t1', 'id');
SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’);

MERGE INTO t1
USING s1 ON t1.id = s1.id AND t1.id = 100
WHEN MATCHED THEN
UPDATE SET val = s1.val + 10
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED THEN
INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src)

Basically, MERGE checks to see if

There are a minimum of two distributed tables (source and a target).
All the distributed tables are indeed colocated.
MERGE relations are joined on the distribution column
MERGE .. USING .. ON target.dist_key = source.dist_key
The query should touch only a single shard i.e. JOIN AND with a constant qual
MERGE .. USING .. ON target.dist_key = source.dist_key AND target.dist_key = <>
If any of the conditions are not met, it raises an exception.

(cherry picked from commit 44c387b978)

This implements MERGE phase3

Support pushdown query where all the tables in the merge-sql are Citus-distributed, co-located, and both
the source and target relations are joined on the distribution column. This will generate multiple tasks
which execute independently after pushdown.

SELECT create_distributed_table('t1', 'id');
SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’);

MERGE INTO t1
USING s1
ON t1.id = s1.id
        WHEN MATCHED THEN
                UPDATE SET val = s1.val + 10
        WHEN MATCHED THEN
                DELETE
        WHEN NOT MATCHED THEN
                INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src)

*The only exception for both the phases II and III is, UPDATEs and INSERTs must be done on the same shard-group
as the joined key; for example, below scenarios are NOT supported as the key-value to be inserted/updated is not
guaranteed to be on the same node as the id distribution-column.

MERGE INTO target t
USING source s ON (t.customer_id = s.customer_id)
WHEN NOT MATCHED THEN - -
     INSERT(customer_id, …) VALUES (<non-local-constant-key-value>, ……);

OR this scenario where we update the distribution column itself

MERGE INTO target t
USING source s On (t.customer_id = s.customer_id)
WHEN MATCHED THEN
     UPDATE SET customer_id = 100;

(cherry picked from commit fa7b8949a8)
2023-03-16 13:43:08 -07:00
Onur Tirtir 20a5f3af2b
Replace CITUS_TABLE_WITH_NO_DIST_KEY checks with HasDistributionKey() (#6743)
Now that we will soon add another table type having DISTRIBUTE_BY_NONE
as distribution method and that we want the code to interpret such
tables mostly as distributed tables, let's make the definition of those
other two table types more strict by removing
CITUS_TABLE_WITH_NO_DIST_KEY
macro.

And instead, use HasDistributionKey() check in the places where the
logic applies to all table types that have / don't have a distribution
key. In future PRs, we might want to convert some of those
HasDistributionKey() checks if logic only applies to Citus local /
reference tables, not the others.

And adding HasDistributionKey() also allows us to consider having
DISTRIBUTE_BY_NONE as the distribution method as a "table attribute"
that can apply to distributed tables too, rather something that
determines the table type.
2023-03-10 13:55:52 +03:00
Onur Tirtir e3cf7ace7c
Stabilize single_node.sql and others that report illegal node removal (#6751)
See
https://app.circleci.com/pipelines/github/citusdata/citus/30859/workflows/223d61db-8c1d-4909-9aea-d8e470f0368b/jobs/1009243.
2023-03-08 15:25:36 +03:00
Teja Mupparti d7b499929c Rearrange the common code into a newfunction to facilitate the multiple checks of the same conditions in a multi-modify MERGE statement 2023-02-24 12:55:11 -08:00
Emel Şimşek 756c1d3f5d
Remove auto_explain workaround in citus explain hook for ALTER TABLE (#6714)
When auto_explain module is loaded and configured, EXPLAIN will be
implicitly run for all the supported commands. Postgres does not support
`EXPLAIN` for `ALTER` command. However, auto_explain will try to
`EXPLAIN` other supported commands internally triggered by `ALTER`.

 For instance, 
`ALTER TABLE target_table ADD CONSTRAINT fkey_167 FOREIGN KEY (col_1)
REFERENCES ref_table(key) ... `

command may trigger a SELECT command in the following form for foreign
key validation purpose:
`SELECT fk.col_1 FROM ONLY target_table fk LEFT OUTER JOIN ONLY
ref_table pk ON ( pk.key OPERATOR(pg_catalog.=) fk.col_1) WHERE pk.key
IS NULL AND (fk.col_1 IS NOT NULL) `

For Citus tables, the Citus utility hook should ensure that constraint
validation is skipped for shell tables but they are done for shard
tables. The reason behind this design choice can be summed up as:

- An ALTER TABLE command via coordinator node is run in a distributed
transaction.
- Citus does not support nested distributed transactions. 
- A SELECT query on a distributed table (aka shell table) is also run in
a distributed transaction.
- Therefore, Citus does not support running a SELECT query on a shell
table while an ALTER TABLE command is running.

With
eadc88a800
a bug is introduced breaking the skip constraint validation behaviour of
Citus. With this change, we see that validation queries on distributed
tables are triggered within `ALTER` command adding constraints with
validation check. This regression did not cause an issue for regular use
cases since the citus executor hook blocks those queries heuristically
when there is an ALTER TABLE command in progress.

The issue is surfaced as a crash (#6424 Workers, when configured to use
auto_explain, crash during distributed transactions.) when auto_explain
is enabled. This is due to auto_explain trying to execute the SELECT
queries in a nested distributed transaction.

Now since the regression with constraint validation is fixed in
https://github.com/citusdata/citus/issues/6543, we should be able to
remove the workaround.
2023-02-17 17:47:03 +03:00
Teja Mupparti 0824d9c1fb Miscellaneous cleanup 2023-02-09 13:05:59 -08:00
Marco Slot a482b36760
Revert "Support MERGE on distributed tables with restrictions" (#6675)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2023-01-30 15:01:59 +01:00
aykut-bozkurt ab71cd01ee
fix multi level recursive plan (#6650)
Recursive planner should handle all the tree from bottom to top at
single pass. i.e. It should have already recursively planned all
required parts in its first pass. Otherwise, this means we have bug at
recursive planner, which needs to be handled. We add a check here and
return error.

DESCRIPTION: Fixes wrong results by throwing error in case recursive
planner multipass the query.

We found 3 different cases which causes recursive planner passes the
query multiple times.
1. Sublink in WHERE clause is planned at second pass after we
recursively planned a distributed table at the first pass. Fixed by PR
#6657.
2. Local-distributed joins are recursively planned at both the first and
the second pass. Issue #6659.
3. Some parts of the query is considered to be noncolocated at the
second pass as we do not generate attribute equivalances between
nondistributed and distributed tables. Issue #6653
2023-01-27 21:25:04 +03:00
aykut-bozkurt 8870f0f90b
fix order of recursive sublink planning (#6657)
We should do the sublink conversations at the end of the recursive
planning because earlier steps might have transformed the query into a
shape that needs recursively planning the sublinks.

DESCRIPTION: Fixes early sublink check at recursive planner.

Related to PR https://github.com/citusdata/citus/pull/6650
2023-01-27 14:35:16 +03:00
Hanefi Onaldi 94b63f35a5
Prevent crashes on update with returning clauses (#6643)
If an update query on a reference table has a returns clause with a
subquery that accesses some other local table, we end-up with an crash.

This commit prevents the crash, but does not prevent other error
messages from happening due to Citus not being able to pushdown the
results of that subquery in a valid SQL command.

Related: #6634
2023-01-24 20:07:43 +03:00
Teja Mupparti 44c387b978 Support MERGE on distributed tables with restrictions
This implements the phase - II of MERGE sql support

Support routable query where all the tables in the merge-sql are distributed, co-located, and both the source and
target relations are joined on the distribution column with a constant qual. This should be a Citus single-task
query. Below is an example.

SELECT create_distributed_table('t1', 'id');
SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’);

MERGE INTO t1
USING s1 ON t1.id = s1.id AND t1.id = 100
WHEN MATCHED THEN
UPDATE SET val = s1.val + 10
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED THEN
INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src)

Basically, MERGE checks to see if

There are a minimum of two distributed tables (source and a target).
All the distributed tables are indeed colocated.
MERGE relations are joined on the distribution column
MERGE .. USING .. ON target.dist_key = source.dist_key
The query should touch only a single shard i.e. JOIN AND with a constant qual
MERGE .. USING .. ON target.dist_key = source.dist_key AND target.dist_key = <>
If any of the conditions are not met, it raises an exception.
2023-01-18 11:05:27 -08:00
Önder Kalacı eb75decbeb
Undo planner extended statistics override (#6492) 2023-01-04 13:25:57 +01:00
Teja Mupparti 9a9989fc15 Support MERGE Phase – I
All the tables (target, source or any CTE present) in the SQL statement are local i.e. a merge-sql with a combination of Citus local and
Non-Citus tables (regular Postgres tables) should work and give the same result as Postgres MERGE on regular tables. Catch and throw an
exception (not-yet-supported) for all other scenarios during Citus-planning phase.
2022-12-18 20:32:15 -08:00
aykut-bozkurt 9c0073ba57
remove unused boundary type (#6563)
Removes unused job boundary tag `SUBQUERY_MAP_MERGE_JOB`.

Only usage is at `BuildMapMergeJob`, which is only called when the
boundary = `JOIN_MAP_MERGE_JOB`. Hence, it should be safe to remove.
2022-12-16 18:19:22 +03:00
Onur Tirtir 2803470b58 Add lateral join checks for outer joins and drop the useless ones for semi joins 2022-12-07 18:27:50 +03:00
Onur Tirtir e7e4881289 Phase - III: recursively plan non-recurring sub join trees too 2022-12-07 18:27:50 +03:00
Onur Tirtir f52381387e Phase - II: recursively plan non-recurring subqueries too 2022-12-07 18:27:50 +03:00
Onur Tirtir f339450a9d Phase - I: recursively plan non-recurring relations 2022-12-07 18:27:50 +03:00
aykut-bozkurt 83ef600f27
fix false full join pushdown error check (#6523)
**Problem**: Currently, we error out if we detect recurring tuples in
one side without checking the other side of the join.

**Solution**: When one side of the full join consists recurring tuples
and the other side consists nonrecurring tuples, we should not pushdown
to prevent duplicate results. Otherwise, safe to pushdown.
2022-11-30 14:17:56 +03:00
Philip Dubé cf69fc3652 Grammar: it's to its
Includes an error message

& one case of its to it's

Also fix "to the to" typos
2022-11-28 20:43:44 +00:00
Emel Şimşek 8e5ba45b74
Fixes a bug that causes crash when using auto_explain extension with ALTER TABLE...ADD FOREIGN KEY... queries. (#6470)
Fixes a bug that causes crash when using auto_explain extension with
ALTER TABLE...ADD FOREIGN KEY... queries.
Those queries trigger a SELECT query on the citus tables as part of the
foreign key constraint validation check. At the explain hook, workers
try to explain this SELECT query as a distributed query causing memory
corruption in the connection data structures. Hence, we will not explain
ALTER TABLE...ADD FOREIGN KEY... and the triggered queries on the
workers.

Fixes #6424.
2022-11-15 17:53:39 +03:00
Marco Slot 666696c01c
Deprecate citus.replicate_reference_tables_on_activate, make it always off (#6474)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-11-04 16:21:10 +01:00
Onur Tirtir dbe2749bbf
Drop unreachable code from query_pushdown_planning.c (#6451)
Given that we cannot continue after a `RaiseDeferredErrorInternal(..,
ERROR)` call.
2022-10-21 18:04:31 +03:00
Emel Şimşek 02fd1e6c03
Fix the crash that happens when using auto_explain extension with recursive queries (#6406)
This crash happens with recursively planned queries. For such queries,
subplans are explained via the ExplainOnePlan function of postgresql.
This function reconstructs the query description from the plan therefore
it expects the ActiveSnaphot for the query be available. This fix makes
sure that the snapshot is in the stack before calling ExplainOnePlan.

Fixes #2920.
2022-10-19 18:04:45 +03:00
Onder Kalaci 766f340ce0 Prevent failures on partitioned distributed tables with statistics objects on PG 15
Comment from the code is clear on this:
/*
 * The statistics objects of the distributed table are not relevant
 * for the distributed planning, so we can override it.
 *
 * Normally, we should not need this. However, the combination of
 * Postgres commit 269b532aef55a579ae02a3e8e8df14101570dfd9 and
 * Citus function AdjustPartitioningForDistributedPlanning()
 * forces us to do this. The commit expects statistics objects
 * of partitions to have "inh" flag set properly. Whereas, the
 * function overrides "inh" flag. To avoid Postgres to throw error,
 * we override statlist such that Postgres does not try to process
 * any statistics objects during the standard_planner() on the
 * coordinator. In the end, we do not need the standard_planner()
 * on the coordinator to generate an optimized plan. We call
 * into standard_planner() for other purposes, such as generating the
 * relationRestrictionContext here.
 *
 * AdjustPartitioningForDistributedPlanning() is a hack that we use
 * to prevent Postgres' standard_planner() to expand all the partitions
 * for the distributed planning when a distributed partitioned table
 * is queried. It is required for both correctness and performance
 * reasons. Although we can eliminate the use of the function for
 * the correctness (e.g., make sure that rest of the planner can handle
 * partitions), it's performance implication is hard to avoid. Certain
 * planning logic of Citus (such as router or query pushdown) relies
 * heavily on the relationRestrictionList. If
 * AdjustPartitioningForDistributedPlanning() is removed, all the
 * partitions show up in the, causing high planning times for
 * such queries.
 */
2022-09-15 14:36:05 +03:00
naisila 47bea76c6c Revert "Support JSON_TABLE on PG 15 (#6241)"
This reverts commit 1f4fe35512.
2022-09-12 15:20:17 +03:00
Emel Şimşek 6f06ff78cc
Throw an error if there is a RangeTblEntry that is not assigned an RTE identity. (#6295)
* Fix issue : 6109 Segfault or (assertion failure) is possible when using a SQL function

* DESCRIPTION: Ensures disallowing the usage of SQL functions referencing to a distributed table and prevents a segfault.
Using a SQL function may result in segmentation fault in some cases.
This change fixes the issue by throwing an error message when a SQL function cannot be handled.

Fixes #6109.

* DESCRIPTION: Ensures disallowing the usage of SQL functions referencing to a distributed table and prevents a segfault.
Using a SQL function may result in segmentation fault in some cases. This change fixes the issue by throwing an error message when a SQL function cannot be handled.

Fixes #6109.

Co-authored-by: Emel Simsek <emel.simsek@microsoft.com>
2022-09-06 15:46:41 +02:00
aykut-bozkurt 69726648ab
verify shards if exists for insert, delete, update (#6280)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-06 15:29:14 +02:00
Naisila Puka 98dcbeb304
Specifies that our CustomScan providers support projections (#6244)
Before, this was the default mode for CustomScan providers.
Now, the default is to assume that they can't project.
This causes performance penalties due to adding unnecessary
Result nodes.

Hence we use the newly added flag, CUSTOMPATH_SUPPORT_PROJECTION
to get it back to how it was.

In PG15 support branch we created explain functions to ignore
the new Result nodes, so we undo that in this commit.

Relevant PG commit:
955b3e0f9269639fb916cee3dea37aee50b82df0
2022-08-31 10:48:01 +03:00
Önder Kalacı 3ed6fea1cf
Prevent Merge command on distributed tables [PG 15] (#6238) 2022-08-25 13:27:08 +03:00
Marco Slot ac07d33a29
Remove unused reduceQuery from physical planning (#6221)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-08-24 17:24:27 +00:00
Naisila Puka 1f4fe35512
Support JSON_TABLE on PG 15 (#6241)
Postgres supports JSON_TABLE feature on PG 15.

We treat JSON_TABLE the same as correlated functions (e.g., recurring tuples).
In the end, for multi-shard JSON_TABLE commands, we apply the same
restrictions as reference tables (e.g., cannot be in the outer part of
an outer join etc.)

Co-authored-by: Onder Kalaci <onderkalaci@gmail.com>
2022-08-24 19:11:18 +03:00
Marco Slot 639588bee0
Remove unused functions (#6220)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-08-22 11:53:25 +03:00
Onder Kalaci 149771792b Remove useless version compats
most likely leftover from earlier versions
2022-07-29 10:31:55 +02:00
Marco Slot cff013a057 Fix issues with insert..select casts and column ordering 2022-07-28 13:23:57 +02:00
aykut-bozkurt 67ac3da2b0
added citus_depended_objects udf and HideCitusDependentObjects GUC to hide citus depended objects from pg meta queries (#6055)
use RecurseObjectDependencies api to find if an object is citus depended

make vanilla tests runnable to see if citus_depended function is working correctly
2022-07-25 16:43:34 +03:00
Naisila Puka 7d6410c838
Drop postgres 12 support (#6040)
* Remove if conditions with PG_VERSION_NUM < 13

* Remove server_above_twelve(&eleven) checks from tests

* Fix tests

* Remove pg12 and pg11 alternative test output files

* Remove pg12 specific normalization rules

* Some more if conditions in the code

* Change RemoteCollationIdExpression and some pg12/pg13 comments

* Remove some more normalization rules
2022-07-20 17:49:36 +03:00
Jelte Fennema 184c7c0bce
Make enterprise features open source (#6008)
This PR makes all of the features open source that were previously only
available in Citus Enterprise.

Features that this adds:
1. Non blocking shard moves/shard rebalancer
   (`citus.logical_replication_timeout`)
2. Propagation of CREATE/DROP/ALTER ROLE statements
3. Propagation of GRANT statements
4. Propagation of CLUSTER statements
5. Propagation of ALTER DATABASE ... OWNER TO ...
6. Optimization for COPY when loading JSON to avoid double parsing of
   the JSON object (`citus.skip_jsonb_validation_in_copy`)
7. Support for row level security
8. Support for `pg_dist_authinfo`, which allows storing different
   authentication options for different users, e.g. you can store
   passwords or certificates here.
9. Support for `pg_dist_poolinfo`, which allows using connection poolers
   in between coordinator and workers
10. Tracking distributed query execution times using
   citus_stat_statements (`citus.stat_statements_max`,
   `citus.stat_statements_purge_interval`,
   `citus.stat_statements_track`). This is disabled by default.
11. Blocking tenant_isolation
12. Support for `sslkey` and `sslcert` in `citus.node_conninfo`
2022-06-16 00:23:46 -07:00
Gledis Zeneli beef392f5a
Fix memory error with citus_add_node reported by valgrind test (#5967)
The error comes due to the datum jsonb in pg_dist_metadata_node.metadata being 0 in some scenarios. This is likely due to not copying the data when receiving a datum from a tuple and pg deciding to deallocate that memory when the table that the tuple was from is closed.
Also fix another place in the code that might have been susceptible to this issue.
I tested on both multi-vg and multi-1-vg and the test were successful.
2022-05-28 00:22:00 +03:00
Marco Slot 7abcfac61f Add caching for functions that check the backend type 2022-05-20 19:02:37 +02:00
Burak Velioglu 06a94d167e Use object address instead of relation id on DDLJob to decide on syncing metadata 2022-05-05 17:59:44 +03:00
Jeff Davis 3e1180de78 PG15: handle extra argument to parse_analyze_varparams().
From PG commit 25751f54b8.
2022-05-02 10:12:03 -07:00
Jeff Davis bd455f42e3 PG15: handle change to SeqScan structure.
Account for PG commit 2226b4189b. The one site dependent on it can do
just as well with a Scan instead of a SeqScan.
2022-05-02 10:12:03 -07:00
Jeff Davis 3799f95742 PG15: Value -> String, Integer, Float.
Handle PG commit 639a86e36a.
2022-05-02 10:12:03 -07:00
Marco Slot c0827703ec Fix EXPLAIN ANALYZE JSON format for subplans 2022-04-07 11:38:20 +02:00
Marco Slot 544dce919a Handle user-defined type parameters in EXPLAIN ANALYZE 2022-04-07 11:14:32 +02:00
Marco Slot 8c8c3b665d Add TABLESAMPLE support 2022-04-01 15:51:40 +02:00
Jelte Fennema 3a44fa827a
Add versions of forboth that don't need ListCell (#5856)
We've had custom versions of Postgres its `foreach` macro which with a
hidden ListCell for quite some time now. People like these custom
macros, because they are easier to use and require less boilerplate.
This adds similar custom versions of Postgres its `forboth` macro. Now
you don't need ListCells anymore when looping over two lists at the same
time.
2022-03-23 14:50:36 +03:00
Onder Kalaci af4ba3eb1f Remove citus.enable_cte_inlining GUC
In Postgres 12+, users can adjust whether to inline/not inline CTEs
by [NOT] MATERIALIZED keywords. So, this GUC is already useless.
2022-03-22 17:14:44 +01:00
Jelte Fennema e5d5c7be93
Start erroring out for unsupported lateral subqueries (#5753)
With the introduction of #4385 we inadvertently started allowing and
pushing down certain lateral subqueries that were unsafe to push down.
To be precise the type of LATERAL subqueries that is unsafe to push down
has all of the following properties:
1. The lateral subquery contains some non recurring tuples
2. The lateral subquery references a recurring tuple from
   outside of the subquery (recurringRelids)
3. The lateral subquery requires a merge step (e.g. a LIMIT)
4. The reference to the recurring tuple should be something else than an
   equality check on the distribution column, e.g. equality on a non
   distribution column.


Property number four is considered both hard to detect and probably not
used very often. Thus this PR ignores property number four and causes
query planning to error out if the first three properties hold.

Fixes #5327
2022-03-11 11:59:18 +01:00
Marco Slot 49467e27e6
Ensure worker_save_query_explain_analyze always fully qualifies types (#5776)
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-03-10 07:30:11 -08:00
Gledis Zeneli 2cb02bfb56
Fix node adding itself with citus_add_node leading to deadlock (Fix #5720) (#5758)
If a worker node is being added, a command is sent to get the server_id of the worker from the pg_dist_node_metadata table. If the worker's id is the same as the node executing the code, we will know the node is trying to add itself. If the node tries to add itself without specifying `groupid:=0` the operation will result in an error.
2022-03-10 17:46:33 +03:00
Marco Slot ef1ceb3953 Only use a single placement for map tasks 2022-02-23 19:40:21 +01:00
Marco Slot 3cd9aa655a Stop using citus.binary_worker_copy_format 2022-02-23 19:40:21 +01:00
Marco Slot 5ac0d31e8b Fix re-partition hash range generation 2022-02-23 19:40:21 +01:00
Marco Slot 72d8fde28b Use intermediate results for re-partition joins 2022-02-23 19:40:21 +01:00
Ahmet Gedemenli 2bc6a00408 Refactor CreateDistributedTable to take column name 2022-02-21 12:07:17 +03:00
Teja Mupparti 46fa47beea Force-delegated functions' distribution argument must be reset as soon as the routine completes execution,
and not wait until the top level Executor ends. This fixes issue #5687
2022-02-17 10:48:30 -08:00
Marco Slot d0711ea9b4 Delegate function calls in FROM outside of transaction block 2022-02-09 20:56:25 +01:00
Teja Mupparti c8e504dd69 Fix the issue #5673
If the expression is simple, such as, SELECT function() or PEFORM function()
in PL/PgSQL code, PL engine does a simple expression evaluation which can't
interpret the Citus CustomScan Node. Code checks for simple expressions when
executing an UDF but missed the DO-Block scenario, this commit fixes it.
2022-02-04 15:44:53 -08:00
Onur Tirtir 79442df1b7
Fix coordinator/worker query targetlists for agg. that we cannot push-down (#5679)
Previously, we were wrapping targetlist nodes with Vars that reference
to the result of the worker query, if the node itself is not `Const` or
not a `Param`. Indeed, we should not do that unless the node itself is
a `Var` node or contains a `Var` within it (e.g.: `OpExpr(Var(column_a) > 2)`).
Otherwise, when worker query returns empty result set, then combine
query exec would crash since the `Var` would be pointing to an empty
tuple slot, which is not desirable for the node-executor methods.
2022-02-04 05:37:25 -08:00
Teja Mupparti f31bce5b48 Fixes the issue seen in https://github.com/citusdata/citus-enterprise/issues/745
With this commit, rebalancer backends are identified by application_name = citus_rebalancer
and the regular internal backends are identified by application_name = citus_internal
2022-02-03 09:40:46 -08:00
Marco Slot 63c6896716 Enable function call pushdown from workers 2022-02-01 14:13:25 +01:00
Teja Mupparti 54862f8c22 (1) Functions will be delegated even when present in the scope of an explicit
BEGIN/COMMIT transaction block or in a UDF calling another UDF.
(2) Prohibit/Limit the delegated function not to do a 2PC (or any work on a
remote connection).
(3) Have a safety net to ensure the (2) i.e. we should block the connections
from the delegated procedure or make sure that no 2PC happens on the node.
(4) Such delegated functions are restricted to use only the distributed argument
value.

Note: To limit the scope of the project we are considering only Functions(not
procedures) for the initial work.

DESCRIPTION: Introduce a new flag "force_delegation" in create_distributed_function(),
which will allow a function to be delegated in an explicit transaction block.

Fixes #3265

Once the function is delegated to the worker, on that node during the planning

distributed_planner()
TryToDelegateFunctionCall()
CheckDelegatedFunctionExecution()
EnableInForceDelegatedFuncExecution()
Save the distribution argument (Constant)
ExecutorStart()
CitusBeginScan()
IsShardKeyValueAllowed()
Ensure to not use non-distribution argument.

ExecutorRun()
AdaptiveExecutor()
StartDistributedExecution()
EnsureNoRemoteExecutionFromWorkers()
Ensure all the shards are local to the node in the remoteTaskList.
NonPushableInsertSelectExecScan()
InitializeCopyShardState()
EnsureNoRemoteExecutionFromWorkers()
Ensure all the shards are local to the node in the placementList.

This also fixes a minor issue: Properly handle expressions+parameters in distribution arguments
2022-01-19 16:43:33 -08:00
Marco Slot 33bfa0b191 Hide shards from application_name's with a specific prefix 2022-01-18 15:20:55 +04:00
Marco Slot ee3b50b026 Disallow remote execution from queries on shards 2022-01-07 17:46:21 +01:00
Ahmet Gedemenli 3c834e6693
Disable foreign distributed tables (#5605)
* Disable foreign distributed tables
* Add warning for existing distributed foreign tables
2022-01-07 18:12:23 +03:00
Talha Nisanci e196d23854
Refactor AttributeEquivalenceId (#5006) 2021-12-23 13:19:02 +03:00
Hanefi Onaldi 13fff9c37a Remove NOOP tuplestore_donestoring calls
PostgreSQL does not need calling this function since 7.4 release, and it
is a NOOP.

For more details, check PostgreSQL commit below :

commit dd04e958c8b03c0f0512497651678c7816af3198
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Sun Mar 9 03:34:10 2003 +0000

    tuplestore_donestoring() isn't needed anymore, but provide a no-op
    macro definition so as not to create compatibility problems.

diff --git a/src/include/utils/tuplestore.h b/src/include/utils/tuplestore.h
index b46babacd1..76fe9fb428 100644
--- a/src/include/utils/tuplestore.h
+++ b/src/include/utils/tuplestore.h
@@ -17,7 +17,7 @@
  * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
- * $Id: tuplestore.h,v 1.8 2003/03/09 02:19:13 tgl Exp $
+ * $Id: tuplestore.h,v 1.9 2003/03/09 03:34:10 tgl Exp $
  *
  *-------------------------------------------------------------------------
  */
@@ -41,6 +41,9 @@ extern Tuplestorestate *tuplestore_begin_heap(bool randomAccess,

 extern void tuplestore_puttuple(Tuplestorestate *state, void *tuple);

+/* tuplestore_donestoring() used to be required, but is no longer used */
+#define tuplestore_donestoring(state)  ((void) 0)
+
 /* backwards scan is only allowed if randomAccess was specified 'true' */
 extern void *tuplestore_gettuple(Tuplestorestate *state, bool forward,
                                        bool *should_free);
2021-12-14 18:55:02 +03:00
Burak Velioglu 6d849cf394
Allow delegating function from worker nodes
We've both allowed delegating functions and procedures from worker nodes
and also prevented delegation if a function/procedure has already been
propagated from another node.
2021-12-06 19:25:51 +03:00
Onder Kalaci 121f5c4271 Active placements can only be on active nodes
We re-define the meaning of active shard placement. It used
to only be defined via shardstate == SHARD_STATE_ACTIVE.

Now, we also add one more check. The worker node that the
placement is on should be active as well.

This is a preparation for supporting citus_disable_node()
for MX with multiple failures at the same time.

With this change, the maintanince daemon only needs to
sync the "node metadata" (e.g., pg_dist_node), not the
shard metadata.
2021-11-26 09:14:33 +01:00
Onur Tirtir 81af605e07
Fix typo: "no sharding pruning constraints" -> "no shard pruning constraints" (#5490) 2021-11-25 21:00:44 +01:00
Sait Talha Nisanci ab29c25658 Fix missing from entry 2021-11-04 18:54:52 +03:00
Philip Dubé cc50682158 Fix typos. Spurred spotting "connectios" in logs 2021-10-25 13:54:09 +00:00
Önder Kalacı 3f726c72e0
When replication factor > 1, all modifications are done via 2PC (#5379)
With Citus 9.0, we introduced `citus.single_shard_commit_protocol` which
defaults to 2PC.

With this commit, we prevent any user to set it to 1PC and drop support
for `citus.single_shard_commit_protocol`.

Although this might add some overhead for users, it is already the default
behaviour (so less likely) and marking placements as INVALID is much
worse.
2021-10-20 01:39:03 -07:00
Marco Slot 93e79b9262 Never allow co-located joins of append-distributed tables 2021-10-18 21:11:16 +02:00
Marco Slot b97e5081c7 Disable co-located joins for append-distributed tables 2021-10-18 21:11:16 +02:00
Marco Slot dfad73d918 Disable implicit single re-partition joins for append tables 2021-10-18 21:11:16 +02:00
Marco Slot 2206e64e42 Disable single-repartition joins for append tables 2021-10-18 21:11:16 +02:00
Teja Mupparti a8348047c5
Pushdown procedures with OUT parameters (#5348) 2021-10-11 23:14:36 -07:00
Halil Ozan Akgul 43d5853b6d Fixes function names in comments 2021-10-06 09:24:43 +03:00
Halil Ozan Akgul 19af1cef2f Errors for CTEs with search clause
Relevant PG commit:
3696a600e2292d43c00949ddf0352e4ebb487e5b
2021-09-09 13:48:24 +03:00
SaitTalhaNisanci e3e0a028c7
return early in case we want to skip outer vars (#5259) 2021-09-09 10:53:36 +03:00
Sait Talha Nisanci 0b67fcf81d Fix style 2021-09-03 16:09:59 +03:00
Sait Talha Nisanci d1c0403055 Disable Query Idenfifier calculation in tests
When queryId is not 0 and verbose is true, the query identifier is
emitted to the explain output. This is breaking Postgres outputs.
We disable de query identifier calculation in the tests.
Commit on PG that introduced the query identifier in the explain output:
4f0b0966c866ae9f0e15d7cc73ccf7ce4e1af84b
2021-09-03 15:41:28 +03:00
Halil Ozan Akgul 8ef94dc1f5 Changes array_cat argument type from anyarray to anycompatiblearray
Relevant PG commit:
9e38c2bb5093ceb0c04d6315ccd8975bd17add66

fix array_cat_agg for pg upgrades

array_cat_agg now needs to take anycompatiblearray instead of anyarray
because array_cat changed its type from anyarray to anycompatiblearray
with pg14.

To handle upgrades correctly, we drop the aggregate in
citus_pg_prepare_upgrade. To be able to drop it, we first remove the
dependency from pg_depend.

Then we create the right aggregate in citus_finish_pg_upgrade and we
also add the dependency back to pg_depend.
2021-09-03 15:41:28 +03:00
Sait Talha Nisanci 29f5b99951 Use empty string instead of NULL for queryString
Postgres doesn't accept NULL for queryStrings in explain plans anymore.
Internally, there are some places in Postgres where they modified the
NULLS to ""(the empty string). So we do the same on citus side.

Commit on Postgres:
1111b2668d89bfcb6f502789158b1233ab4217a6
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 287706b717 Introduces SetTuplestoreDestReceiverParams_compat macro
SetTuplestoreDestReceiverParams function now has two new parameters
This new macro give us the ability to use this new parameter for PG14 and it doesn't give the parameter for previous versions
Existing parameters are set to NULL to keep previous behavior

Relevant PG commit:
2f48ede080f42b97b594fb14102c82ca1001b80c
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul b01e7e884c Pass NULL for plannerInfo as we don't generate PlaceHolderVars 2021-09-03 15:27:25 +03:00
Halil Ozan Akgul ebf1b7e23f Introduces macros for functions that now have include_out_arguments argument
New macros: FuncnameGetCandidates_compat and expand_function_arguments_compat

The functions (the ones without _compat) now have a new bool include_out_arguments parameter
These new macros give us the ability to use this new parameter for PG14 and it doesn't give the parameter for previous versions
Existing include_out_arguments parameters are set to 'false' to keep current behavior

Relevant PG commit:
e56bce5d43789cce95d099554ae9593ada92b3b7
2021-09-03 15:27:24 +03:00
SaitTalhaNisanci 5ae01303d4
Use get_attnum to find the attribute number of target entry (#5220)
* Use get_attnum to find the attribute number of target entry
2021-08-31 16:47:19 +03:00
SaitTalhaNisanci 2ec4e37e45
Fix assert failure in FindReferencedTableColumn (#5175) 2021-08-12 18:21:45 +03:00