Commit Graph

754 Commits (7b90b52308c30e02142352aeb4be31fc52669d74)

Author SHA1 Message Date
Marco Slot c0827703ec Fix EXPLAIN ANALYZE JSON format for subplans 2022-04-07 11:38:20 +02:00
Marco Slot 544dce919a Handle user-defined type parameters in EXPLAIN ANALYZE 2022-04-07 11:14:32 +02:00
Marco Slot 8c8c3b665d Add TABLESAMPLE support 2022-04-01 15:51:40 +02:00
Jelte Fennema 3a44fa827a
Add versions of forboth that don't need ListCell (#5856)
We've had custom versions of Postgres its `foreach` macro which with a
hidden ListCell for quite some time now. People like these custom
macros, because they are easier to use and require less boilerplate.
This adds similar custom versions of Postgres its `forboth` macro. Now
you don't need ListCells anymore when looping over two lists at the same
time.
2022-03-23 14:50:36 +03:00
Onder Kalaci af4ba3eb1f Remove citus.enable_cte_inlining GUC
In Postgres 12+, users can adjust whether to inline/not inline CTEs
by [NOT] MATERIALIZED keywords. So, this GUC is already useless.
2022-03-22 17:14:44 +01:00
Jelte Fennema e5d5c7be93
Start erroring out for unsupported lateral subqueries (#5753)
With the introduction of #4385 we inadvertently started allowing and
pushing down certain lateral subqueries that were unsafe to push down.
To be precise the type of LATERAL subqueries that is unsafe to push down
has all of the following properties:
1. The lateral subquery contains some non recurring tuples
2. The lateral subquery references a recurring tuple from
   outside of the subquery (recurringRelids)
3. The lateral subquery requires a merge step (e.g. a LIMIT)
4. The reference to the recurring tuple should be something else than an
   equality check on the distribution column, e.g. equality on a non
   distribution column.


Property number four is considered both hard to detect and probably not
used very often. Thus this PR ignores property number four and causes
query planning to error out if the first three properties hold.

Fixes #5327
2022-03-11 11:59:18 +01:00
Marco Slot 49467e27e6
Ensure worker_save_query_explain_analyze always fully qualifies types (#5776)
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-03-10 07:30:11 -08:00
Gledis Zeneli 2cb02bfb56
Fix node adding itself with citus_add_node leading to deadlock (Fix #5720) (#5758)
If a worker node is being added, a command is sent to get the server_id of the worker from the pg_dist_node_metadata table. If the worker's id is the same as the node executing the code, we will know the node is trying to add itself. If the node tries to add itself without specifying `groupid:=0` the operation will result in an error.
2022-03-10 17:46:33 +03:00
Marco Slot ef1ceb3953 Only use a single placement for map tasks 2022-02-23 19:40:21 +01:00
Marco Slot 3cd9aa655a Stop using citus.binary_worker_copy_format 2022-02-23 19:40:21 +01:00
Marco Slot 5ac0d31e8b Fix re-partition hash range generation 2022-02-23 19:40:21 +01:00
Marco Slot 72d8fde28b Use intermediate results for re-partition joins 2022-02-23 19:40:21 +01:00
Ahmet Gedemenli 2bc6a00408 Refactor CreateDistributedTable to take column name 2022-02-21 12:07:17 +03:00
Teja Mupparti 46fa47beea Force-delegated functions' distribution argument must be reset as soon as the routine completes execution,
and not wait until the top level Executor ends. This fixes issue #5687
2022-02-17 10:48:30 -08:00
Marco Slot d0711ea9b4 Delegate function calls in FROM outside of transaction block 2022-02-09 20:56:25 +01:00
Teja Mupparti c8e504dd69 Fix the issue #5673
If the expression is simple, such as, SELECT function() or PEFORM function()
in PL/PgSQL code, PL engine does a simple expression evaluation which can't
interpret the Citus CustomScan Node. Code checks for simple expressions when
executing an UDF but missed the DO-Block scenario, this commit fixes it.
2022-02-04 15:44:53 -08:00
Onur Tirtir 79442df1b7
Fix coordinator/worker query targetlists for agg. that we cannot push-down (#5679)
Previously, we were wrapping targetlist nodes with Vars that reference
to the result of the worker query, if the node itself is not `Const` or
not a `Param`. Indeed, we should not do that unless the node itself is
a `Var` node or contains a `Var` within it (e.g.: `OpExpr(Var(column_a) > 2)`).
Otherwise, when worker query returns empty result set, then combine
query exec would crash since the `Var` would be pointing to an empty
tuple slot, which is not desirable for the node-executor methods.
2022-02-04 05:37:25 -08:00
Teja Mupparti f31bce5b48 Fixes the issue seen in https://github.com/citusdata/citus-enterprise/issues/745
With this commit, rebalancer backends are identified by application_name = citus_rebalancer
and the regular internal backends are identified by application_name = citus_internal
2022-02-03 09:40:46 -08:00
Marco Slot 63c6896716 Enable function call pushdown from workers 2022-02-01 14:13:25 +01:00
Teja Mupparti 54862f8c22 (1) Functions will be delegated even when present in the scope of an explicit
BEGIN/COMMIT transaction block or in a UDF calling another UDF.
(2) Prohibit/Limit the delegated function not to do a 2PC (or any work on a
remote connection).
(3) Have a safety net to ensure the (2) i.e. we should block the connections
from the delegated procedure or make sure that no 2PC happens on the node.
(4) Such delegated functions are restricted to use only the distributed argument
value.

Note: To limit the scope of the project we are considering only Functions(not
procedures) for the initial work.

DESCRIPTION: Introduce a new flag "force_delegation" in create_distributed_function(),
which will allow a function to be delegated in an explicit transaction block.

Fixes #3265

Once the function is delegated to the worker, on that node during the planning

distributed_planner()
TryToDelegateFunctionCall()
CheckDelegatedFunctionExecution()
EnableInForceDelegatedFuncExecution()
Save the distribution argument (Constant)
ExecutorStart()
CitusBeginScan()
IsShardKeyValueAllowed()
Ensure to not use non-distribution argument.

ExecutorRun()
AdaptiveExecutor()
StartDistributedExecution()
EnsureNoRemoteExecutionFromWorkers()
Ensure all the shards are local to the node in the remoteTaskList.
NonPushableInsertSelectExecScan()
InitializeCopyShardState()
EnsureNoRemoteExecutionFromWorkers()
Ensure all the shards are local to the node in the placementList.

This also fixes a minor issue: Properly handle expressions+parameters in distribution arguments
2022-01-19 16:43:33 -08:00
Marco Slot 33bfa0b191 Hide shards from application_name's with a specific prefix 2022-01-18 15:20:55 +04:00
Marco Slot ee3b50b026 Disallow remote execution from queries on shards 2022-01-07 17:46:21 +01:00
Ahmet Gedemenli 3c834e6693
Disable foreign distributed tables (#5605)
* Disable foreign distributed tables
* Add warning for existing distributed foreign tables
2022-01-07 18:12:23 +03:00
Talha Nisanci e196d23854
Refactor AttributeEquivalenceId (#5006) 2021-12-23 13:19:02 +03:00
Hanefi Onaldi 13fff9c37a Remove NOOP tuplestore_donestoring calls
PostgreSQL does not need calling this function since 7.4 release, and it
is a NOOP.

For more details, check PostgreSQL commit below :

commit dd04e958c8b03c0f0512497651678c7816af3198
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Sun Mar 9 03:34:10 2003 +0000

    tuplestore_donestoring() isn't needed anymore, but provide a no-op
    macro definition so as not to create compatibility problems.

diff --git a/src/include/utils/tuplestore.h b/src/include/utils/tuplestore.h
index b46babacd1..76fe9fb428 100644
--- a/src/include/utils/tuplestore.h
+++ b/src/include/utils/tuplestore.h
@@ -17,7 +17,7 @@
  * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
- * $Id: tuplestore.h,v 1.8 2003/03/09 02:19:13 tgl Exp $
+ * $Id: tuplestore.h,v 1.9 2003/03/09 03:34:10 tgl Exp $
  *
  *-------------------------------------------------------------------------
  */
@@ -41,6 +41,9 @@ extern Tuplestorestate *tuplestore_begin_heap(bool randomAccess,

 extern void tuplestore_puttuple(Tuplestorestate *state, void *tuple);

+/* tuplestore_donestoring() used to be required, but is no longer used */
+#define tuplestore_donestoring(state)  ((void) 0)
+
 /* backwards scan is only allowed if randomAccess was specified 'true' */
 extern void *tuplestore_gettuple(Tuplestorestate *state, bool forward,
                                        bool *should_free);
2021-12-14 18:55:02 +03:00
Burak Velioglu 6d849cf394
Allow delegating function from worker nodes
We've both allowed delegating functions and procedures from worker nodes
and also prevented delegation if a function/procedure has already been
propagated from another node.
2021-12-06 19:25:51 +03:00
Onder Kalaci 121f5c4271 Active placements can only be on active nodes
We re-define the meaning of active shard placement. It used
to only be defined via shardstate == SHARD_STATE_ACTIVE.

Now, we also add one more check. The worker node that the
placement is on should be active as well.

This is a preparation for supporting citus_disable_node()
for MX with multiple failures at the same time.

With this change, the maintanince daemon only needs to
sync the "node metadata" (e.g., pg_dist_node), not the
shard metadata.
2021-11-26 09:14:33 +01:00
Onur Tirtir 81af605e07
Fix typo: "no sharding pruning constraints" -> "no shard pruning constraints" (#5490) 2021-11-25 21:00:44 +01:00
Sait Talha Nisanci ab29c25658 Fix missing from entry 2021-11-04 18:54:52 +03:00
Philip Dubé cc50682158 Fix typos. Spurred spotting "connectios" in logs 2021-10-25 13:54:09 +00:00
Önder Kalacı 3f726c72e0
When replication factor > 1, all modifications are done via 2PC (#5379)
With Citus 9.0, we introduced `citus.single_shard_commit_protocol` which
defaults to 2PC.

With this commit, we prevent any user to set it to 1PC and drop support
for `citus.single_shard_commit_protocol`.

Although this might add some overhead for users, it is already the default
behaviour (so less likely) and marking placements as INVALID is much
worse.
2021-10-20 01:39:03 -07:00
Marco Slot 93e79b9262 Never allow co-located joins of append-distributed tables 2021-10-18 21:11:16 +02:00
Marco Slot b97e5081c7 Disable co-located joins for append-distributed tables 2021-10-18 21:11:16 +02:00
Marco Slot dfad73d918 Disable implicit single re-partition joins for append tables 2021-10-18 21:11:16 +02:00
Marco Slot 2206e64e42 Disable single-repartition joins for append tables 2021-10-18 21:11:16 +02:00
Teja Mupparti a8348047c5
Pushdown procedures with OUT parameters (#5348) 2021-10-11 23:14:36 -07:00
Halil Ozan Akgul 43d5853b6d Fixes function names in comments 2021-10-06 09:24:43 +03:00
Halil Ozan Akgul 19af1cef2f Errors for CTEs with search clause
Relevant PG commit:
3696a600e2292d43c00949ddf0352e4ebb487e5b
2021-09-09 13:48:24 +03:00
SaitTalhaNisanci e3e0a028c7
return early in case we want to skip outer vars (#5259) 2021-09-09 10:53:36 +03:00
Sait Talha Nisanci 0b67fcf81d Fix style 2021-09-03 16:09:59 +03:00
Sait Talha Nisanci d1c0403055 Disable Query Idenfifier calculation in tests
When queryId is not 0 and verbose is true, the query identifier is
emitted to the explain output. This is breaking Postgres outputs.
We disable de query identifier calculation in the tests.
Commit on PG that introduced the query identifier in the explain output:
4f0b0966c866ae9f0e15d7cc73ccf7ce4e1af84b
2021-09-03 15:41:28 +03:00
Halil Ozan Akgul 8ef94dc1f5 Changes array_cat argument type from anyarray to anycompatiblearray
Relevant PG commit:
9e38c2bb5093ceb0c04d6315ccd8975bd17add66

fix array_cat_agg for pg upgrades

array_cat_agg now needs to take anycompatiblearray instead of anyarray
because array_cat changed its type from anyarray to anycompatiblearray
with pg14.

To handle upgrades correctly, we drop the aggregate in
citus_pg_prepare_upgrade. To be able to drop it, we first remove the
dependency from pg_depend.

Then we create the right aggregate in citus_finish_pg_upgrade and we
also add the dependency back to pg_depend.
2021-09-03 15:41:28 +03:00
Sait Talha Nisanci 29f5b99951 Use empty string instead of NULL for queryString
Postgres doesn't accept NULL for queryStrings in explain plans anymore.
Internally, there are some places in Postgres where they modified the
NULLS to ""(the empty string). So we do the same on citus side.

Commit on Postgres:
1111b2668d89bfcb6f502789158b1233ab4217a6
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 287706b717 Introduces SetTuplestoreDestReceiverParams_compat macro
SetTuplestoreDestReceiverParams function now has two new parameters
This new macro give us the ability to use this new parameter for PG14 and it doesn't give the parameter for previous versions
Existing parameters are set to NULL to keep previous behavior

Relevant PG commit:
2f48ede080f42b97b594fb14102c82ca1001b80c
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul b01e7e884c Pass NULL for plannerInfo as we don't generate PlaceHolderVars 2021-09-03 15:27:25 +03:00
Halil Ozan Akgul ebf1b7e23f Introduces macros for functions that now have include_out_arguments argument
New macros: FuncnameGetCandidates_compat and expand_function_arguments_compat

The functions (the ones without _compat) now have a new bool include_out_arguments parameter
These new macros give us the ability to use this new parameter for PG14 and it doesn't give the parameter for previous versions
Existing include_out_arguments parameters are set to 'false' to keep current behavior

Relevant PG commit:
e56bce5d43789cce95d099554ae9593ada92b3b7
2021-09-03 15:27:24 +03:00
SaitTalhaNisanci 5ae01303d4
Use get_attnum to find the attribute number of target entry (#5220)
* Use get_attnum to find the attribute number of target entry
2021-08-31 16:47:19 +03:00
SaitTalhaNisanci 2ec4e37e45
Fix assert failure in FindReferencedTableColumn (#5175) 2021-08-12 18:21:45 +03:00
SaitTalhaNisanci 4559d02c41
Fix union pushdown issue (#5079)
* Fix UNION not being pushdown

Postgres optimizes column fields that are not needed in the output. We
were relying on these fields to understand if it is safe to push down a
union query.

This fix looks at the parse query, which has the original column fields
to detect if it is safe to push down a union query.

* Add more tests

* Simplify code and make it more robust

* Process varlevelsup > 0 in FindReferencedTableColumn

* Only look for outers vars in union path

* Add more comments

* Remove UNION ALL specific logic for pulling up childvars
2021-07-29 13:52:55 +03:00
Onder Kalaci 2c349e6dfd Use current user to sync metadata
Before this commit, we always synced the metadata with superuser.
However, that creates various edge cases such as visibility errors
or self distributed deadlocks or complicates user access checks.

Instead, with this commit, we use the current user to sync the metadata.
Note that, `start_metadata_sync_to_node` still requires super user
because accessing certain metadata (like pg_dist_node) always require
superuser (e.g., the current user should be a superuser).

However, metadata syncing operations regarding the distributed
tables can now be done with regular users, as long as the user
is the owner of the table. A table owner can still insert non-sense
metadata, however it'd only affect its own table. So, we cannot do
anything about that.
2021-07-16 13:25:27 +02:00