This code is not needed anymore since #3668 was merged.
It's actually causing some issues when using the binary Postgres
protocol, because postgres thinks it gets a `bigint` from
the worker, but actually gets an normal `int`.
The query in question that fails is this:
```sql
CREATE TABLE test_table_1(id int, val1 int);
CREATE TABLE test_table_2(id int, val1 bigint);
SELECT create_distributed_table('test_table_1', 'id');
SELECT create_distributed_table('test_table_2', 'id');
INSERT INTO test_table_1 VALUES(1,1),(2,2),(3,3);
INSERT INTO test_table_2 VALUES(1,1),(3,3),(4,5);
SELECT val1
FROM test_table_1 LEFT JOIN test_table_2 USING(id, val1)
ORDER BY 1;
```
The difference in queries that is sent to the workers after this change is this, for this query:
```diff
--- query_old.sql 2020-06-09 09:51:21.460000000 +0200
+++ query_new.sql 2020-06-09 09:51:39.500000000 +0200
@@ -1 +1 @@
-SELECT worker_column_1 AS val1 FROM (SELECT test_table_1.val1 AS worker_column_1 FROM (public.test_table_1_102015 test_table_1(id, val1) LEFT JOIN public.test_table_2_102019 test_table_2(id, val1) USING (id, val1))) worker_subquery
+SELECT worker_column_1 AS val1 FROM (SELECT val1 AS worker_column_1 FROM (public.test_table_1_102015 test_table_1(id, val1) LEFT JOIN public.test_table_2_102019 test_table_2(id, val1) USING (id, val1))) worker_subquery
```
This is a different version of #3634. It also removes SwallowErrors, but
instead of modifying our own functions to not throw errors, it uses the
postgres built in `PathNameDeleteTemporaryDir` function. This function
does not throw errors.
Since this change is for a bugfix, I tried to minimize the changes.
PRs with the following changes would be good to do separately from this
PR:
1. Use PathName(Create|Open|Delete)Temporary(File|Dir) to open and
remove all files/dirs instead of our own custom file functions.
2. Prefix our outmost files/directories with `PG_TEMP_FILE_PREFIX` so
that they are identified by Postgres as temporary files, which will be
removed at postmaster start. This way we do not have to do this cleanup
ourselves.
3. Store the files in the temporary table space if it exists.
Fixes#3634Fixes#3618
Implements worker_save_query_explain_analyze and worker_last_saved_explain_analyze.
worker_save_query_explain_analyze executes and returns results of query while
saving its EXPLAIN ANALYZE to be fetched later.
worker_last_saved_explain_analyze returns the saved EXPLAIN ANALYZE result.
If we want to get necessary lockmode for a relation RangeVar within
a query, we can get the lockmode easily from the RangeVar itself (if
pg version >= 12).
However, if we want to decide the lockmode appropriate for the
"query", we can derive this information by using GetQueryLockMode
according to the code comment from RangeTblEntry->rellockmode.
Implements a new `TupleDestination` interface to allow custom tuple processing per task.
This can be specially useful if a task contains multiple queries. An example of this EXPLAIN
ANALYZE, where it needs to add some UDF calls to the query to fetch the explain output
from worker after fetching the actual query results.
We should check the remove type in IsDropCitusStmt because if the remove
type is not OBJECT_EXTENSION then the stored objects in
dropStmt->objects may not be of type Value. This was crashing PG-13.
Also rename the method as IsDropCitusExtensionStmt.
To reduce code duplication, implement function that pushes search_path
to be NIL and sets addCatalog to true so that all objects outside of
pg_catalog will be schema-prefixed.
Append IF NOT EXISTS to CREATE SERVER commands generated by
pg_get_serverdef_string function when deparsing an existing server
object that a foreign table depends.
SELECT_TASK is renamed to READ_TASK as a SELECT with modifying CTEs will be a MODIFYING_TASK
RouterInsertJob: Assert originalQuery->commandType == CMD_INSERT
CreateModifyPlan: Assert originalQuery->commandType != CMD_SELECT
Remove unused function IsModifyDistributedPlan
DistributedExecution, ExecutionParams, DistributedPlan: Rename hasReturning to expectResults
SELECTs set expectResults to true
Rename CreateSingleTaskRouterPlan to CreateSingleTaskRouterSelectPlan
Do not release AccessShareLock when closing pg_constraint to prevent
modifications to be done on pg_constraint to make sure that caller
will process valid foreign key constraints through the transaction.
With this commit:
You can trigger two types of hammerdb benchmark jobs:
-ch_benchmark (analytical and transactional queries)
-tpcc_benchmark (only transactional queries)
Your branch will be run against `master` branch.
In order to trigger the jobs prepend `ch_benchmark/` or `tpcc_benchmark/` to your branch and push it.
For example if you were running on a feature/improvement branch with name `improve/adaptive_executor`. In order to trigger a tpcc benchmark, you can do the following:
```bash
git checkout improve/adaptive_executor
git checkout -b tpcc_benchmark/improve/adaptive_executor
git push origin tpcc_benchmark/improve/adaptive_executor # the tpcc benchmark job will be triggered.
```
You will see the results in a branch in [https://github.com/citusdata/release-test-results](https://github.com/citusdata/release-test-results).
The branch name will be something like: `delete_me/citusbot_tpcc_benchmark_rg/<date>/<date>`.
The resource groups will be deleted automatically but if the benchmark fails, they won't be deleted(If you don't see the results after a reasonable time, it might mean it failed, you can check the resource usage from portal, if it is almost 0 and you didn't see the results, it means it probably failed). In that case, you will need to delete the resource groups manually from portal, the resource groups are `citusbot_ch_benchmark_rg` and `citusbot_tpcc_benchmark_rg`.
This PR removes ExecuteUtilityTaskListWithoutResults and uses the same
path for local execution via ExecuteTaskListExtended.
ExecuteUtilityTaskList is added. ExecuteLocalTaskListExtended now has a
parameter for utility commands so that it can call the right method. In
order not to change the existing calls,
ExecuteTaskListExtendedInternal is added, which is the main method that
runs the execution, via local and remote execution.
DESCRIPTION: Ignore pruned target list entries in coordinator plan
The postgres planner has the ability to prune target list entries that are proven not used in the output relation. When this happens at the `CitusCustomScan` boundary we need to _not_ return these pruned columns to not upset the rest of the planner.
By using the target list the planner asks us to return we fix issues that lead to Assertion failures, and potentially could be runtime errors when they hit in a production build.
Fixes#3809
* add a job to check if merge to enterprise master would fail
Add a job to check if merge to enterprise master would fail.
The job does the following:
- It checks if there is already a branch with the same name on
enterprise, if so it tries to merge it to enterprise master, if the
merge fails the job fails.
- If the branch doesn't exist on the enterprise, it tries to merge the
current branch to enterprise master, it fails if there is any conflict
while merging.
The motivation is that if a branch on community would create a conflict
on enterprise-master, until we create a PR on enterprise that would
solve this conflict, we won't be able to merge the PR on community. This
way we won't have many conflicts when merging to enterprise master and
the author, who has the most context will be responsible for resolving
the conflict when he has the most context, not after 1 month.
* Improve test suite to be able to easily run locally
* Add documentation on how to resolve conflicts to enterprise master
* Improve enterprise merge script
* Improve merge conflict job README
* Improve merge conflict job README
* Improve merge conflict job README
* Improve merge conflict job README
Co-authored-by: Nils Dijk <nils@citusdata.com>
In the code, we had the assumption that if restriction information
is NULL, it means that we cannot have any disributetd tables in
the subquery.
However, for subqueries in WHERE clause, that is not the case when
the subquery is ANDed with FALSE. In that case, Citus operates
on the originalQuery (which doesn't go through the standard_planner()),
and rely on the restriction information generated by standard_plannner().
As Postgres is smart enough to no generate restriction information for
subqueries ANDed with FALSE, we hit the assertion.
The reason is that PQconnectPoll() may change the underlying
socket. If we don't rebuild the wait event set, the low level
APIs (such as epoll_ctl()) may fail due to invalid sockets.
Instead, rebuilding ensures that we'll use accurate/active sockets.
* Not append empty task in ExtractLocalAndRemoteTasks
ExtractLocalAndRemoteTasks extracts the local and remote tasks. If we do
not have a local task the localTaskPlacementList will be NIL, in this
case we should not append anything to local tasks. Previously we would
first check if a task contains a single placement or not, now we first
check if there is any local task before doing anything.
* fix copy of node task
Task node has task query, which might contain a list of strings in its
fields. We were using postgres copyObject for these lists. Postgres
assumes that each element of list will be a node type. If it is not a
node type it will error.
As a solution to that, a new macro is introduced to copy a list of
strings.
The previous default was 5 seconds, and we change it to 30 seconds.
The main motivation for this is that for busy clusters, 5 seconds
can be too aggressive. Especially with connection throttling, the servers
might be kept busy for a really long time, and users may see the
connection errors more frequently.
We've done some sanity checks, for really quick queries (like
`SELECT count(*) from table`), 30 seconds is a decent value even
if users execute 300 distributed queries on the coordinator. We've
verified this on Hyperscale(Citus).
Physical planner doesn't support parameters. If the parameters have already
been resolved when the physical planner handling the queries, mark it.
The reason is that the executor is unaware of this, and sends the parameters
along with the worker queries, which fails for composite types.
(See `DissuadePlannerFromUsingPlan()` for the details of paramater resolving)
As reported in #3787, we were having issues while building citus with "GCC Red Hat 10" (maybe in some other versions of gcc as well).
Fixes "multiple definition of 'CitusNodeTagNames'" error by explicitly specifying storage of CitusNodeTagNames to be extern.