Commit Graph

3588 Commits (95c4aed73bb00d7d54ab9fc262a085a8cfbe6b62)

Author SHA1 Message Date
Onur Tirtir 95c4aed73b Add changelog entry for 9.3.0 (#3823)
(cherry picked from commit ac1ec40bfb)
2020-05-07 16:11:53 +03:00
Nils Dijk 7973d43ba4 Fix for pruned target list entries (#3818)
DESCRIPTION: Ignore pruned target list entries in coordinator plan

The postgres planner has the ability to prune target list entries that are proven not used in the output relation. When this happens at the `CitusCustomScan` boundary we need to _not_ return these pruned columns to not upset the rest of the planner.

By using the target list the planner asks us to return we fix issues that lead to Assertion failures, and potentially could be runtime errors when they hit in a production build.

Fixes #3809
(cherry picked from commit 105de7beb8)
2020-05-06 15:18:41 +03:00
Hadi Moshayedi e7e36dddca Don't error out when cannot create maintenanced
(cherry picked from commit dbf509bbdd)
2020-05-06 15:16:52 +03:00
Marco Slot 2b8db771ef Make sure we don't wrap GROUP BY expressions in any_value 2020-05-06 06:26:28 +02:00
Onder Kalaci 5474508c01 Rebuild wait event sets after PQconnectPoll() if socket changes
The reason is that PQconnectPoll() may change the underlying
socket. If we don't rebuild the wait event set, the low level
APIs (such as epoll_ctl()) may fail due to invalid sockets.
Instead, rebuilding ensures that we'll use accurate/active sockets.

(cherry picked from commit 77c397e9ae)
2020-05-04 09:02:48 +02:00
Jelte Fennema 8f9b1a839f Add some asserts to pass static analysis (#3805)
(cherry picked from commit c6f5d5fe88)
2020-04-29 11:19:47 +02:00
SaitTalhaNisanci 15fc7821a8 Fix task copy and appending empty task in ExtractLocalAndRemoteTasks (#3802)
* Not append empty task in ExtractLocalAndRemoteTasks

ExtractLocalAndRemoteTasks extracts the local and remote tasks. If we do
not have a local task the localTaskPlacementList will be NIL, in this
case we should not append anything to local tasks. Previously we would
first check if a task contains a single placement or not, now we first
check if there is any local task before doing anything.

* fix copy of node task

Task node has task query, which might contain a list of strings in its
fields. We were using postgres copyObject for these lists. Postgres
assumes that each element of list will be a node type. If it is not a
node type it will error.

As a solution to that, a new macro is introduced to copy a list of
strings.

(cherry picked from commit cbda951395)
2020-04-29 11:41:15 +03:00
Philip Dubé 962bdc67af Fix COPY TO's COPY (SELECT) with distributed table having generated columns
It's necessary to omit generated columns from output

(cherry picked from commit b6b3c1bc17)
2020-04-29 10:50:01 +03:00
SaitTalhaNisanci f7f1b6cc5e Fix typo: longer visible -> no longer visible (#3803) 2020-04-27 16:40:33 +03:00
Onder Kalaci 128273393f Increase the default value of citus.node_connection_timeout
The previous default was 5 seconds, and we change it to 30 seconds.
The main motivation for this is that for busy clusters, 5 seconds
can be too aggressive. Especially with connection throttling, the servers
might be kept busy for a really long time, and users may see the
connection errors more frequently.

We've done some sanity checks, for really quick queries (like
`SELECT count(*) from table`), 30 seconds is a decent value even
if users execute 300 distributed queries on the coordinator. We've
verified this on Hyperscale(Citus).

(cherry picked from commit bc54c5125f)
2020-04-24 16:35:29 +02:00
Onder Kalaci 95ba5dd39c Explicitly mark queries in physical planner for [not] having parameters
Physical planner doesn't support parameters. If the parameters have already
been resolved when the physical planner handling the queries, mark it.
The reason is that the executor is unaware of this, and sends the parameters
along with the worker queries, which fails for composite types.

(See `DissuadePlannerFromUsingPlan()` for the details of paramater resolving)

(cherry picked from commit 0cb7ab2d05)
2020-04-24 13:23:41 +02:00
Onur Tirtir 80da0ed69a Fix build issue in GCC 10 (#3790)
As reported in #3787, we were having issues while building citus with "GCC Red Hat 10" (maybe in some other versions of gcc as well).
Fixes "multiple definition of 'CitusNodeTagNames'" error by explicitly specifying storage of CitusNodeTagNames to be extern.
(cherry picked from commit b8dd8f50d1)
2020-04-22 17:45:08 +03:00
Onur Tirtir 64f9ba2746 Bump Citus version to 9.3.0 2020-04-22 10:22:42 +03:00
Hanefi Onaldi 2e0cb6160c
Merge pull request #3786 from citusdata/coord-skip-dep-setup
Skip dependency setup on coordinator node
2020-04-21 15:29:26 +03:00
Hanefi Önaldı e85b835065
Skip dependency setup on coordinator node 2020-04-21 12:06:31 +03:00
Philip Dubé 2e5b1bfa41
Merge pull request #3756 from citusdata/fix-maintenanced-error-restart
maintenanced: use before_shmem_exit to clear workerPid
2020-04-20 14:57:30 +00:00
Philip Dubé 9093d51a22 maintenanced: handle before_shmem_exit, assert workerPid == 0 on start 2020-04-20 14:41:40 +00:00
Jelte Fennema 1423433531
Fix running check-isolation-base (#3782) 2020-04-20 15:36:09 +02:00
Önder Kalacı 793c65b539
Merge pull request #3606 from citusdata/improve_error_messages
Improve connection error message from the worker nodes
2020-04-20 13:47:43 +02:00
Onder Kalaci e182215d96 Improve connection error message from the worker nodes
We currently put the actual error message to the detail part. However,
many drivers don't show detail part.

As connection errors are somehow common, and hard to trace back, can't
we added the detail to the message itself.

In addition to that, we changed "connection error" message, as it
was confusing to the users who think that the error was happening
while connecting to the coordinator. In fact, this error is showing
up when the coordinator fails to connect remote nodes.
2020-04-20 13:32:55 +02:00
Hadi Moshayedi 797180e0e3
Merge pull request #3778 from citusdata/more_replicate
Replicate reference tables before master_create_empty_shard
2020-04-17 16:54:59 -07:00
Hadi Moshayedi 1250d691d3 Replicate reference tables before master_create_empty_shard 2020-04-17 16:47:03 -07:00
Philip Dubé c03d3714b3
Merge pull request #3779 from citusdata/insert-select-copy-cache-entry
Try copying shard intervals out of cache for long lived borrow
2020-04-17 22:49:58 +00:00
Philip Dubé 8e79672839 Try copying shard intervals out of cache for long lived borrow 2020-04-17 22:00:41 +00:00
Philip Dubé a461ef20d9
Merge pull request #3769 from citusdata/avoid-invalidating-live-cache-entries
Avoid invalidating live cache entries
2020-04-17 15:22:10 +00:00
Philip Dubé c00d57a955 CreateDistributedInsertSelectPlan: avoid calling GetCitusTableCacheEntry in a way that would invalidate live ShardInterval pointers 2020-04-17 14:44:23 +00:00
SaitTalhaNisanci 1d0f4bdcd2
invalidate plan cache in master_update_node (#3758)
* invalidate plan cache in master_update_node

If a plan is cached by postgres but a user uses master_update_node, then
when the plan cache is used for the updated node, they will get the old
nodename/nodepost in the plan. This is because the plan cache doesn't
know about the master_update_node. This could be a problem in prepared
statements or anything that goes into plancache. As a solution the plan
cache is invalidated inside master_update_node.

* add invalidate_inactive_shared_connections test function

We introduce invalidate_inactive_shared_connections udf to be used in
testing. It is possible that a connection count for an inactive node
will be greater than 0 and in that case it will not be removed at the
time of invalidation. However, later we don't have a mechanism to remove
it, which means that it will stay in the hash. For this not to cause a
problem, we use this udf in testing.

* move invalidate_inactive_shared_connections to udfs from test as it will be used in mx

* remove the test udf

* remove the IsInactive check
2020-04-17 17:43:48 +03:00
Philip Dubé ae391c4f4b
Merge pull request #3768 from citusdata/avoid-long-lived-metadata
Copy data from CitusTableCacheEntry more often
2020-04-17 14:25:51 +00:00
Philip Dubé c0a95a3adb Copy data from CitusTableCacheEntry more often
This copies over fixes from reference counting branch,
all CitusTableCacheEntry data may be freed when a GetCitusTableCacheEntry call occurs for its relationId

This fix is not complete, but reference counting is being deferred until 9.4

CopyShardInterval: remove dest parameter, always return newly allocated object
2020-04-17 14:17:18 +00:00
Önder Kalacı a919f09c96
Remove the entries from the shared connection counter hash when no connections remain (#3775)
We initially considered removing entries just before any change to
pg_dist_node. However, that ended-up being very complex and making
MX even more complex.

Instead, we're switching to a simpler solution, where we remove entries
when the counter gets to 0.

With certain workloads, this may have some performance penalty. But, two
notes on that:
 - When counter == 0, it implies that the cluster is not busy
 - With cached connections, that's not possible
2020-04-17 17:14:58 +03:00
Philip Dubé 79f6f3c02c
Merge pull request #3757 from citusdata/fix-window-function-assertion-failure
Avoid setting hasWindowFuncs true after window functions have been optimized out of query
2020-04-17 12:39:32 +00:00
Philip Dubé e4a4707f4a Avoid setting hasWindowFuncs true after window functions have been optimized out of query 2020-04-17 12:22:48 +00:00
SaitTalhaNisanci a9a3be15cc
introduce TASK_QUERY_NULL task type (#3774)
When we call SetTaskQueryString we would set the task type to
TASK_QUERY_TEXT, and some parts of the codebase rely on the fact that if
TASK_QUERY_TEXT is set, the data can be read safely. However if
SetTaskQueryString is called with a NULL taskQueryString this can cause
crashes. In that case taskQueryType will simply be set to
TASK_QUERY_NULL.
2020-04-17 14:59:22 +03:00
Hanefi Onaldi 2d50f63841
Merge pull request #3752 from citusdata/local-truncate
UDF to truncate local data after distributing table
2020-04-17 13:50:58 +03:00
Hanefi Önaldı 0c5d0cfee9
Notice message to help truncate local data after distribution 2020-04-17 13:21:34 +03:00
Hanefi Önaldı d535121f8d
Introduce truncate_local_data_after_distributing_table() 2020-04-17 13:21:34 +03:00
Marco Slot c3324f8962
Merge pull request #3772 from citusdata/fixes_from_enterprise
Use block_writes for replicate_reference_tables
2020-04-17 12:07:57 +02:00
Hadi Moshayedi 61198251fd Use block_writes for replicate_reference_tables 2020-04-16 19:25:41 -07:00
Nils Dijk 1d6ba1d09e
Refactor alter role to work on distributed roles (#3739)
DESCRIPTION: Alter role only works for citus managed roles

Alter role was implemented before we implemented good role management that hooks into the object propagation framework. This is a refactor of all alter role commands that have been implemented to
 - be on by default
 - only work for supported roles
 - make the citus extension owner a supported role

Instead of distributing the alter role commands for roles at the beginning of the node activation role it now _only_ executes the alter role commands for all users in all databases and in the current database.

In preparation of full role support small refactors have been done in the deparser.

Earlier tests targeting other roles than the citus extension owner have been either slightly changed or removed to be put back where we have full role support.

Fixes #2549
2020-04-16 12:23:27 +02:00
Hadi Moshayedi e0eba87b6c
Merge pull request #3764 from citusdata/fix_stuck
Detect deadlocks in replicate_reference_tables()
2020-04-15 13:18:38 -07:00
Hadi Moshayedi 59b9a4e5a1 Detect deadlocks in replicate_reference_tables() 2020-04-15 11:06:18 -07:00
SaitTalhaNisanci df9048ebaa
update outdated comments related to local_execution (#3759) 2020-04-15 16:15:43 +03:00
Marco Slot 5bd4970fac
Merge pull request #3017 from citusdata/fix/notices
Propagate notices from queries as notices
2020-04-15 11:50:56 +02:00
Marco Slot 8b83306a27 Issue worker messages with the same log level 2020-04-14 21:08:25 +02:00
SaitTalhaNisanci 132efdbc56
add execution params struct (#3747)
We had 9+ parameters in some of the functions related to execution.
Execution params is created to simplify this a bit so that we can set
only the fields that we are interested in and it is easier to read.
2020-04-14 14:32:40 +03:00
SaitTalhaNisanci d58b5e67c1
not run multi_router_planner_fast_path in parallel (#3744) 2020-04-14 13:14:23 +03:00
Önder Kalacı 9229db2081
Merge pull request #3692 from citusdata/shared_connection_counter
Throttle connections to the worker  nodes
2020-04-14 10:37:57 +02:00
Onder Kalaci aa6b641828 Throttle connections to the worker nodes
With this commit, we're introducing a new infrastructure to throttle
connections to the worker nodes. This infrastructure is useful for
multi-shard queries, router queries are have not been affected by this.

The goal is to prevent establishing more than citus.max_shared_pool_size
number of connections per worker node in total, across sessions.

To do that, we've introduced a new connection flag OPTIONAL_CONNECTION.
The idea is that some connections are optional such as the second
(and further connections) for the adaptive executor. A single connection
is enough to finish the distributed execution, the others are useful to
execute the query faster. Thus, they can be consider as optional connections.
When an optional connection is not allowed to the adaptive executor, it
simply skips it and continues the execution with the already established
connections. However, it'll keep retrying to establish optional
connections, in case some slots are open again.
2020-04-14 10:27:48 +02:00
Onder Kalaci 38b8a9ad62 Add citus_remote_connection_stats() function
This function is intended to be used for monitoring
the remote connections.
2020-04-14 10:03:27 +02:00
Onder Kalaci 0dbfbe0c37 Add the necessary shared memory infrastructure
- The hashmap in the shared memory
- The lock to access the hashmap
- The GUC to control the size
2020-04-14 10:03:26 +02:00