Apply suggestions from code review

Co-authored-by: Steven Sheehy <17552371+steven-sheehy@users.noreply.github.com> Co-authored-by: Marco Slot <marco.slot@gmail.com>
2024-06-27 09:48:00 +02:00 · 2024-06-27 09:48:00 +02:00 · c3d1ad3f9d
parent d59466fce3
commit c3d1ad3f9d
1 changed files with 6 additions and 6 deletions
--- a/src/backend/distributed/README.md
+++ b/src/backend/distributed/README.md
@ -2368,12 +2368,12 @@ necessary to change one of them to add a feature/fix a bug.
 The rebalancing algorithm tries to find an optimal placement of shard groups
 across nodes. This is not an easy job, because this is a [co-NP-complete
-problem](https://en.wikipedia.org/wiki/Knapsack_problem). So instead going for
+problem](https://en.wikipedia.org/wiki/Knapsack_problem). So instead of going for
 the fully optimal solution it uses a greedy approach to reach a local
-optimimum, which so far has proved effective in getting to a pretty optimal
+optimum, which so far has proved effective in getting to a pretty optimal
 solution.
-Eventhough it won't result in the perfect balance, the  greedy aproach has two
+Even though it won't result in the perfect balance, the  greedy approach has two
 important practical benefits over a perfect solution:
 1. It's relatively easy to understand why the algorithm decided on a certain move.
 2. Every move makes the balance better. So if the rebalance is cancelled midway
@ -2395,7 +2395,7 @@ main usage for "Is a shard group allowed on a certain node?" is to be able to pi
 specific shard group to a specific node.
 There is one last definition that you should know to understand the algorithm
-and that is "utilization". Utilization is total cost of all shard groups
+and that is "utilization". Utilization is the total cost of all shard groups
 divided by capacity. In practice this means that utilization is almost always
 the same as cost because as explained above capacity is almost always 1. So if
 you see "utilization" in the algorithm, for all intents and purposes you can
@ -2457,7 +2457,7 @@ moving shards around isn't free.
  most 10% above or 10% below the average utilization then no moves are
  necessary anymore (i.e. the nodes are balanced enough). The main reason for
  this threshold is that these small differences in utilization are not
-  necessarily problematic and might very well resolve automatically over time
+  necessarily problematic and might very well resolve automatically over time. For example, consider a scenario in which
  one shard gets mostly written in during the weekend, while another one during
  the week. Moving shards on monday and that you then have to move back on
  friday is not very helpful given the overhead of moving data around.
@ -2577,7 +2577,7 @@ and jobs in parallel. This can make rebalancing go much faster especially in
 clusters with many nodes. To ensure that we're not doing too many tasks in
 parallel though we have a few ways to limit concurrency:
-1. Tasks can depend on eachother. This makes sure that one task doesn't start
+1. Tasks can depend on each other. This makes sure that one task doesn't start
   before all the ones that it depends on have finished.
 2. The maximum number of parallel tasks being executed at the same time can be
   limited using `citus.max_background_task_executors`. The default for