Commit Graph

14 Commits (ace800851a88d691f694c86244dcccd72ea90d1d)

Author SHA1 Message Date
jeff-davis a2f5b068e6
Columnar: tighten security and improve visibility. (#5922)
Move internal storage details to a separate schema with no public
access to limit the possibility for information leakage.

Create views with public access that show storage details for those
columnar tables where the user has ownership privileges. Include
mapping between relation ID and storage ID for easier interpretation.
2022-05-20 15:30:31 -07:00
Onur Tirtir fe72e8bb48
Discard index deletion requests made to columnarAM (#5331)
A write operation might trigger index deletion if index already had
dead entries for the key we are about to insert.
There are two ways of index deletion:
  a) simple deletion
  b) bottom-up deletion (>= pg14)

Since columnar_index_fetch_tuple never sets all_dead to true,
columnarAM doesn't ever expect to receive simple deletion requests
(columnar_index_delete_tuples) as we don't mark any index entries
as dead.

However, since columnarAM doesn't delete any dead entries via simple
deletion, postgres might ask for a more comprehensive deletion
(i.e.: bottom-up) at some point when pg >= 14.

So with this commit, we start gracefully ignoring bottom-up deletion
requests made to columnar_index_delete_tuples.

Given that users can anyway "VACUUM FULL" their columnar tables,
we don't see any problem in ignoring deletion requests.
2021-10-01 14:32:47 +03:00
Onur Tirtir ea61efb63a
Not flush writes until need to read them when doing index-scan on columnar (#5247)
Not flush pending writes if given tid belongs to a "flushed" or
"aborted" stripe write, or to an "in-progress" stripe write of
another backend.

That way, we would reduce the cases where we flush single-tuple
stripes during index scan.

To do that, we follow below steps for index look-up's:

- Do not flush any pending writes and do stripe metadata look-up for
  given tid.
  If tuple with tid is found, then no need to do another look-up
  since we already found the tuple without needing to flush pending
  writes.

- If tuple is not found without flushing pending writes, then we have two
  scenarios:

  -  If given tid belongs to a pending write of my backend, then do stripe
     metadata look-up for given tid. But this time first **flush any pending
     writes**.
     
  -  Otherwise, just return false from `index_fetch_tuple` since flushing
      pending writes wouldn't help.
2021-09-13 18:41:20 +02:00
Onur Tirtir 4ee0fb2758
Make sure to skip aborted writes when reading the first tuple (#5274)
With 5825c44d5f, we made the changes to
skip aborted writes when scanning a columnar table.

However, looks like we forgot to handle such cases for the very first
call made to columnar_getnextslot. That means, that commit only
considered the intermediate stripe read operations.

However, functions called by columnar_getnextslot to find first stripe
to read (ColumnarBeginRead & ColumnarRescan) were not caring about
those aborted writes.

To fix that, we teach AdvanceStripeRead to find the very first stripe
to read, and then start using it where were blindly calling
FindNextStripeByRowNumber.
2021-09-13 11:50:53 +03:00
Onur Tirtir 3340f17c4e
Prevent planner from choosing parallel scan for columnar tables (#5245)
Previously, for regular table scans, we were setting `RelOptInfo->partial_pathlist`
to `NIL` via `set_rel_pathlist_hook` to discard scan `Path`s that need to use any
parallel workers, this was working nicely.

However, when building indexes, this hook doesn't get called so we were not
able to prevent spawning parallel workers when building an index. For this
reason, 9b4dc2f804 added basic
implementation for `columnar_parallelscan_*` callbacks but also made some
changes to skip using those workers when building the index.

However, now that we are doing stripe reservation in two stages, we call 
`heap_inplace_update` at some point to complete stripe reservation.
However, postgres throws an error if we call `heap_inplace_update` during
a parallel operation, even if we don't actually make use of those workers.

For this reason, with this pr, we make sure to not generate scan `Path`s that
need to use any parallel workers by using `get_relation_info_hook`.

This is indeed useful to prevent spawning parallel workers during index builds.
2021-09-08 13:53:43 +03:00
Onur Tirtir 5825c44d5f
Handle aborted writes properly when scanning a columnar table (#5244)
If it is certain that we will not use any `parallel_worker`s for a columnar table,
then stripe entries inserted by aborted transactions become visible to
`SnapshotAny` and that causes `REINDEX` to fail by throwing a duplicate key
error.

To fix that:
* consider three states for a stripe write operation:
   "flushed", "aborted", or "in-progress",
* make sure to have a clear separation between them, and
* act according to those three states when reading from a columnar table
2021-09-08 13:26:11 +03:00
Sait Talha Nisanci e0faf34417 turn off costs in columnar_indexes explain query 2021-09-03 15:41:28 +03:00
Onur Tirtir 73058d35cc Not free (stripe) chunk buffers after de-serializing
Previously, we were only using chunk group reader for sequential scan.
However, to support index scans on columnar tables, now we use very
same low level functions for index scan too.

Since those low-level functions were only used for sequential scan, it
was guaranteed that we would never read the same chunk group more than
once, so we were freeing chunk buffers after deserializing them into a
separate buffer.

Now that we use those low level functions for index scan, we cannot
free chunk buffers since it's possible to read the same chunk group
again, such that:

- read chunk group 1 of stripe 5
- read chunk group 2 of stripe 5
- read chunk group 1 of stripe 5 again

Here, when we decide to read chunk group 1 for a second time,
chunk group 1 is not cached. Plus, before this commit, we were
freeing the chunk buffers for chunk group 1 after the first
read and then we were getting segfault or errors from low-level
de-compression APIs.
2021-08-02 11:00:12 +03:00
Onur Tirtir 83f5d42365 Use long-lasting mem cxt & optimize correlated index scan 2021-08-02 11:00:12 +03:00
Onur Tirtir f00c63c33d
Support columnar table index builds with CONCURRENTLY option (#5032)
With this commit, we add (`CREATE INDEX` / `REINDEX`) `CONCURRENTLY` support for columnar tables.

For that, we implement `columnar_index_validate_scan` callback.
The reasoning behind the implementation is as follows:

* Postgres function `validate_index` provides all the TIDs that are currently in the
  index to `columnar_index_validate_scan` callback via a `tupleSort` object..

* We start scanning the table by using `columnar_getnextslot` as usual.
  Before moving forward, note that `columnar_getnextslot` guarantees
  to return tuples in the order of their TIDs.

* For us to use during table scan, postgres provides a snapshot guaranteeing
  that any tuples that are valid according to that snapshot but are not in the
  index must be added to the index.

* Then for each tuple that we read from our table, we continue iterating
  given `tupleSort` to find the first TID that is greater than or equal to our
  tuple's TID.

  If both TID's are equal to each other, then we skip the tuple since it's already
  indexed.

  If the TID that we read from tupleSort is greater then our tuple's TID, then
  we decide to insert this tuple into index.
2021-07-09 13:44:58 +03:00
Onur Tirtir b6b969971a Error out for CLUSTER commands on columnar tables 2021-06-16 20:06:33 +03:00
Onur Tirtir 9b4dc2f804 Prevent using parallel scan for columnar index builds 2021-06-16 19:59:32 +03:00
Onur Tirtir 10a762aa88 Implement columnar index support functions 2021-06-16 19:59:32 +03:00
jeff-davis 9da9bd3dfd
Columnar: rename files and tests. (#4751)
* Columnar: rename files and tests.

* Columnar: Rename Table*State to Columnar*State.
2021-03-01 08:34:24 -08:00