Commit Graph

33 Commits (37ae22ce3e0fd3eedf33a0019a0aecba1bbcfa91)

Author SHA1 Message Date
Halil Ozan Akgul 37ae22ce3e Introduces macros for vacuum options
VacOptTernaryValue enum is renamed to VacOptValue.
In the enum there were three values, VACOPT_TERNARY_DEFAULT, VACOPT_TERNARY_DISABLED, and VACOPT_TERNARY_ENABLED
Now there are four values VACOPTVALUE_UNSPECIFIED, VACOPTVALUE_AUTO, VACOPTVALUE_DISABLED, and VACOPTVALUE_ENABLED

New macros are VacOptValue_compat, VACOPTVALUE_UNSPECIFIED_COMPAT, VACOPTVALUE_DISABLED_COMPAT, and VACOPTVALUE_ENABLED_COMPAT
The VACOPTVALUE_UNSPECIFIED_COMPAT matches VACOPT_TERNARY_DEFAULT and VACOPTVALUE_UNSPECIFIED. And there are no macro for VACOPTVALUE_AUTO.

Relevant PG commit:
3499df0dee8c4ea51d264a674df5b5e31991319a
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 4bc0c80bba Adds index_delete_tuples instead of compute_xid_horizon_for_tuples
Relevant PG commit:
d168b666823b6e0bcf60ed19ce24fb5fb91b8ccf
2021-09-03 15:27:24 +03:00
jeff-davis 4718b6bcdf
Generate parameterized paths for columnar scans. (#5172)
Allow ColumnarScans to push down join quals by generating
parameterized paths. This significantly expands the utility of chunk
group filtering, making a ColumnarScan behave similar to an index when
on the inner of a nested loop join.

Also, evaluate all parameters on beginscan/rescan, which also works
for external parameters.

Fixes #4488.
2021-09-02 22:22:48 -07:00
Onur Tirtir 9cb5ef5007 Pass ColumnarScanDesc to ColumnarScanChunkGroupsFiltered 2021-09-02 13:20:11 +03:00
Onur Tirtir 889a2731cb
Split columnar stripe reservation into two phases (#5188)
Previously, we were doing `first_row_number` reservation for the first
row written to current `WriteState` but were doing `stripe_id`
reservation when flushing the `WriteState` and were inserting the
related record to `columnar.stripe` at that time as well.

However, inserting `columnar.stripe` record at flush-time is
problematic. This is because, as told in #5160, if relation has
any index-based constraints and if there are two concurrent
writes that are inserting conflicting key values for that constraint,
then postgres relies on `tableAM->fetch_index_tuple`
(=`columnar_fetch_index_tuple`) callback to return `true` when
indexAM is checking against possible constraint violations.

However, pending writes of other backends are not visible to concurrent
sessions in columnar since we were not inserting the stripe metadata
record until flushing the stripe.

With this commit, we split stripe reservation into two phases:
i) Reserve `stripe_id` and insert a "dummy" record to `columnar.stripe`
at the very same time we reserve `first_row_number`, i.e. when writing
the first row to the current `WriteState`.
ii) At flush time, do the storage level allocation and complete the
missing fields of the dummy record inserted into `columnar.stripe`
during i).

That way, any concurrent writes would be able to check against possible
constraint violations by using `SnapshotDirty` when scanning
`columnar.stripe`.

Note that `columnar_fetch_index_tuple` still wouldn't be able to fill
the output tupleslot for the requested tid but it would at least return
`true` for such index look-up's and we believe this should be sufficient
for the caller indexAM callback to make the concurrent writer block on
prior one.

That is how we fix #5160.

Only downside of reserving `stripe_id` at the same time we reserve
`first_row_number` is that now any aborted writes would also waste
some amount of `stripe_id` as in the case of `first_row_number` but
we are just wasting them one-by-one.

Considering the fact that we waste `first_row_number` by the amount
stripe row limit (=150k by default) in such cases, this shouldn't be
important at all.
2021-09-02 11:49:14 +03:00
Onur Tirtir bf4dfad6f7 Update curcid of given snapshot if it is MVCC
Before starting to scan a columnar table, we always flush the pending
writes to disk.

However, we increment command counter after modifying metadata tables.

On the other hand, now that we _don't always use_ xact snapshot to scan
a columnar table, writes that we just flushed might not be visible to
the query that just flushed pending writes to disk since curcid of
provided snapshot would become smaller than the command id being used
when modifying metadata tables.

To give an example, before this change, below was a possible scenario
due to the changes that we made to use the correct snapshot.

```sql
CREATE TABLE t(a int, b int) USING columnar;
BEGIN;
  INSERT INTO t VALUES (5, 10);

  SELECT * FROM t;
  ┌───┬───┐
  │ a │ b │
  ├───┼───┤
  └───┴───┘
  (0 rows)

  SELECT * FROM t;
  ┌───┬────┐
  │ a │ b  │
  ├───┼────┤
  │ 5 │ 10 │
  └───┴────┘
  (1 row)
```
2021-09-02 11:11:59 +03:00
Onur Tirtir 6c26c67ea0 Flush write state when initializing read state
In next commit, we will adjust curcid of the snapshot being used when
scanning the columnar table.

However, for index scan, snapshot is provided not when beginning scan
but within fetch-tuple call.

For this reason, start flushing pending writes in init_columnar_read_state
since this seem to be a prerequisite step that needs to be done before
scanning a columnar table regardless of the scan method being used.
2021-09-02 11:10:11 +03:00
Onur Tirtir 0b4ed075b5 Use correct snapshot when reading a columnar table
Instead of using xact snapshot, use the snapshot provided
to columnarAM when scanning table.
2021-09-02 11:10:11 +03:00
Onur Tirtir 68f46c5dc9 Use scan context for intermediate mem allocs too 2021-08-16 11:06:03 +03:00
Onur Tirtir b3d9fc91f8 Always use right mem cxt when creating ColumnarReadState
All the callers except columnar_relation_copy_for_cluster were already
switching to right memory context when creating ColumnarReadState.

With this commit, we embed that logic into init_columnar_read_state
to avoid further such bugs.

That way, we start using the right memory context for
columnar_relation_copy_for_cluster too.
2021-08-16 11:06:03 +03:00
Onur Tirtir 7fcecde203 Use init_columnar_read_state instead of lower level func
Funtionally, this doesn't change anything. This is just a preparation
before next commit.
2021-08-16 11:06:03 +03:00
Onur Tirtir 83f5d42365 Use long-lasting mem cxt & optimize correlated index scan 2021-08-02 11:00:12 +03:00
Onur Tirtir c021b82a43 Introduce CreateColumnarScanMemoryContext 2021-08-02 11:00:12 +03:00
Onur Tirtir 84a49cc221 Improve error message for indexAMs not supported by columnar 2021-07-30 16:41:53 +03:00
Onur Tirtir eeecbd2324 Introduce ColumnarSupportsIndexAM 2021-07-30 16:40:27 +03:00
Onur Tirtir f00c63c33d
Support columnar table index builds with CONCURRENTLY option (#5032)
With this commit, we add (`CREATE INDEX` / `REINDEX`) `CONCURRENTLY` support for columnar tables.

For that, we implement `columnar_index_validate_scan` callback.
The reasoning behind the implementation is as follows:

* Postgres function `validate_index` provides all the TIDs that are currently in the
  index to `columnar_index_validate_scan` callback via a `tupleSort` object..

* We start scanning the table by using `columnar_getnextslot` as usual.
  Before moving forward, note that `columnar_getnextslot` guarantees
  to return tuples in the order of their TIDs.

* For us to use during table scan, postgres provides a snapshot guaranteeing
  that any tuples that are valid according to that snapshot but are not in the
  index must be added to the index.

* Then for each tuple that we read from our table, we continue iterating
  given `tupleSort` to find the first TID that is greater than or equal to our
  tuple's TID.

  If both TID's are equal to each other, then we skip the tuple since it's already
  indexed.

  If the TID that we read from tupleSort is greater then our tuple's TID, then
  we decide to insert this tuple into index.
2021-07-09 13:44:58 +03:00
Onur Tirtir 7bfd84bc70 Introduce StripeGetHighestRowNumber 2021-07-07 11:01:39 +03:00
Onur Tirtir 3d11c0f9ef Merge remote-tracking branch 'origin/master' into columnar-index
Conflicts:
	src/test/regress/expected/columnar_empty.out
	src/test/regress/expected/multi_extension.out
2021-06-16 20:23:50 +03:00
Onur Tirtir b6b969971a Error out for CLUSTER commands on columnar tables 2021-06-16 20:06:33 +03:00
Onur Tirtir 5adab2a3ac Report progress when building index on columnar tables 2021-06-16 20:06:33 +03:00
Onur Tirtir 9b4dc2f804 Prevent using parallel scan for columnar index builds 2021-06-16 19:59:32 +03:00
Onur Tirtir 10a762aa88 Implement columnar index support functions 2021-06-16 19:59:32 +03:00
Onur Tirtir a209999618
Enforce table opt constraints when using alter_columnar_table_set (#5029) 2021-06-08 17:39:16 +03:00
Onur Tirtir 94f30a0428 Refactor index check in ColumnarProcessUtility 2021-06-01 11:12:28 +03:00
Onur Tirtir 181848cc80 Implement ErrorIfInvalidRowNumber
To use the same logic when mapping tid's to row number's
2021-05-10 20:16:50 +03:00
Onur Tirtir 2552aee404 Handle old versioned columnar metapage after binary upgrade (#4956)
* Make VACUUM hint for upgrade scenario actually work

* Suggest using VACUUM if metapage doesn't exist

Plus, suggest upgrading sql version as another option.

* Always force read metapage block

* Fix two typos
2021-05-10 20:16:50 +03:00
Onur Tirtir 2e419ea177 Add first_row_number column to columnar.stripe for tid mapping 2021-05-10 20:16:50 +03:00
jeff-davis 7b9aecff21 Columnnar: metapage changes. (#4907)
* Columnar: introduce columnar storage API.

This new API is responsible for the low-level storage details of
columnar; translating large reads and writes into individual block
reads and writes that respect the page headers and emit WAL. It's also
responsible for the columnar metapage, resource reservations (stripe
IDs, row numbers, and data), and truncation.

This new API is not used yet, but will be used in subsequent
forthcoming commits.

* Columnar: add columnar_storage_info() for debugging purposes.

* Columnar: expose ColumnarMetadataNewStorageId().

* Columnar: always initialize metapage at creation time.

This avoids the complexity of dealing with tables where the metapage
has not yet been initialized.

* Columnar: columnar storage upgrade/downgrade UDFs.

Necessary upgrade/downgrade step so that new code doesn't see an old
metapage.

* Columnar: improve metadata.c comment.

* Columnar: make ColumnarMetapage internal to the storage API.

Callers should not have or need direct access to the metapage.

* Columnar: perform resource reservation using storage API.

* Columnar: implement truncate using storage API.

* Columnar: implement read/write paths with storage API.

* Columnar: add storage tests.

* Revert "Columnar: don't include stripe reservation locks in lock graph."

This reverts commit c3dcd6b9f8.

No longer needed because the columnar storage API takes care of
concurrency for resource reservation.

* Columnar: remove unnecessary lock when reserving.

No longer necessary because the columnar storage API takes care of
concurrent resource reservation.

* Add simple upgrade tests for storage/ branch

* fix multi_extension.out

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2021-05-10 20:16:46 +03:00
Onur Tirtir 7def297a3b
Move the logic that builds relation col list into a function (#4964) 2021-05-10 20:01:28 +03:00
Onur Tirtir fe5c985e1d
Remove HAS_TABLEAM config since we dropped pg11 support (#4862)
* Remove HAS_TABLEAM config

* Drop columnar_ensure_objects_exist

* Not call columnar_ensure_objects_exist in citus_finish_pg_upgrade
2021-04-13 10:51:26 +03:00
jeff-davis 3efdfdd791
Columnar: make projectedColumnList an integer list. (#4869)
Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2021-04-12 19:07:21 -07:00
Onur Tirtir 1d3e075e62
Support temporary columnar tables (#4766) 2021-03-12 12:01:36 +03:00
jeff-davis 9da9bd3dfd
Columnar: rename files and tests. (#4751)
* Columnar: rename files and tests.

* Columnar: Rename Table*State to Columnar*State.
2021-03-01 08:34:24 -08:00