Commit Graph

5221 Commits (3c8482893151d5df07a97f046ffd1846b268427c)

Author SHA1 Message Date
Onur Tirtir 3c84828931 Fix coordinator/worker query targetlists for agg. that we cannot push-down (#5679)
Previously, we were wrapping targetlist nodes with Vars that reference
to the result of the worker query, if the node itself is not `Const` or
not a `Param`. Indeed, we should not do that unless the node itself is
a `Var` node or contains a `Var` within it (e.g.: `OpExpr(Var(column_a) > 2)`).
Otherwise, when worker query returns empty result set, then combine
query exec would crash since the `Var` would be pointing to an empty
tuple slot, which is not desirable for the node-executor methods.

(cherry picked from commit 79442df1b7)
2022-02-04 16:38:41 +03:00
Hanefi Onaldi cb21065749
Bump Citus version to 10.2.4 2022-02-01 14:15:18 +03:00
Hanefi Onaldi cca5a76090
Add changelog entries for 10.2.4
(cherry picked from commit beafde5ff5)
2022-02-01 14:14:20 +03:00
Onder Kalaci ed3f298eb3 Use a fixed application_name while connecting to remote nodes
Citus heavily relies on application_name, see
`IsCitusInitiatedRemoteBackend()`.

But if the user set the application name, such as export PGAPPNAME=test_name,
Citus uses that name while connecting to the remote node.

With this commit, we ensure that Citus always connects with
the "citus" user name to the remote nodes.

(cherry picked from commit b26eeaecd3)
2022-01-27 12:58:04 +01:00
Onur Tirtir 5a1a1334d3 Tell other backends it's safe to ignore the backend that concurrently built the shell table index (#5520)
In addition to starting a new transaction, we also need to tell other
backends --including the ones spawned for connections opened to
localhost to build indexes on shards of this relation-- that concurrent
index builds can safely ignore us.

Normally, DefineIndex() only does that if index doesn't have any
predicates (i.e.: where clause) and no index expressions at all.
However, now that we already called standard process utility, index
build on the shell table is finished anyway.

The reason behind doing so is that we cannot guarantee not grabbing any
snapshots via adaptive executor, and the backends creating indexes on
local shards (if any) might block on waiting for current xact of the
current backend to finish, which would cause self deadlocks that are not
detectable.

(cherry picked from commit 3cc44ed8b3)

  Conflicts:
	src/backend/distributed/commands/utility_hook.c
2022-01-13 14:36:22 +03:00
Ahmet Gedemenli 5f04346408
Do not include return table params in the function arg list (#5568)
(cherry picked from commit 90928cfd74)

Fix function signature generation

Fix comment typo

Add test for worker_create_or_replace_object

Add test for recreating distributed functions with OUT/TABLE params

Add test for recreating distributed function that returns setof int

Fix test output

Fix comment

(cherry picked from commit 8e4ff34a2e)
2021-12-23 18:07:46 +03:00
Onur Tirtir 39348e2c3b HAVE_LZ4 -> HAVE_CITUS_LZ4 (#5541)
(cherry picked from commit cc4c83b1e5)
2021-12-16 16:38:45 +03:00
Marco Slot d695deae16 Support operator class parameters in indexes
(cherry picked from commit defb97b7f5)
2021-12-15 20:10:56 +03:00
Onur Tirtir e19089503e Bump citus version to 10.2.3 2021-11-29 11:50:44 +03:00
Onur Tirtir 81702af8d7 Add changelog entries for 10.2.3 (#5498)
(cherry picked from commit 1836361a51)

Conflicts:
	CHANGELOG.md
2021-11-29 11:50:35 +03:00
Onur Tirtir 6899e66f80 Drop upgrade_list_citus_objects_0.out 2021-11-26 10:09:30 +03:00
Onur Tirtir f5b297e149 Allow overwriting columnar storage pages written by aborted xacts (#5484)
When refactoring storage layer in #4907, we deleted the code that allows
overwriting a disk page previously written but not known by metadata.

Readers can see the change that introduced the code allows doing so in
commit a8da9acc63.

The reasoning was that; as of 10.2, we started aligning page
reservations (`AlignReservation`) for subsequent writes right after
allocating pages from disk. That means, even if writer transaction
fails, subsequent writes are guaranteed to allocate a new page and write
to there. For this reason, attempting to write to a page allocated
before is not possible for a columnar table that user created when using
v10.2.x.

However, since the older versions of columnar doesn't do that, following
example scenario can still result in writing to such disk page, even if
user now upgraded to v10.2.x. This is because, when upgrading storage to
2.0 (`ColumnarStorageUpdateIfNeeded`), we calculate `reservedOffset` of
the metapage based on the highest used address known by stripe
metadata (`GetHighestUsedAddressAndId`). However, stripe metadata
doesn't have entries for aborted writes. As a result, highest used
address would be computed by ignoring pages that are allocated but not
used.

- User attempts writing to columnar table on Citus v10.0x/v10.1x.
- Write operation fails for some reason.
- User upgrades Citus to v10.2.x.
- When attempting to write to same columnar table, they hit to "attempt
  to write columnar data .." error since write operation done in the
  older version of columnar already allocated that page, and now we are
  overwriting it.

For this reason, with this commit, we re-do the change done in
a8da9acc63.

And for the reasons given above, it wasn't possible to add a test for
this commit via usual code-paths. For this reason, added a UDF only for
testing purposes so that we can reproduce the exact scenario in our
regression test suite.

(cherry picked from commit 76b8006a9e)
2021-11-26 10:07:00 +03:00
Onur Tirtir b66abbcba8 Introduce dependencies from columnarAM to columnar metadata objects
During pg upgrades, we have seen that it is not guaranteed that a
columnar table will be created after metadata objects got created.
Prior to changes done in this commit, we had such a dependency
relationship in `pg_depend`:

```
columnar_table ----> columnarAM ----> citus extension
                                           ^  ^
                                           |  |
columnar.storage_id_seq --------------------  |
                                              |
columnar.stripe -------------------------------
```

Since `pg_upgrade` just knows to follow topological sort of the objects
when creating database dump, above dependency graph doesn't imply that
`columnar_table` should be created before metadata objects such as
`columnar.storage_id_seq` and `columnar.stripe` are created.

For this reason, with this commit we add new records to `pg_depend` to
make columnarAM depending on all rel objects living in `columnar`
schema. That way, `pg_upgrade` will know it needs to create those before
creating `columnarAM`, and similarly, before creating any tables using
`columnarAM`.

Note that in addition to inserting those records via installation script,
we also do the same in `citus_finish_pg_upgrade()`. This is because,
`pg_upgrade` rebuilds catalog tables in the new cluster and that means,
we must insert them in the new cluster too.

(cherry picked from commit 73f06323d8)
2021-11-26 10:02:11 +03:00
Onur Tirtir 88f2b8a60d Reproduce bug via test suite
(cherry picked from commit ef2ca03f24)
2021-11-26 09:55:50 +03:00
Onur Tirtir de39835da2 Store tmp_upgrade/newData/*.log as an artifact
(cherry picked from commit 4a97664fd7)
2021-11-26 09:55:10 +03:00
Onur Tirtir 3f2ac78cf6 Skip deleting options if columnar.options is already dropped (#5458)
Drop extension might cascade to columnar.options before dropping a
columnar table. In that case, we were getting below error when opening
columnar.options to delete records for the columnar table that we are
about to drop.: "ERROR:  could not open relation with OID 0".

I somehow reproduced this bug easily when upgrading pg, that is why
adding added the test to after_pg_upgrade_schedule.

(cherry picked from commit 25024b776e)

 Conflicts:
	src/test/regress/after_pg_upgrade_schedule
2021-11-12 12:36:00 +03:00
naisila 757446bc61 Fix index name udf scripts for 10.2 and bump use of new sql functions 2021-11-09 08:56:57 +01:00
Naisila Puka 0a48c0aec7 Add fix_partition_shard_index_names udf to fix currently broken names (#5291)
* Add udf to include shardId in broken partition shard index names

* Address reviews: rename index such that operations can be done on it

* More comprehensive index tests

* Final touches and formatting
2021-11-09 08:56:57 +01:00
Önder Kalacı d18757b0cd Allow lock_shard_resources to be called by the users with privileges (#5441)
Before this commit, we required the user to be owner of the shard/table
in order to call lock_shard_resources.

However, that is too restrictive. We can have users with GRANTS
to the table who are not owners of the tables/shards.

With this commit, we allow such patterns.

(cherry picked from commit 98ca6ba6ca)
2021-11-08 15:45:02 +01:00
Sait Talha Nisanci a71e0b5c84 Fix missing from entry
(cherry picked from commit a0e0759f73)
2021-11-08 12:44:18 +03:00
Nils Dijk 143a3f2b28
reinstate optimization that got unintentionally broken in 366461ccdb (#5418)
DESCRIPTION: Reinstate optimisation for uniform shard interval ranges

During a refactor introduced in #4132 the following change was made, which made the optimisation in `CalculateUniformHashRangeIndex` unreachable: 
366461ccdb (diff-565a339ed3c78bc5a0d4ffeb4e91032150b1dffbeeff59cd3e65981d20b998c7L319-R319)

This PR reinstates the path to the optimisation!
2021-11-05 13:10:46 +01:00
Jelte Fennema 425ca713ff Fix duplicate typedef which can cause compile failures (#5406)
ColumnarScanDesc is already defined in columnar_tableam.h. Redifining it
again causes a compiler error on some C compilers.

Useful reference: https://bugzilla.redhat.com/show_bug.cgi?id=767538

Fixes #5404

(cherry picked from commit 3bdbfc3edf)
2021-10-25 16:54:17 +02:00
Onur Tirtir ef2c6d51b2 Add changelog for 10.2.2
(cherry picked from commit c2ea886085)
2021-10-14 14:31:46 +03:00
Onur Tirtir bf3c0d7efd Bump citus version to 10.2.2 2021-10-14 12:37:27 +03:00
Marco Slot dee3c95992 Fixes CREATE INDEX deparsing issue 2021-10-14 10:03:50 +02:00
Onur Tirtir 6fe7c32d9f (Share) Lock buffer page when reading from columnar storage (#5338)
Under high write concurrency, we were sometimes reading columnar
metapage as all zeros.

In `WriteToBlock()`, if `clear == true`, then it will clear the page before
writing the new one, rather than just adding data to the page. That
means any concurrent connection that is holding only a pin will be
able to see the all-zero state between the `InitPage()` and the
`memcpy_s()`.

Moreover, postgres/storage/buffer/README states that:

> Buffer access rules:
>
> 1. To scan a page for tuples, one must hold a pin and either shared or
> exclusive content lock.  To examine the commit status (XIDs and status bits)
> of a tuple in a shared buffer, one must likewise hold a pin and either shared
> or exclusive lock.

For those reasons, we have to make sure to never keep a pin on the
page without (at least) the shared lock, to avoid having such problems.

(cherry picked from commit 5d8f74bd0b)
2021-10-06 12:01:31 +03:00
SaitTalhaNisanci 819ac372e0
Update images to use pg 14.0 (#5339) 2021-10-04 16:20:12 +03:00
Onur Tirtir 877369fd36 Discard index deletion requests made to columnarAM (#5331)
A write operation might trigger index deletion if index already had
dead entries for the key we are about to insert.
There are two ways of index deletion:
  a) simple deletion
  b) bottom-up deletion (>= pg14)

Since columnar_index_fetch_tuple never sets all_dead to true,
columnarAM doesn't ever expect to receive simple deletion requests
(columnar_index_delete_tuples) as we don't mark any index entries
as dead.

However, since columnarAM doesn't delete any dead entries via simple
deletion, postgres might ask for a more comprehensive deletion
(i.e.: bottom-up) at some point when pg >= 14.

So with this commit, we start gracefully ignoring bottom-up deletion
requests made to columnar_index_delete_tuples.

Given that users can anyway "VACUUM FULL" their columnar tables,
we don't see any problem in ignoring deletion requests.
(cherry picked from commit fe72e8bb48)
2021-10-01 14:33:27 +03:00
Önder Kalacı 2813063059 Make (columnar.stripe) first_row_number index a unique constraint (#5324)
* Make (columnar.stripe) first_row_number index a unique constraint

Since stripe_first_row_number_idx is required to scan a columnar
table, we need to make sure that it is created before doing anything
with columnar tables during pg upgrades.

However, a plain btree index is not a dependency of a table, so
pg_upgrade cannot guarantee that stripe_first_row_number_idx gets
created when creating columnar.stripe, unless we make it a unique
"constraint".

To do that, drop stripe_first_row_number_idx and create a unique
constraint with the same name to keep the code change at minimum.

* Add more pg upgrade tests for columnar

* Fix a logic error in uprade_columnar_after test

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
(cherry picked from commit c2311b4c0c)
2021-09-30 10:52:46 +03:00
Onur Tirtir 8e5b9a06ad Add changelog for 10.2.1
(cherry picked from commit 2f1bf9499f)

 Conflicts:
	CHANGELOG.md
2021-09-24 13:11:30 +03:00
Onur Tirtir aed6776d1f Bump citus version to 10.2.1 2021-09-24 12:53:51 +03:00
SaitTalhaNisanci a9dced4291 Update images to use rc (#5320)
(cherry picked from commit 800ad5eca6)
2021-09-24 11:42:53 +03:00
Jeff Davis 34744501ce Columnar: only call BuildStripeMetadata() with heap tuple.
BuildStripeMetadata() calls HeapTupleHeaderGetXmin(), which must only
be called on a proper heap tuple with MVCC information. Make sure the
caller passes the heap tuple, and not a datum tuple.

Fixes #5318.

(cherry picked from commit d49d321eac)
2021-09-24 10:57:22 +03:00
tejeswarm 86b57f426b Parition shards to be colocated with the parent shards 2021-09-23 11:51:03 -07:00
Onur Tirtir 9801f743ce Revoke read access to columnar.chunk from unprivileged user (#5313)
Since this could expose chunk min/max values to unprivileged users.
(cherry picked from commit 77a2dd68da)
2021-09-22 16:24:08 +03:00
Onur Tirtir 8e3246a2f3 Columnar CustomScan: Pushdown BoolExpr's as we do before
(cherry picked from commit 68335285b4)
2021-09-22 11:44:12 +03:00
Onur Tirtir 1feb7102b8 Check if xact id is in progress before checking if aborted (#5312)
(cherry picked from commit e6ed764f63)
2021-09-22 11:44:12 +03:00
Onur Tirtir 9cde3d4122 Add CheckCitusVersion() calls to columnarAM (#5308)
Considering all code-paths that we might interact with a columnar table,
add `CheckCitusVersion` calls to tableAM callbacks:
- initializing table scan (`columnar_beginscan` & `columnar_index_fetch_begin`)
- setting a new filenode for a relation (storage initializiation or a table rewrite)
- truncating the storage
- inserting tuple (single and multi)

Also add `CheckCitusVersion` call to:
- drop hook (`ColumnarTableDropHook`)
- `alter_columnar_table_set` & `alter_columnar_table_reset` UDFs
(cherry picked from commit f8b1ff7214)
2021-09-20 17:33:51 +03:00
Onder Kalaci 7da6d68675 Add missing version checks for citus_internal_XXX functions
(cherry picked from commit cea937f52f)
2021-09-20 13:50:47 +03:00
Hanefi Onaldi a913b90ff3
Bump Citus version to 10.2.0 2021-09-15 05:55:36 +03:00
Gurkan Indibay 082667a985
Add changelog entries for 10.2.0 2021-09-15 05:20:13 +03:00
jeff-davis 6e8b19984e
Columnar: separate plan and runtime quals. (#5261)
* Columnar: separate plain and exec quals.

Make a clear separation between plain quals, which contain constants
or extern params; and exec quals, which contain exec params and can't
be evaluated until a rescan.

Fixes #5258.

* more vanilla tests

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2021-09-13 10:54:53 -07:00
jeff-davis d48ceee238
Columnar: add method ReparameterizeCustomPathByChild. (#5275)
When performing a partition-wise join, the planner will adjust paths
parameterized by the parent rel to instead parameterize by the child
rel directly. When this reparameterization happens, we also need to
adjust the join quals to reference the child rather than the parent.

Fixes #5257.
2021-09-13 10:33:48 -07:00
Onur Tirtir ea61efb63a
Not flush writes until need to read them when doing index-scan on columnar (#5247)
Not flush pending writes if given tid belongs to a "flushed" or
"aborted" stripe write, or to an "in-progress" stripe write of
another backend.

That way, we would reduce the cases where we flush single-tuple
stripes during index scan.

To do that, we follow below steps for index look-up's:

- Do not flush any pending writes and do stripe metadata look-up for
  given tid.
  If tuple with tid is found, then no need to do another look-up
  since we already found the tuple without needing to flush pending
  writes.

- If tuple is not found without flushing pending writes, then we have two
  scenarios:

  -  If given tid belongs to a pending write of my backend, then do stripe
     metadata look-up for given tid. But this time first **flush any pending
     writes**.
     
  -  Otherwise, just return false from `index_fetch_tuple` since flushing
      pending writes wouldn't help.
2021-09-13 18:41:20 +02:00
Onur Tirtir 4ee0fb2758
Make sure to skip aborted writes when reading the first tuple (#5274)
With 5825c44d5f, we made the changes to
skip aborted writes when scanning a columnar table.

However, looks like we forgot to handle such cases for the very first
call made to columnar_getnextslot. That means, that commit only
considered the intermediate stripe read operations.

However, functions called by columnar_getnextslot to find first stripe
to read (ColumnarBeginRead & ColumnarRescan) were not caring about
those aborted writes.

To fix that, we teach AdvanceStripeRead to find the very first stripe
to read, and then start using it where were blindly calling
FindNextStripeByRowNumber.
2021-09-13 11:50:53 +03:00
Burak Velioglu 531ad83b8c
Merge pull request #5263 from citusdata/velioglu/handle_errors_on_abort
Swallow errors while aborting remote transactions
2021-09-10 11:18:47 +03:00
Burak Velioglu ceec5d72e3
Swallow errors while aborting remote transactions 2021-09-10 11:06:16 +03:00
Naisila Puka a69abe3be0
Fixes bug about int and smallint sequences on MX (#5254)
* Introduce worker_nextval udf for int&smallint column defaults

* Fix current tests and add new ones for worker_nextval
2021-09-09 23:41:07 +03:00
Nils Dijk 80a44a7b93
prevent double inclusion of columnar_tableam.h (#5266)
Recently there are some warnings during the compilation of Citus.
Part of the warnings come due to the `columnar_tableam.h` header not being properly guarded with defines and ifndef's.

This PR fixes these warnings.
2021-09-09 17:37:58 +02:00
Onur Tirtir be74518965
Improve memset calls made to reset bool arrays (#5262) 2021-09-09 17:56:03 +03:00