citus

Commit Graph

Author	SHA1	Message	Date
Hanefi Onaldi	998b9ffcaa	Merge branch 'master' into changelog-updates	2021-08-05 19:27:13 +03:00
jeff-davis	deb7ec605b	Columnar: fix misleading comments and useless types. (#5162 ) CustomScan and CustomPath structures cannot be extended with additional fields. Fix comments and type structure that implied that they can.	2021-08-05 09:22:21 -07:00
Hanefi Onaldi	bc5553b5d1	Add changelog entries for 10.1.1	2021-08-05 17:32:31 +03:00
Ahmet Gedemenli	07ca8784cd	Merge pull request #5161 from citusdata/add-check-for-gucs-order Add check for alphabetically sorted gucs	2021-08-05 17:09:38 +03:00
Ahmet Gedemenli	51d410bb7b	Add check for alphabetically sorted gucs Move to a separate script Add the new script to readme	2021-08-05 16:37:49 +03:00
naisila	798a7902bf	Fix master_update_table_statistics scripts for 9.5	2021-08-03 18:15:56 +03:00
naisila	f9fa5a3d69	Fix master_update_table_statistics scripts for 9.4	2021-08-03 18:15:56 +03:00
Önder Kalacı	0d2f49fbce	Merge pull request #5130 from citusdata/get_ready_update_dist_table_colocation Introduce citus_internal_update_relation_colocation	2021-08-03 11:53:30 +02:00
Onder Kalaci	482b8096e9	Introduce citus_internal_update_relation_colocation update_distributed_table_colocation can be called by the relation owner, and internally it updates pg_dist_partition. With this commit, update_distributed_table_colocation uses an internal UDF to access pg_dist_partition. As a result, this operation can now be done by regular users on MX.	2021-08-03 11:44:58 +02:00
Onur Tirtir	ef6a8604ba	Merge pull request #5140 from citusdata/col/seq-path-costing Re-cost columnar table sequential scan paths With the changes in this pr, we adjust the cost estimates done by postgres for sequential scan paths for columnar tables. We want to make better decisions when columnar custom scan is disabled too. That means, there are cases where index scan is more preferable over sequential scan for heapAM but not for columnarAM. For this reason, we want to make better decisions regarding whether to choose index scan or sequential scan when columnar custom is scan is disabled. So with this pr, we re-estimate costs for sequential scan paths in a way that is quite similar to what we do for columnar custom scan. The idea is that columnar custom scan uses projection pushdown so the cost is directly proportional to column selectivity. However, for sequential scan, we re-estimate the cost considering all the columns since projection pushdown is not supported for plain sequential scan. One thing to note here is that we still don't consider chunk group filtering when estimating the cost for columnar custom scan. For this reason, we calculate the same costs for sequential scan & columnar custom scan if query reads all columns, regardless of the filters in the `where` clause. To avoid mistakenly choosing sequential scan in such cases, we still remove non `IndexPath`s if columnar custom scan is enabled. That way, even when we calculate the same cost for sequential scan and columnar scan, we will anyway remove sequential one and guarantee that we would choose either columnar custom scan or index scan.	2021-08-02 11:38:11 +03:00
Onur Tirtir	93ebbb0607	Re-cost SeqPath's as well for columnar tables	2021-08-02 11:32:25 +03:00
Onur Tirtir	453ac40725	Comment why we still remove non IndexPath's when custom scan is off	2021-08-02 11:25:18 +03:00
Onur Tirtir	a87405b6ba	Not adjust IndexPath cost if indexscan is off	2021-08-02 11:25:18 +03:00
Onur Tirtir	51691a8994	Rename RecostColumnarIndexPaths to RecostColumnarPaths	2021-08-02 11:25:18 +03:00
Onur Tirtir	734fa22272	Merge pull request #5090 from citusdata/col/path-costing Re-cost columnar table index scan paths With the changes in this pr, we adjust the cost estimate done by indexAM for `IndexPath` according to columnar tables when the index is on a columnar table. This is because, the way indexAM estimates the cost is not appropriate for indexes on columnar tables. The most basic reason is that indexAM assumes we will only need to read single page to access a single tuple of the table. On the other hand for columnar tables, we read the whole stripe from disk for a single tuple too, regardless of the optimization done in #5058. Note that we don't simply assign startup / total costs but we add the cost estimated by us to the cost estimated by indexAM. This is because we need to take "the cost due to index data-structure traversal" into account too. Before explaining the logic that we follow for `IndexPath`, let's first summarize what we were / are doing for `ColumnarCustomScan`: ```math X <- cost for reading single column of single stripe // 1 cost = X * (number of columns after projection pushdown) // 2 cost = cost * (number of stripes that relation has) // 3 ``` The logic that we follow to calculate the additional cost for index scan is as follows: ```math X <- cost for reading single column of single stripe // same as 1 above cost = X * (number of columns that relation has) // index scan cannot do projection pushdown, so different than 2 above cost = cost * (estimated number of stripes that we need to read) ``` where, we calculate `estimated number of stripes that we need to read` as follows: ```math indexCorrelation, indexSelectivity <- calculate by using amcostestimate_function estimatedReadRows = (relation row count) * indexSelectivity minEstimateStripeReads = estimatedReadRows / (average stripe row count) // full correlation, we will not do any redundant stripe reads maxEstimateStripeReads = estimatedReadRows // no correlation, we will read a different stripe for each tuple complementCorrelation = 1 - abs(indexCorrelation) estimatedStripeCount = minEstimateStripeReads + complementCorrelation * (maxEstimateStripeReads - minEstimateStripeReads) ```	2021-08-02 11:23:20 +03:00
Onur Tirtir	297f59a70e	Re-cost columnar table index paths	2021-08-02 11:16:37 +03:00
Onur Tirtir	8adcf2096b	Multiply ColumnarCustomScan cost by tblspace.seqpage cost	2021-08-02 11:16:37 +03:00
Onur Tirtir	dba8421453	Refactor ColumnarScanCost into ColumnarPerChunkGroupScanCost	2021-08-02 11:16:37 +03:00
Onur Tirtir	d8f92697f2	Free memory used for last stripe read when re-scanning a columnar table (#5143 ) Instead of setting stripeReadState to NULL, call ColumnarResetRead before re-scanning a columnar table since this function is already designed for doing the necessary clean up when finishing a stripe read. Note that this change shouldn't have a great effect on memory usage since AdvanceStripe was already doing the clean-up for all the stripes except the last one.	2021-08-02 11:16:01 +03:00
Onur Tirtir	38940ed2a6	Merge pull request #5058 from citusdata/col/optimize-index-read Use long-lasting mem cxt during columnar index scan & optimize correlated ones	2021-08-02 11:06:57 +03:00
Onur Tirtir	73058d35cc	Not free (stripe) chunk buffers after de-serializing Previously, we were only using chunk group reader for sequential scan. However, to support index scans on columnar tables, now we use very same low level functions for index scan too. Since those low-level functions were only used for sequential scan, it was guaranteed that we would never read the same chunk group more than once, so we were freeing chunk buffers after deserializing them into a separate buffer. Now that we use those low level functions for index scan, we cannot free chunk buffers since it's possible to read the same chunk group again, such that: - read chunk group 1 of stripe 5 - read chunk group 2 of stripe 5 - read chunk group 1 of stripe 5 again Here, when we decide to read chunk group 1 for a second time, chunk group 1 is not cached. Plus, before this commit, we were freeing the chunk buffers for chunk group 1 after the first read and then we were getting segfault or errors from low-level de-compression APIs.	2021-08-02 11:00:12 +03:00
Onur Tirtir	327ae43b83	Get rid of EndStripeRead, since we anyway reset mem cxt	2021-08-02 11:00:12 +03:00
Onur Tirtir	83f5d42365	Use long-lasting mem cxt & optimize correlated index scan	2021-08-02 11:00:12 +03:00
Onur Tirtir	c021b82a43	Introduce CreateColumnarScanMemoryContext	2021-08-02 11:00:12 +03:00
Onur Tirtir	a25d89e4cb	Merge pull request #5103 from citusdata/at-set-columnar-index Keep supported indexes when converting table to columnar. Previously, as indexes were not supported by columnar tables, we were ignoring all the indexes & index-based constraints of table when converting it to a columnar table. However, now that we support `btree` & `hash` indexAM's for columnar tables, we only ignore the indexAM's other than those two. However, the way we ignore the unsupported indexes is now a bit different than before. Previously we were just _not creating_ any index types after converting table to columnar as we didn't support any of the index types. Now that we support `btree` & `hash` indexAMs for columnar tables, now we really drop the unsupported index types since re-creating the remaining ones is easier than adding some code that creates only the supported indexes.	2021-07-30 17:01:30 +03:00
Onur Tirtir	84a49cc221	Improve error message for indexAMs not supported by columnar	2021-07-30 16:41:53 +03:00
Onur Tirtir	90e856d6bc	Keep supported indexes when converting table to columnar	2021-07-30 16:41:01 +03:00
Onur Tirtir	eeecbd2324	Introduce ColumnarSupportsIndexAM	2021-07-30 16:40:27 +03:00
Halil Ozan Akgül	d140ca1b0e	Merge pull request #5146 from citusdata/fix_ruleutils_13_endif_comment Corrects the ruleutils_13.c endif comment	2021-07-29 17:27:01 +03:00
Halil Ozan Akgul	286b0fe0e8	Corrects the endif comment	2021-07-29 17:22:31 +03:00
SaitTalhaNisanci	4559d02c41	Fix union pushdown issue (#5079 ) * Fix UNION not being pushdown Postgres optimizes column fields that are not needed in the output. We were relying on these fields to understand if it is safe to push down a union query. This fix looks at the parse query, which has the original column fields to detect if it is safe to push down a union query. * Add more tests * Simplify code and make it more robust * Process varlevelsup > 0 in FindReferencedTableColumn * Only look for outers vars in union path * Add more comments * Remove UNION ALL specific logic for pulling up childvars	2021-07-29 13:52:55 +03:00
Jelte Fennema	2aa67421a7	Fix showing target shard size in the rebalance progress monitor (#5136 ) The progress monitor wouldn't actually update the size of the shard on the target node when using "block_writes" as the `shard_transfer_mode`. The reason for this is that the CREATE TABLE part of the shard creation would only be committed once all data was moved as well. This caused our size calculation to always return 0, since the table did not exist yet in the session that the progress monitor used. This is fixed by first committing creation of the table, and only then starting the actual data copy. The test output changes slightly. Apparently splitting this up in two transactions instead of one, increases the table size after the copy by about 40kB. The additional size used doesn't increase when with the amount of data in the table is larger (it stays ~40kB per shard). So this small change in test output is not considered an actual problem.	2021-07-23 16:37:00 +02:00
Jelte Fennema	4c1066e463	Merge pull request #5133 from citusdata/add-cache-to-sequence-def-mx Include data_type and cache in sequence definition on workers	2021-07-22 11:57:03 +02:00
Jelte Fennema	7d0b6dc9be	Include data_type and cache in sequence definition on workers These two options were not included when creating the sequences on the workers as part of metadata syncing. The missing `data_type` part of the definition made finding the cause of #5126 harder than necessary, because of confusing errors.	2021-07-22 11:49:06 +02:00
Önder Kalacı	f52db0abab	Merge pull request #5127 from citusdata/get_ready_tenant_isolation Introduce citus_internal_delete_shard_metadata	2021-07-19 14:43:47 +02:00
Onder Kalaci	903489c763	Improve wording of an error message	2021-07-19 14:38:52 +02:00
Onder Kalaci	c8368e7929	Introduce citus_internal_delete_shard_metadata With this function, the owner of the table is allowed to remove shard metadata. This is going to be useful for tenant-isolation.	2021-07-19 13:25:05 +02:00
Önder Kalacı	87a51ae552	CLUSTER ON deparser should consider schemas (#5122 )	2021-07-16 19:13:18 +03:00
Hanefi Onaldi	38c139ba59	Merge pull request #5114 from citusdata/changelog-updates	2021-07-16 17:53:36 +03:00
Hanefi Onaldi	6b4996f47e	Add changelog entries for 10.1.0 This patch also moves the section to the top of the changelog	2021-07-16 16:51:12 +03:00
Jelte Fennema	adf17a8cf1	Add upgrade and dowgrade tests for Citus 10.2 (#5120 ) It seems we forgot to add this when starting 10.2 development.	2021-07-16 14:39:04 +02:00
Önder Kalacı	644052ea58	Merge pull request #5105 from citusdata/regular_user_metadata_sync Use current user while syncing metadata	2021-07-16 14:00:32 +02:00
Onder Kalaci	2c349e6dfd	Use current user to sync metadata Before this commit, we always synced the metadata with superuser. However, that creates various edge cases such as visibility errors or self distributed deadlocks or complicates user access checks. Instead, with this commit, we use the current user to sync the metadata. Note that, `start_metadata_sync_to_node` still requires super user because accessing certain metadata (like pg_dist_node) always require superuser (e.g., the current user should be a superuser). However, metadata syncing operations regarding the distributed tables can now be done with regular users, as long as the user is the owner of the table. A table owner can still insert non-sense metadata, however it'd only affect its own table. So, we cannot do anything about that.	2021-07-16 13:25:27 +02:00
Hanefi Onaldi	b3cc9d63cb	Merge pull request #5111 from citusdata/changelog-updates	2021-07-14 15:42:43 +03:00
Hanefi Onaldi	45b72c204d	Add changelog entry for 10.0.4	2021-07-14 15:04:45 +03:00
Onur Tirtir	f00c63c33d	Support columnar table index builds with CONCURRENTLY option (#5032 ) With this commit, we add (`CREATE INDEX` / `REINDEX`) `CONCURRENTLY` support for columnar tables. For that, we implement `columnar_index_validate_scan` callback. The reasoning behind the implementation is as follows: * Postgres function `validate_index` provides all the TIDs that are currently in the index to `columnar_index_validate_scan` callback via a `tupleSort` object.. * We start scanning the table by using `columnar_getnextslot` as usual. Before moving forward, note that `columnar_getnextslot` guarantees to return tuples in the order of their TIDs. * For us to use during table scan, postgres provides a snapshot guaranteeing that any tuples that are valid according to that snapshot but are not in the index must be added to the index. * Then for each tuple that we read from our table, we continue iterating given `tupleSort` to find the first TID that is greater than or equal to our tuple's TID. If both TID's are equal to each other, then we skip the tuple since it's already indexed. If the TID that we read from tupleSort is greater then our tuple's TID, then we decide to insert this tuple into index.	2021-07-09 13:44:58 +03:00
Onur Tirtir	ea5fe022a4	Be more explicit when doing ordered scan on columnar cat. tables (#5026 ) systable_getnext already uses ForwardScanDirection if relation has any open indexes, but let's be more explicit doing ordered scan on columnar catalog tables.	2021-07-09 13:24:27 +03:00
Hanefi Onaldi	ab873c6b58	Merge pull request #5030 from citusdata/do-not-use-public-schema	2021-07-09 02:15:42 +03:00
Hanefi Onaldi	efc5776451	Remove public schema dependency for 10.1 upgrades This commit contains a subset of the changes that should be cherry picked to 10.1 releases.	2021-07-09 02:08:22 +03:00
Hanefi Onaldi	8e9cc229ff	Remove public schema dependency for 10.0 upgrades This commit contains a subset of the changes that should be cherry picked to 10.0 releases.	2021-07-09 02:08:22 +03:00

1 2 3 4 5 ...

5031 Commits (6c26c67ea09db4e95a93f3eaefea72d20e54b20c) All Branches Search

5031 Commits (6c26c67ea09db4e95a93f3eaefea72d20e54b20c)

All Branches