Commit Graph

2735 Commits (d3fb9288ab290fa3ed1a1a9417e8d91fab9724f5)

Author SHA1 Message Date
Naisila Puka acb5ae6ab6
Skip dropping shards when we know it's a partition (#5176) 2021-08-31 17:41:37 +03:00
SaitTalhaNisanci 5ae01303d4
Use get_attnum to find the attribute number of target entry (#5220)
* Use get_attnum to find the attribute number of target entry
2021-08-31 16:47:19 +03:00
Jelte Fennema 481f8be084
Fix crash in shard rebalancer when no distributed tables exist (#5205)
The logging of the amount of ignored moves crashed when no distributed
tables existed in a cluster. This also fixes in passing that the logging
of ignored moves logs the correct number of ignored moves if there
exist multiple colocation groups and all are rebalanced at the same time.
2021-08-31 14:15:24 +02:00
SaitTalhaNisanci d50830d4cc
Update failure tests README (#5197)
* Update failure tests README

I keep finding this page when trying to run failure tests, so updating the README that way:
https://github.com/pypa/pipenv/issues/3363#issuecomment-452171564

Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>

Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
2021-08-26 12:35:06 +03:00
Hanefi Onaldi 7e39c7ea83
Replace master with citus in logs and comments (#5210)
I replaced 

- master_add_node,
- master_add_inactive_node
- master_activate_node

with

- citus_add_node,
- citus_add_inactive_node
- citus_activate_node

respectively.
2021-08-26 11:31:17 +03:00
SaitTalhaNisanci b923d51fc6
Bump pg12 and pg13 images to pg12.8 and pg13.8 (#5208)
In our testing infra structure, even though we use pinned versions of postgres, the auxiliary libraries might pull in newer versions. This is for example the case for libpq, which will now use the libpq libraries from 14beta3.

The changes in this PR are a lot due to the libpq changes.

We also have changed the citus version that is used as a base for the citus upgrades, from 10.0 to 10.1 . This caused columnar to enforce some extra limits on the settings, which conflicted with our upgrade tests.

The changes in failure tests are due to the libpq changes.

There are also a lot of changes on isolation tests outputs, hence we
updated all of them.

Co-authored-by: Nils Dijk <nils@citusdata.com>
2021-08-25 16:04:57 +03:00
Onur Tirtir 5af839ada0
Not print metapage.reserved_offset in regression tests (#5168)
* We were anyway not testing reserved_offset in any of those tests
   but other fields.

* This only happens with compressed columnar tables and is because the
   libzstd/liblz4 versions that we have on exttester ci image might be different
   than what we might have on our local environments.
2021-08-23 11:07:10 +03:00
jeff-davis 4f213f293e
Columnar: use generate_series for test rather than load. (#5181) 2021-08-16 16:12:06 -07:00
Onur Tirtir 68f46c5dc9 Use scan context for intermediate mem allocs too 2021-08-16 11:06:03 +03:00
Burak Velioglu 4355ba0a38
Add CREATE INDEX ... ON ONLY and ALTER INDEX ... ATTACH PARTITION (#4938 #4980)
- Add support for CRETE INDEX ... ON ONLY: Before that commit we were not sending "ONLY" option to the worker nodes at all. With this commit, "ONLY" parameter will be sent to the worker nodes if it is necessary. (#4938)

- Add support for ALTER INDEX ... ATTACH PARTITION: Attach child_index to parent_index by creating same inheritance on shard level in addition to table level. (#4980)
2021-08-13 13:12:45 +03:00
Ahmet Gedemenli 9e90894f21
Synchronize hasmetadata flag on mx workers (#5086)
* Synchronize hasmetadata flag on mx workers

* Switch to sequential execution

* Add test

* Use SetWorkerColumn

* Add test for stop_sync

* Remove usage of UpdateHasmetadataOnWorkersWithMetadata

* Remove MarkNodeMetadataSynced

* Fix test for metadatasynced

* Remove MarkNodeMetadataSynced

* Style

* Remove MarkNodeHasMetadata

* Remove UpdateDistNodeBoolAttr

* Refactor SetWorkerColumn

* Use SetWorkerColumnLocalOnly when setting up dependencies

* Use SetWorkerColumnLocalOnly in TriggerSyncMetadataToPrimaryNodes

* Style

* Make update command generator functions static

* Set metadatasynced before syncing

* Call SetWorkerColumn only if the sync is successful

* Try to sync all nodes

* Fix indexno

* Update metadatasynced locally first

* Break if a node fails to sync metadata

* Send worker commands optional

* Style & Rebase

* Add raiseOnError param to SetWorkerColumn

* Style

* Set metadatasynced for all metadata nodes

* Style

* Introduce SetWorkerColumnOptional

* Polish

* Style

* Dont send set command to not synced metadata nodes

* Style

* Polish

* Add test for stop_sync

* Add test for shouldhaveshards

* Add test for isactive flag

* Sort by placementid in the function verify_metadata

* Cover edge cases for failing nodes

* Add comments

* Add nodeport to isactive test

* Add warning if metadata out of sync

* Update warning message
2021-08-12 14:16:18 +03:00
Naisila Puka e5b32b2c3c
Acquire AccessShareLock before updating table statistics (#5155) 2021-08-12 13:58:15 +03:00
Onder Kalaci d4368ff2b3 Make sure that shouldhaveshards is synced to workers 2021-08-11 15:53:31 +02:00
Onder Kalaci 5f02d18ef8 transactional metadata sync for maintanince daemon
As we use the current user to sync the metadata to the nodes
with #5105 (and many other PRs), there is no reason that
prevents us to use the coordinated transaction for metadata syncing.

This commit also renames few functions to reflect their actual
implementation.
2021-08-09 10:34:55 +02:00
Onder Kalaci 35964c6366 Dropped columns do not diverge distribution column for partitioned tables
Before this commit, creating a partition after a DROP column
on the parent (position before dist. key) was leading to
partition to have the wrong distribution column.
2021-08-06 13:36:12 +02:00
naisila 798a7902bf Fix master_update_table_statistics scripts for 9.5 2021-08-03 18:15:56 +03:00
naisila f9fa5a3d69 Fix master_update_table_statistics scripts for 9.4 2021-08-03 18:15:56 +03:00
Onder Kalaci 482b8096e9 Introduce citus_internal_update_relation_colocation
update_distributed_table_colocation can be called by the relation
owner, and internally it updates pg_dist_partition. With this
commit, update_distributed_table_colocation uses an internal
UDF to access pg_dist_partition.

As a result, this operation can now be done by regular users
on MX.
2021-08-03 11:44:58 +02:00
Onur Tirtir 93ebbb0607 Re-cost SeqPath's as well for columnar tables 2021-08-02 11:32:25 +03:00
Onur Tirtir 297f59a70e Re-cost columnar table index paths 2021-08-02 11:16:37 +03:00
Onur Tirtir 73058d35cc Not free (stripe) chunk buffers after de-serializing
Previously, we were only using chunk group reader for sequential scan.
However, to support index scans on columnar tables, now we use very
same low level functions for index scan too.

Since those low-level functions were only used for sequential scan, it
was guaranteed that we would never read the same chunk group more than
once, so we were freeing chunk buffers after deserializing them into a
separate buffer.

Now that we use those low level functions for index scan, we cannot
free chunk buffers since it's possible to read the same chunk group
again, such that:

- read chunk group 1 of stripe 5
- read chunk group 2 of stripe 5
- read chunk group 1 of stripe 5 again

Here, when we decide to read chunk group 1 for a second time,
chunk group 1 is not cached. Plus, before this commit, we were
freeing the chunk buffers for chunk group 1 after the first
read and then we were getting segfault or errors from low-level
de-compression APIs.
2021-08-02 11:00:12 +03:00
Onur Tirtir 83f5d42365 Use long-lasting mem cxt & optimize correlated index scan 2021-08-02 11:00:12 +03:00
Onur Tirtir 84a49cc221 Improve error message for indexAMs not supported by columnar 2021-07-30 16:41:53 +03:00
Onur Tirtir 90e856d6bc Keep supported indexes when converting table to columnar 2021-07-30 16:41:01 +03:00
SaitTalhaNisanci 4559d02c41
Fix union pushdown issue (#5079)
* Fix UNION not being pushdown

Postgres optimizes column fields that are not needed in the output. We
were relying on these fields to understand if it is safe to push down a
union query.

This fix looks at the parse query, which has the original column fields
to detect if it is safe to push down a union query.

* Add more tests

* Simplify code and make it more robust

* Process varlevelsup > 0 in FindReferencedTableColumn

* Only look for outers vars in union path

* Add more comments

* Remove UNION ALL specific logic for pulling up childvars
2021-07-29 13:52:55 +03:00
Jelte Fennema 2aa67421a7
Fix showing target shard size in the rebalance progress monitor (#5136)
The progress monitor wouldn't actually update the size of the shard on
the target node when using "block_writes" as the `shard_transfer_mode`.
The reason for this is that the CREATE TABLE part of the shard creation
would only be committed once all data was moved as well. This caused
our size calculation to always return 0, since the table did not exist
yet in the session that the progress monitor used.

This is fixed by first committing creation of the table, and only then
starting the actual data copy.

The test output changes slightly. Apparently splitting this up in two
transactions instead of one, increases the table size after the copy by
about 40kB. The additional size used doesn't increase when with the
amount of data in the table is larger (it stays ~40kB per shard). So 
this small change in test output is not considered an actual problem.
2021-07-23 16:37:00 +02:00
Jelte Fennema 7d0b6dc9be Include data_type and cache in sequence definition on workers
These two options were not included when creating the sequences on the
workers as part of metadata syncing.

The missing `data_type` part of the definition made finding the cause
of #5126 harder than necessary, because of confusing errors.
2021-07-22 11:49:06 +02:00
Onder Kalaci 903489c763 Improve wording of an error message 2021-07-19 14:38:52 +02:00
Onder Kalaci c8368e7929 Introduce citus_internal_delete_shard_metadata
With this function, the owner of the table is allowed to remove
shard metadata. This is going to be useful for tenant-isolation.
2021-07-19 13:25:05 +02:00
Jelte Fennema adf17a8cf1
Add upgrade and dowgrade tests for Citus 10.2 (#5120)
It seems we forgot to add this when starting 10.2 development.
2021-07-16 14:39:04 +02:00
Onder Kalaci 2c349e6dfd Use current user to sync metadata
Before this commit, we always synced the metadata with superuser.
However, that creates various edge cases such as visibility errors
or self distributed deadlocks or complicates user access checks.

Instead, with this commit, we use the current user to sync the metadata.
Note that, `start_metadata_sync_to_node` still requires super user
because accessing certain metadata (like pg_dist_node) always require
superuser (e.g., the current user should be a superuser).

However, metadata syncing operations regarding the distributed
tables can now be done with regular users, as long as the user
is the owner of the table. A table owner can still insert non-sense
metadata, however it'd only affect its own table. So, we cannot do
anything about that.
2021-07-16 13:25:27 +02:00
Onur Tirtir f00c63c33d
Support columnar table index builds with CONCURRENTLY option (#5032)
With this commit, we add (`CREATE INDEX` / `REINDEX`) `CONCURRENTLY` support for columnar tables.

For that, we implement `columnar_index_validate_scan` callback.
The reasoning behind the implementation is as follows:

* Postgres function `validate_index` provides all the TIDs that are currently in the
  index to `columnar_index_validate_scan` callback via a `tupleSort` object..

* We start scanning the table by using `columnar_getnextslot` as usual.
  Before moving forward, note that `columnar_getnextslot` guarantees
  to return tuples in the order of their TIDs.

* For us to use during table scan, postgres provides a snapshot guaranteeing
  that any tuples that are valid according to that snapshot but are not in the
  index must be added to the index.

* Then for each tuple that we read from our table, we continue iterating
  given `tupleSort` to find the first TID that is greater than or equal to our
  tuple's TID.

  If both TID's are equal to each other, then we skip the tuple since it's already
  indexed.

  If the TID that we read from tupleSort is greater then our tuple's TID, then
  we decide to insert this tuple into index.
2021-07-09 13:44:58 +03:00
Hanefi Onaldi 8e9cc229ff
Remove public schema dependency for 10.0 upgrades
This commit contains a subset of the changes that should be cherry
picked to 10.0 releases.
2021-07-09 02:08:22 +03:00
Ahmet Gedemenli ed3b98a80b
Add failure test for stop_metadata_sync_to_node (#5102) 2021-07-08 18:23:19 +03:00
Nils Dijk 366796a72e
Add test for idempotency of citus_prepare_pg_upgrade 2021-07-08 12:24:51 +02:00
Marco Slot b14955c2bd
Fix PG upgrade scripts for 10.0 2021-07-05 14:38:20 +02:00
Marco Slot 3c0dfc12c0
Fix PG upgrade scripts for 9.5 2021-07-05 13:39:35 +02:00
Marco Slot bee202aa39
Fix PG upgrade scripts for 9.4 2021-07-05 13:39:28 +02:00
Onur Tirtir b118d4188e
Fix lower boundary calculation when pruning range dist table shards (#5082)
This happens only when we have a "<" or "<=" filter on distribution
column of a range distributed table and that filter falls in between
two shards.

When the filter falls in between two shards:

  If the filter is ">" or ">=", then UpperShardBoundary was
  returning "upperBoundIndex - 1", where upperBoundIndex is
  exclusive shard index used during binary seach.
  This is expected since upperBoundIndex is an exclusive
  index.
 
  If the filter is "<" or "<=", then LowerShardBoundary was
  returning "lowerBoundIndex + 1", where lowerBoundIndex is
  inclusive shard index used during binary seach.
  On the other hand, since lowerBoundIndex is an inclusive
  index, we should just return lowerBoundIndex instead of
  doing "+ 1". Before this commit, we were missing leftmost
  shard in such queries.

* Remove useless conditional branches

The branch that we delete from UpperShardBoundary was obviously useless.

The other one in LowerShardBoundary became useless after we remove "+ 1"
from there.

This indeed is another proof of what & how we are fixing with this pr.

* Improve comments and add more

* Add some tests for upper bound calculation too
2021-07-02 14:48:21 +03:00
Ahmet Gedemenli 8bae58fdb7
Add parameter to cleanup metadata (#5055)
* Add parameter to cleanup metadata

* Set clear metadata default to true

* Add test for clearing metadata

* Separate test file for start/stop metadata syncing

* Fix stop_sync bug for secondary nodes

* Use PreventInTransactionBlock

* DRemovedebuggiing logs

* Remove relation not found logs from mx test

* Revert localGroupId when doing stop_sync

* Move metadata sync test to mx schedule

* Add test with name that needs to be quoted

* Add test for views and matviews

* Add test for distributed table with custom type

* Add comments to test

* Add test with stats, indexes and constraints

* Fix matview test

* Add test for dropped column

* Add notice messages to stop_metadata_sync

* Add coordinator check to stop metadat sync

* Revert local_group_id only if clearMetadata is true

* Add a final check to see the metadata is sane

* Remove the drop verbosity in test

* Remove table description tests from sync test

* Add stop sync to coordinator test

* Change the order in stop_sync

* Add test for hybrid (columnar+heap) partitioned table

* Change error to notice for stop sync to coordinator

* Sync at the end of the test to prevent any failures

* Add test case in a transaction block

* Remove relation not found tests
2021-07-01 16:23:53 +03:00
Sait Talha Nisanci e7ed16c296 Not include to-be-deleted shards while finding shard placements
Ignore orphaned shards in more places

Only use active shard placements in RouterInsertTaskList

Use IncludingOrphanedPlacements in some more places

Fix comment

Add tests
2021-06-28 13:05:31 +03:00
Naisila Puka fe5907ad2d
Adds propagation of ALTER SEQUENCE and other improvements (#5061)
* Alter seq type when we first use the seq in a dist table

* Don't allow type changes when seq is used in dist table

* ALTER SEQUENCE propagation

* Tests for ALTER SEQUENCE propagation

* Relocate AlterSequenceType and ensure dependencies for sequence

* Support for citus local tables, and other fixes

* Final formatting
2021-06-24 21:23:25 +03:00
Jelte Fennema e9bfb8eddd
Fix check to always allow foreign keys to reference tables (#5073)
With the previous version of this check we would disallow distributed
tables that did not have a colocationid, to have a foreign key to a
reference table. This fixes that, since there's no reason to disallow
that.
2021-06-24 12:15:52 +02:00
Jelte Fennema d1d386a904
Only allow moves of shards of distributed tables (#5072)
Moving shards of reference tables was possible in at least one case:
```sql
select citus_disable_node('localhost', 9702);
create table r(x int);
select create_reference_table('r');
set citus.replicate_reference_tables_on_activate = off;
select citus_activate_node('localhost', 9702);
select citus_move_shard_placement(102008, 'localhost', 9701, 'localhost', 9702);
```

This would then remove the reference table shard on the source, causing
all kinds of issues. This fixes that by disallowing all shard moves
except for shards of distributed tables.

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2021-06-23 16:25:46 +02:00
Onder Kalaci 75847d10b5 Add regression tests for changing column type with fkey
closes https://github.com/citusdata/citus/issues/2337 as it doesn't
apply anymore.
2021-06-23 09:03:55 +03:00
Onder Kalaci 55ed93bf0d fix regression tests to avoid any conflicts in enterprise 2021-06-22 08:45:17 +03:00
Onder Kalaci 76ae5dd0db Improve regression tests for prepared statements
With a recent commit, we made (644b266dee)
the behaviour of prepared statements for local cached plans has
slightly changed.

Now, Citus caches the plans when they are re-used. This make triggering
of local cached plans on the 7th execution, and 8th execution is the
first time the plan is used from the cached.

So, the tests are improved to cover 8th execution.
2021-06-21 13:34:44 +03:00
Onder Kalaci 69ca943e58 Deparse/parse the local cached queries
With local query caching, we try to avoid deparse/parse stages as the
operation is too costly.

However, we can do deparse/parse operations once per cached queries, right
before we put the plan into the cache. With that, we avoid edge
cases like (4239) or (5038).

In a sense, we are making the local plan caching behave similar for non-cached
local/remote queries, by forcing to deparse the query once.
2021-06-21 12:24:29 +03:00
Onur Tirtir 82e58c91f3
Use correct test schedule name in columnar vg test target (#5027) 2021-06-18 11:31:16 +03:00
Onur Tirtir 6215a3aa93 Merge remote-tracking branch 'origin/master' into columnar-index 2021-06-17 14:31:12 +03:00
Hanefi Onaldi c4f50185e0
Ignore pl/pgsql line numbers in regression outputs (#4411) 2021-06-17 14:11:17 +03:00
SaitTalhaNisanci 3edef11a9f
Fix a test in hyperscale schedule (#5042) 2021-06-17 13:40:05 +03:00
Onur Tirtir 681f700321 Fix first_row_number test for stripe_row_limit enforcement 2021-06-17 10:51:43 +03:00
Onur Tirtir 18fe0311c0 Move rest of the schema changes to 10.2-1 2021-06-16 20:43:41 +03:00
Onur Tirtir 3d11c0f9ef Merge remote-tracking branch 'origin/master' into columnar-index
Conflicts:
	src/test/regress/expected/columnar_empty.out
	src/test/regress/expected/multi_extension.out
2021-06-16 20:23:50 +03:00
Onur Tirtir b6b969971a Error out for CLUSTER commands on columnar tables 2021-06-16 20:06:33 +03:00
Onur Tirtir 9b4dc2f804 Prevent using parallel scan for columnar index builds 2021-06-16 19:59:32 +03:00
Onur Tirtir 10a762aa88 Implement columnar index support functions 2021-06-16 19:59:32 +03:00
Halil Ozan Akgul db03afe91e Bump citus version to 10.2devel 2021-06-16 17:44:05 +03:00
SaitTalhaNisanci 1784c7ef85
Merge branch 'master' into split_multi 2021-06-16 15:26:09 +03:00
Sait Talha Nisanci c7d04e7f40 swap multi_schedule and multi_schedule_1 2021-06-16 14:40:14 +03:00
Sait Talha Nisanci c55e44a4af Drop table if exists 2021-06-16 14:19:59 +03:00
Sait Talha Nisanci fc89487e93 Split check multi 2021-06-16 14:19:59 +03:00
Naisila Puka e26b29d3bb
Fix nextval('seq_name'::text) bug, and schema for seq tests (#5046) 2021-06-16 13:58:49 +03:00
Marco Slot a7e4d6c94a Fix a bug that causes worker_create_or_alter_role to crash with NULL input 2021-06-15 20:07:08 +02:00
Onur Tirtir a209999618
Enforce table opt constraints when using alter_columnar_table_set (#5029) 2021-06-08 17:39:16 +03:00
Ahmet Gedemenli 089ef35940 Disable dropping and truncating known shards
Add test for disabling dropping and truncating known shards
2021-06-02 14:30:27 +02:00
Jelte Fennema 1a83628195 Use "orphaned shards" naming in more places
We were not very consistent in how we named these shards.
2021-06-04 11:39:19 +02:00
Jelte Fennema 503c70b619 Cleanup orphaned shards before moving when necessary
A shard move would fail if there was an orphaned version of the shard on
the target node. With this change before actually fail, we try to clean
up orphaned shards to see if that fixes the issue.
2021-06-04 11:23:07 +02:00
Jelte Fennema 280b9ae018 Cleanup orphaned shards at the start of a rebalance
In case the background daemon hasn't cleaned up shards yet, we do this
manually at the start of a rebalance.
2021-06-04 11:23:07 +02:00
Jelte Fennema 7015049ea5 Add citus_cleanup_orphaned_shards UDF
Sometimes the background daemon doesn't cleanup orphaned shards quickly
enough. It's useful to have a UDF to trigger this removal when needed.
We already had a UDF like this but it was only used during testing. This
exposes that UDF to users. As a safety measure it cannot be run in a
transaction, because that would cause the background daemon to stop
cleaning up shards while this transaction is running.
2021-06-04 11:23:07 +02:00
Naisila Puka 0f37ab5f85
Fixes column default coming from a sequence (#4914)
* Add user-defined sequence support for MX

* Remove default part when propagating to workers

* Fix ALTER TABLE with sequences for mx tables

* Clean up and add tests

* Propagate DROP SEQUENCE

* Removing function parts

* Propagate ALTER SEQUENCE

* Change sequence type before propagation & cleanup

* Revert "Propagate ALTER SEQUENCE"

This reverts commit 2bef64c5a29f4e7224a7f43b43b88e0133c65159.

* Ensure sequence is not used in a different column with different type

* Insert select tests

* Propagate rename sequence stmt

* Fix issue with group ID cache invalidation

* Add ALTER TABLE ALTER COLUMN TYPE .. precaution

* Fix attnum inconsistency and add various tests

* Add ALTER SEQUENCE precaution

* Remove Citus hook

* More tests

Co-authored-by: Marco Slot <marco.slot@gmail.com>
2021-06-03 23:02:09 +03:00
Marco Slot c03729ad03 Only warn about reference tables when removing last node 2021-06-01 10:53:12 +02:00
Hanefi Onaldi 056005db4d
Improve tests for truncating local data (#5012)
We have a slightly different behavior when using truncate_local_data_after_distributing_table UDF on metadata synced clusters. This PR aims to add tests to cover such cases.

We allow distributing tables with data that have foreign keys to reference tables only on metadata synced clusters. This is the reason why some of my earlier tests failed when run on a single node Citus cluster.
2021-06-03 08:51:32 +03:00
Ahmet Gedemenli 0fbddc740d Fix shard id difference for enterprise 2021-06-01 17:17:46 +03:00
Jelte Fennema 4c20bf7a36
Remove pg_dist_rebalence_strategy_enterprise_check (#5014)
This is not necessary anymore now that the rebalancer is open source.
2021-06-01 06:16:46 -07:00
Ahmet Gedemenli 69d39c0e8b Fix relname null bug when parallel execution 2021-06-01 14:14:35 +03:00
Ahmet Gedemenli 9638933d9d Remove function GenerateNewTargetEntriesForSortClauses 2021-06-01 12:35:36 +03:00
Jelte Fennema d3feee37ea
Add a simple python script to generate a new test (#3972)
The current default citus settings for tests are not really best
practice anymore. However, we keep them because lots of tests depend on
them.

I noticed that I created the same test harness for new tests I added all
the time. This is a simple script that generates that harness, given a
name for the test.

To run:

src/test/regress/bin/create_test.py my_awesome_test
2021-06-01 11:22:11 +02:00
SaitTalhaNisanci c72d2b479b
Add tests for union pushdown workaround (#5005) 2021-05-31 20:02:20 +02:00
SaitTalhaNisanci 8c3f85692d
Not consider old placements when disabling or removing a node (#4960)
* Not consider old placements when disabling or removing a node

* update cluster test
2021-05-28 22:38:20 +02:00
SaitTalhaNisanci 40a229976f
Fix flaky test because of parallel metadata syncing (#5004) 2021-05-28 13:19:15 +03:00
Hanefi Onaldi 4941f00a95
Do not run ref2ref tests in parallel 2021-05-21 16:14:59 +03:00
Hanefi Onaldi 878513f325
Remove all occurences of replication_model GUC 2021-05-21 16:14:59 +03:00
SaitTalhaNisanci 87e3a5e24a
Use 2PC when using a node connection (#4997) 2021-05-21 14:58:53 +03:00
SaitTalhaNisanci 82f34a8d88
Enable citus.defer_drop_after_shard_move by default (#4961)
Enable citus.defer_drop_after_shard_move by default
2021-05-21 10:48:32 +03:00
Jelte Fennema 10f06ad753 Fetch shard size on the fly for the rebalance monitor
Without this change the rebalancer progress monitor gets the shard sizes
from the `shardlength` column in `pg_dist_placement`. This column needs to
be updated manually by calling `citus_update_table_statistics`.
However, `citus_update_table_statistics` could lead to distributed
deadlocks while database traffic is on-going (see #4752).

To work around this we don't use `shardlength` column anymore. Instead
for every rebalance we now fetch all shard sizes on the fly.

Two additional things this does are:
1. It adds tests for the rebalance progress function.
2. If a shard move cannot be done because a source or target node is
   unreachable, then we error in stop the rebalance, instead of showing
   a warning and continuing. When using the by_disk_size rebalance
   strategy it's not safe to continue with other moves if a specific
   move failed. It's possible that the failed move made space for the
   next move, and because the failed move never happened this space now
   does not exist.
3. Adds two new columns to the result of `get_rebalancer_progress` which
   shows the size of the shard on the source and target node.

Fixes #4930
2021-05-20 16:38:17 +02:00
Nils Dijk a6c2d2a4c4
Feature: alter database owner (#4986)
DESCRIPTION: Add support for ALTER DATABASE OWNER

This adds support for changing the database owner. It achieves this by marking the database as a distributed object. By marking the database as a distributed object it will look for its dependencies and order the user creation commands (enterprise only) before the alter of the database owner. This is mostly important when adding new nodes.

By having the database marked as a distributed object it can easily understand for which `ALTER DATABASE ... OWNER TO ...` commands to propagate by resolving the object address of the database and verifying it is a distributed object, and hence should propagate changes of owner ship to all workers.

Given the ownership of the database might have implications on subsequent commands in transactions we force sequential mode for transactions that have a `ALTER DATABASE ... OWNER TO ...` command in them. This will fail the transaction with meaningful help when the transaction already executed parallel statements.

By default the feature is turned off since roles are not automatically propagated, having it turned on would cause hard to understand errors for the user. It can be turned on by the user via setting the `citus.enable_alter_database_owner`.
2021-05-20 13:27:44 +02:00
Onder Kalaci d07db99ea4 Make sure that target node in shard moves is eligable for shard move 2021-05-20 10:51:01 +02:00
Jelte Fennema 924959fdb1
Include result type in upgrade diff test (#4987)
We often change result types of functions slightly. Our downgrade tests
wouldn't notice these changes. This change adds them to the description
of these items.

An example of an SQL change that isn't caught without this change and is
caught with the get_rebalance_progress change in this PR:
https://github.com/citusdata/citus/pull/4963
2021-05-18 16:25:39 +02:00
SaitTalhaNisanci ff2a125a5b
Lookup hostname before execution (#4976)
We lookup the hostname just before the execution so that even if there are cached entries in the prepared statement cache we use the updated entries.
2021-05-18 16:46:31 +03:00
SaitTalhaNisanci eaa7d2bada
Not block maintenance daemon (#4972)
It was possible to block maintenance daemon by taking an SHARE ROW
EXCLUSIVE lock on pg_dist_placement. Until the lock is released
maintenance daemon would be blocked.

We should not block the maintenance daemon under any case hence now we
try to get the pg_dist_placement lock without waiting, if we cannot get
it then we don't try to drop the old placements.
2021-05-17 03:22:35 -07:00
Nils Dijk c91f8d8a15
Feature: localhost guc (#4836)
DESCRIPTION: introduce `citus.local_hostname` GUC for connections to the current node

Citus once in a while needs to connect to itself for some systems operations. This used to be hardcoded to `localhost`. The hardcoded hostname causes some issues, for example in environments where `sslmode=verify-full` is required. It is not always desirable or even feasible to get `localhost` as an alt name on the certificate.

By introducing a GUC to use when connecting to the current instance the user has more control what network path is used and what hostname is required to be present in the server certificate.
2021-05-12 16:59:44 +02:00
Hanefi Onaldi 13808b60cf
Update gitignore files 2021-05-12 09:49:07 +03:00
Jelte Fennema cbbd10b974
Implement an improvement threshold in the rebalancer (#4927)
Every move in the rebalancer algorithm results in an improvement in the
balance. However, even if the improvement in the balance was very small
the move was still chosen. This is especially problematic if the shard
itself is very big and the move will take a long time.

This changes the rebalancer algorithm to take the relative size of the
balance improvement into account when choosing moves. By default a move
will not be chosen if it improves the balance by less than half of the
size of the shard. An extra argument is added to the rebalancer
functions so that the user can decide to lower the default threshold if
the ignored move is wanted anyway.
2021-05-11 14:24:59 +02:00
Onur Tirtir 2e419ea177 Add first_row_number column to columnar.stripe for tid mapping 2021-05-10 20:16:50 +03:00
jeff-davis 7b9aecff21 Columnnar: metapage changes. (#4907)
* Columnar: introduce columnar storage API.

This new API is responsible for the low-level storage details of
columnar; translating large reads and writes into individual block
reads and writes that respect the page headers and emit WAL. It's also
responsible for the columnar metapage, resource reservations (stripe
IDs, row numbers, and data), and truncation.

This new API is not used yet, but will be used in subsequent
forthcoming commits.

* Columnar: add columnar_storage_info() for debugging purposes.

* Columnar: expose ColumnarMetadataNewStorageId().

* Columnar: always initialize metapage at creation time.

This avoids the complexity of dealing with tables where the metapage
has not yet been initialized.

* Columnar: columnar storage upgrade/downgrade UDFs.

Necessary upgrade/downgrade step so that new code doesn't see an old
metapage.

* Columnar: improve metadata.c comment.

* Columnar: make ColumnarMetapage internal to the storage API.

Callers should not have or need direct access to the metapage.

* Columnar: perform resource reservation using storage API.

* Columnar: implement truncate using storage API.

* Columnar: implement read/write paths with storage API.

* Columnar: add storage tests.

* Revert "Columnar: don't include stripe reservation locks in lock graph."

This reverts commit c3dcd6b9f8.

No longer needed because the columnar storage API takes care of
concurrency for resource reservation.

* Columnar: remove unnecessary lock when reserving.

No longer necessary because the columnar storage API takes care of
concurrent resource reservation.

* Add simple upgrade tests for storage/ branch

* fix multi_extension.out

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2021-05-10 20:16:46 +03:00
Ahmet Gedemenli 8cb505d6e1
Fix matview access method change issue (#4959)
* Fix matview access method change issue

* Use pg function get_am_name

* Split view generation command into pieces
2021-05-07 15:47:24 +03:00
SaitTalhaNisanci 6b1904d37a
When moving a shard to a new node ensure there is enough space (#4929)
* When moving a shard to a new node ensure there is enough space

* Add WairForMiliseconds time utility

* Add more tests and increase readability

* Remove the retry loop and use a single udf for disk stats

* Address review

* address review

Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
2021-05-06 17:28:02 +03:00
Ahmet Gedemenli bc818e76e2 Add notice log message for skipping child tables for optimization 2021-05-06 16:49:37 +03:00
Ahmet Gedemenli 2e0bb5c0c8 Fix nested select query with union bug 2021-05-05 20:35:00 +03:00
Jelte Fennema 0e6c080e81
Run copy_modified in upgrade tests (#4952)
This allows running the following command to update the expected files
with normalized output files for upgrade tests too:

```bash
cp src/test/regress/{results,expected}/upgrade_rebalance_strategy_before.out
```
2021-05-05 12:28:05 +02:00
Jelte Fennema 50357db957
Simplify code that tests the shard rebalancer algorithm (#4925)
This modifies the test code to use sane defaults instead of requiring
all values to be specified in the test.
2021-05-03 15:47:19 +02:00
Hanefi Onaldi 23a505d41f
Bump PG versions in CI (#4941)
Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
Co-authored-by: Sait Talha Nisanci <s.talhanisanci@gmail.com>
2021-05-03 13:51:20 +03:00
Jelte Fennema 2f29d4e53e Continue to remove shards after first failure in DropMarkedShards
The comment of DropMarkedShards described the behaviour that after a
failure we would continue trying to drop other shards. However the code
did not do this and would stop after the first failure. Instead of
simply fixing the comment I fixed the code, because the described
behaviour is more useful. Now a single shard that cannot be removed yet
does not block others from being removed.
2021-04-30 15:42:09 +03:00
Marco Slot 4b49cb112f Fix FROM ONLY queries on partitioned tables 2021-04-27 16:10:07 +02:00
Onur Tirtir 889ad6fa8c Run some upgrade tests only when old version=9.0 2021-04-26 14:53:53 +03:00
Ahmet Gedemenli 332c5ce4ad
Fix worker partitioned size functions (#4922) 2021-04-26 10:29:46 +03:00
Jelte Fennema 763fa1cf41 Fix diff-filter to search the whole line for matches
Recently two new normalization line deletion rules have been added that
don't match the start of a line:
```
/local tables that are added to metadata but not chained with reference tables via foreign keys might be automatically converted back to postgres tables$/d
/Consider setting citus.enable_local_reference_table_foreign_keys to 'off' to disable this behavior$/d
```

Because `diff-filter` used `regex.match` these lines were not removed
when creating a new diff. This could cause some confusing diffs, where
the wrong lines were shown as changed. This fixes that by using
`regex.search` instead of `regex.match`.
2021-04-23 12:43:49 +02:00
Onder Kalaci 918838e488 Allow constant VALUES clauses in pushdown queries
As long as the VALUES clause contains constant values, we should not
recursively plan the queries/CTEs.

This is a follow-up work of #1805. So, we can easily apply OUTER join
checks as if VALUES clause is a reference table/immutable function.
2021-04-21 14:28:08 +02:00
SaitTalhaNisanci 93c2dcf3d2
Fix data-race with concurrent calls of DropMarkedShards (#4909)
* Fix problews with concurrent calls of DropMarkedShards

When trying to enable `citus.defer_drop_after_shard_move` by default it
turned out that DropMarkedShards was not safe to call concurrently.
This could especially cause big problems when also moving shards at the
same time. During tests it was possible to trigger a state where a shard
that was moved would not be available on any of the nodes anymore after
the move.

Currently DropMarkedShards is only called in production by the
maintenaince deamon. Since this is only a single process triggering such
a race is currently impossible in production settings. In future changes
we will want to call DropMarkedShards from other places too though.

* Add some isolation tests

Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
2021-04-21 10:59:48 +03:00
Ahmet Gedemenli 33c620f232
Optimize partitioned disk size calculation (#4905)
* Optimize partitioned disk size calculation

* Polish

* Fix test for citus_shard_cost_by_disk_size

Try optimizing if not CSTORE
2021-04-19 13:30:56 +03:00
Onur Tirtir 96278822d9
Move columnar test helpers to a separate file (#4908)
* Move columnar test helpers to another file

* Rename column_store_memory_stats to columnar_store_memory_stats
2021-04-16 18:56:21 +03:00
Onder Kalaci 5b78f6cd63 Keep more execution statistics
When DEBUG4 enabled, Citus now prints per task execution times.
2021-04-16 14:45:00 +02:00
Hanefi Onaldi 9919fbe3f8 Switch to sequential mode on long partition names
This commit adds support for long partition names for distributed tables:
- ALTER TABLE dist_table ATTACH PARTITION ..
- CREATE TABLE .. PARTITION OF dist_table ..

Note: create_distributed_table UDF does not support long table and
partition names, and is not covered in this commit
2021-04-14 15:27:50 +03:00
Ahmet Gedemenli e445e3d39c
Introduce 3 partitioned size udfs (#4899)
* Introduce 3 partitioned size udfs

* Add tests for new partition size udfs

* Fix type incompatibilities

* Convert UDFs into pure sql functions

* Fix function comment
2021-04-13 17:36:27 +03:00
Onur Tirtir fe5c985e1d
Remove HAS_TABLEAM config since we dropped pg11 support (#4862)
* Remove HAS_TABLEAM config

* Drop columnar_ensure_objects_exist

* Not call columnar_ensure_objects_exist in citus_finish_pg_upgrade
2021-04-13 10:51:26 +03:00
Ahmet Gedemenli 52e467a9a0
Error out if inheriting a distributed table (#4871)
* Error out if inheriting a distributed table

* Add test inheriting a distirbuted table
2021-04-07 11:21:06 +03:00
Ahmet Gedemenli e4c4a9b683
Fix error message for local table joins (#4870)
* Fix error message for local table joins

* Fix error messages for regression tests expected outputs
2021-04-06 16:18:28 +03:00
Ahmet Gedemenli 48a6a5b128 Add test for public shard not found issue 2021-04-06 10:29:17 +03:00
Ahmet Gedemenli d530d79d73 Fix tests for public schema 2021-04-06 10:29:17 +03:00
jeff-davis 063e673038
Columnar: use clause Vars for chunk group filtering. (#4856)
* Columnar: use clause Vars for chunk group filtering.

This solves #4780 and also provides a cleaner separation between chunk
group filtering and projection pushdown.

* Columnar: sort and deduplicate Vars pulled from clauses.

* Columnar: cleanup variable names.

* Columnar: remove alternate test output.

* Columnar: do not recurse when looking for whereClauseVars.

Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2021-04-01 12:27:28 -07:00
Halil Ozan Akgul a5038046f9 Adds shard_count parameter to create_distributed_table 2021-03-29 16:22:49 +03:00
Hanefi Önaldı 797538750f Delete all upgrade test artifacts before citus-upgrade-local 2021-03-27 00:46:06 +03:00
SaitTalhaNisanci 03832f353c Drop postgres 11 support 2021-03-25 09:20:28 +03:00
Onur Tirtir 7081690480
Add check-columnar-vg regression test target (#4737) 2021-03-25 11:55:58 +03:00
SaitTalhaNisanci 3a3171cd04 Ignore temporary output files 2021-03-25 09:59:21 +03:00
Nils Dijk 1c1999ed7b
incorporate the fixopen fix for osx users on bigsur (#4837)
comparable to https://github.com/citusdata/tools/pull/88

this patch adds checks to the perl script running the testing harness of citus to start the postgres instances via the fixopen binary when present to work around `Interrupted System` call errors on OSX Big Sur.
2021-03-22 16:22:08 +01:00
Nils Dijk 787ee97867
Tests: foreign key non colocated tests (#4841)
Earlier versions of Citus (pre 9.0) had a bug where a user was able to get in a situation where a foreign key between two non-colocated tables was allowed. This was caused by the wrongful scoping together with only setting to on of a boolean variable in a loop, causing the `true` from an earlier iteration to leak into a new iteration.

This was 'by accident' solved in a refactor that was executed in the preparation of the 9.0 release. Only recently we had a user running into this and it was tracked down to this behaviour.

Given the dire situation a user could get them self into when running into this bug we have backported a fix to the latest 8.3 release branch.

To make sure this regression does not happen anymore in the future I propose we add the tests from the backport to our mainline.

For reference: https://github.com/citusdata/citus/pull/4840
2021-03-22 15:33:56 +01:00
dependabot[bot] a1aedc41f1
Bump jinja2 from 2.11.2 to 2.11.3 in /src/test/regress
Bumps [jinja2](https://github.com/pallets/jinja) from 2.11.2 to 2.11.3.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/master/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/2.11.2...2.11.3)

Signed-off-by: dependabot[bot] <support@github.com>
2021-03-20 05:51:26 +00:00
Önder Kalacı b5f4320164
Make sure that single task local executions start coordinated transaction (#4831)
With https://github.com/citusdata/citus/pull/4806 we enabled
2PC for any non-read-only local task. However, if the execution
is a single task, enabling 2PC (CoordinatedTransactionShouldUse2PC)
hits an assertion as we are not in a coordinated transaction.

There is no downside of using a coordinated transaction for single
task local queries.
2021-03-17 12:20:57 +01:00
Ahmet Gedemenli 5e5db9eefa Add udf citus_get_active_worker_nodes 2021-03-17 13:15:59 +03:00
Marco Slot fbc2147e11 Replace MAX_PUT_COPY_DATA_BUFFER_SIZE by citus.remote_copy_flush_threshold GUC 2021-03-16 06:00:38 +01:00
Marco Slot 1646fca445 Add GUC to set maximum connection lifetime 2021-03-16 01:57:57 +01:00
Onur Tirtir b2a7bafcc4
Fix flaky test in multi_foreign_key_relation_graph (#4819) 2021-03-15 17:55:04 +03:00
Onur Tirtir 1d3e075e62
Support temporary columnar tables (#4766) 2021-03-12 12:01:36 +03:00
Onder Kalaci e65e72130d Rename use -> shouldUse
Because setting the flag doesn't necessarily mean that we'll
use 2PC. If connections are read-only, we will not use 2PC.
In other words, we'll use 2PC only for connections that modified
any placements.
2021-03-12 08:29:43 +00:00
Onder Kalaci 6a7ed7b309 Do not trigger 2PC for reads on local execution
Before this commit, Citus used 2PC no matter what kind of
local query execution happens.

For example, if the coordinator has shards (and the workers as well),
even a simple SELECT query could start 2PC:
```SQL

WITH cte_1 AS (SELECT * FROM test LIMIT 10) SELECT count(*) FROM cte_1;
```

In this query, the local execution of the shards (and also intermediate
result reads) triggers the 2PC.

To prevent that, Citus now distinguishes local reads and local writes.
And, Citus switches to 2PC only if a modification happens. This may
still lead to unnecessary 2PCs when there is a local modification
and remote SELECTs only. Though, we handle that separately
via #4587.
2021-03-12 08:29:43 +00:00
Onur Tirtir 874d5fd962
Remove foreign keys between columnar metadata tables (#4791)
Postgres keeps AFTER trigger state for each transaction, because we can have deferred AFTER triggers which will be fired at the end of a transaction. Postgres cleans up this state at the end of transaction.

Postgres processes ON COMMIT triggers after cleaning-up the AFTER trigger states. So if we fire any triggers in ON COMMIT, the AFTER trigger state won't be cleaned-up properly and the transaction state will be left in an inconsistent state, which might result in assertion failure.

So with this commit, we remove foreign keys between columnar metadata tables and enforce constraints between them manually when dropping columnar tables.
2021-03-12 11:28:17 +03:00
Naisila Puka 196064836c
Skip 2PC for readonly connections in a transaction (#4587)
* Skip 2PC for readonly connections in a transaction

* Use ConnectionModifiedPlacement() function

* Remove the second check of ConnectionModifiedPlacement()

* Add order by to prevent flaky output

* Test using pg_dist_transaction
2021-03-10 20:01:37 +03:00
Marco Slot 9c0d7f5c26 Add tests for modifying CTE and SELECT without FROM 2021-03-09 10:39:33 +01:00
SaitTalhaNisanci aef7fc3a51
Ignore columnar generated test files (#4796) 2021-03-09 10:52:08 +03:00
Philip Dubé 4e22f02997 Fix various typos due to zealous repetition 2021-03-04 19:28:15 +00:00
Onur Tirtir 1bb7a0a268
Fix chunk_group_consistency regression test view (#4765) 2021-03-04 12:20:25 +03:00
Onur Tirtir 9728ce1167
Add tests for concurrent index deadlock issue (#4775) 2021-03-04 11:56:54 +03:00
Onder Kalaci d1cd198655 Prevent infinite recursion for queries that involve UNION ALL and JOIN
With this commit, we make sure to prevent infinite recursion for queries
in the format: [subquery with a UNION ALL] JOIN [table or subquery]

Also, fixes a bug where we pushdown UNION ALL below a JOIN even if the
UNION ALL is not safe to pushdown.
2021-03-03 12:27:26 +01:00
Naisila Puka 2f30614fe3
Reimplement citus_update_table_statistics to detect dist. deadlocks (#4752)
* Reimplement citus_update_table_statistics

* Update stats for the given table not colocation group

* Add tests for reimplemented citus_update_table_statistics

* Use coordinated transaction, merge with citus_shard_sizes functions

* Update the old master_update_table_statistics as well
2021-03-03 04:12:30 +03:00
Marco Slot dca615c5aa Normalize the ConvertTable notices 2021-03-01 10:36:12 +01:00
jeff-davis 9da9bd3dfd
Columnar: rename files and tests. (#4751)
* Columnar: rename files and tests.

* Columnar: Rename Table*State to Columnar*State.
2021-03-01 08:34:24 -08:00
SaitTalhaNisanci feee25dfbd
Use translated vars in postgres 13 as well (#4746)
* Use translated vars in postgres 13 as well

Postgres 13 removed translated vars with pg 13 so we had a special logic
for pg 13. However it had some bug, so now we copy the translated vars
before postgres deletes it. This also simplifies the logic.

* fix rtoffset with pg >= 13
2021-02-26 19:41:29 +03:00
Halil Ozan Akgul 5c5cb200f7 Adds GRANT for public to citus_tables 2021-02-26 16:24:33 +03:00
Önder Kalacı 0fe26a216c
Prevent cross join without any target list entries (#4750)
/*
 * The physical planner assumes that all worker queries would have
 * target list entries based on the fact that at least the column
 * on the JOINs have to be on the target list. However, there is
 * an exception to that if there is a cartesian product join and
 * there is no additional target list entries belong to one side
 * of the JOIN. Once we support cartesian product join, we should
 * remove this error.
 */
2021-02-26 11:04:21 +01:00
Onur Tirtir 54ac924bef Grant read access for columnar metadata tables to unprivileged user 2021-02-26 12:31:09 +03:00
Onur Tirtir dcc0207605 Add 10.0-2 schema version 2021-02-26 12:31:09 +03:00
Onur Tirtir 5ed954844c
Ensure table owner when using alter_columnar_table_set/alter_columnar_table_reset (#4748) 2021-02-26 12:27:51 +03:00
Naisila Puka 5ebd4eac7f
Preserve colocation with procedures in alter_distributed_table (#4743) 2021-02-25 19:52:47 +03:00
Hanefi Onaldi 5aff18b573 Fix flaky test 2021-02-24 17:09:08 +03:00
Hanefi Onaldi 9a792ef841 Remove length limitations for table renames 2021-02-24 03:35:27 +03:00
Hanefi Onaldi 7bebeb872d Failing long table name tests 2021-02-24 03:35:27 +03:00
Naisila Puka dbb88f6f8b
Fix insert query with CTEs/sublinks/subqueries etc (#4700)
* Fix insert query with CTE

* Add more cases with deferred pruning but false fast path

* Add more tests

* Better readability with if statements
2021-02-23 18:00:47 +03:00
Naisila Puka 105bb580e1
Add columnar regression tests (#4727)
* Add cursor tests for columnar tables

* Add columnar tests for data types w/out comp. operators

* Add more prepared statements with columnar tables

* Add constraint tests for columnar tables

* Add row level security, detach partition and rename columnar tests

* Add some ORDER BYs
2021-02-23 14:16:38 +03:00
Ahmet Gedemenli 1f345f65b4 Support dropping local table indexes along with a distributed index 2021-02-18 13:30:12 +03:00
Onur Tirtir 676d9a9726 Bump Citus to 10.1devel 2021-02-17 11:54:33 +03:00
Onur Tirtir d61fd6e478
Decide changing sequence dependencies on MX nodes according to resulting relation (#4713)
When executing alter_table / undistribute_table udf's, we should not try
to change sequence dependencies on MX workers if new table wouldn't
require syncing metadata.

Previously, we were checking that for input table. But in some cases, the
fact that input table requires syncing metadata doesn't imply the same
for resulting table (e.g when undistributing a Citus table).

Even more, doing that was giving an unexpected error when undistributing
a Citus table so this commit actually fixes that.
2021-02-15 19:20:26 +03:00
SaitTalhaNisanci bcbd24f8de
Only consider pseudo constants for shortcuts (#4712)
It seems that we need to consider only pseudo constants while doing some
shortcuts in planning. For example there could be a false clause but it
can contribute to the result in which case it will not be a pseudo
constant.
2021-02-15 18:39:37 +03:00
SaitTalhaNisanci 0f1ce7a913
Not skip relation in conversion if it doesn't have RelationRestriction (#4685)
We would exclude tables without relationRestriction from conversion
candidates in local-distributed table joins. This could leave a leftover
local table which should have been converted to a subquery.

Ideally I would expect that in each call to CreateDistributedPlan we
would pass a new plan id, but that seems like a bigger change.
2021-02-12 12:33:55 +03:00
Hadi Moshayedi e690d8b79b Move stripe.chunk_count to last position 2021-02-11 17:00:44 -08:00
Jeff Davis 1f1c3c362b Columnar: rename chunk_num -> chunk_group_num. 2021-02-11 09:27:00 -08:00
Onder Kalaci f297c96ec5 Add regression tests for COPY into colocated intermediate results
To add the tests without too much data, make the copy switchover
configurable.
2021-02-11 15:41:06 +01:00
Onder Kalaci 5d5a357487 Do not connection re-use for intermediate results
/*
 * Colocated intermediate results are just files and not required to use
 * the same connections with their co-located shards. So, we are free to
 * use any connection we can get.
 *
 * Also, the current connection re-use logic does not know how to handle
 * intermediate results as the intermediate results always truncates the
 * existing files. That's why, we use one connection per intermediate
 * result.
 */
2021-02-11 15:41:06 +01:00
Ahmet Gedemenli c8e83d1f26 Fix dropping fkey when distributing table 2021-02-11 15:48:35 +03:00
SaitTalhaNisanci 847b79078f
Not consider subplans in restriction list (#4679)
* Not consider subplans in restriction list

* Not consider sublink, alternative subplan in restrictions
2021-02-11 15:04:07 +03:00
Onur Tirtir ec7ab68f3b Test adding local table with long name to metadata 2021-02-10 18:05:04 +03:00
Onur Tirtir 9f619a85d6
Fix EXPLAIN ANALYZE exec when query returns no cols (#4672)
We do not include dummy column if original task didn't return any
columns.
Otherwise, number of columns that original task returned wouldn't
match number of columns returned by worker_save_query_explain_analyze.
2021-02-10 17:59:47 +03:00
Hadi Moshayedi 52297804ae Fix zero column tables 2021-02-09 23:05:11 -08:00
Hadi Moshayedi 2d09c76b76 Rename storageid to storage_id 2021-02-09 19:57:04 -08:00
Hadi Moshayedi 8270b598b6 Rename stripeid, chunkid, and attnum 2021-02-09 19:50:50 -08:00
Hadi Moshayedi be90c20457 Fix write path for zero column tables 2021-02-09 14:14:06 -08:00
Hadi Moshayedi c8d61a31e2 Columnar: chunk_group metadata table 2021-02-09 14:11:58 -08:00
Onder Kalaci c804c9aa21 Allow local execution for intermediate results in COPY
When COPY is used for copying into co-located files, it was
not allowed to use local execution. The primary reason was
Citus treating co-located intermediate results as co-located
shards, and COPY into the distributed table was done via
"format result". And, local execution of such COPY commands
was not implemented.

With this change, we implement support for local execution with
"format result". To do that, we use the buffer for every file
on shardState->copyOutState, similar to how local copy on
shards are implemented. In fact, the logic is similar to
local copy on shards, but instead of writing to the shards,
Citus writes the results to a file.

The logic relies on LOCAL_COPY_FLUSH_THRESHOLD, and flushes
only when the size exceeds the threshold. But, unlike local
copy on shards, in this case we write the headers and footers
just once.
2021-02-09 15:00:06 +01:00
SaitTalhaNisanci e96da4886f
Sort results in citus_shards and give raw size (#4649)
* Sort results in citus_shards and give raw size

Sort results so that it is consistent and also similar to citus_tables.

Use raw size in the output so that doing operations on the size is
easier.

* Change column ordering
2021-02-08 15:29:42 +03:00
Hadi Moshayedi 3e6b54b964 Normalize isolation_metadata_sync_deadlock 2021-02-06 15:59:28 -08:00
Hadi Moshayedi eff8cffaf3
Columnar: improve naming of limit config variables. (#4653)
* Rename chunk_row_count to chunk_group_row_limit

* Rename stripe_row_count to stripe_row_limit

* Undo couple of renames
2021-02-06 09:04:04 -08:00
Jeff Davis b1882d4400 Columnar: Call nextval_internal instead of DirectFunctionCall. 2021-02-06 01:45:30 -08:00
Hadi Moshayedi 4e53314e3f Make isolation_metadata_sync_deadlock more resilient 2021-02-06 01:05:24 -08:00
Hadi Moshayedi 0a9fd91d8f Use 'Chunk Groups' in EXPLAIN ANALYZE of columnar scan 2021-02-05 10:58:01 -08:00
Hadi Moshayedi 1d311b0709 Columnar: don't double count chunks filtered 2021-02-05 10:58:01 -08:00
Ahmet Gedemenli 5dd2a3da03 Convert RelabelTypes into CollateExprs in get_rule_expr function 2021-02-05 12:06:46 +03:00
Ahmet Gedemenli 503171d2f2
Merge branch 'master' into rename-master-parameter-for-dist-stat-activity 2021-02-04 15:37:13 +03:00
Ahmet Gedemenli 2443b20b2c Rename master to distributed for worker stat activity 2021-02-04 12:20:06 +03:00
Onder Kalaci fc9a23792c COPY uses adaptive connection management on local node
With #4338, the executor is smart enough to failover to
local node if there is not enough space in max_connections
for remote connections.

For COPY, the logic is different. With #4034, we made COPY
work with the adaptive connection management slightly
differently. The cause of the difference is that COPY doesn't
know which placements are going to be accessed hence requires
to get connections up-front.

Similarly, COPY decides to use local execution up-front.

With this commit, we change the logic for COPY on local nodes:

Try to reserve a connection to local host. This logic follows
the same logic (e.g., citus.local_shared_pool_size) as the
executor because COPY also relies on TryToIncrementSharedConnectionCounter().
If reservation to local node fails, switch to local execution
Apart from this, if local execution is disabled, we follow the
exact same logic for multi-node Citus. It means that if we are
out of the connection, we'd give an error.
2021-02-04 09:45:07 +01:00
Ahmet Gedemenli 34840ddc5c Rename master to citus for dist stat activity cols 2021-02-04 11:12:23 +03:00
Hadi Moshayedi 5fde617229 Columnar: disallow CREATE INDEX CONCURRENTLY 2021-02-03 12:10:00 -08:00
Jeff Davis 4043731c41 Columnar: fix inheritance planning. 2021-02-03 10:41:21 -08:00
Sait Talha Nisanci 24e60b44a1 Consider coordinator in intermediate result optimization
It seems that we were not considering the case where coordinator was
added to the cluster as a worker in the optimization of intermediate
results.

This could lead to errors when coordinator was added as a worker.
2021-02-03 20:02:03 +03:00
Onur Tirtir c0f2817b70
Disallow using alter_table udfs with tables having any identity cols (#4635)
pg_get_tableschemadef_string doesn't know how to deparse identity
columns so we cannot reflect those columns when creating table
from scratch. For this reason, we don't allow using alter_table udfs
with tables having any identity cols.
2021-02-03 19:33:54 +03:00
Onur Tirtir 3a403090fd
Disallow adding local table with identity column to metadata (#4633)
pg_get_tableschemadef_string doesn't know how to deparse identity
columns so we cannot reflect those columns when creating shell
relation.
For this reason, we don't allow adding local tables -having identity cols-
to metadata.
2021-02-03 19:05:17 +03:00
Onur Tirtir 5efb742f8a
Skip copying GENERATED ALWAYS AS STORED cols in ReplaceTable (#4616)
Postgres doesn't allow inserting into columns having GENERATED ALWAYS
AS (...) STORED expressions.
For this reason, when executing undistribute_table or an alter_* udf,
we should skip copying such columns.
This is not bad since Postgres would already generate such columns.
2021-02-03 17:55:16 +03:00
jeff-davis e03246dd45
Colummnar: mark custom scan path paralle_safe. (#4619)
Enables an overall plan to be parallel (e.g. over a partition
hierarchy), even though an individual ColumnarScan is not
parallel-aware.

Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2021-02-02 11:56:00 -08:00
jeff-davis e195af7e72
Columnar: always disable parallel paths. (#4617)
Previously, if columnar.enable_custom_scan was false, parallel paths
could remain, leading to an unexpected error.

Also, ensure that cheapest_parameterized_paths is cleared if a custom
scan is used.

Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2021-02-02 11:37:42 -08:00
Onur Tirtir 912d829757 Skip GENERATED AS ALWAYS STORED cols when processing cols owning sequences
When finding columns owning sequences, we shouldn't rely on atthasdef
since it might be true when column has GENERATED ALWAYS AS (...)
STORED expression.
2021-02-02 18:17:42 +03:00
Onur Tirtir c8a48c6eee
Not try to sync metadata for local tables (#4625) 2021-02-02 15:12:12 +03:00
Hadi Moshayedi bcb162976f Fix #4608 2021-02-01 16:23:16 -08:00
Hadi Moshayedi f5b1e49b79 Columnar: Fix lateral joins 2021-02-01 11:59:36 -08:00
Hadi Moshayedi ef927688fa Columnar: Fix ALTER TABLE ... ADD COLUMN. 2021-02-01 11:40:17 -08:00
Brian Bergeron 1253eeb9ff
Don't propagate ALTER ROLE SET when scoped to a different database (#4471)
Co-authored-by: brberger <brberger@microsoft.com>
2021-02-01 15:49:26 +03:00
Hanefi Önaldı cab17afce9 Introduce UDFs for fixing partitioned table constraint names 2021-01-29 17:32:20 +03:00
Hanefi Önaldı 92cf49b7e9 Limit shardId in partitioned table constraint names to only CHECK 2021-01-29 17:29:53 +03:00
SaitTalhaNisanci 738825cc38
Fix partition column index issue (#4591)
* Fix partition column index issue

We send column names to worker_hash/range_partition_table methods, and
in these methods we check the column name index from tuple descriptor.
Then this index is used to decide the bucket that the current row will
be sent for the repartition.

This becomes a problem when there are the same column names in the
tupleDescriptor. Then we can choose the wrong index. Hence the
partitioned data will be put to wrong workers. Then the result could
miss some data because workers might contain different range of data.

An example:
TupleDescriptor contains "trip_id", "car_id", "car_id" for one table.
It contains only "car_id" for the other table. And assuming that the
tables will be partitioned by car_id, it is not certain what should be
used for deciding the bucket number for the first table. Assuming value
2 goes to bucket 2 and value 3 goes to bucket 3, it is not certain which
bucket "1 2 3" (trip_id, car_id, car_id)  row will go to.

As a solution we send the index of partition column in targetList
instead of the column name.

The old API is kept so that if workers upgrade work, it still works
(though it will have the same bug)

* Use the same method so that backporting is easier
2021-01-29 14:40:40 +03:00
SaitTalhaNisanci 1ba399f5ca
Fix a flaky behaviour in shared_connection_stats (#4596)
With the previous query, we were not pushing down the pg_sleep hence the
number of connections to a worker could be different from run to run.
2021-01-28 18:42:49 +03:00
Onder Kalaci c7ea46067f Add regression tests 2021-01-28 12:45:57 +01:00
Onur Tirtir bb5962ee79
Early error out when creating citus local from a temp table (#4592) 2021-01-28 14:18:06 +03:00
Halil Ozan Akgul 913aa91449 Adds error message to AlterTableSetAccessMethod for below PG12 2021-01-28 11:32:02 +03:00
jeff-davis 15297cab49
Columnar: add GUC to control qual pushdown. (#4586) 2021-01-27 09:57:40 -08:00
Nils Dijk 07d3b4fd04
fix NaN cost estimate on empty columnar tables (#4593)
Fixing a division by zero in the cost calculations for scanning a columnar table.

Due to how the columns in a columnar table are counted an empty table would result in a division by zero. Instead this patch keeps the column selection ratio on zero when this happens, resulting in an accurate cost of zero pages to scan a columnar table.

fixes #4589
2021-01-27 17:32:17 +01:00
Onur Tirtir b20615cbbe
Advise dropping foreign key in addition to create_reference_table hint (#4590) 2021-01-27 17:59:06 +03:00
Onur Tirtir 8151c4b443 Merge remote-tracking branch 'origin/master' into rename-create_citus_local_table 2021-01-27 17:08:58 +03:00
Ahmet Gedemenli b2c1bbddd4
Merge branch 'master' into fix-dropping-mat-views-when-alter-table 2021-01-27 16:33:10 +03:00
Ahmet Gedemenli 35043c56f1 Fix dropping materialized views while doing alter table 2021-01-27 16:32:09 +03:00
Onur Tirtir dfcdccd0e7 Rename udf in regression tests (as per prev commit) 2021-01-27 15:52:37 +03:00
Onur Tirtir 1a4482a37c Get rid of the sql dir for new udf 2021-01-27 15:52:37 +03:00
Onur Tirtir 2f30be823e Rename create_citus_local_table to citus_add_local_table_to_metadata
For simplicity in downgrade test in multi_extension, didn't
actually remove create_citus_local_table udf.
2021-01-27 15:52:36 +03:00
Onur Tirtir c06fcc26e5 Hide notice messages when implicitly undistributing citus local tables 2021-01-27 13:42:06 +03:00
Onur Tirtir cacb76d2c6
Not mention citus local tables in error messages (#4579) 2021-01-27 12:36:53 +03:00
Naisila Puka 94bc2703bc
Make undistribute_table() and citus_create_local_table() work with columnar (#4563)
* Make undistribute_table() and citus_create_local_table() work with columnar

* Rename and use LocallyExecuteUtilityTask for UDF check

* Remove 'local' references in ExecuteUtilityCommand
2021-01-27 01:17:20 +03:00
Halil Ozan Akgul bafa692fc1 Adds error messages with names of indexes that will be dropped 2021-01-26 18:18:26 +03:00
Ahmet Gedemenli e99f052904 Fix index renaming when creating citus local tables 2021-01-26 15:52:48 +03:00
Jeff Davis d62e54dc09 Columnar: optimize write path. 2021-01-25 11:47:21 -08:00
Hadi Moshayedi 639952ffa8 Read chunk row count from catalog tables 2021-01-25 08:53:52 -08:00
Onur Tirtir 215d6630c3 Update foreign_key_to_reference_table so that test output doesn't change 2021-01-25 11:03:39 +03:00
Onur Tirtir b5ea033a0b Convert postgres tables to citus local when creating reference table having fkeys 2021-01-25 11:02:50 +03:00
Jeff Davis 53f7b019d5 Columnar: clean up old references to cstore. 2021-01-22 11:08:36 -08:00
Onur Tirtir 941c8fbf32
Automatically undistribute citus local tables when no more fkeys with reference tables (#4538) 2021-01-22 18:15:41 +03:00
Marco Slot 03328e9679 Rename citus_tables column names to be query-friendly 2021-01-21 18:58:30 +01:00
Ahmet Gedemenli 3ac30ef9d8
Merge branch 'master' into remove-deprecated-gucs-udfs 2021-01-22 13:06:13 +03:00
Ahmet Gedemenli 76354ff563
Merge branch 'master' into remove-deprecated-gucs-udfs 2021-01-22 12:47:06 +03:00
Ahmet Gedemenli 887b67953b
Merge branch 'master' into fix-bug-create-citus-local-table-with-stats 2021-01-22 12:46:47 +03:00
Hadi Moshayedi ff38996645 More meaningful columnar metadata table names 2021-01-21 21:29:07 -08:00
jeff-davis 0b5551faaf
Columnar: add explain info for chunk filtering (#4554)
Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2021-01-21 15:04:42 -08:00
jeff-davis 0581df23f4
Add columnar test for json (#4553)
Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2021-01-21 14:36:38 -08:00
Önder Kalacı 9b39b25390
Prevent citus local table creation via remote execution (#4540)
/*
 * Creating Citus local tables relies on functions that accesses
 * shards locally (e.g., ExecuteAndLogDDLCommand()). As long as
 * we don't teach those functions to access shards remotely, we
 * cannot relax this check.
*/
2021-01-21 11:26:45 +03:00
Onur Tirtir 433062e5d2
Add fkeys between citus local and reference tables in some tests (#4546) 2021-01-20 19:30:20 +03:00
Ahmet Gedemenli 89a6fe83f7 Replace to update_distributed_table_colocation for tests 2021-01-20 17:30:06 +03:00
Ahmet Gedemenli ceb6b503c0 Remove unused UDF mark_tables_colocated 2021-01-20 17:29:23 +03:00
Ahmet Gedemenli 2fa060a32d Fix bug creating citus local table with stats 2021-01-20 17:17:13 +03:00
Halil Ozan Akgul 434f5af030 Adds same access method check 2021-01-20 15:18:03 +03:00
Hadi Moshayedi 8a5b6a43fc Normalize citus_local_tables 2021-01-19 15:56:42 -08:00
Hadi Moshayedi 0e0fd6599a Faster logical replication tests.
Logical replication status can take wal_receiver_status_interval
seconds to get updated. Default is 10s, which means tests in
which logical replication is used can take a long time to finish.
We reduce it to 1 second to speed these tests up.

Logical replication apply launcher launches workers every
wal_retrieve_retry_interval, so if we have many shard moves with
logical replication consecutively, they will be throttled by this
parameter. Default is 5s, we reduce it to 1s so we finish tests
faster.
2021-01-19 07:48:47 -08:00
Hadi Moshayedi bc01c795a2 Reland #4419 2021-01-19 07:48:47 -08:00
SaitTalhaNisanci 745ffbc691
Separate schedules for mixed mode and normal mode in upgrade (#4420) 2021-01-19 14:08:11 +03:00
Halil Ozan Akgul 27c2bd1599 Moves creation of ALTER INDEX STATISTICS commands next to index commands 2021-01-18 16:55:53 +03:00
Naisila Puka 7124a7715d
Skip 'already exists' in CREATE TABLE IF NOT EXISTS PARTITION OF (#4507)
* Just skip 'already exists' in CT IF NOT EXISTS PARTITION OF

* Generalize to tables that are not already distributed partitions
2021-01-18 15:56:02 +03:00
Onur Tirtir f1ecbc3a53
Fix segfault when adding/dropping fkey from ref to citus local via remote exec (#4528) 2021-01-17 20:43:33 +03:00
Onur Tirtir 5a3e8a6e24
Skip postgres tables for UndistributeTable(cascadeViaFKeys) (#4530)
The reason behind skipping postgres tables is that we support
foreign keys between postgres tables and reference tables
(without converting postgres tables to citus local tables)
when enable_local_reference_table_foreign_keys is false or
when coordinator is not added to metadata.
2021-01-17 20:32:30 +03:00
Ahmet Gedemenli 107097ee28 Fix assert failure when creating statistics 2021-01-15 19:36:58 +03:00
Onur Tirtir 7dddfa2d0b
Not invalidate fkey cache if citus not installed (#4521) 2021-01-15 18:31:43 +03:00
Onder Kalaci 30d0a65f40 Adds citus.enable_local_reference_table_foreign_keys
When enabled any foreign keys between local tables and reference
tables supported by converting the local table to a citus local
table.

When the coordinator is not in the metadata, the logic is disabled
as foreign keys are not allowed in this configuration.
2021-01-15 18:04:52 +03:00
Onur Tirtir e718d24868 Add support for CREATE TABLE commands defining foreign keys 2021-01-15 17:46:06 +03:00
Ahmet Gedemenli 9a100bcdb9 Remove unused GUCs
Remove deprecated variables

Remove GUC citus.sslmode

Remove GUC citus.expire_cached_shards

Remove GUC citus.task_tracker_delay

Remove GUC citus.max_assign_task_batch_size

Remove GUC citus.max_tracked_tasks_per_node

Remove GUC citus.max_running_tasks_per_node

Remove GUC citus.large_table_shard_count

Remove GUC citus.max_task_string_size

Remove GUC citus.binary_master_copy_format
2021-01-15 13:30:45 +03:00
Onur Tirtir 787ed643dd
Undistribute table when cascade_via_foreign_keys=true even if rel has no fkeys (#4516)
If relation is not involved in any foreign key relationships,
foreign key graph would not return any relations for given
relationId as expected.

But even if it's the case, we should still undistribute the table
itself.
2021-01-15 12:45:44 +03:00
Onur Tirtir 36b418982f Add support for ALTER TABLE commands defining foreign keys 2021-01-14 17:12:00 +03:00
Nils Dijk a655ef27bc
Test columnar recovery (#4485)
DESCRIPTION: Add tests to verify crash recovery for columnar tables

Based on the Postgres TAP tooling we add a new test suite to the array of test suites for citus. It is modelled after `src/test/recovery` in the postgres project and takes the same place in our repository. It uses the perl modules defined in the postgres project to control the postgres nodes.

The test we add here focus on crash recovery. Our follower tests should cover the streaming replication behaviour.

It is hooked to our CI for both postgres 12 and postgres 13. We omit the recovery tests for postgres 11 as we do not have support for the columnar table access method.
2021-01-14 14:58:29 +01:00
Marco Slot b840e97cd6 Add a alter_old_partitions_set_access_method UDF 2021-01-14 10:44:14 +01:00
Ahmet Gedemenli 9b56ad48cb Recreate invalidation functions for Citus10
Fix multi_create_table

Add schema name to altered functions

Recreate invalidation functions when downgrading
2021-01-13 23:18:07 +03:00
jeff-davis ec319faa43
Only allow columnar tables with permanent storage (#4492). (#4495)
Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2021-01-13 10:37:34 -08:00
jeff-davis b49beda4c3
Stronger check for triggers on columnar tables (#4493). (#4494)
* Stronger check for triggers on columnar tables (#4493).

Previously, we used a simple ProcessUtility_hook. Change to use an
object_access_hook instead.

* Replace alter_table_set_access_method test on partition with foreign key

Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2021-01-13 10:30:53 -08:00
Marco Slot de6aaaa648 Expand support for subqueries in target list through recursive planning 2021-01-13 17:26:09 +01:00
Onur Tirtir ccbc3de535 Enable reference/distributed table creation from citus local tables 2021-01-13 17:14:26 +03:00
Halil Ozan Akgul 2be14cce2e Adds alter_distributed_table and alter_table_set_access_method UDFs 2021-01-13 16:02:39 +03:00
SaitTalhaNisanci 724d56f949
Add citus shard helper view (#4361)
With citus shard helper view, we can easily see:
- where each shard is, which node, which port
- what kind of table it belongs to
- its size

With such a view, we can see shards that have a size bigger than some
value, which could be useful. Also debugging can be easier in production
as well with this view.

Fetch shards in one go per node

The previous implementation was slow because it would do a lot of round
trips, one per shard to be exact. Hence it is improved so that we fetch
all the shard_name, shard-size pairs per node in one go.

Construct shards_names, sizes query on coordinator
2021-01-13 13:58:47 +03:00
Önder Kalacı 7e0826a06b
Make sure that materialized views that contains only (#4499)
Make sure that materialized views that contains only  intermediate results work fine.
2021-01-13 13:17:43 +03:00
Ahmet Gedemenli 436c9d9d79
Remove the word 'master' from Citus UDFs (#4472)
* Replace master_add_node with citus_add_node

* Replace master_activate_node with citus_activate_node

* Replace master_add_inactive_node with citus_add_inactive_node

* Use master udfs in old scripts

* Replace master_add_secondary_node with citus_add_secondary_node

* Replace master_disable_node with citus_disable_node

* Replace master_drain_node with citus_drain_node

* Replace master_remove_node with citus_remove_node

* Replace master_set_node_property with citus_set_node_property

* Replace master_unmark_object_distributed with citus_unmark_object_distributed

* Replace master_update_node with citus_update_node

* Replace master_update_shard_statistics with citus_update_shard_statistics

* Replace master_update_table_statistics with citus_update_table_statistics

* Rename master_conninfo_cache_invalidate to citus_conninfo_cache_invalidate

Rename master_dist_local_group_cache_invalidate to citus_dist_local_group_cache_invalidate

* Replace master_copy_shard_placement with citus_copy_shard_placement

* Replace master_move_shard_placement with citus_move_shard_placement

* Rename master_dist_node_cache_invalidate to citus_dist_node_cache_invalidate

* Rename master_dist_object_cache_invalidate to citus_dist_object_cache_invalidate

* Rename master_dist_partition_cache_invalidate to citus_dist_partition_cache_invalidate

* Rename master_dist_placement_cache_invalidate to citus_dist_placement_cache_invalidate

* Rename master_dist_shard_cache_invalidate to citus_dist_shard_cache_invalidate

* Drop master_modify_multiple_shards

* Rename master_drop_all_shards to citus_drop_all_shards

* Drop master_create_distributed_table

* Drop master_create_worker_shards

* Revert old function definitions

* Add missing revoke statement for citus_disable_node
2021-01-13 12:10:43 +03:00
Onur Tirtir 2ef5879bcc
Fix error thrown for foreign keys from citus local to dist tables (#4490) 2021-01-13 10:15:12 +03:00
Onur Tirtir dd55ab394e
Disallow cascade_via_foreign_keys if any partition rel has non-inherited fkeys (#4487) 2021-01-11 21:50:09 +03:00
Naisila Puka 7b05777682
Add ALTER TABLE .. SET LOGGED/UNLOGGED support (#4486) 2021-01-11 20:39:06 +03:00
Marco Slot d900a7336e Automatically add placeholder record for coordinator 2021-01-08 15:09:53 +01:00
Marco Slot 597533b1ff Add citus_set_coordinator_host 2021-01-08 13:36:26 +01:00
Marco Slot e7f13978b5 Add a view for simple (time) partitions and their access methods 2021-01-08 11:28:15 +01:00
Onur Tirtir 5289785da4
Add cascade_via_foreign_keys option to create_citus_local_table (#4462) 2021-01-08 15:13:26 +03:00
Marco Slot 011283122b Add the shard rebalancer implementation 2021-01-07 16:51:55 +01:00
Onur Tirtir d9a3e26f20
Fix flaky test in multi_foreign_key_relation_graph (#4476)
CREATE TABLE does not invalidate foreign key graph but some other set of
ddl commands do.

Previously, as we run multi_foreign_key & multi_foreign_key_relation_graph
in parallel, it's possible that multi_foreign_key invalidates foreign key
graph via some ddl commands and create table test in
multi_foreign_key_relation_graph becomes flaky.

So we un-parallelize those two tests.
2021-01-07 16:19:11 +03:00
Onur Tirtir f3801143fb Add cascade option to undistribute_table 2021-01-07 15:41:49 +03:00
Marco Slot 47c1b19174 Revert "Do metadata sync in a separate background worker."
This reverts commit 4df723cf9b.
2021-01-07 10:30:04 +01:00
Marco Slot d9f175532b Revert "Trigger metadata sync at transaction commit"
This reverts commit a2c73bef27.
2021-01-07 10:30:00 +01:00
Marco Slot 5de3337b2f Support local execution for INSERT..SELECT with re-partitioning 2021-01-06 16:15:53 +01:00
Naisila Puka bcfc0aa4e9
Rethrow original concurrent index creation failure message (#4469)
* Rethrow original concurrent index creation failure message

* Alter test outputs for concurrent index creation

* Detect duplicate table failure in concurrent index creation

* Add test for conc. index creation w/out duplicates
2021-01-06 15:27:13 +03:00
Ahmet Gedemenli 1f36ff7c17
Prevent deadlock for long named partitioned index creation on single node (#4461)
* Prevent deadlock for long named partitioned index creation on single node

* Create IsSingleNodeCluster function

* Use both local and sequential execution
2021-01-05 13:39:13 +03:00
Ahmet Gedemenli f27649754b
Add alter index set statistics support (#4455)
* Add alter index set statistics support

* Use attNum instead of attName
2021-01-05 13:23:11 +03:00
Onur Tirtir 87e5276bdd
Fix fkey graph test for self reference (#4450) 2020-12-28 12:47:39 +03:00
Halil Ozan Akgül a8626d1944
Fixes the table used in the error message (#4449) 2020-12-25 16:48:50 +03:00
Naisila Puka 04aeb6938b
Merge branch 'master' into issue4237 2020-12-25 12:36:40 +03:00
Hadi Moshayedi a2c73bef27 Trigger metadata sync at transaction commit 2020-12-24 08:28:38 -08:00
Hadi Moshayedi 4df723cf9b Do metadata sync in a separate background worker. 2020-12-24 08:25:55 -08:00
Naisila Puka 0bb2c991f9
Merge branch 'master' into issue4237 2020-12-24 18:05:27 +03:00
Ahmet Gedemenli 5af585269a Add separate pg13 test for stats targets 2020-12-24 18:01:25 +03:00
naisila 59a81491e8 Add test for master_create_empty_shard on coordinator 2020-12-24 17:59:40 +03:00
Ahmet Gedemenli d4bc17f6f0 Propagate statistics with altered targets 2020-12-24 17:10:12 +03:00
Ahmet Gedemenli f7c70f9a63 Propagate alter stats target 2020-12-24 17:10:12 +03:00
Ahmet Gedemenli 5a1607b6c0 Propagate alter stats schema 2020-12-24 17:10:12 +03:00
Ahmet Gedemenli bdce4a7e67 Propagate rename statistics 2020-12-24 17:10:12 +03:00
Onur Tirtir 5ed9197041
Implement infra to get foreign key connected relations (#4439)
On top of our foreign key graph, implement the infrastructure to get
list of relations that are connected to input relation via a foreign key
graph.
We need this to support cascading create_citus_local_table &
undistribute_table operations.

Also add regression tests to see what our foreign key graph is able to
capture currently.
2020-12-24 16:42:40 +03:00
Onur Tirtir 57e7defa3c
Support CREATE INDEX commands without index name on citus tables (#4273) 2020-12-23 23:15:39 +03:00
Marco Slot e3dcc278e0 Remove upgrade_to_reference_table UDF 2020-12-23 00:40:14 +01:00
jeff-davis 90d63cb792
Add columnar pg_dump test. (#4433) 2020-12-22 15:57:35 -08:00
Ahmet Gedemenli 874fa1fc09 Propagate Drop Statistics 2020-12-22 18:34:46 +03:00
Marco Slot 321cc784c7 Collapse Citus 7.* scripts into Citus 8.0-1 2020-12-21 22:55:51 +01:00
Hadi Moshayedi dde0323b57
Columnar: enable zstd & lz4 compilation by default (#4402)
* Columnar: enable zstd & lz4 compilation by default

* Make zstd & lz4 tests more consistent

* Don't require lz4 & zstd for postgres 11

Co-authored-by: Nils Dijk <nils@citusdata.com>
2020-12-21 12:11:58 -08:00
Onur Tirtir cceaf31e4c
Add some more tests with views to test recursive planning on views (#4427)
(cherry picked from commit 51f422f3c6)
2020-12-21 11:53:37 +03:00
jeff-davis 49281202af
Add simple follower test for columnar. (#4432) 2020-12-18 13:59:20 -08:00
jeff-davis 3e0f1aaaab
Prevent inserting into logically-replicated columnar table. (#4429) 2020-12-18 12:29:30 -08:00
Marco Slot f2056e553f
Expose partition column of subqueries in optimizer (#4355)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2020-12-18 20:32:52 +01:00
SaitTalhaNisanci 145112f3a0
Fix attribute numbers in subquery conversions (#4426)
Attribute number in a subquery RTE and relation RTE means different
things. In a relation attribute number will point to the column number
in the table definition including the dropped columns as well however in
subquery, it means the index in the target list. When we convert a
relation RTE to subquery RTE we should either correct all the relevant
attribute numbers or we can just add a dummy column for the dropped
columns. We choose the latter in this commit because it is practically
too vulnerable to update all the vars in a query.

Another thing this commit fixes is that in case a join restriction
clause list contains a false clause, we should just returns a false
clause instead of the whole list, because the whole list will contain
restrictions from other RTEs as well and this breaks the query, which
can be seen from the output changes, now it is much simpler.

Also instead of adding single tests for dropped columns, we choose to
run the whole mixed queries with tables with dropped columns, this
revealed some bugs already, which are fixed in this commit.
2020-12-18 20:25:41 +03:00
Nils Dijk a748729998
rework ci 2020-12-18 18:04:45 +01:00
Ahmet Gedemenli 770d3da1ca Add dependencies for stat schemas 2020-12-18 17:04:13 +03:00
Ahmet Gedemenli 6c0465566a Propagate create statistics 2020-12-17 20:38:36 +03:00
Marco Slot 1e2518f83c
Add tests for router queries with catalog tables (#4422)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2020-12-17 15:07:50 +01:00
Marco Slot 100e5d3196 Address review feedback 2020-12-15 15:23:38 +01:00
Marco Slot 23dccd8941 Add some new tests for complex correlated subqueries in WHERE 2020-12-15 14:17:16 +01:00
Marco Slot 707a6554b1 Support co-located/recurring correlated subqueries 2020-12-15 14:17:16 +01:00
Sait Talha Nisanci 181a7e1d36 Skip dropped columns 2020-12-15 18:18:36 +03:00
Sait Talha Nisanci 7951273f74 Refactor WrapRteRelationIntoSubquery 2020-12-15 18:18:36 +03:00
Sait Talha Nisanci 0e53aa5d3b Add more tests 2020-12-15 18:18:36 +03:00
Sait Talha Nisanci c4d3927956 Not allow local table updates with remote citus local tables 2020-12-15 18:18:36 +03:00
Sait Talha Nisanci f5dd5379b2 Add more tests 2020-12-15 18:18:36 +03:00
Sait Talha Nisanci f7c1509fed Not check if the query is routable for converting
It seems that there are only very few cases where that is useful, and
for now we prefer not having that check. This means that we might
perform some unnecessary checks, but that should be rare and not
performance critical.
2020-12-15 18:18:36 +03:00
Sait Talha Nisanci 1d82972ff4 Increase the performance with a trick
Instead of sending NULL's over a network, we now convert the subqueries
in the form of:

SELECT t.a, NULL, NULL FROM (SELECT a FROM table)t;

And we recursively plan the inner part so that we don't send the NULL's
over network. We still need the NULLs in the outer subquery because we
currently don't have an easy way of updating all the necessary places in
the query.

Add some documentation for how the conversion is done
2020-12-15 18:18:36 +03:00
Sait Talha Nisanci 3aed6c3ad0 Rename containsOnlyLocalTable as isLocalTableModification
Update error message in Modify View
2020-12-15 18:18:36 +03:00
Sait Talha Nisanci 13c43d5744 Improve table conversion logic in dist-local joins 2020-12-15 18:18:36 +03:00
Sait Talha Nisanci 5618f3a3fc Use BaseRestrictInfo for finding equality columns
Baseinfo also has pushed down filters etc, so it makes more sense to use
BaseRestrictInfo to determine what columns have constant equality
filters.

Also RteIdentity is used for removing conversion candidates instead of
rteIndex.
2020-12-15 18:18:36 +03:00
Sait Talha Nisanci 69992d58f9 Add broken local-dist table modifications tests
It seems that most of the updates were broken, we weren't aware of it
because there wasn't any data in the tables. They are broken mostly
because local tables do not have a shard id and some code paths should
be updated with that information, currently when there is an invalid
shard id, it is assumed to be pruned.

Consider local tables in router planner

In case there is a local table, the shard id will not be valid and there
are some checks that rely on shard id, we should skip these in case of
local tables, which is handled with a dummy placement.

Add citus local table dist table join tests

add local-dist table mixed joins tests
2020-12-15 18:18:36 +03:00
Sait Talha Nisanci 2a44029aaf Simplify ContainsTableToBeConvertedToSubquery
AllDataLocallyAccessible and ContainsLocalTableSubqueryJoin are removed.
We can possibly remove ModifiesLocalTableWithRemoteCitusLocalTable as
well. Though this removal has a side effect that now when all the data
is locally available, we could still wrap a relation into a subquery, I
guess that should be resolved in the router planner itself.

Add more tests
2020-12-15 18:17:10 +03:00
Sait Talha Nisanci 26d9f0b457 Use auto mode in tests and fix debug message 2020-12-15 18:17:10 +03:00
Sait Talha Nisanci 3bd53a24a3 Support update on postgres table from citus local table 2020-12-15 18:17:10 +03:00
Sait Talha Nisanci 4b6611460a Support foreign table joins as well 2020-12-15 18:17:10 +03:00
Sait Talha Nisanci 7e9204eba9 Update vars in quals while wrapping RTE to subquery
When we wrap an RTE to subquery we are updating the variables varno's as
1, however we should also update the varno's of vars in quals.

Also some other small code quality improvements are done.
2020-12-15 18:17:10 +03:00
Sait Talha Nisanci 0689f2ac1a Recursively plan distributed tables only if all have unique filters
The previous algorithm was not consistent and it could convert different
RTEs based on the table orders in the query. Now we convert local tables
if there is a distributed table which doesn't have a unique index. So if
there are 4 tables, local1, local2, dist1, dist2_with_pkey then we will
convert local1 and local2 in `auto` mode. Converting a distributed table
is not that logical because as there is a distributed table without a
unique index, we will need to convert the local tables anyway. So
converting the distributed table with pkey is redundant.
2020-12-15 18:17:10 +03:00
Sait Talha Nisanci a008fc611c Support materialized view joins as well 2020-12-15 18:17:10 +03:00
Sait Talha Nisanci 5f46abffd9 Update check multi tests 2020-12-15 18:17:10 +03:00
Sait Talha Nisanci 3fe3c55023 Use ShouldConvertLocalTableJoinsToSubqueries
Remove FillLocalAndDistributedRTECandidates and use
ShouldConvertLocalTableJoinsToSubqueries, which simplifies things as we
rely on a single function to decide whether we should continue
converting RTE to subquery.
2020-12-15 18:17:10 +03:00
Sait Talha Nisanci eebcd995b3 Add some more tests 2020-12-15 18:17:10 +03:00
Sait Talha Nisanci 5693cabc41 Not convert an already routable plannable query
We should not recursively plan an already routable plannable query. An
example of this is (SELECT * FROM local JOIN (SELECT * FROM dist) d1
USING(a));

So we let the recursive planner do all of its work and at the end we
convert the final query to to handle unsupported joins. While doing each
conversion, we check if it is router plannable, if so we stop.

Only consider range table entries that are in jointree

If a range table is not in jointree then there is no point in
considering that because we are trying to convert range table entries to
subqueries for join use case.
2020-12-15 18:17:10 +03:00
Sait Talha Nisanci 2ff65f3630 Enable partitioned distributed tables in local-dist table joins 2020-12-15 18:17:10 +03:00
Sait Talha Nisanci 44953579cf Enable citus-local distributed table joins
Check equality in quals

We want to recursively plan distributed tables only if they have an
equality filter on a unique column. So '>' and '<' operators will not
trigger recursive planning of distributed tables in local-distributed
table joins.

Recursively plan distributed table only if the filter is constant

If the filter is not a constant then the join might return multiple rows
and there is a chance that the distributed table will return huge data.
Hence if the filter is not constant we choose to recursively plan the
local table.
2020-12-15 18:17:10 +03:00
Sait Talha Nisanci f3d55448b3 Choose distributed table if it has a unique index in filter
When doing local-distributed table joins we convert one of them to
subquery. The current policy is that we convert distributed tables to
subquery if it has a unique index on a column that has unique
index(primary key also has a unique index).
2020-12-15 18:17:10 +03:00
Onder Kalaci f0aef67ed2 Update existing regression tests 2020-12-15 18:17:10 +03:00
Onder Kalaci 3f4952cc2b Pushdown projections when relations are recursively planned
This is important to limit the data transfer size.
2020-12-15 18:17:10 +03:00
Onder Kalaci 945193555b add basic regression tests 2020-12-15 18:17:10 +03:00
Onder Kalaci 594e001f3b Add filter pushdown regression tests
Also handle WHERE false
2020-12-15 18:17:10 +03:00
Onder Kalaci 82a4830c7d Adjust the existing regression tests 2020-12-15 18:17:10 +03:00
Marco Slot f2538a456f Support co-located/recurring sublinks in the target list 2020-12-13 15:45:24 +01:00
Marco Slot 8e8adcd92a Harden citus_tables against node failure 2020-12-13 15:10:40 +01:00
Hadi Moshayedi 4dd22cc4e4 Columnar: Fix ANALYZE for large number of rows. 2020-12-10 09:52:33 -08:00
Hadi Moshayedi b3dac5e9d1 Columnar: set default compression as zstd if available 2020-12-09 14:32:08 -08:00
Hadi Moshayedi 4668fe51a6 Columnar: Make compression level configurable 2020-12-09 08:48:50 -08:00
Hadi Moshayedi f5a4a4bc74 Columnar: Support zstd compression 2020-12-09 08:30:55 -08:00
Hadi Moshayedi 3f81ee26fd Columnar: Support LZ4 compression 2020-12-09 08:29:07 -08:00
jeff-davis 260a02180b
Add tests for unsupported columnar storage features (#4397)
Add negative tests:
 * Deletes
 * Sample scan
 * Special columns
 * Tuple locks
 * Indexes
2020-12-09 00:08:45 -08:00
Jeff Davis c91e5b052b more test fixups 2020-12-07 13:43:27 -08:00
Jeff Davis 7169ba21c4 more test fixes 2020-12-07 13:36:46 -08:00
Jeff Davis e26fdeb706 fixup tests some more 2020-12-07 13:22:16 -08:00
Jeff Davis 5b3c32eb38 fixup tests 2020-12-07 13:18:22 -08:00
Jeff Davis 068af7f38e fixup upgrade tests 2020-12-07 13:11:51 -08:00
Jeff Davis 3758e83850 Rename cstore->columnar in SQL objects and errors. 2020-12-07 13:01:53 -08:00
Jeff Davis ad919ff220 Tests for UPDATE and error message improvement.
UPDATEs on partitioned tables that affect only row partitions should
succeed, the rest should fail.

Also rename CStoreScan to ColumnarScan to make the error message more
relevant.
2020-12-07 11:25:30 -08:00
Ahmet Gedemenli 936775e8e3 Delete transactions when removing node
With this commit, we delete entries in pg_dist_transaction
for the primary nodes that are removed by `master_remove_node`.
2020-12-07 11:35:20 +03:00
Hadi Moshayedi 01da2a1c73 Columnar: track decompressed length in metadata 2020-12-04 09:09:39 -08:00
Onder Kalaci bd9827aed9 Add regression tests with different data types
We typically do not test Citus with these uncommon
data types. Now, we already have the tests for ADF
integration, add it to regression tests as well.
2020-12-04 10:25:00 +03:00
Hadi Moshayedi 4a9aebaa7b Columnar: rename block to chunk 2020-12-03 08:50:19 -08:00
Hadi Moshayedi 24bfd368a9 Columnar: Fix VACUUM for empty tables 2020-12-03 08:46:09 -08:00
Marco Slot c9b658daea Add a public.citus_tables view 2020-12-03 17:31:40 +01:00
Marco Slot 4098d33acb Allow citus size functions on replicated tables 2020-12-03 16:33:24 +01:00
Marco Slot c69ea2512a Fix flappy failure test 2020-12-03 13:54:02 +01:00
Onder Kalaci c546ec5e78 Local node connection management
When Citus needs to parallelize queries on the local node (e.g., the node
executing the distributed query and the shards are the same), we need to
be mindful about the connection management. The reason is that the client
backends that are running distributed queries are competing with the client
backends that Citus initiates to parallelize the queries in order to get
a slot on the max_connections.

In that regard, we implemented a "failover" mechanism where if the distributed
queries cannot get a connection, the execution failovers the tasks to the local
execution.

The failover logic is follows:

- As the connection manager if it is OK to get a connection
	- If yes, we are good.
	- If no, we fail the workerPool and the failure triggers
	  the failover of the tasks to local execution queue

The decision of getting a connection is follows:

/*
 * For local nodes, solely relying on citus.max_shared_pool_size or
 * max_connections might not be sufficient. The former gives us
 * a preview of the future (e.g., we let the new connections to establish,
 * but they are not established yet). The latter gives us the close to
 * precise view of the past (e.g., the active number of client backends).
 *
 * Overall, we want to limit both of the metrics. The former limit typically
 * kics in under regular loads, where the load of the database increases in
 * a reasonable pace. The latter limit typically kicks in when the database
 * is issued lots of concurrent sessions at the same time, such as benchmarks.
 */
2020-12-03 14:16:13 +03:00
Hadi Moshayedi c2f60b6422
Columnar: pg_upgrade support (#4354) 2020-12-02 08:46:59 -08:00
Ahmet Gedemenli 5242dcfe99 Add tests for propagating alter schema rename 2020-12-02 15:18:26 +03:00
Nils Dijk 6f9c040f76
DESCRIPTION: Propagate columnar table settings for distributed tables
When distributing a columnar table, as well as changing options on a distributed columnar table, this patch will forward the settings from the coordinator to the workers.

For propagating options changes on an already distributed table this change is pretty straight forward. Before applying the change in options locally we will create a `DDLJob` that contains a call to `alter_columnar_table_set(...)` for every shard placement with all settings of the current table. This goes both for setting an option as well as resetting. This will reset the values to the defaults configured on the coordinator. Having the effect that the coordinator is authoritative on the settings and makes sure the shards have the same settings set as the table on the coordinator.

When a columnar table is distributed it is using the `TableDDLCommand` infra structure to create a new kind of `TableDDLCommand`. This new type, called a `TableDDLCommandFunction` contains a context and 2 function pointers to execute. One function returns the command as applied on the table, the second function will return the sql command to apply to a shard with a given shard id. The schema name is ignored as it will use the fully qualified name of the shard in the same schema as the base table.
2020-12-02 13:02:42 +01:00
Halil Ozan Akgül ef0914a7f8
Adds ORDER BY to flaky test (#4305)
Co-authored-by: Önder Kalacı <onder@citusdata.com>
2020-12-02 14:24:05 +03:00
Onder Kalaci f7e1aa3f22 Multi-row INSERTs use local execution when placements are local
Multi-row execution already uses sequential execution. When shards
are local, using local execution is profitable as it avoids
an extra connection establishment to the local node.
2020-12-01 21:37:59 +03:00
Marco Slot 04cffdd925 Run master_copy_shard_placement separately 2020-11-30 20:34:03 +01:00
Marco Slot 48caca4084 Improve regression test settings 2020-11-30 20:34:03 +01:00
Ahmet Gedemenli 8e5f0487eb Add order by for flaky test 2020-12-01 10:54:52 +03:00
Ahmet Gedemenli 67761897ab Add test for citus table size func in transaction with modification
Add test for citus_relation_size
2020-12-01 10:38:15 +03:00
Hadi Moshayedi feecb7b423
Columnar: few fixes (#4371)
* Columnar: fix a memory issue

* Columnar: no need for deferred triggers

* Columnar: relax memory growth constraints
2020-11-30 18:09:43 -08:00
Hadi Moshayedi a94e8c9cda
Associate column store metadata with storage id (#4347) 2020-11-30 18:01:43 -08:00
Marco Slot ecbc1ab008 Run subquery_prepared_statements by itself 2020-11-30 08:53:06 +01:00
Sait Talha Nisanci 8b0aed521f Isolate join test
Join test gets too many clients error too frequently hence we should
not run anything concurrently with that. Hopefully this will fix the
flakiness of test.
2020-12-01 00:00:17 +03:00
SaitTalhaNisanci c31a8df380
Call 6 times not 7 in subquery_prepared_statements (#4357) 2020-11-30 21:20:51 +03:00
SaitTalhaNisanci 8c3dd6338e
Run pg12 and pg13 separately (#4352)
It seems that sometimes we get `too many clients errors` with this set
of parallel tests, hence two of them are separated.
2020-11-30 19:32:49 +03:00
Hadi Moshayedi 7f43804dae
Normalize VACUUM VERBOSE output (#4353)
This is to avoid flaky changes like the following in test outputs:

-CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
+CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.02 s.
2020-11-27 12:07:25 -08:00
Nils Dijk 383e334023
refactor options to their own table linked to the regclass (#4346)
Columnar options were by accident linked to the relfilenode instead of the regclass/relation oid. This PR moves everything related to columnar options to their own catalog table.
2020-11-27 11:22:08 -08:00
Onder Kalaci 629ecc3dee Add the infrastructure to count the number of client backends
Considering the adaptive connection management
improvements that we plan to roll soon, it makes it
very helpful to know the number of active client
backends.

We are doing this addition to simplify yhe adaptive connection
management for single node Citus. In single node Citus, both the
client backends and Citus parallel queries would compete to get
slots on Postgres' `max_connections` on the same Citus database.

With adaptive connection management, we have the counters for
Citus parallel queries. That helps us to adaptively decide
on the remote executions pool size (e.g., throttle connections
if necessary).

However, we do not have any counters for the total number of
client backends on the database. For single node Citus, we
should consider all the client backends, not only the remote
connections that Citus does.

Of course Postgres internally knows how many client
backends are active. However, to get that number Postgres
iterates over all the backends. For examaple, see [pg_stat_get_db_numbackends](8e90ec5580/src/backend/utils/adt/pgstatfuncs.c (L1240))
where Postgres iterates over all the backends.

For our purpuses, we need this information on every connection
establishment. That's why we cannot affort to do this kind of
iterattion.
2020-11-25 19:19:24 +01:00
Ahmet Gedemenli a64dc8a72b Fixes a bug preventing INSERT SELECT .. ON CONFLICT with a constraint name on local shards
Separate search relation shard function

Add tests
2020-11-25 15:10:46 +03:00
Onder Kalaci 7accbff3f6 Do not cache all the distributed table metadata during CitusTableTypeIdList()
CitusTableTypeIdList() function iterates on all the entries of pg_dist_partition
and loads all the metadata in to the cache. This can be quite memory intensive
especially when there are lots of distributed tables.

When partitioned tables are used, it is common to have many distributed tables
given that each partition also becomes a distributed table.

CitusTableTypeIdList() is used on every CREATE TABLE .. PARTITION OF.. command
as well. It means that, anytime a partition is created, Citus loads all the
metadata to the cache. Note that Citus typically only loads the accessed table's
metadata to the cache.
2020-11-24 17:44:06 +01:00
Önder Kalacı c760cd3470
Move local execution after remote execution (#4301)
* Move local execution after the remote execution

Before this commit, when both local and remote tasks
exist, the executor was starting the execution with
local execution. There is no strict requirements on
this.

Especially considering the adaptive connection management
improvements that we plan to roll soon, moving the local
execution after to the remote execution makes more sense.

The adaptive connection management for single node Citus
would look roughly as follows:

   - Try to connect back to the coordinator for running
     parallel queries.
        - If succeeds, go on and execute tasks in parallel
        - If fails, fallback to the local execution

So, we'll use local execution as a fallback mechanism. And,
moving it after to the remote execution allows us to implement
such further scenarios.
2020-11-24 13:43:38 +01:00
Hadi Moshayedi 40b52ab757 Fix memory leaks in column store 2020-11-23 11:26:12 -08:00
Jeff Davis ba6ec610e2 address review comment 2020-11-20 10:03:12 -08:00
Jeff Davis 8cee2b092b remove columnar FDW code 2020-11-20 10:03:12 -08:00
Onder Kalaci c433c66f2b Do not execute subplans multiple times with cursors
Before this commit, we let AdaptiveExecutorPreExecutorRun()
to be effective multiple times on every FETCH on cursors.
That does not affect the correctness of the query results,
but adds significant overhead.
2020-11-20 10:43:56 +01:00
Hadi Moshayedi b182a95389 Fix ALTER COLUMN ... SET TYPE for columnar 2020-11-19 15:36:45 -08:00
Jeff Davis cef1d0e915 fixup test output 2020-11-19 12:45:52 -08:00
Jeff Davis 91015deb9d rename UDFs also 2020-11-19 12:27:40 -08:00
Jeff Davis a2b698a766 rename cstore_tableam -> columnar 2020-11-19 12:15:51 -08:00
SaitTalhaNisanci 9c44911226
Improve error messages in shard pruning (#4324) 2020-11-18 17:16:06 +03:00
Hadi Moshayedi 2747fd80ff Add prepared materialized view tests for columnar 2020-11-17 20:13:20 -08:00
Hadi Moshayedi 6711340ea6 Add prepared xact & stmt tests for columnar 2020-11-17 20:00:57 -08:00
Hadi Moshayedi 97cba2d5b6 Implements write state management for tuple inserts.
TableAM API doesn't allow us to pass around a state variable along all of the tuple inserts belonging to the same command. We require this in columnar store, since we batch them, and when we have enough rows we flush them as stripes.

To do that, we keep a (relfilenode) -> stack of (subxact id, TableWriteState) global mapping.

**Inserts**

Whenever we want to insert a tuple, we look up for the relation's relfilenode in this mapping. If top of the stack matches current subtransaction, we us the existing TableWriteState. Otherwise, we allocate a new TableWriteState and push it on top of stack.

**(Sub)Transaction Commit/Aborts**

When the subtransaction or transaction is committed, we flush and pop all entries matching current SubTransactionId.

When the subtransaction or transaction is committed, we pop all entries matching current SubTransactionId and discard them without flushing.

**Reads**

Since we might have unwritten rows which needs to be read by a table scan, we flush write states on SELECTs. Since flushing the write state of upper transactions in a subtransaction will cause metadata being written in wrong subtransaction, we ERROR out if any of the upper subtransactions have unflushed rows.

**Table Drops**

We record in which subtransaction the table was dropped. When committing a subtransaction in which table was dropped, we propagate the drop to upper transaction. When aborting a subtransaction in which table was dropped, we mark table as not deleted.
2020-11-17 12:07:16 -08:00
Nils Dijk 725f4a37d0
change configure to not have options 2020-11-17 19:01:54 +01:00
Nils Dijk 22df8027b0
add extra output for multi_extension targeting pg11 2020-11-17 19:01:54 +01:00
Nils Dijk 7c891a01a9 create missing objects during upgrade path 2020-11-17 19:01:51 +01:00
Nils Dijk 2987535172
add pg upgrade tests verifying table am is created 2020-11-17 18:55:36 +01:00
Nils Dijk d065bb495d
Prepare downgrade script and bump development version to 10.0-1 2020-11-17 18:55:35 +01:00
Nils Dijk b6d4a1bbe2
fix style 2020-11-17 18:55:35 +01:00
Nils Dijk 3bb6554976
make tests run 2020-11-17 18:55:35 +01:00
Nils Dijk f89bd3eeb5
move columnar test files 2020-11-17 18:55:34 +01:00
SaitTalhaNisanci 34de1f645c
Update failure test dependencies (#4284)
* Update failure test dependencies

There was a security alert for cryptography. The vulnerability was fixed
in 3.2.0. The vulnebarility:

"RSA decryption was vulnerable to Bleichenbacher timing vulnerabilities,
which would impact people using RSA decryption in online scenarios."

The fix:
58494b41d6

It wasn't enough to only update crpytography because mitm was
incompatible with the new version, so mitm is also upgraded.

The steps to do in local:
python -m pip install -U cryptography
python -m pip install -U mitmproxy
2020-11-17 19:16:08 +03:00
Onur Tirtir 5e3dc9d707 Bump citus version to 10.0devel 2020-11-09 13:16:54 +03:00
Onur Tirtir 5d5966f700
Fix a flaky test in mixed_relkind_tests (#4300) 2020-11-06 14:53:30 +03:00
Onder Kalaci e0d2ac7620 Do not rely on set_rel_pathlist_hook for finding local relations
When a relation is used on an OUTER JOIN with FALSE filters,
set_rel_pathlist_hook may not be called for the table.

There might be other cases as well, so do not rely on the hook
for classification of the tables.
2020-11-06 11:14:30 +01:00
Onur Tirtir 0556952607
Normalize partitioned table aliases in explain output (#4295)
Aliases that postgres choose for partitioned tables in explain output
might change in different pg versions, so normalize them and remove
the alternative test output
2020-11-06 10:44:01 +03:00
Onur Tirtir d912d4bc38
Print full file path in valgrind testing (#4299) 2020-11-06 10:26:53 +03:00
Onur Tirtir cc8be422ce
Fix relkind checks in planner for relkinds other than RELKIND_RELATION (#4294)
We were qualifying relations with relkind != RELKIND_RELATION as
non-relations due to the strict checks around RangeTblEntry->relkind
in planner.
2020-11-05 14:21:02 +03:00
Hanefi Önaldı d6f19e2298
Honor error message conventions 2020-11-03 18:11:18 +03:00
Hanefi Önaldı 85a4b61a0e
Prevent undistribute_table calls for partitions 2020-11-03 18:10:20 +03:00
Hanefi Önaldı 5db380f33a
Prevent undistribute_table calls for foreign tables 2020-11-03 17:33:29 +03:00
Halil Ozan Akgul 77b3be8b6d Turn RelOptInfos to only used field of them, relids, to be able to copy 2020-10-22 13:42:28 +03:00
Onur Tirtir 790beea59f
Add intermediate result tests with unsupported outer joins (#4262) 2020-10-20 12:11:18 +03:00
SaitTalhaNisanci 0f209377c4
Fix incorrect join related fields (#4242)
* Fix incorrect join related fields

Ruleutils expect to give the original index of join columns hence we
should consider the dropped columns while setting the fields in
SetJoinRelatedFieldsCompat.

* add some more tests for joins

* Move tests to join.sql and create a utility function
2020-10-19 18:28:39 +03:00
Onur Tirtir c49077d594
Disallow outer joins `ON TRUE` with ref & dist tables when ref table is outer relation (#4255)
Disallow `ON TRUE` outer joins with reference & distributed tables
when reference table is outer relation by fixing the logic bug made
when calling `LeftListIsSubset` function.

Also, be more defensive when removing duplicate join restrictions
when join clause is empty for non-inner joins as they might still
contain useful information for non-inner joins.
2020-10-19 16:58:11 +03:00
Onder Kalaci bbedfca761 Improve the relation restriction counters
It seems like Postgres could call set_rel_pathlist() for
the same relation multiple times. This breaks the logic
where we assume relationCount eqauls to the number of
entries in relationRestrictionList.

In summary, relationRestrictionList may contain duplicate
entries.
2020-10-19 08:51:16 +02:00
Hadi Moshayedi 663549db33 Set explicit transfer_mode in tableam tests 2020-10-16 12:40:37 -07:00
Nils Dijk caabbf4b84 Table access method support for distributed tables 2020-10-16 12:02:25 -07:00
Marco Slot 8976f245ab Support reference table view in reference table modification 2020-10-16 11:31:24 +02:00
Onder Kalaci 596f7bf4a9 Add more regression test for single node Citus
Tests on commands with SCHEMA.
2020-10-15 17:32:32 +02:00
Onder Kalaci fe3caf3bc8 Local execution considers intermediate result size limit
With this commit, we make sure that local execution adds the
intermediate result size as the distributed execution adds. Plus,
it enforces the citus.max_intermediate_result_size value.
2020-10-15 17:18:55 +02:00
Marco Slot 31858c8a29 Check table existence in EnsureRelationKindSupported 2020-10-15 17:05:06 +02:00
Onder Kalaci 15e724c073 Add regression tests for outer/cross JOINs 2020-10-14 15:17:30 +02:00
Onder Kalaci de33079065 Improve outer join checks
Before this commit, the logic was:
    - As long as the outer side of the JOIN is not a JOIN (e.g., relation
      or subquery etc.), we check for the existence of any recurring
      tuples. There were two implications of this decision.

      First, even if a subquery which is on the outer side contains
      distributed table JOIN reference table, Citus would unnecessarily throw
      an error. Note that, the JOIN inside the subquery would already
      be going to be tested recursively. But, as long as that check
      passes, there is no reason for the upper JOIN to fail. An example, which
      used to fail and now works:

	SELECT * FROM (SELECT * FROM dist JOIN ref) as foo LEFT JOIN dist;

      Second, certain JOINs, especially with ON (true) conditions were not
      represented as Citus expects the JOINs to be in the format
      DeferredErrorIfUnsupportedRecurringTuplesJoin().
2020-10-14 15:17:30 +02:00
Onur Tirtir 1a28858c47
Disallow field indirection in INSERT/UPDATE queries (#4241) 2020-10-14 14:11:59 +03:00
Onur Tirtir 8efca3b60a
Fix a crash with inserting domain composite types in coord. evaluation (#4231)
Use short lived per-tuple context in citus_evaluate_expr like
(pg) evaluate_expr does.

We should not use planState->ExprContext when evaluating expressions
as it might lead to freeing the same executor twice (first one happens
in citus_evaluate_expr itself and the other one happens when postgres
doing clean-up for the top level executor state), which in turn might
cause seg.faults.

However, now as we don't have necessary planState info to evaluate
prepared statements, we also add planState->es_param_list_info to
per-tuple ExprContext.
2020-10-13 14:19:59 +03:00
Halil Ozan Akgul e2736c25bd Adds support for WITH TIES option 2020-10-12 19:34:18 +03:00
Sait Talha Nisanci dc40758355 Return early if there is no citus table in VACUUM 2020-10-09 11:10:00 +03:00
Sait Talha Nisanci 99bb79745a Commit transaction for VACUUM on shell table
With postgres 13, there is a global lock that prevents multiple VACUUMs
happening in the current database. This global lock is taken for a short
time but this creates a problem because of the following:

- We execute the VACUUM for the shell table through the standard process
utility. In this step the global lock is taken for the current database.
- If the current node has shard placements then it tries to execute
VACUUM over a connection to localhost with ExecuteUtilityTaskList.
- the VACUUM on shard placements cannot proceed because it is waiting
for the global lock for the current database to be released.
- The acquired lock from the VACUUM for shell table will not be released
until the transaction is committed.
- So there is a deadlock.

As a solution, we commit the current transaction in case of VACUUM after
the VACUUM is executed for the shell table. Executing the VACUUM on a
shell table is not important because the data there will probably be
truncated. PostprocessVacuumStmt takes the necessary locks on the shell
table so we don't need to take any extra locks after we commit the
current transaction.
2020-10-09 10:57:44 +03:00
Marco Slot 881e5df780 Fix a bug that could lead to multiple maintenance daemons 2020-10-08 16:18:14 +02:00
Marco Slot 18219843d0 Add maintenance daemon error tests 2020-10-08 16:17:33 +02:00
Marco Slot dbc348b7e0 Create sequence dependency during metadata syncing 2020-10-06 10:57:39 +02:00
Marco Slot 9bba8bb4e8 Remove master_drop_sequences 2020-10-06 10:57:33 +02:00
Sait Talha Nisanci 078dcae18c Write settings to postgres configuration file directly
In our test structure, we have been passing postgres configurations from
the terminal, which causes problems after it hits to a certain length
hence it cannot start the server and understanding why it failed is not
easy because there isn't a nice error message.

This commit changes this to write the settings directly to the postgres
configuration file. This way we can add as many postgres settings as we
want to without needing to worry about the length problem.
2020-10-05 22:09:08 +03:00
Onur Tirtir 2cd0a69dfb
Fix multi-row & router INSERT crash with local exec. when def. cols not specified (#4197)
Multi-row & router INSERT's were crashing with local execution if at
least one of the DEFAULT columns were not specified in VALUES list.

This was because, the changes we make on query->values_lists and
query->targetList was sufficient for deparsing given INSERT for remote
execution but not sufficient for local execution.

With this commit, DEFAULT value normalization for multi-row & router
INSERT's is fixed by adding dummy column references for unspecified
DEFAULT columns.
2020-10-05 10:45:17 +03:00
Hanefi Önaldı 6d8e83d24f
Replace worker_hash calls with partkey IS NOT NULL filters 2020-10-02 18:16:24 +03:00
Önder Kalacı df5aa0f0cc
Switch to sequential execution if the index name is long (#4209)
Citus has the logic to truncate the long shard names to prevent
various issues, including self-deadlocks. However, for partitioned
tables, when index is created on the parent table, the index names
on the partitions are auto-generated by Postgres. We use the same
Postgres function to generate the index names on the shards of the
partitions. If the length exceeds the limit, we switch to sequential
execution mode.
2020-10-02 13:39:34 +03:00
Ahmet Gedemenli 70e9edb4f2 Add subplan test with insert 2020-10-01 13:58:55 +03:00
Jelte Fennema 13ef8252e7 Add broken distributed subplan test 2020-10-01 13:52:42 +03:00
Ahmet Gedemenli 3357eea46b Add regression tests for PG13 WAL 2020-10-01 13:52:42 +03:00
Hanefi Önaldı 9ec85f1283
Remove some pgoptions to prevent hitting bash command character limits 2020-09-30 15:04:40 +03:00
Hanefi Önaldı b0a2c1ee5c
Disallow volatile functions on single shard update queries
We currently do not support volatile functions in update/delete statements
because the function evaluation logic does not know how to distinguish
volatile functions (that need to be evaluated per row) from stable functions
(that need to be evaluated per query), and it is also not safe to push the
volatile functions down on replicated tables.
2020-09-29 15:40:21 +03:00
Marco Slot b905c8043d Fix create index concurrently crash with local execution 2020-09-25 11:49:09 +02:00
Ahmet Gedemenli abfb79bda6 Sort explain analyze output by task time
Add sort method parameter for regression tests

Fix check-style

Change sorting method parameters to enum

Polish

Add task fields to OutTask

Add test into multi_explain

Fix isolation test
2020-09-24 11:38:40 +03:00
Onur Tirtir 64d5ac6a10
Do not downgrade if a citus local table exists (#4174)
As the previous versions of Citus don't know how to handle citus local
tables, we should prevent downgrading from 9.5 to older versions if any
citus local tables exists.
2020-09-22 14:19:50 +03:00
Onder Kalaci 5d017cd123 Improve node matedata when coordinator is added
Coordinator should always be always active, hasmetadata and
metadasynced. Prevent changing those fields.
2020-09-21 14:53:41 +02:00
Onder Kalaci 6fc1dea85c Improve the robustness of function call delegation
Pushing down the CALLs to the node that the CALL is executed is
dangerous and could lead to infinite recursion.

When the coordinator added as worker, Citus was by chance preventing
this. The coordinator was marked as "not metadatasynced" node
in pg_dist_node, which prevented CALL/function delegation to happen.

With this commit, we do the following:

  - Fix metadatasynced column for the coordinator on pg_dist_node
  - Prevent pushdown of function/procedure to the same node that
    the function/procedure is being executed. Today, we do not sync
    pg_dist_object (e.g., distributed functions metadata) to the
    worker nodes. But, even if we do it now, the function call delegation
    would prevent the infinite recursion.
2020-09-21 14:53:30 +02:00
Onur Tirtir 1b31b22635 Refactor the functions that return OID lists for citus tables 2020-09-18 16:42:46 +03:00
SaitTalhaNisanci dae2c69fd7
Not allow removing a single node with ref tables (#4127)
* Not allow removing a single node with ref tables

We should not allow removing a node if it is the only node in the
cluster and there is a data on it. We have this check for distributed
tables but we didn't have it for reference tables.

* Update src/test/regress/expected/single_node.out

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>

* Update src/test/regress/sql/single_node.sql

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2020-09-18 15:35:59 +03:00
Ahmet Gedemenli 1cf11b4632 Shorten insert_select_connection_leak_test 2020-09-18 10:07:15 +03:00
Önder Kalacı 8d3f353746
Add more tests for single node citus - distributetd tables (#4166) 2020-09-17 17:50:35 +02:00
Marco Slot c9d46c618b Fix EXPLAIN ANALYZE truncation 2020-09-17 14:42:21 +02:00
Onur Tirtir d81559b7f8
Use "table" instead of "reference table" in sequential truncate log (#4164)
We might get this debug message for citus local tables as well
2020-09-17 14:37:36 +03:00
Onur Tirtir 4118560b75
Prevent citus local table creation from a catalog table (#4158) 2020-09-15 14:30:48 +03:00
Önder Kalacı e7079d1384
Add orderbys to some tests (#4162) 2020-09-14 16:59:22 +02:00
Marco Slot b82f6ee163 Add tests for distributing catalog tables 2020-09-10 04:46:11 +02:00
Marco Slot bd12555b16 Fix distributing tables owned by extensions 2020-09-10 04:46:11 +02:00
Onur Tirtir 9a56c22917
Add udf tests with citus local tables (#4154) 2020-09-11 12:36:53 +03:00
Onur Tirtir 3a73fba810 Apply planner changes for citus local tables 2020-09-09 11:51:18 +03:00
Onur Tirtir 0b1cc118a9 Adapt other cache entry changes for citus local tables 2020-09-09 11:50:55 +03:00
Onur Tirtir a58a4395ab Extend citus local table utility command support
This commit brings following features:

Foreign key support from citus local tables to reference tables
* Foreign key support from reference tables to citus local tables
  (only with RESTRICT & NO ACTION behavior)
* ALTER TABLE ENABLE/DISABLE trigger command support
* CREATE/DROP/ALTER trigger command support

and disallows:
* ALTER TABLE ATTACH/DETACH PARTITION commands
* CREATE TABLE <postgres table> ATTACH PARTITION <citus local table>
  commands
* Foreign keys from postgres tables to citus local tables
  (the other way was already disallowed)

for citus local tables.
2020-09-09 11:50:55 +03:00
Onur Tirtir 17cc810372 Implement "citus local table" creation logic 2020-09-09 11:50:48 +03:00
Onur Tirtir ba208eae4d
Record non-distributed table accesses in local executor (#4139) 2020-09-07 18:19:08 +03:00
Hanefi Önaldı 024d398cd7
Allow distribution of functions that read from reference tables
create_distributed_function(function_name,
                            distribution_arg_name,
                            colocate_with text)

This UDF did not allow colocate_with parameters when there were no
disttribution_arg_name supplied. This commit changes the behaviour to
allow missing distribution_arg_name parameters when the function should
be colocated with a reference table.
2020-09-01 07:28:34 +03:00
Önder Kalacı 983206c5e1
Hide `citus.subquery_pushdown` flag and NOTICE when enabled (#4124)
* Hide citus.subquery_pushdown flag

This flag is dangerous and could likely to let queries
return wrong results.

The flag has a very specific purpose for a very specific
data distribution and query structure. In those cases, when
the flag is set, the user can skip recursive planning altogether
*at their own risk*.

The meaning of the flag is that "I know what I'm doing such that
the query structure/data distribution is on my control, so Citus
can skip many correctness checks".

For regular users, enabling this flag is discouraged. We have to
keep the support only for backward compatibility for some users.

In addition to that, give a NOTICE to discourage new users to
use it.
2020-08-28 14:53:09 +02:00
SaitTalhaNisanci 2459ba6eca
Update docker images (#4122)
* Update and separate test images

The build image was a single one and it would contain pg11, pg12 and
pg13. Now it is separated so that we can build each pg major
independently.

Tags are used as full postgres versions so that we can know which
version we use by looking at the tag. For example exttester:11.9 would
mean we are using pg11.9.

pg11 is updated from 11.5 to 11.9.
pg12 is updated from 12rc to 12.4.

* Ignore memory usage in pg13 explain

* Use citus instead of personal repo
2020-08-26 16:23:59 +03:00
SaitTalhaNisanci 20c39fae9a
Loosen the requirement to pushdown a subquery with ref tables (#4110)
AllTargetExpressionsAreColumnReferences would return false if a query
had an entry that is referencing the outer query. It seems safe to not
have this for non-distributed tables, such as reference tables. We
already have separate checks for other cases such as having limits.
2020-08-14 12:11:15 +03:00
Hadi Moshayedi 7b74eca22d Support EXPLAIN EXECUTE ANALYZE. 2020-08-10 13:44:30 -07:00
Philip Dubé 212ae7163f Fix non deterministic collation test to work with ancient libicu versions
CentOS 7's libicu is too old for und-u-ks-level2

@colStrength=secondary works with both older & newer versions of libicu
2020-08-07 12:34:32 +00:00
Halil Ozan Akgul 375310b7f1 Adds support for table undistribution 2020-08-05 14:36:03 +03:00
Sait Talha Nisanci fe4ac51d8c Normalize Output:.. since it changes with pg13
Fix indentation for better readability
2020-08-04 15:38:13 +03:00
Sait Talha Nisanci d68bfc5687 Improve error for index operator class parameters
The error message when index has opclassopts is improved and the commit
from postgres side is also included for future reference.

Also some minor style related changes are applied.
2020-08-04 15:38:13 +03:00
Sait Talha Nisanci 288aa58603 add alternative out for pg13 test 2020-08-04 15:38:13 +03:00
Sait Talha Nisanci d0b0c88920 Changelog: error out if index has opclassopts
Error out if index has opclassopts.

Changelog entry on PG13:
Allow CREATE INDEX to specify the GiST signature length and maximum number of integer ranges (Nikita Glukhov)
2020-08-04 15:38:13 +03:00
Sait Talha Nisanci f7a1971361 Changelog: Alter type options
It seems that we don't support propagating commands related to base
types. Therefore Alter TYPE options doesn't seem to apply to us. I have
added a test to verify that we don't propagate them.

Changelog entry on pg13:
Add ALTER TYPE options useful for extensions, like TOAST and I/O functions control (Tomas Vondra, Tom Lane)
2020-08-04 15:38:11 +03:00
Sait Talha Nisanci 00633165fc Changelog: Test unicode escapes
Unicode escapes work as expected, related tests are added.

Changelog entry on PG13:
Allow Unicode escapes, e.g., E'\u####', U&'\####', to specify any character available in the database encoding, even when the database encoding is not UTF-8 (Tom Lane)
2020-08-04 15:36:30 +03:00
Sait Talha Nisanci 79dcb80140 Changelog: Test IS NORMALIZED for pg13
Tests for is_normalized and normalized ar eadded. One thing that seems
to be because of existent bug is that when we don't give the second
argument to normalize or is_normalized, which is optional, it crashes.
Because in the executor part, in the expression we don't have the
default argument.

Changelog entry in PG-13:
Add SQL functions NORMALIZE() to normalize Unicode strings, and IS NORMALIZED to check for normalization (Peter Eisentraut)

Commit on Postgres:
2991ac5fc9b3904ca4582be6d323497d7c3d17c9
2020-08-04 15:18:27 +03:00
Sait Talha Nisanci ebabca16b7 Changelog: Test row suffix notation
It seems that row suffix notation is working fine with our code, a test
is added.

Changelog entry in PG13:
Allow ROW values values to have their members extracted with suffix notation (Tom Lane)
2020-08-04 15:18:27 +03:00
Sait Talha Nisanci 275ccd0400 Changelog: Test that alter view rename column works
Changelog entry in PG13:
Add ALTER VIEW syntax to rename view columns (Fujii Masao)
2020-08-04 15:18:27 +03:00
Sait Talha Nisanci 920d7211e4 Changelog: Test that we error out for DROP EXPRESSION
PG13 now supports dropping expression from a column such as generated
columns. We error out with this currently.

Changelog entry in postgres:
Add ALTER TABLE clause DROP EXPRESSION to remove generated properties from columns (Peter Eisentraut)
2020-08-04 15:18:27 +03:00
Sait Talha Nisanci 87088d92bc Changelog: handle VACUUM PARALLEL option
Postgres 13 added a new VACUUM option, PARALLEL. It is now supported
in our code as well.

Relevant changelog message on postgres:
Allow VACUUM to process indexes in parallel (Masahiko Sawada, Amit Kapila)
2020-08-04 15:18:27 +03:00
Sait Talha Nisanci 1070828465 update cte inline output for pg13
Make some macros in version_compat more robust
Remove commented code in ruleutils
Remove unnecessary variable assignments
2020-08-04 15:18:27 +03:00
Sait Talha Nisanci 157af140e4 ignore concurrent root page split debugs 2020-08-04 15:18:27 +03:00
Sait Talha Nisanci ff7a563c57 decrease log level to debug1 to prevent flaky debug 2020-08-04 15:18:27 +03:00
Sait Talha Nisanci 6ff4e42706 Add alternative output for multi_function_in_join
With pg13, constants functions from "FROM" clause are replaced. This
means that in citus side, we will see the constraints in restriction
info, instead of the function call. For example:

SELECT * FROM table1 JOIN add(3,5) sum ON (id = sum) ORDER BY id ASC;

Assuming that the function `add` returns constant, it will be evaluated
on postgres side. This means that this query will be routable because
there will be only one shard after pruning with the restrictions.
However before pg13, this would be multi shard query. And it would go
into recursive planning, the function would be evaluated on the
coordinator because it can be.

This means that with pg13, users will need to distribute the function
because when it is routable executable, it will currently also send the
function call to the worker in the query. So the function should exist
in the worker.

It could be better to replace the constant in the query tree as well so
that the query string sent to the worker has the constant value and
therefore it doesn't need the function. However I feel like users would
already have the function in workers if they have any multi shard query.

Commit on Postgres side:
7266d0997dd2a0632da38a594c78e25ff21df67e
2020-08-04 15:18:27 +03:00
Sait Talha Nisanci a34a1126ec add alternative output for pg13 in some tests 2020-08-04 15:18:27 +03:00
Sait Talha Nisanci c5c9ec288f fix multi_mx_create_table test 2020-08-04 15:18:27 +03:00
Sait Talha Nisanci 76c7b3d1c6 Remove unused steps in isolation tests
PG13 gives a warning for unused steps therefore we should remove the
unused steps in isolation tests.
2020-08-04 15:18:27 +03:00
Sait Talha Nisanci 17388e2e91 update some tests 2020-08-04 15:18:27 +03:00
Sait Talha Nisanci de82d0ff79 add output for pg13 for propagate extension commands
CREATE EXTENSION <name> FROM <old_version> is not supported anymore with
postgres 13. An alternative output is added for pg13 where we basically
error for that statement.
2020-08-04 15:18:27 +03:00
Sait Talha Nisanci 80d2bc2317 normalize some output and sort test result 2020-08-04 15:18:27 +03:00
Sait Talha Nisanci 0f6c21d418 sort result in ch_bench_having_mx test 2020-08-04 15:10:22 +03:00
Sait Talha Nisanci 70f27c10e5 Add some normalization rules for tests
The not-null constraint message changed with pg13 slightly hence a
normalization rule is added for that, which converts it to pg < 13
output.

Commit on postgres:
05f18c6b6b6e4b44302ee20a042cedc664532aa2

An extra debug message is added related to indexes on postgres, these
are safe to be ignored, so we can delete them from tests.

Commit on Postgres side:
612a1ab76724aa1514b6509269342649f8cab375

varnoold is renamed as varnosyn and varoattno is renamed as varattnosyn
so in the output we normalize the values as the old ones to simply pass
the tests.
2020-08-04 15:10:22 +03:00
Onder Kalaci eeb8c81de2 Implement shared connection count reservation & enable `citus.max_shared_pool_size` for COPY
With this patch, we introduce `locally_reserved_shared_connections.c/h` files
which are responsible for reserving some space in shared memory counters
upfront.

We sometimes need to reserve connections, but not necessarily
establish them. For example:
-  COPY command should reserve connections as it cannot know which
   connections it needs in which order. COPY establishes connections
   as any input data hits the workers. For example, for router COPY
   command, it only establishes 1 connection.

   As discussed here (https://github.com/citusdata/citus/pull/3849#pullrequestreview-431792473),
   COPY needs to reserve connections up-front, otherwise we can end
   up with resource starvation/un-detected deadlocks.
2020-08-03 18:51:40 +02:00
nukoyluoglu 38987431e7
propagation of CHECK statements to workers with parentheses (#4039)
* ensure propagation of CHECK statements to workers with parantheses & adjust regression test outputs

* add tests for distributing tables with simple CHECK constraints

* added test for CHECK on bool variable
2020-07-27 15:08:37 +03:00
Benjamin Satzger a35a15a513
Distribute custom aggregates with multiple arguments (#4047)
Enable custom aggregates with multiple parameters to be executed on workers.

#2921 introduces distributed execution of custom aggregates. One of the limitations of this feature is that only aggregate functions with a single aggregation parameter can be pushed to worker nodes. Aim of this change is to remove that limitation and support handling of multi-parameter aggregates.

Resolves: #3997
See also: #2921
2020-07-24 15:16:00 -07:00
Halil Ozan Akgul 38b72ddd66 Fixes create index concurrently bug 2020-07-24 12:14:14 +03:00
Halil Ozan Akgül e9f89ed651
Fixes the non existing table bug (#4058) 2020-07-23 18:01:21 +03:00
Sait Talha Nisanci 01c23b0df2 update test outputs with task-tracker removal 2020-07-21 16:25:08 +03:00
Sait Talha Nisanci 1dbd545cf4 replace task-tracker with adaptive in tests 2020-07-21 16:21:01 +03:00
Sait Talha Nisanci 4308d867d9 remove task-tracker in comments, documentation 2020-07-21 16:21:01 +03:00
Hanefi Önaldı e534dbae4a
Accept list of values in a supported ALTER ROLE .. SET statement
Some GUCs support a list of values which is indicated by GUC_LIST_INPUT flag.

When an ALTER ROLE .. SET statement is executed, the new configuration
default for affected users and databases are stored in the
setconfig(text[]) column in a pg_db_role_setting record.

If a GUC that supports a list of values is used in an ALTER ROLE .. SET
statement, we need to split the text into items delimited by commas.
2020-07-21 03:49:57 +03:00
Nils Dijk 00a4a15d95
fix sorting on string litteral (#4045)
As noted by Talha https://github.com/citusdata/citus/pull/4029#issuecomment-660466972 there was still some sort order flappiness in the test.

The root cause is that sorting on `1::text` sorts on the literal `'1'` which causes sorting to be indeterministic.

This behaviour is consistent with Postgres' behaviour, so no bug on Citus' side.
2020-07-20 17:39:27 +02:00
Onder Kalaci c25de2cf22 Remove flag from
As it doesn't make any sense anymore
2020-07-20 12:45:05 +02:00
SaitTalhaNisanci b3af63c8ce
Remove task tracker executor (#3850)
* use adaptive executor even if task-tracker is set

* Update check-multi-mx tests for adaptive executor

Basically repartition joins are enabled where necessary. For parallel
tests max adaptive executor pool size is decresed to 2, otherwise we
would get too many clients error.

* Update limit_intermediate_size test

It seems that when we use adaptive executor instead of task tracker, we
exceed the intermediate result size less in the test. Therefore updated
the tests accordingly.

* Update multi_router_planner

It seems that there is one problem with multi_router_planner when we use
adaptive executor, we should fix the following error:
+ERROR:  relation "authors_range_840010" does not exist
+CONTEXT:  while executing command on localhost:57637

* update repartition join tests for check-multi

* update isolation tests for repartitioning

* Error out if shard_replication_factor > 1 with repartitioning

As we are removing the task tracker, we cannot switch to it if
shard_replication_factor > 1. In that case, we simply error out.

* Remove MULTI_EXECUTOR_TASK_TRACKER

* Remove multi_task_tracker_executor

Some utility methods are moved to task_execution_utils.c.

* Remove task tracker protocol methods

* Remove task_tracker.c methods

* remove unused methods from multi_server_executor

* fix style

* remove task tracker specific tests from worker_schedule

* comment out task tracker udf calls in tests

We were using task tracker udfs to test permissions in
multi_multiuser.sql. We should find some other way to test them, then we
should remove the commented out task tracker calls.

* remove task tracker test from follower schedule

* remove task tracker tests from multi mx schedule

* Remove task-tracker specific functions from worker functions

* remove multi task tracker extra schedule

* Remove unused methods from multi physical planner

* remove task_executor_type related things in tests

* remove LoadTuplesIntoTupleStore

* Do initial cleanup for repartition leftovers

During startup, task tracker would call TrackerCleanupJobDirectories and
TrackerCleanupJobSchemas to clean up leftover directories and job
schemas. With adaptive executor, while doing repartitions it is possible
to leak these things as well. We don't retry cleanups, so it is possible
to have leftover in case of errors.

TrackerCleanupJobDirectories is renamed as
RepartitionCleanupJobDirectories since it is repartition specific now,
however TrackerCleanupJobSchemas cannot be used currently because it is
task tracker specific. The thing is that this function is a no-op
currently.

We should add cleaning up intermediate schemas to DoInitialCleanup
method when that problem is solved(We might want to solve it in this PR
as well)

* Revert "remove task tracker tests from multi mx schedule"

This reverts commit 03ecc0a681.

* update multi mx repartition parallel tests

* not error with task_tracker_conninfo_cache_invalidate

* not run 4 repartition queries in parallel

It seems that when we run 4 repartition queries in parallel we get too
many clients error on CI even though we don't get it locally. Our guess
is that, it is because we open/close many connections without doing some
work and postgres has some delay to close the connections. Hence even
though connections are removed from the pg_stat_activity, they might
still not be closed. If the above assumption is correct, it is unlikely
for it to happen in practice because:
- There is some network latency in clusters, so this leaves some times
for connections to be able to close
- Repartition joins return some data and that also leaves some time for
connections to be fully closed.

As we don't get this error in our local, we currently assume that it is
not a bug. Ideally this wouldn't happen when we get rid of the
task-tracker repartition methods because they don't do any pruning and
might be opening more connections than necessary.

If this still gives us "too many clients" error, we can try to increase
the max_connections in our test suite(which is 100 by default).

Also there are different places where this error is given in postgres,
but adding some backtrace it seems that we get this from
ProcessStartupPacket. The backtraces can be found in this link:
https://circleci.com/gh/citusdata/citus/138702

* Set distributePlan->relationIdList when it is needed

It seems that we were setting the distributedPlan->relationIdList after
JobExecutorType is called, which would choose task-tracker if
replication factor > 1 and there is a repartition query. However, it
uses relationIdList to decide if the query has a repartition query, and
since it was not set yet, it would always think it is not a repartition
query and would choose adaptive executor when it should choose
task-tracker.

* use adaptive executor even with shard_replication_factor > 1

It seems that we were already using adaptive executor when
replication_factor > 1. So this commit removes the check.

* remove multi_resowner.c and deprecate some settings

* remove TaskExecution related leftovers

* change deprecated API error message

* not recursively plan single relatition repartition subquery

* recursively plan single relation repartition subquery

* test depreceated task tracker functions

* fix overlapping shard intervals in range-distributed test

* fix error message for citus_metadata_container

* drop task-tracker deprecated functions

* put the implemantation back to worker_cleanup_job_schema_cachesince citus cloud uses it

* drop some functions, add downgrade script

Some deprecated functions are dropped.
Downgrade script is added.
Some gucs are deprecated.
A new guc for repartition joins bucket size is added.

* order by a test to fix flappiness
2020-07-18 13:11:36 +03:00
Marco Slot 9cb8dc9d12 Improve error message when creating a foreign key to a local table 2020-07-13 13:57:22 +02:00
SaitTalhaNisanci bc011a6286
Add IsCitusTable check to citus table utilities (#4028) 2020-07-14 18:29:33 +03:00
Nils Dijk 23d44eba9f
fix flappy tests due to undeterministic order of test output (#4029)
As reported on #4011 https://github.com/citusdata/citus/pull/4011/files#r453804702 some of the tests were flapping due to an indeterministic order for test outputs.

This PR makes the test output ordered for all tests returning non-zero rows.

Needs to be backported to 9.2, 9.3, 9.4
2020-07-14 15:47:29 +02:00
SaitTalhaNisanci ab5be77709
test coordinator reference-distributed table join (#3698) 2020-07-14 11:43:03 +03:00
Sait Talha Nisanci 1b5ed45a58 add multi follower repartition tests 2020-07-13 19:50:50 +03:00
Sait Talha Nisanci 510535f558 address feedback 2020-07-13 19:45:02 +03:00
Sait Talha Nisanci 41ec76a6ad use ActiveReadableNodeList in JobExecutorType and task tracker
The reason we should use ActiveReadableNodeList instead of ActiveReadableNonCoordinatorNodeList is that if coordinator is added to cluster as a worker, it should be counted as well. Otherwise if there is only coordinator in the cluster, the count will be 0, hence we get a warning.

In MultiTaskTrackerExecute, we should connect to coordinator if it is
added to the cluster because it will also be assigned tasks.
2020-07-13 19:45:02 +03:00
Sait Talha Nisanci d97d03ec65 use ActivePrimaryNodeList to include coordinator
ActiveReadableWorkerNodeList doesn't include coordinator, however if
coordinator is added as a worker, we should also include that while
planning. The current methods are very easily misusable and this
requires a refactoring to make the distinction between methods that
include coordinator and that don't very explicit as they can introduce
subtle/major bugs pretty easily.
2020-07-13 19:20:15 +03:00
Sait Talha Nisanci db1b78148c send schema creation/cleanup to coordinator in repartitions
We were using ALL_WORKERS TargetWorkerSet while sending temporary schema
creation and cleanup. We(well mostly I) thought that ALL_WORKERS would also include coordinator when it is added as a worker. It turns out that it was FILTERING OUT the coordinator even if it is added as a worker to the cluster.

So to have some context here, in repartitions, for each jobId we create
(at least we were supposed to) a schema in each worker node in the cluster. Then we partition each shard table into some intermediate files, which is called the PARTITION step. So after this partition step each node has some intermediate files having tuples in those nodes. Then we fetch the partition files to necessary worker nodes, which is called the FETCH step. Then from the files we create intermediate tables in the temporarily created schemas, which is called a MERGE step. Then after evaluating the result, we remove the temporary schemas(one for each job ID in each node) and files.

If node 1 has file1, and node 2 has file2 after PARTITION step, it is
enough to either move file1 from node1 to node2 or vice versa. So we
prune one of them.

In the MERGE step, if the schema for a given jobID doesn't exist, the
node tries to use the `public` schema if it is a superuser, which is
actually added for testing in the past.

So when we were not sending schema creation comands for each job ID to
the coordinator(because we were using ALL_WORKERS flag, and it doesn't
include the coordinator), we would basically not have any schemas for
repartitions in the coordinator. The PARTITION step would be executed on
the coordinator (because the tasks are generated in the planner part)
and it wouldn't give us any error because it doesn't have anything to do
with the temporary schemas(that we didn't create). But later two things
would happen:

- If by chance the fetch is pruned on the coordinator side, we the other
nodes would fetch the partitioned files from the coordinator and execute
the query as expected, because it has all the information.
- If the fetch tasks are not pruned in the coordinator, in the MERGE
step, the coordinator would either error out saying that the necessary
schema doesn't exist, or it would try to create the temporary tables
under public schema ( if it is a superuser). But then if we had the same
task ID with different jobID it would fail saying that the table already
exists, which is an error we were getting.

In the first case, the query would work okay, but it would still not do
the cleanup, hence we would leave the partitioned files from the
PARTITION step there. Hence ensure_no_intermediate_data_leak would fail.

To make things more explicit and prevent such bugs in the future,
ALL_WORKERS is named as ALL_NON_COORD_WORKERS. And a new flag to return
all the active nodes is added as ALL_DATA_NODES. For repartition case,
we don't use the only-reference table nodes but this version makes the
code simpler and there shouldn't be any significant performance issue
with that.
2020-07-13 19:20:15 +03:00
SaitTalhaNisanci 76ddb85545
improve error message in secondaries (#4025) 2020-07-13 19:18:57 +03:00
Nils Dijk 449d1f0e91
force aliases in deparsing for queries with anonymous column references (#4011)
DESCRIPTION: Force aliases in deparsing for queries with anonymous column references

Fixes: #3985  

The root cause has todo with discrepancies in the query tree we create. I think in the future we should spend some time on categorising all changes we made to ruleutils and see if we can change the data structure `query` we pass to the deparser to have an actual valid postgres query for the deparser to render.

For now the fix is to keep track, besides changing the names of the entries in the target list, also if we have a reference to an anonymous columns. If there are anonymous columns we set the `printaliases` flag to true which forces the deparser to add the aliases.
2020-07-13 16:29:24 +02:00
Hadi Moshayedi 3651fc64ee Fix Subtransaction memory leak 2020-07-09 12:33:39 -07:00
Jelte Fennema 16242d5264
Fix write queries with const expressions and COLLATE in various places (#3973) 2020-07-08 18:19:53 +02:00
Jelte Fennema ab01571c9e
Fix crash with single node dummy placement (#3993)
Static analysis found an issue where we could dereference `NULL`, because 
`CreateDummyPlacement` could return `NULL` when there were no workers. This
PR changes it so that it never returns `NULL`, which was intended by 
@marcocitus when doing this change: https://github.com/citusdata/citus/pull/3887/files#r438136433

While adding tests for citus on a single node I also added some more basic
tests and it turns out we error out on repartition joins. This has been
present since `shouldhaveshards` was introduced and is not trivial to fix.
So I created a separate issue for this: https://github.com/citusdata/citus/issues/3996
2020-07-08 17:11:25 +02:00
Philip Dubé 444472ffc6 ruleutils: use get_rtable_name for deparsing resultRelation 2020-07-07 12:20:41 +00:00
Marco Slot b4fec63bc0 Rename master evaluation to coordinator evaluation 2020-07-07 10:37:41 +02:00
Jelte Fennema 8ab47f4f37
Add a CI check to see if all tests are part of a schedule (#3959)
I recently forgot to add tests to a schedule in two of my PRs. One of
these was caught by review, but the other one was not. This adds a
script to causes CI to ensure that each test in the repo is included in
at least one schedule.

Three tests were found that were currently not part of a schedule. This PR
adds those three tests to a schedule as well and it also fixes some small
issues with these tests.
2020-07-03 11:34:55 +02:00
Jelte Fennema 9311978487 Add README for CI scripts
We keep accumulating more and more scripts to flag issues in CI. This is
good, but we are currently missing consistent documentation for them.
This commit moves all these scripts to the `ci` directory and adds some
documentation for all of them in the README. It also makes sure that the
last line of output of a failed script points to this documentation.
2020-07-03 10:22:48 +02:00
Onur Tirtir be17ebb334 Bump citus version to 9.5devel 2020-07-01 14:46:55 +03:00
Hanefi Önaldı ca2ececb3b
Downgrade path from 9.4 to 9.3 to 9.2 2020-07-01 10:38:11 +03:00
Sait Talha Nisanci e5a21f07cb test aggregates with expressions 2020-06-30 11:41:16 -07:00
Jelte Fennema 392c5e2c34
Fix wrong cancellation message about distributed deadlocks (#3956) 2020-06-30 14:57:46 +02:00
Jelte Fennema 02fa942be1
Fix assertion error when rolling back to savepoint (#3868)
It was possible to get an assertion error, if a DML command was
cancelled that opened a connection and then "ROLLBACK TO SAVEPOINT" was
used to continue the transaction. The reason for this was that canceling
the transaction might leave the `claimedExclusively` flag on for (some
of) it's connections.

This caused an assertion failure because `CanUseExistingConnection`
would return false and a new connection would be opened, and then there
would be two connections doing DML for the same placement. Which is
disallowed. That this situation caused an assertion failure instead of
an error, means that without asserts this could possibly result in some
visibility bugs, similar to the ones described
https://github.com/citusdata/citus/issues/3867
2020-06-30 11:31:46 +02:00
Hadi Moshayedi 4ed59d2db3 Move more from insert_select_executor to insert_select_planner 2020-06-26 08:08:26 -07:00
Hadi Moshayedi cd25a27174 Fix crash caused by EXPLAIN EXECUTE INSERT ... SELECT 2020-06-25 08:55:48 -07:00
Hadi Moshayedi 4e8d79998e Save INSERT/SELECT method in DistributedPlan.
This is so we don't need to calculate it twice in
insert_select_executor.c and multi_explain.c, which can
cause discrepancy if an update in one of them is not
reflected in the other site.
2020-06-25 08:55:48 -07:00
Jelte Fennema 64506143e4
Replace flaky repartition analyze test with a non flaky one (#3950)
The flaky test was introduced in #3941. This removes that flaky test and
adds a new one that fails in the same manner when removing the fix in #3941.

An example of a random failure can be found here:
https://app.circleci.com/pipelines/github/citusdata/citus/9558/workflows/de76e7a5-6558-46c9-97e7-8b1dae1f173b/jobs/135876/steps
2020-06-25 15:19:15 +02:00
SaitTalhaNisanci 50e115fe3a
test task tracker repartition with replication >1 (#3944) 2020-06-24 14:54:20 +03:00
SaitTalhaNisanci f458d1fd1c
Fix/task execution (#3941)
* Not set TaskExecution with adaptive executor

Adaptive executor is using a utility method from task tracker for
repartition joins, however adaptive executor doesn't need taskExecution.
It is only used by task tracker. This causes a problem when explain
analyze is used because what taskExecution is pointing to might be
random.

We solve this by not setting taskExecution from adaptive executor. So it
will stay NULL as set by CreateTask.

* use same memory context as task for taskExecution

Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
2020-06-24 12:10:00 +03:00
Philip Dubé cd0b2ad5b5 citus_evaluate_expression: call expand_function_arguments beforehand to avoid segfaulting on implicit parameters 2020-06-23 18:06:46 +00:00
Jelte Fennema b3ec6fbe7a
Make check_enterprise_merge script stricter (#3918)
We've had two issues with merge conflicts to enterprise in the last week, that
suddenly happened. Because of this CI check this actually blocks all community
PRs from being merged.

This PR tries to improve on the previous script we had, by putting tougher
constraints on when a merge is allowed.

Previously the check would pass in two cases:
1. This PR be merged without conflicts into `enterprise-master`
2. A branch exists with the same name as this PR on enterprise and that can be
   merged into `enterprise-master`.

The first case stays the same, but I've changed the second case to require the
following instead:
1. A branch exists on enterprise with the same name as this PR
2. **NEW: This branch contains the the last commit of the community PR branch**
3. This branch can be merged into enterprise-master

This makes sure the enterprise branch is actually up to date and not forgotten about.

If we still get problems with this change, future improvements could be:
1. Check that the PR on enterprise passes CI
2. Check that the PR on enterprise has been approved
3. Require the enterprise PR branch to be merged before merging community.
2020-06-19 12:45:36 +02:00
SaitTalhaNisanci 3a789352b6
rename citus hammerdb branch prefix as citus_github_push (#3925)
When we are using hammerdb jobs, the job creates a branch on test
automation, since that branch should be deleted, it would have
`delete_me` prefix, however since the result branch on
release-test-results will have the test automation branch as prefix, it
will also have `delete_me` prefix, which seems a bit confusing.

This PR updates it as citus_github_push
2020-06-18 21:11:58 +03:00
Marco Slot 2a3234ca26 Rename masterQuery to combineQuery 2020-06-17 14:14:37 +02:00
Jelte Fennema 0259815d3a
Fix EXPLAIN ANALYZE received data counter issues (#3917)
In #3901 the "Data received from worker(s)" sections were added to EXPLAIN
ANALYZE. After merging @pykello posted some review comments. This addresses
those comments as well as fixing a other issues that I found while addressing 
them. The things this does:

1. Fix `EXPLAIN ANALYZE EXECUTE p1` to not increase received data on every
   execution
2. Fix `EXPLAIN ANALYZE EXECUTE p1(1)` to not return 0 bytes as received data
   allways.
3. Move `EXPLAIN ANALYZE` specific logic to `multi_explain.c` from
   `adaptive_executor.c`
4. Change naming of new explain sections to `Tuple data received from node(s)`.
   Firstly because a task can reference the coordinator too, so "worker(s)" was
   incorrect. Secondly to indicate that this is tuple data and not all network
   traffic that was performed.
5. Rename `totalReceivedData` in our codebase to `totalReceivedTupleData` to
   make it clearer that it's a tuple data counter, not all network traffic.
6. Actually add `binary_protocol` test to `multi_schedule` (woops)
7. Fix a randomly failing test in `local_shard_execution.sql`.
2020-06-17 11:33:38 +02:00
Jelte Fennema b71f82b31e
Use 5 second isolation test timeout (#3907)
Sometimes isolation tests get stuck in CI and we cannot see why, because
the job is killed by the CI runner. This will instead fail inside make
the testsuite continue, but mark it as a failure like this in the diff
output:
```diff
+isolationtester: canceling step s2-ddl-create-index-concurrently after 5 seconds
 step s2-ddl-create-index-concurrently: CREATE INDEX CONCURRENTLY select_append_index ON select_append(id);
+ERROR:  CONCURRENTLY-enabled index command failed
```

We should detect blockages very quickly and the queries we run are also
very fast, so 5 seconds should be more than enough to catch any random
slowness. The default from Postgres is 5 minutes, which is waaay to much
for us.
2020-06-16 14:57:49 +02:00
Jelte Fennema 799bfdab56
Temporarily disable connection leak tests that fail a lot (#3911)
MX connection leak failures:
1. https://app.circleci.com/pipelines/github/citusdata/citus/9296/workflows/e36d1088-662a-4f60-acec-293132632c2f/jobs/131908/steps
2. https://app.circleci.com/pipelines/github/citusdata/citus/9258/workflows/37659d82-2c5b-495e-b0e7-905811e30444/jobs/131299

Failure connection leak failures:
1. https://app.circleci.com/pipelines/github/citusdata/citus/9297/workflows/c0ebc326-8c93-468f-8b70-f470bd492fb9/jobs/131920
2. https://app.circleci.com/pipelines/github/citusdata/citus/9283/workflows/9af154d0-ff96-4c5d-ae19-81faae1e0c18/jobs/131668
2020-06-16 13:48:48 +02:00
Jelte Fennema 927de6d187
Show amount of data received in EXPLAIN ANALYZE (#3901)
Sadly this does not actually work yet for binary protocol data, because
when doing EXPLAIN ANALYZE we send two commands at the same time. This
means we cannot use `SendRemoteCommandParams`, and thus cannot use the
binary protocol. This can still be useful though when using the text
protocol, to find out that a lot of data is being sent.
2020-06-15 16:01:05 +02:00
Hadi Moshayedi ef778c1cd7 address feedback from Sait Talha & Hadi 2020-06-12 18:36:02 -07:00
Marco Slot 080f711e62 Remove useless debug message in router planner 2020-06-12 18:36:02 -07:00
Marco Slot 24feadc230 Handle joins between local/reference/cte via router planner 2020-06-12 18:36:01 -07:00
Nils Dijk f57711b3d2
fix test output for tdigest (#3909)
Due to the problem described in #3908 we don't cover the tdigest integration (and other extensions) on CI.

Due to this a bug got in the patch due to a change in `EXPLAIN VERBOSE` being merged concurrently with the tdigest integration. This PR fixes the test output that missed the newly added information.
2020-06-12 20:54:27 +02:00
Halil Ozan Akgül 8c5eb6b7ea
Insert Select Into Local Table (#3870)
* Insert select with master query

* Use relid to set custom_scan_tlist varno

* Reviews

* Fixes null check

Co-authored-by: Marco Slot <marco.slot@gmail.com>
2020-06-12 17:06:31 +03:00
Jelte Fennema 0e12d045b1
Support use of binary protocol in between nodes (#3877)
This can save a lot of data to be sent in some cases, thus improving
performance for which inter query bandwidth is the bottleneck.
There's some issues with enabling this as default, so that's currently not done.
2020-06-12 15:02:51 +02:00
Nils Dijk da8f2b0134
Feature: tdigest aggregate (#3897)
DESCRIPTION: Adds support to partially push down tdigest aggregates

tdigest extensions: https://github.com/tvondra/tdigest

This PR implements the partial pushdown of tdigest calculations when possible. The extension adds a tdigest type which can be combined into the same structure. There are several aggregate functions that can be used to get;
 - a quantile
 - a list of quantiles
 - the quantile of a hypothetical value
 - a list of quantiles for a list of hypothetical values

These function can work both on values or tdigest types.

Since we can create tdigest values either by combining them, or based on a group of values we can rewrite the aggregates in such a way that most of the computation gets delegated to the compute on the shards. This both speeds up the percentile calculations because the values don't have to be sorted while at the same time making the transfer size from the shards to the coordinator significantly less.
2020-06-12 13:50:28 +02:00
Philip Dubé 1722d8ac8b Allow routing modifying CTEs
We still recursively plan some cases, eg:
- INSERTs
- SELECT FOR UPDATE when reference tables in query
- Everything must be same single shard & replication model
2020-06-11 15:14:06 +00:00
Hadi Moshayedi 0e3140c14d Include execution duration in worker_last_saved_explain_analyze 2020-06-11 02:54:54 -07:00
Hadi Moshayedi 7c52c6edb0 CTE statistics in EXPLAIN ANALYZE 2020-06-11 02:39:59 -07:00
Hadi Moshayedi 1f6d6ee4a5 Show query text in EXPLAIN output 2020-06-11 02:19:55 -07:00
Hadi Moshayedi bb96ef5047 Does the EXPLAIN ANALYZE at the same time as execution, so avoids executing twice.
We wrap worker tasks in worker_save_query_explain_analyze() so we can fetch
their explain output later by a call worker_last_saved_explain_analyze().

Fixes #3519
Fixes #2347
Fixes #2613
Fixes #621
2020-06-11 01:55:57 -07:00
Hadi Moshayedi 6ca621bd16 Test we don't support multi-shard EXPLAIN EXECUTE 2020-06-10 17:11:27 -07:00
Onder Kalaci 640717bea2
Copy doesn't use more than MaxAdaptiveExecutor
Co-authored-by: Hanefi Önaldı <Hanefi.Onaldi@Microsoft.com>
2020-06-10 16:46:21 +03:00
Jelte Fennema b87bae71bb
Error out when using different users in the same transaction (#3869)
Fixes #3867

As described in the issue above we return incorrect results when
changing user within a transaction. This causes us to error out instead.
2020-06-10 14:07:40 +02:00
Onder Kalaci 06461ca55f Coerce types properly for INSERT
Also, unify similar code-paths to rely on more accurate function.
2020-06-10 10:40:28 +02:00
Hadi Moshayedi 5cdfa9f571 Implement EXPLAIN ANALYZE udfs.
Implements worker_save_query_explain_analyze and worker_last_saved_explain_analyze.

worker_save_query_explain_analyze executes and returns results of query while
saving its EXPLAIN ANALYZE to be fetched later.

worker_last_saved_explain_analyze returns the saved EXPLAIN ANALYZE result.
2020-06-09 10:02:05 -07:00
Hadi Moshayedi 45a41e249f Test EXPLAIN ANALYZE doesn't show repartition join tasks 2020-06-06 23:24:45 -07:00
Hadi Moshayedi 02cff1a7c6 Test that EXPLAIN ANALYZE is not supported for some forms of INSERT/SELECT 2020-06-06 23:24:45 -07:00
Onur Tirtir 8b39d12846
Append IF NOT EXISTS to deparsed CREATE SERVER commands (#3875)
Append IF NOT EXISTS to CREATE SERVER commands generated by
pg_get_serverdef_string function when deparsing an existing server
object that a foreign table depends.
2020-06-05 18:04:33 +03:00
Onur Tirtir dfcc18468c Error out for unsupported trigger objects
Error out if creating a citus table from a table having triggers.
Error out for CREATE TRIGGER commands that are run on citus tables.
2020-05-31 23:10:01 +03:00
MoYi 9e1f198155 Fix composite create type deparsing to preserve typmod 2020-05-15 13:12:54 +00:00
Sait Talha Nisanci 41fceb7849 Add optional ch_benchmark and tpcc_benchmark job
With this commit:
You can trigger two types of hammerdb benchmark jobs:
-ch_benchmark (analytical and transactional queries)
-tpcc_benchmark (only transactional queries)

Your branch will be run against `master` branch.

In order to trigger the jobs prepend `ch_benchmark/` or `tpcc_benchmark/` to your branch and push it.

For example if you were running on a feature/improvement branch with name `improve/adaptive_executor`. In order to trigger a tpcc benchmark, you can do the following:

```bash
git checkout improve/adaptive_executor
git checkout -b tpcc_benchmark/improve/adaptive_executor
git push origin tpcc_benchmark/improve/adaptive_executor # the tpcc benchmark job will be triggered.
```

You will see the results in a branch in [https://github.com/citusdata/release-test-results](https://github.com/citusdata/release-test-results).

The branch name will be something like: `delete_me/citusbot_tpcc_benchmark_rg/<date>/<date>`.

The resource groups will be deleted automatically but if the benchmark fails, they won't be deleted(If you don't see the results after a reasonable time, it might mean it failed, you can check the resource usage from portal, if it is almost 0 and you didn't see the results, it means it probably failed). In that case, you will need to delete the resource groups manually from portal, the resource groups are `citusbot_ch_benchmark_rg` and `citusbot_tpcc_benchmark_rg`.
2020-05-14 16:01:48 +03:00
SaitTalhaNisanci cf98b9d6d5
not wait forever for metadata sync in tests (#3760)
We shouldn't wait forever for metada sync in tests, otherwise when a
test gets stuck, we don't know which line causes the problem.
2020-05-14 10:51:24 +03:00
Nils Dijk 105de7beb8
Fix for pruned target list entries (#3818)
DESCRIPTION: Ignore pruned target list entries in coordinator plan

The postgres planner has the ability to prune target list entries that are proven not used in the output relation. When this happens at the `CitusCustomScan` boundary we need to _not_ return these pruned columns to not upset the rest of the planner.

By using the target list the planner asks us to return we fix issues that lead to Assertion failures, and potentially could be runtime errors when they hit in a production build.

Fixes #3809
2020-05-06 13:56:02 +02:00
Marco Slot 6ce2803777 Make sure we don't wrap GROUP BY expressions in any_value 2020-05-05 05:12:45 +02:00
SaitTalhaNisanci 4a9d516f1b
Add a job to check if merge to enterprise master would fail (#3777)
* add a job to check if merge to enterprise master would fail

Add a job to check if merge to enterprise master would fail.
The job does the following:
- It checks if there is already a branch with the same name on
enterprise, if so it tries to merge it to enterprise master, if the
merge fails the job fails.
- If the branch doesn't exist on the enterprise, it tries to merge the
current branch to enterprise master, it fails if there is any conflict
while merging.

The motivation is that if a branch on community would create a conflict
on enterprise-master, until we create a PR on enterprise that would
solve this conflict, we won't be able to merge the PR on community. This
way we won't have many conflicts when merging to enterprise master and
the author, who has the most context will be responsible for resolving
the conflict when he has the most context, not after 1 month.

* Improve test suite to be able to easily run locally

* Add documentation on how to resolve conflicts to enterprise master

* Improve enterprise merge script

* Improve merge conflict job README

* Improve merge conflict job README

* Improve merge conflict job README

* Improve merge conflict job README

Co-authored-by: Nils Dijk <nils@citusdata.com>
2020-05-04 17:08:17 +03:00
Onder Kalaci f9d4a9cf38 Remove assertion for subqueries in WHERE clause ANDed with FALSE
In the code, we had the assumption that if restriction information
is NULL, it means that we cannot have any disributetd tables in
the subquery.

However, for subqueries in WHERE clause, that is not the case when
the subquery is ANDed with FALSE. In that case, Citus operates
on the originalQuery (which doesn't go through the standard_planner()),
and rely on the restriction information generated by standard_plannner().
As Postgres is smart enough to no generate restriction information for
subqueries ANDed with FALSE, we hit the assertion.
2020-05-04 10:52:15 +02:00
Onder Kalaci 891d99efaf add order by to some tests to make the output consistent 2020-05-01 12:41:51 +02:00
SaitTalhaNisanci cbda951395
Fix task copy and appending empty task in ExtractLocalAndRemoteTasks (#3802)
* Not append empty task in ExtractLocalAndRemoteTasks

ExtractLocalAndRemoteTasks extracts the local and remote tasks. If we do
not have a local task the localTaskPlacementList will be NIL, in this
case we should not append anything to local tasks. Previously we would
first check if a task contains a single placement or not, now we first
check if there is any local task before doing anything.

* fix copy of node task

Task node has task query, which might contain a list of strings in its
fields. We were using postgres copyObject for these lists. Postgres
assumes that each element of list will be a node type. If it is not a
node type it will error.

As a solution to that, a new macro is introduced to copy a list of
strings.
2020-04-29 11:05:34 +03:00
Philip Dubé b6b3c1bc17 Fix COPY TO's COPY (SELECT) with distributed table having generated columns
It's necessary to omit generated columns from output
2020-04-28 14:40:47 +00:00
SaitTalhaNisanci 164c00cf08
Fix typo: longer visible -> no longer visible (#3803) 2020-04-27 16:32:46 +03:00
Onder Kalaci 0cb7ab2d05 Explicitly mark queries in physical planner for [not] having parameters
Physical planner doesn't support parameters. If the parameters have already
been resolved when the physical planner handling the queries, mark it.
The reason is that the executor is unaware of this, and sends the parameters
along with the worker queries, which fails for composite types.

(See `DissuadePlannerFromUsingPlan()` for the details of paramater resolving)
2020-04-24 12:49:43 +02:00
Onder Kalaci f517fa2e2a Re-enable isolation test for reference tables + distributed deadlock detection 2020-04-24 11:53:03 +02:00
SaitTalhaNisanci 07cbd84631
Add base isolation schedule (#3784)
We should do some setup steps in check-isolation-base target.
This PR adds base_isolation_schedule which will set up the cluster.
2020-04-24 12:38:37 +03:00
Onur Tirtir 2e927bd6b7
Bump Citus to 9.4devel (#3788) 2020-04-22 12:50:00 +03:00
Hanefi Önaldı e85b835065
Skip dependency setup on coordinator node 2020-04-21 12:06:31 +03:00
Jelte Fennema 1423433531
Fix running check-isolation-base (#3782) 2020-04-20 15:36:09 +02:00
Onder Kalaci e182215d96 Improve connection error message from the worker nodes
We currently put the actual error message to the detail part. However,
many drivers don't show detail part.

As connection errors are somehow common, and hard to trace back, can't
we added the detail to the message itself.

In addition to that, we changed "connection error" message, as it
was confusing to the users who think that the error was happening
while connecting to the coordinator. In fact, this error is showing
up when the coordinator fails to connect remote nodes.
2020-04-20 13:32:55 +02:00
Hadi Moshayedi 1250d691d3 Replicate reference tables before master_create_empty_shard 2020-04-17 16:47:03 -07:00
SaitTalhaNisanci 1d0f4bdcd2
invalidate plan cache in master_update_node (#3758)
* invalidate plan cache in master_update_node

If a plan is cached by postgres but a user uses master_update_node, then
when the plan cache is used for the updated node, they will get the old
nodename/nodepost in the plan. This is because the plan cache doesn't
know about the master_update_node. This could be a problem in prepared
statements or anything that goes into plancache. As a solution the plan
cache is invalidated inside master_update_node.

* add invalidate_inactive_shared_connections test function

We introduce invalidate_inactive_shared_connections udf to be used in
testing. It is possible that a connection count for an inactive node
will be greater than 0 and in that case it will not be removed at the
time of invalidation. However, later we don't have a mechanism to remove
it, which means that it will stay in the hash. For this not to cause a
problem, we use this udf in testing.

* move invalidate_inactive_shared_connections to udfs from test as it will be used in mx

* remove the test udf

* remove the IsInactive check
2020-04-17 17:43:48 +03:00
Önder Kalacı a919f09c96
Remove the entries from the shared connection counter hash when no connections remain (#3775)
We initially considered removing entries just before any change to
pg_dist_node. However, that ended-up being very complex and making
MX even more complex.

Instead, we're switching to a simpler solution, where we remove entries
when the counter gets to 0.

With certain workloads, this may have some performance penalty. But, two
notes on that:
 - When counter == 0, it implies that the cluster is not busy
 - With cached connections, that's not possible
2020-04-17 17:14:58 +03:00
Philip Dubé e4a4707f4a Avoid setting hasWindowFuncs true after window functions have been optimized out of query 2020-04-17 12:22:48 +00:00
SaitTalhaNisanci a9a3be15cc
introduce TASK_QUERY_NULL task type (#3774)
When we call SetTaskQueryString we would set the task type to
TASK_QUERY_TEXT, and some parts of the codebase rely on the fact that if
TASK_QUERY_TEXT is set, the data can be read safely. However if
SetTaskQueryString is called with a NULL taskQueryString this can cause
crashes. In that case taskQueryType will simply be set to
TASK_QUERY_NULL.
2020-04-17 14:59:22 +03:00
Hanefi Önaldı 0c5d0cfee9
Notice message to help truncate local data after distribution 2020-04-17 13:21:34 +03:00
Hanefi Önaldı d535121f8d
Introduce truncate_local_data_after_distributing_table() 2020-04-17 13:21:34 +03:00
Hadi Moshayedi 61198251fd Use block_writes for replicate_reference_tables 2020-04-16 19:25:41 -07:00
Nils Dijk 1d6ba1d09e
Refactor alter role to work on distributed roles (#3739)
DESCRIPTION: Alter role only works for citus managed roles

Alter role was implemented before we implemented good role management that hooks into the object propagation framework. This is a refactor of all alter role commands that have been implemented to
 - be on by default
 - only work for supported roles
 - make the citus extension owner a supported role

Instead of distributing the alter role commands for roles at the beginning of the node activation role it now _only_ executes the alter role commands for all users in all databases and in the current database.

In preparation of full role support small refactors have been done in the deparser.

Earlier tests targeting other roles than the citus extension owner have been either slightly changed or removed to be put back where we have full role support.

Fixes #2549
2020-04-16 12:23:27 +02:00
Hadi Moshayedi 59b9a4e5a1 Detect deadlocks in replicate_reference_tables() 2020-04-15 11:06:18 -07:00
SaitTalhaNisanci df9048ebaa
update outdated comments related to local_execution (#3759) 2020-04-15 16:15:43 +03:00
Marco Slot 8b83306a27 Issue worker messages with the same log level 2020-04-14 21:08:25 +02:00
SaitTalhaNisanci d58b5e67c1
not run multi_router_planner_fast_path in parallel (#3744) 2020-04-14 13:14:23 +03:00
Onder Kalaci aa6b641828 Throttle connections to the worker nodes
With this commit, we're introducing a new infrastructure to throttle
connections to the worker nodes. This infrastructure is useful for
multi-shard queries, router queries are have not been affected by this.

The goal is to prevent establishing more than citus.max_shared_pool_size
number of connections per worker node in total, across sessions.

To do that, we've introduced a new connection flag OPTIONAL_CONNECTION.
The idea is that some connections are optional such as the second
(and further connections) for the adaptive executor. A single connection
is enough to finish the distributed execution, the others are useful to
execute the query faster. Thus, they can be consider as optional connections.
When an optional connection is not allowed to the adaptive executor, it
simply skips it and continues the execution with the already established
connections. However, it'll keep retrying to establish optional
connections, in case some slots are open again.
2020-04-14 10:27:48 +02:00
Hadi Moshayedi 2639a9a19d Test master_copy_shard_placement errors on foreign constraints 2020-04-13 12:45:27 -07:00
Hadi Moshayedi f9de734329 Ensure metadata is synced on ReplicateColocatedShardPlacement 2020-04-13 11:45:21 -07:00
SaitTalhaNisanci 2b2a146af4
update gitignores with new files in test folder (#3749) 2020-04-13 17:09:18 +03:00
Philip Dubé 30f10984e1 Defer get_agg_clause_costs, it happens later & avoids errors 2020-04-10 13:26:05 +00:00
Halil Ozan Akgul 34c2b7e056 Fixes the psql connection bug 2020-04-10 15:54:47 +03:00
Halil Ozan Akgul 56e814a333 Adds public host to only hyperscale tests 2020-04-10 15:54:47 +03:00
Halil Ozan Akgul d574ac33a8 Adds next shard ids to multi_create_table tests 2020-04-10 15:54:47 +03:00
Halil Ozan Akgul a701fc774a Adds multi_schedule_hyperscale schedule 2020-04-10 15:54:47 +03:00
Halil Ozan Akgul 5bf350faf9 Removes failing tests
This task just removes the failing tests. It doesn't mean this tests cannot be saved. It's just a starting point
2020-04-10 15:54:47 +03:00
Halil Ozan Akgul 1aa1f55d8e Adds check_multi_hyperscale_superuser schedule 2020-04-10 13:05:07 +03:00
Halil Ozan Akgul c2edf989cf Adds public host parameters 2020-04-10 13:04:24 +03:00
Halil Ozan Akgul 4b9705f714 Adds worker host parameters 2020-04-10 13:03:28 +03:00
Halil Ozan Akgul 119bf590c8 Creates normalize_modified.sed 2020-04-10 13:03:19 +03:00
Halil Ozan Akgul c8a81ef1ce Changes copy to \copy 2020-04-10 13:03:15 +03:00
Halil Ozan Akgul 93b97248b2 Adds a connection string to run tests on that connection 2020-04-10 13:03:03 +03:00
SaitTalhaNisanci 17373d51da
not wait forever in upgrade distributed function before (#3731) 2020-04-10 09:43:42 +03:00
Marco Slot a4b2197450 Correctly handle non-constant LIMIT/OFFSET clauses 2020-04-09 19:59:50 +00:00
SaitTalhaNisanci 24dcb02bca
enable local table join with reference table (#3697)
* enable local table join with reference table

* test different cases with local table and reference join
2020-04-09 15:25:54 +03:00
SaitTalhaNisanci ebda3eff61
read database name inside the function (#3730) 2020-04-09 13:11:13 +03:00
SaitTalhaNisanci 233e4a24d1
use local execution within transaction block (#3714)
* use local executon when in a transaction block

When we are inside a transaction block, there could be other methods
that need local execution, therefore we will use local execution in a
transaction block.

* update test outputs with transaction block local execution

* add a test to verify we dont leak intermediate schemas
2020-04-09 12:41:58 +03:00
SaitTalhaNisanci fa88046ce1
test that we don't leak intermediate schemas (#3737)
* test that we don't leak intermediate schemas

We have tests to make sure that we don't intermediate any intermediate
files, tables etc but we don't test if we are leaking schemas. It makes
sense to test this as well.

* remove all repartition schemas in case of error

This solution is not an ideal one but it seems to be doing the job.
We should have a more generic solution for the cleanup but it seems that
putting the cleanup in the abort handler is dangerous and it was
crashing.
2020-04-09 12:17:41 +03:00
Hadi Moshayedi 9b8802ba2d Remove todo from reference_table_utils 2020-04-08 12:46:55 -07:00
Hadi Moshayedi dda53a0bba GUC for replicate reference tables on activate. 2020-04-08 12:42:45 -07:00
Hadi Moshayedi c168a53ebc Tests for replicate_reference_tables 2020-04-08 12:41:36 -07:00
Hadi Moshayedi acfa850c38 Make multi_replicate_reference_table check-base friendly 2020-04-08 12:41:36 -07:00
Hadi Moshayedi 0758a81287 Prevent reference tables being dropped when replicating reference tables 2020-04-08 12:41:36 -07:00
Marco Slot 924cd7343a Defer reference table replication to shard creation time 2020-04-08 12:41:36 -07:00
Önder Kalacı 70012dfd33 Do not error when an intermediate file does not exit (#3707)
When the file does not exist, it could mean two different things.
First -- and a lot more common -- case is that a failure happened
in a concurrent backend on the same distributed transaction. And,
one of the backends in that transaction has already been roll
backed, which has already removed the file. If we throw an error
here, the user might see this error instead of the actual error
message. Instead, we prefer to WARN the user and pretend that the
file has no data in it. In the end, the user would see the actual
error message for the failure.

Second, in case of any bugs in intermediate result broadcasts,
we could try to read a non-existing file. That is most likely
to happen during development. Thus, when asserts enabled, we throw
an error instead of WARNING so that the developers cannot miss.
2020-04-07 17:06:55 +02:00
Onder Kalaci a695b44ce9 Add new regression tests 2020-04-07 17:06:55 +02:00
Onder Kalaci 4b3d17f466 Make sure that tests are not failing randomly 2020-04-07 17:06:55 +02:00
Marco Slot 2632343f64 Fix intermediate result pruning for INSERT..SELECT 2020-04-07 11:07:49 +02:00
Marco Slot 84672c3dbd Simplify intermediate result pruning logic 2020-04-07 10:53:29 +02:00
SaitTalhaNisanci a710b3cdc5
fix null tupleStoreState case in ExecuteLocalTaskListExtended (#3711)
In case we don't care about the tupleStoreState in
ExecuteLocalTaskListExtended, it could be passed as null. In that case
we will get a seg error. This changes it so that a dummy tuple store
will be created when it is null.

Do not use local execution in ExecuteTaskListOutsideTransaction.
As we are going to run the tasks outside transaction, we shouldn't use local execution.
However, there is some problem when using local execution related to
repartition joins, when we solve that problem, we can execute the tasks
coming to this path with local execution.

Also logging the local command is simplified.

normalize job id in worker_hash_partition_table in test outputs.
2020-04-07 11:47:09 +03:00
Philip Dubé 4860e11561 Duplicate grouping on worker whenever possible
This is possible whenever we aren't pulling up intermediate rows

We want to do this because this was done in 9.2,
some queries rely on the performance of grouping causing distinct values

This change was introduced when implementing window functions on coordinator
2020-04-06 18:51:30 +00:00
Philip Dubé b01bae5937 Check connections from connection_placement before polling 2020-04-06 17:45:44 +00:00
SaitTalhaNisanci cd3e499834
not log in debug level in null parameters (#3718)
The purpose of null_parameters is to make sure that citus doesn't crash
with null parameters. (The related issue is #3493.) The logs in this
file are not that important and they are flaky. The flakiness is related
to postgres part as well so it is hard to reproduce them. Therefore it
makes sense to decrease the log level.
2020-04-06 17:59:46 +03:00
SaitTalhaNisanci 3d3605be80
simplify vacuum test and fix the flakiness (#3704)
look at sent commands to simplify complex logic in vacuum test

also normalize connection id as that can differ when we don't have to
choose a specific connection.
2020-04-03 21:39:54 +03:00
Jelte Fennema 459a4829ae
Fix isolation tests on OSX (#3706)
* Don't print out comments in make output

* Remove empty lines with sed
2020-04-03 16:28:06 +02:00
SaitTalhaNisanci 32156dbf5c
fix flaky log statement in null_parameters (#3705)
It seems that sometimes the pruning is deferred and sometimes not with
this statement. What we care in this test is to see that it doesn't
crash. I think we don't care about the log statement for this line. So
it makes sense to not log this statement, and care about the result.
2020-04-03 17:01:59 +03:00
Hanefi Önaldı d1223bd6cc
Remove migration paths to 9.3-1, introduce 9.3-2 2020-04-03 12:50:45 +03:00
SaitTalhaNisanci 710970407f
not wait forever in multi_extension test (#3702) 2020-04-03 12:21:02 +03:00
SaitTalhaNisanci 659283c9a7
fix multi utilities vacuum test (#3699) 2020-04-03 11:50:00 +03:00
Marco Slot fd8cdb92f4 Evaluate nextval in the target list on the coordinator 2020-04-02 02:53:19 +02:00
SaitTalhaNisanci df88ab71b6 normalize assign_distributed_transaction_id in tests 2020-04-01 18:23:16 +03:00
SaitTalhaNisanci 0aebd78ea7 use localExecution in ExecuteTaskListExtended
ExecuteTaskListExtended is the common method for different codepaths,
and instead of writing separate local execution logics in different
codepaths, it makes more sense to have the logic here. We still need to
do some refactoring, this is an initial step.

After this commit, we can run create shard commands locally. There is a
special case with shard creation commands. A create shard command might
have a concatenated query string, however local execution did not know
how to execute a task with multiple query strings. This is also
implemented in this commit. We go over each query in the concatenated
query string and plan/execute them one by one.

A more clean solution to this would be to make sure that each task has a
single query. We currently cannot do that because we need to ensure the
task dependencies. However, it would make sense to do that at some point
and it would simplify the code a lot.
2020-04-01 18:23:16 +03:00
Philip Dubé 3bb4f14efd upgrade_type_after: ORDER BY 2020-04-01 01:07:21 +00:00
Philip Dubé d155149c18 tests: remove stale comment, fix typo 2020-03-31 20:13:51 +00:00
Marco Slot 252abcce16 Allow table type to be used in target list 2020-03-31 11:11:01 -07:00
SaitTalhaNisanci 5bf9f32dd3
disable one of deadlock detection test (#3682)
It seems that one of the deadlock detection tests fails way too often in
our CI. The difference is only ordering. Currently it seems that it is a
good idea to disable this test for the sake of development.
2020-03-31 19:47:58 +03:00
Marco Slot 331b45348c Fix error when using LEFT JOIN with GROUP BY on primary key 2020-03-30 16:42:22 +02:00
Philip Dubé 67d2ad4e37
Fixes flaky test in multi_reference_table: ORDER BY (#3676)
Fixes app.circleci.com/pipelines/github/citusdata/citus/7744/workflows/0848f36c-af9e-46b7-9dda-a421df54ba56/jobs/109503
2020-03-30 23:31:10 +02:00
Hanefi Onaldi 0e8103b101
Propagate ALTER ROLE .. SET statements
In PostgreSQL, user defaults for config parameters can be changed by
ALTER ROLE .. SET statements. We wish to propagate those defaults
accross the Citus cluster so that the behaviour will be similar in
different workers.

The defaults can either be set in a specific database, or the whole
cluster, similarly they can be set for a single role or all roles.

We propagate the ALTER ROLE .. SET if all the conditions below are met:
- The query affects the current database, or all databases
- The user is already created in worker nodes
2020-03-27 13:02:48 +03:00
Marco Slot a65ffee266 Fixes a bug that causes some DML queries containing aggregates to fail 2020-03-26 16:08:34 +00:00
Marco Slot b89e9dc158 Fix a bug which caused queries with SRFs and function evalution to fail 2020-03-25 06:55:53 +01:00
Philip Dubé 917cb6ae93 Don't segfault on queries using GROUPING
GROUPING will always return 0 outside of GROUPING SETS, CUBE, or ROLLUP
Since we don't support those, it makes sense to reject GROUPING in queries
2020-03-25 15:46:43 +00:00
Philip Dubé 720525cfda Add support for window functions on coordinator
Some refactoring:
Consolidate expression which decides whether GROUP BY/HAVING are pushed down
Rename early pullUpIntermediateRows to hasNonDistributableAggregates
Create WorkerColumnName to handle formatting WORKER_COLUMN_FORMAT
Ignore NULL StringInfo pointers to SafeToPushdownWindowFunction
Fix bug where SubqueryPushdownMultiNodeTree mutates supplied Query,
	SafeToPushdownWindowFunction requires the original query as it relies on rtable
2020-03-25 15:31:20 +00:00
Jelte Fennema 149f0b2122
Use Microsoft approved cipher string (#3639)
This cipher string is approved by the Microsoft security team and only enables
TLSv1.2 ciphers.
2020-03-24 15:51:44 +01:00
Jelte Fennema 2aabe3e2ef
Mark all connections for shutdown when citus.node_conninfo chan… (#3642)
We cache connections between nodes in our connection management code.
This is good for speed. For security this can be a problem though. If
the user changes settings related to TLS encryption they want those to
be applied to future queries. This is especially important when they did
not have TLS enabled before and now they want to enable it. This can
normally be achieved by changing citus.node_conninfo.  However, because
connections are not reopened there will still be old connections that
might not be encrypted at all.

This commit changes that by marking all connections to be shutdown at
the end of their current transaction. This way running transactions will
succeed, even if placement requires connections to be reused for this
transaction. But after this transaction completes any future statements
will use a connection created with the new connection options.

If a connection is requested and a connection is found that is marked
for shutdown, then we don't return this connection. Instead a new one is
created. This is needed to make sure that if there are no running
transactions, then the next statement will not use an old cached
connection, since connections are only actually shutdown at the end of a
transaction.
2020-03-24 15:31:41 +01:00
Hadi Moshayedi b46b9a68ae Tests for master_copy_shard_placement 2020-03-23 08:33:55 -07:00
Marco Slot ede176d849 Implement shard placement copying 2020-03-23 08:33:08 -07:00
Philip Dubé dd2bd53e5b PartiallyEvaluateExpression: Avoid unrecognized paramkind: 2 2020-03-23 14:14:01 +00:00
SaitTalhaNisanci 3b7959a763
not run local shard copy test in parallel (#3640)
It seems that when logging is enabled we should not run local shard copy
in parallel with other tests. The reason is that it adds coordinator for
reference tables and if the parallel test creates a schema before this
test is run, the schema will be logged. So it is not deterministic.
2020-03-23 14:38:18 +03:00
SaitTalhaNisanci c5c446f84f
not run local_shard_copy in parallel (#3635) 2020-03-23 13:56:25 +03:00
SaitTalhaNisanci 3df578010e
add a UDF to update colocation (#3623)
If two tables have the same distribution column type, we implicitly
colocate them. This is useful since colocation has a big performance
impact in most applications.

When a table is rebalanced, all of the colocated tables are also
rebalanced. If table A and table B are colocated and we want to
rebalance table A, table B will also be rebalanced. We need replica
identity so that logical replication can replicate updates and deletes
during rebalancing. If table B does not have a replica identity we
error out.

A solution to this is to introduce a UDF so that colocation can be
updated. The remaining tables in the colocation group will stay
colocated. For example if table A, B and C are colocated and after
updating table B's colocations, table A and table C stay colocated.

The "updating colocation" step does not move any data around, it only
updated pg_dist_partition and pg_dist_colocation tables. Specifically it
creates a new colocation group for the table and updates the entry in
pg_dist_partition while invalidating any cache.
2020-03-23 13:22:24 +03:00
Jelte Fennema ed0376bb41
Unparallelize tests (#3629)
We're getting a lot of random failures on CI regarding connection errors. This
works around that by not running that create lots of connections in parallel.
2020-03-20 10:31:34 +01:00
SaitTalhaNisanci 9d2f3c392a enable local execution in INSERT..SELECT and add more tests
We can use local copy in INSERT..SELECT, so the check that disables
local execution is removed.

Also a test for local copy where the data size >
LOCAL_COPY_FLUSH_THRESHOLD is added.

use local execution with insert..select
2020-03-18 09:34:39 +03:00
SaitTalhaNisanci 42cfc4c0e9 apply review items
log shard id in local copy and add more comments
2020-03-18 09:33:55 +03:00
SaitTalhaNisanci c22068e75a use the right partition for partitioned tables 2020-03-18 09:28:59 +03:00
SaitTalhaNisanci 1df9601e13 not use local copy if current transaction is connected to local group
If current transaction is connected to local group we should not use
local copy, because we might not see some of the changes that are made
over the connection to the local group.
2020-03-18 09:28:59 +03:00
SaitTalhaNisanci 39bbec0f30 add tests for local copy execution 2020-03-18 09:28:59 +03:00
Nils Dijk e5237b9e20
Fix left join shard pruning (#3569)
DESCRIPTION: Fix left join shard pruning in pushdown planner

Due to #2481 which moves outer join planning through the pushdown planner we caused a regression on the shard pruning behaviour for outer joins.

In the pushdown planner we make a union of the placement groups for all shards accessed by a query based on the filters we see during planning. Unfortunately implicit filters for left joins are not available during this part. This causes the inner part of an outer join to not prune any shards away. When we take the union of the placement groups it shows the behaviour of not having any shards pruned.

Since the inner part of an outer query will not return any rows if the outer part does not contain any rows we have observed we do not have to add the shard intervals of the inner part of an outer query to the list of shard intervals to query. 

Fixes: #3512
2020-03-13 15:20:45 +01:00
Onur Tirtir a14739f808
Local execution of ddl/drop/truncate commands (#3514)
* reimplement ExecuteUtilityTaskListWithoutResults for local utility command execution

* introduce new functions for local execution of utility commands

* change ErrorIfTransactionAccessedPlacementsLocally logic for local utility command execution

* enable local execution for TRUNCATE command on distributed & reference tables

* update existing tests for local utility command execution

* enable local execution for DDL commands on distributed & reference tables

* enable local execution for DROP command on distributed & reference tables

* add normalization rules for cascaded commands

* add new tests for local utility command execution
2020-03-13 15:39:32 +03:00
SaitTalhaNisanci 77f96a1f87
retry vanilla tests if they fail once more (#3611) 2020-03-12 12:50:06 +03:00
Philip Dubé 11b968bc30 Add runtime type checking to AGGREGATE_CUSTOM_COMBINE helper functions 2020-03-11 17:20:30 +00:00
Philip Dubé 4b68ee12c6 Also check aggregates in havingQual when scanning for non pushdownable aggregates
Came across this while coming up with test cases,
'result "68_1" does not exist' I'll seek to address in a future PR,
for now avoid segfault
2020-03-11 15:47:04 +00:00
Önder Kalacı 63ced3d901
Improve master evaluation tests (#3609)
* Add third column to master_evaluation_modify table

It was already added in some tests, but now make it globally
applicable to the test file.

* Add third column to master_evaluation_select table

As we'll use the column in some tests

* Add modify regression tests

For the combinations of: local/remote, router/fast-path:
   - Distribution key is a const.
   - Contains a function
   - A column which is not dist. key is parametrized

* Add select regression tests

    For the combinations of: local/remote, router/fast-path:
       - Distribution key is a const.
       - Contains a function
       - A column which is not dist. key is parametrized

* Make some tests consistent to check-base
2020-03-11 15:38:08 +01:00
Önder Kalacı afc942c6af
Remove non-adaptive test schedules (#3605)
As we don't have any other executors to run them.

These schedules were added when we had both the adaptive executor and
the real-time/router executors in the code. Since we only have adaptive
executor anymore, we can remove these.
2020-03-11 09:58:49 +01:00
Onder Kalaci 7d787e3d5e Prevent create_distributed_function() from the workers
As this could cause weird edge cases.
2020-03-10 18:24:20 +01:00
Marco Slot cb3d90bdc8 Simplify INSERT logic in router planner 2020-03-10 15:54:40 +01:00
Philip Dubé 81cfa05d3d First phase of addressing HAVING subquery issues
Add failing tests, make changes to avoid crashes at least

Fix HAVING subquery pushdown ignoring reference table only subqueries,
also include HAVING in recursive planning

Given that we have a function IsDistributedTable which includes reference tables,
it seems best to have IsDistributedTableRTE & QueryContainsDistributedTableRTE
reflect that they do not include reference tables in their check

Similarly SublinkList's name should reflect that it only scans WHERE

contain_agg_clause asserts that we don't have SubLinks,
use contain_aggs_of_level as suggested by pg sourcecode
2020-03-09 17:58:30 +00:00
Onder Kalaci 2ed19181fe Improve definition of RelationInfoContainsOnlyRecurringTuples
Before this commit, we considered !ContainsRecurringRTE() enough
for NotContainsOnlyRecurringTuples. However, instead, we can check
for existince of any distributed table.

DESCRIPTION: Fixes a bug that causes wrong results with complex outer joins
2020-03-09 17:28:33 +01:00
Marco Slot 5b1d1dd413 Remove unnecessary use of max_parallel_workers_per_gather 2020-03-06 13:18:58 +01:00
Marco Slot d0fead6691 Disable Postgres parallelism by default in tests 2020-03-06 13:18:58 +01:00
Hanefi Onaldi c0ad44f975
Fix early exit bug on intermediate result pruning
There are 2 problems with our early exit strategy that this commit fixes:

1- When we decide that a subplan results are sent to all worker nodes,
we used to skip traversing the whole distributed plan, instead of
skipping only the subplan.

2- We used to consider all available nodes in the cluster (secondaries
and inactive nodes as well as active primaries) when deciding on early
exit strategy. This resulted in failures to early exit when there are
secondaries or inactive nodes.
2020-03-05 16:41:44 +03:00
Onder Kalaci f72916875f Expand test coverage for combinations of master evalution, deferred pruning, parameters, local execution
- Router           & Remote & Requires Master Evaluation & With Param & Without Param
- Fast Path Router & Remote & Requires Master Evaluation & With Param & Without Param
2020-03-05 12:37:22 +01:00
Nils Dijk 268ad741a9
Refactor the deparsing of a CREATE EXTENSION to prevent NULL POINTER dereferences (#3518)
DESCRIPTION: satisfy static analysis tool for a nullptr dereference

During the static analysis project on the codebase this code has been flagged as having the potential for a null pointer dereference. Funnily enough the author had already made a comment of it in the code this was not possible due to us setting the schema name before we pass in the statement. If we want to reuse this code in a later setting this comment might not always apply and we could actually run into null pointer dereference.

This patch changes a bit of the code around to first of all make sure there is no NULL pointer dereference in this code anymore.
Secondly we allow for better deparsing by setting and adhering to the `if_not_exists` flag on the statement.
And finally add support for all syntax described in the documentation of postgres (FROM was missing).
2020-03-04 16:47:07 +01:00
Marco Slot 27f23d2c89 Add some distribution column = composite type prepared statement tests 2020-03-04 05:01:43 +01:00
Onder Kalaci 087f6eb4c0 For composite types, add cast to the parameter to ease remote node detect
the type.
2020-03-04 11:27:45 +01:00
Philip Dubé 34f241af16 Fix create_distributed_table on a table using GENERATED ALWAYS AS
If the generated column does not come at the end of the column list,
columnNameList doesn't line up with the column indexes. Seek past

CREATE TABLE test_table (
    test_id int PRIMARY KEY,
    gen_n int GENERATED ALWAYS AS (1) STORED,
    created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
SELECT create_distributed_table('test_table', 'test_id');

Would raise ERROR: cannot cast 23 to 1184
2020-02-28 09:34:26 -08:00
Jelte Fennema 685b54b3de
Semmle: Check for NULL in some places where it might occur (#3509)
Semmle reported quite some places where we use a value that could be NULL. Most of these are not actually a real issue, but better to be on the safe side with these things and make the static analysis happy.
2020-02-27 10:45:29 +01:00
Hadi Moshayedi 1b3e58f0c3 Merge branch 'improve-shard-pruning' of https://github.com/MarkusSintonen/citus into MarkusSintonen-improve-shard-pruning 2020-02-26 07:13:33 -08:00
Nils Dijk a77ed9cd23
Refactor master query to be planned by postgres' planner (#3326)
DESCRIPTION: Replace the query planner for the coordinator part with the postgres planner

Closes #2761 

Citus had a simple rule based planner for the query executed on the query coordinator. This planner grew over time with the addigion of SQL support till it was getting close to the functionality of the postgres planner. Except the code was brittle and its complexity rose which made it hard to add new SQL support.

Given its resemblance with the postgres planner it was a long outstanding wish to replace our hand crafted planner with the well supported postgres planner. This patch replaces our planner with a call to postgres' planner.

Due to the functionality of the postgres planner we needed to support both projections and filters/quals on the citus custom scan node. When a sort operation is planned above the custom scan it might require fields to be reordered in the custom scan before returning the tuple (projection). The postgres planner assumes every custom scan node implements projections. Because we controlled the plan that was created we prevented reordering in the custom scan and never had implemented it before.

A same optimisation applies to having clauses that could have been where clauses. Instead of applying the filter as a having on the aggregate it will push it down into the plan which could reach a custom scan node.

For both filters and projections we have implemented them when tuples are read from the tuple store. If no projections or filters are required it will directly return the tuple from the tuple store. Otherwise it will loop tuples from the tuple store through the filter and projection until a tuple is found and returned.

Besides filters being pushed down a side effect of having quals that could have been a where clause is that a call to read intermediate result could be called before the first tuple is fetched from the custom scan. This failed because the intermediate result would only be pulled to the coordinator on the first tuple fetch. To overcome this problem we do run the distributed subplans now before we run the postgres executor. This ensures the intermediate result is present on the coordinator in time. We do account for total time instrumentation by removing the instrumentation before handing control to the psotgres executor and update the timings our self.

For future SQL support it is enough to create a valid query structure for the part of the query to be executed on the query coordinating node. As a utility we do serialise and print the query at debug level4 for engineers to inspect what kind of query is being planned on the query coordinator.
2020-02-25 14:39:56 +01:00
Philip Dubé 025cb94159 Fix multi_task_string_size sometimes leaking intermediate files 2020-02-24 16:33:34 +00:00
Philip Dubé bcf54c5014 Address a couple issues with maintenace daemon management:
- Stop the daemon when citus extension is dropped
- Bail on maintenance daemon startup if myDbData is started with a non-zero pid
- Stop maintenance daemon from spawning itself
- Don't use postgres die, just wrap proc_exit(0)
- Assert(myDbData->workerPid == MyProcPid)

The two issues were that multiple daemons could be running for a database,
or that a daemon would be leftover after DROP EXTENSION citus
2020-02-21 16:49:01 +00:00
Nils Dijk 6ee82c381e
Add missing pieces for version bump of #3482 (#3523) 2020-02-21 12:35:29 +01:00
Onur Tirtir 926a1a61b9 change "relation" with "table" in error messages related with foreign keys on reference tables 2020-02-20 09:58:47 +03:00
Onur Tirtir 001089783c Fix null relation name issue in CheckConflictingRelationAccesses 2020-02-19 19:10:35 +03:00
Philip Dubé d7a4ffdc46 Add test for issue, does not reproduce issue 2020-02-18 23:45:17 +00:00
Marco Slot 038e5999cb Implement direct COPY table TO stdout 2020-02-17 15:15:10 +01:00
Jelte Fennema 15f1173b1d
Semmle: Ensure permissions of private keys are 0600 (#3506)
When using --allow-group-access option from initdb our keys and
certificates would be created with 0640 permissions. Which is a pretty
serious security issue: This changes that. This would not be exploitable
though, since postgres would not actually enable SSL and would output
the following message in the logs:

```
DETAIL:  File must have permissions u=rw (0600) or less if owned by the database user, or permissions u=rw,g=r (0640) or less if owned by root.
```

Since citus still expected the cluster to have SSL enabled handshakes
between workers and coordinator would fail. So instead of a security
issue the cluster would simply be unusable.
2020-02-17 12:58:40 +01:00
Markus Sintonen 099e266a6c Force task executor 2020-02-16 01:32:52 +02:00
Markus Sintonen cf8319b992 Add comment, add subquery NOT tests 2020-02-16 01:21:10 +02:00
Markus Sintonen 3d3d615040 Add comment about NOT_EXPR. Treat it as invalid constraint for safety. 2020-02-15 16:54:38 +02:00
Philip Dubé 7382c8be00 Clean up from code review
Only change to behavior is:
- don't ignore array const's constcollid in SAORestrictions
- don't end lines with commas in DebugLogPruningInstance
2020-02-14 17:58:23 +00:00
Markus Sintonen cdedb98c54 Improve shard pruning logic to understand OR-conditions.
Previously a limitation in the shard pruning logic caused multi distribution value queries to always go into all the shards/workers whenever query also used OR conditions in WHERE clause.

Related to https://github.com/citusdata/citus/issues/2593 and https://github.com/citusdata/citus/issues/1537
There was no good workaround for this limitation. The limitation caused quite a bit of overhead with simple queries being sent to all workers/shards (especially with setups having lot of workers/shards).

An example of a previous plan which was inadequately pruned:
```
EXPLAIN SELECT count(*) FROM orders_hash_partitioned
	WHERE (o_orderkey IN (1,2)) AND (o_custkey = 11 OR o_custkey = 22);
                                                          QUERY PLAN
---------------------------------------------------------------------
 Aggregate  (cost=0.00..0.00 rows=0 width=0)
   ->  Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=0 width=0)
         Task Count: 4
         Tasks Shown: One of 4
         ->  Task
               Node: host=localhost port=xxxxx dbname=regression
               ->  Aggregate  (cost=13.68..13.69 rows=1 width=8)
                     ->  Seq Scan on orders_hash_partitioned_630000 orders_hash_partitioned  (cost=0.00..13.68 rows=1 width=0)
                           Filter: ((o_orderkey = ANY ('{1,2}'::integer[])) AND ((o_custkey = 11) OR (o_custkey = 22)))
(9 rows)
```

After this commit the task count is what one would expect from the query defining multiple distinct values for the distribution column:
```
EXPLAIN SELECT count(*) FROM orders_hash_partitioned
	WHERE (o_orderkey IN (1,2)) AND (o_custkey = 11 OR o_custkey = 22);
                                                          QUERY PLAN
---------------------------------------------------------------------
 Aggregate  (cost=0.00..0.00 rows=0 width=0)
   ->  Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=0 width=0)
         Task Count: 2
         Tasks Shown: One of 2
         ->  Task
               Node: host=localhost port=xxxxx dbname=regression
               ->  Aggregate  (cost=13.68..13.69 rows=1 width=8)
                     ->  Seq Scan on orders_hash_partitioned_630000 orders_hash_partitioned  (cost=0.00..13.68 rows=1 width=0)
                           Filter: ((o_orderkey = ANY ('{1,2}'::integer[])) AND ((o_custkey = 11) OR (o_custkey = 22)))
(9 rows)
```

"Core" of the pruning logic works as previously where it uses `PrunableInstances` to queue ORable valid constraints for shard pruning.
The difference is that now we build a compact internal representation of the query expression tree with PruningTreeNodes before actual shard pruning is run.

Pruning tree nodes represent boolean operators and the associated constraints of it. This internal format allows us to have compact representation of the query WHERE clauses which allows "core" pruning logic to work with OR-clauses correctly.

For example query having
`WHERE (o_orderkey IN (1,2)) AND (o_custkey=11 OR (o_shippriority > 1 AND o_shippriority < 10))`
gets transformed into:
1. AND(o_orderkey IN (1,2), OR(X, AND(X, X)))
2. AND(o_orderkey IN (1,2), OR(X, X))
3. AND(o_orderkey IN (1,2), X)
Here X is any set of unknown condition(s) for shard pruning.

This allow the final shard pruning to correctly recognize that shard pruning is done with the valid condition of `o_orderkey IN (1,2)`.

Another example with unprunable condition in query
`WHERE (o_orderkey IN (1,2)) OR (o_custkey=11 AND o_custkey=22)`
gets transformed into:
1. OR(o_orderkey IN (1,2), AND(X, X))
2. OR(o_orderkey IN (1,2), X)

Which is recognized as unprunable due to the OR condition between distribution column and unknown constraint -> goes to all shards.

Issue https://github.com/citusdata/citus/issues/1537 originally suggested transforming the query conditions into a full disjunctive normal form (DNF),
but this process of transforming into DNF is quite a heavy operation. It may "blow up" into a really large DNF form with complex queries having non trivial `WHERE` clauses.

I think the logic for shard pruning could be simplified further but I decided to leave the "core" of the shard pruning untouched.
2020-02-14 17:58:13 +00:00
Jelte Fennema 3d8efe303e
Fix flaky test introduced by #3374 (#3504)
Since #3374 multi_utilities is not safe to run in parallel anymore. This
is because it now also shows locks on shards created outside it's own
test. This is not really possible to fix.

Example of flaky test:
- https://circleci.com/gh/citusdata/citus/89995
- https://circleci.com/gh/citusdata/citus/90017
2020-02-14 16:07:33 +01:00
Jelte Fennema 5ef3e83ce4
Make multi_utilities test take 2 seconds instead of 20 (#3507)
On worker 2 it was waiting for dustbunnies_990001 to be
vacuumed/analyzed. This table doesn't actually exist, so that never
happend. Now it waits for the correct table and throws an error if it
waits more than 10 seconds.
2020-02-14 15:38:51 +01:00
Onder Kalaci 975c4c2264 Do not prune shards if the distribution key is NULL
The root of the problem is that, standard_planner() converts the following qual

```
   {OPEXPR
   :opno 98
   :opfuncid 67
   :opresulttype 16
   :opretset false
   :opcollid 0
   :inputcollid 100
   :args (
      {VAR
      :varno 1
      :varattno 1
      :vartype 25
      :vartypmod -1
      :varcollid 100
      :varlevelsup 0
      :varnoold 1
      :varoattno 1
      :location 45
      }
      {CONST
      :consttype 25
      :consttypmod -1
      :constcollid 100
      :constlen -1
      :constbyval false
      :constisnull true
      :location 51
      :constvalue <>
      }
   )
   :location 49
   }
```

To

```
(
   {CONST
   :consttype 16
   :consttypmod -1
   :constcollid 0
   :constlen 1
   :constbyval true
   :constisnull true
   :location -1
   :constvalue <>
   }
)
```

So, Citus doesn't deal with NULL values in real-time or non-fast path router queries.

And, in the FastPathRouter planner, we check constisnull in DistKeyInSimpleOpExpression().
However, in deferred pruning case, we do not check for isnull for const.

Thus, the fix consists of two parts:
- Let PruneShards() not crash when NULL parameter is passed
- For deferred shard pruning in fast-path queries, explicitly check that we have CONST which is not NULL
2020-02-13 15:00:31 +01:00
Onur Tirtir cd8210d516
Bump citus version to 9.3devel (#3482) 2020-02-13 16:22:05 +03:00
Onur Tirtir 39df51e903
Introduce objects to dist. infrastructure when updating Citus (#3477)
Mark existing objects that are not included in distributed object infrastructure
in older versions of Citus (but now should be) as distributed, after updating
Citus successfully.
2020-02-07 18:07:59 +03:00
Nils Dijk d5433400f9
Fix: Unnecessary repartition on joins with more than 4 tables (#3473)
DESCRIPTION: Fix unnecessary repartition on joins with more than 4 tables

In 9.1 we have introduced support for all CH-benCHmark queries by widening our definitions of joins to include joins with expressions in them. This had the undesired side effect of Q5 regressing on its plan by implementing a repartition join.

It turned out this regression was not directly related to widening of the join clause, nor the schema employed by CH-benCHmark. Instead it had to do with 4 or more tables being joined in a chain. A chain meaning:

```sql
SELECT * FROM a,b,c,d WHERE a.part = b.part AND b.part = c.part AND ....
```

Due to how our join order planner was implemented it would only keep track of 1 of the partition columns when comparing if the join could be executed locally. This manifested in a join chain of 4 tables to _always_ be executed as a repartition join. 3 tables joined in a chain would have the middle table shared by the two outer tables causing the local join possibility to be found.

With this patch we keep a  unique list (or set) of all partition columns participating in the join. When a candidate table is checked for a possibility to execute a local join it will check if there is any partition column in that set that matches an equality join clause on the partition column of the candidate table.

By taking into account all partition columns in the left relation it will now find the local join path on >= 4 tables joined in a chain. 

fixes: #3276
2020-02-06 15:07:07 +01:00
Philip Dubé c252811884 dont: don't, wont: won't, acylic: acyclic 2020-02-05 17:32:22 +00:00
Halil Ozan Akgul 8ce4f20061 Fixes the bug of grants on public schema propagation 2020-02-05 18:05:58 +03:00
SaitTalhaNisanci 89dc7d5e41
remove outdated information in citus upgrade readme (#3471) 2020-02-05 13:31:02 +03:00
Marco Slot 64ca5c9acb Add additional INSERT..SELECT repartition tests 2020-02-05 11:06:44 +01:00
Hadi Moshayedi 9dd14fa90d Rename discarded target list items in repartitioned INSERT/SELECT 2020-02-05 11:06:44 +01:00
Onder Kalaci c7e2309f4c Improve single hash-repartitioning with numeric (or non-int) types
We used to treat the shard interval array that we passed as numeric[].
However, it should be int[], as the shard ranges are int[].
2020-02-04 20:30:04 +01:00
Hadi Moshayedi bc1a800f70 Use current user for repartition join temp schemas.
Otherwise when using a less privileged user we might get
errors when trying to create the schema.
2020-02-04 09:48:20 -08:00
Hadi Moshayedi 890e23e734 Update multi_insert_select_non_pushable_queries 2020-02-03 13:13:30 -08:00
Hadi Moshayedi 5818bcd27e Update with_dml 2020-02-03 13:13:30 -08:00
Hadi Moshayedi 46f60e1ac0 Update multi_insert_select_conflict 2020-02-03 13:13:30 -08:00
Hadi Moshayedi 05f58c9ec5 Update multi_insert_select 2020-02-03 13:13:30 -08:00
Hadi Moshayedi 264530311a Don't use distributed insert/select for repartitioned joins 2020-02-03 13:13:30 -08:00
Onder Kalaci 8be1b0112d Add failure test for parallel reference table join 2020-02-03 19:35:07 +01:00
Marco Slot a6bd6c657e Add tests that exercise parallel reference table join logic 2020-02-03 11:54:29 +01:00
Onder Kalaci 2f274a4fce Make sure to go deeper into the functions to search for PARAMs
For example, a PARAM might reside inside a function just because
of a casting of a type such as the follows:

```
               {FUNCEXPR
               :funcid 1740
               :funcresulttype 1700
               :funcretset false
               :funcvariadic false
               :funcformat 2
               :funccollid 0
               :inputcollid 0
               :args (
                  {PARAM
                  :paramkind 0
                  :paramid 15
                  :paramtype 23
                  :paramtypmod -1
                  :paramcollid 0
                  :location 356
                  }
               )
```

We should recursively check the expression before bailing out.
2020-02-03 09:36:12 +01:00
Philip Dubé db2eac5658 diff-filter: use utf8 encoding, not ascii 2020-01-31 00:03:17 +00:00
Hadi Moshayedi 9d988b3437 Add insert/select connection leak tests 2020-01-30 14:09:07 -08:00
Philip Dubé d43c80d4d8 pullUpIntermediateRows should not be true when groupedByDisjointPartitionColumn is true
This was causing 'SELECT id, stdev(y_int) FROM tbl GROUP BY id' to push down stddev without group by
2020-01-30 21:18:08 +00:00
Philip Dubé 5fccc56d3e Expand the set of aggregates which cannot have LIMIT approximated
Previously we only prevented AVG from being pushed down, but this is incorrect:
- array_agg, while somewhat non sensical to order by, will potentially be missing values
- combinefunc aggregation will raise errors about cstrings not being comparable (while we also can't know if the aggregate is commutative)

This commit limits approximating LIMIT pushdown when ordering by aggregates to:
min, max, sum, count, bit_and, bit_or, every, any
Which means of those we previously supported, we now exclude:
avg, array_agg, jsonb_agg, jsonb_object_agg, json_agg, json_object_agg, hll_add, hll_union, topn_add, topn_union
2020-01-30 17:45:18 +00:00
Önder Kalacı 8584cb005b
Do not evaluate functions on the coordinator for SELECT queries (#3440)
Previously, the logic for evaluting the functions and the parameters
were the same. That ended-up evaluting the functions inaccurately
on the coordinator. Instead, split the function evaluation logic
from parameter evalution logic.
2020-01-30 08:47:28 +01:00
Önder Kalacı e9c17b71a4
Add missing ORDER BY (#3441)
As it causes some random failures
2020-01-29 17:36:32 +01:00
Philip Dubé 40ce531850 Update diff-filter to handle lines removed by normalization
Add a script to test our diff logic

pg_regress_multi updated to rely on $PATH having copy_modified to fix testing with VPATH
2020-01-28 15:39:40 +00:00
Jelte Fennema b9eee70fa5 Fix random output ordering in CTE inlining test (#3434) 2020-01-27 16:38:27 +01:00
Jelte Fennema c38446b5f5 Replace denormalized test output with normalized at the end of the run 2020-01-24 11:42:38 +01:00
Önder Kalacı 4519d3411d
Improve the representation of used sub plans (#3411)
Previously, we've identified the usedSubPlans by only looking
to the subPlanId.

With this commit, we're expanding it to also include information
on the location of the subPlan.

This is useful to distinguish the cases where the subPlan is used
either on only HAVING or both HAVING and any other part of the query.
2020-01-24 10:47:14 +01:00
Philip Dubé 69dde460de See what flaky multi_extension test is doing with roles 2020-01-23 21:50:40 +00:00
Philip Dubé 5e55a36172 Avoid obscuring regression test diffs with normalization
First, diff is updated to not update the files in-place

For some reason diff is being called multiple times,
so $file1.unmodified becomes normalized on second invocation

Secondly, diff-filter updates output to come from the unmodified version

Normalization is serving two purposes:
- avoid diff noise in regressions
- avoid diff noise in commits when expected result is updated

The first purpose only wants to reduce the lines which diff registers,
whereas the second wants those changes to be committed
2020-01-23 18:51:23 +00:00
Önder Kalacı ef7d1ea91d
Locally execute queries that don't need any data access (#3410)
* Update shardPlacement->nodeId to uint

As the source of the shardPlacement->nodeId is always workerNode->nodeId,
and that is uint32.

We had this hack because of: 0ea4e52df5 (r266421409)

And, that is gone with: 90056f7d3c (diff-c532177d74c72d3f0e7cd10e448ab3c6L1123)

So, we're safe to do it now.

* Relax the restrictions on using the local execution

Previously, whenever any local execution happens, we disabled further
commands to do any remote queries. The basic motivation for doing that
is to prevent any accesses in the same transaction block to access the
same placements over multiple sessions: one is local session the other
is remote session to the same placement.

However, the current implementation does not distinguish local accesses
being to a placement or not. For example, we could have local accesses
that only touches intermediate results. In that case, we should not
implement the same restrictions as they become useless.

So, this is a pre-requisite for executing the intermediate result only
queries locally.

* Update the error messages

As the underlying implementation has changed, reflect it in the error
messages.

* Keep track of connections to local node

With this commit, we're adding infrastructure to track if any connection
to the same local host is done or not.

The main motivation for doing this is that we've previously were more
conservative about not choosing local execution. Simply, we disallowed
local execution if any connection to any remote node is done. However,
if we want to use local execution for intermediate result only queries,
this'd be annoying because we expect all queries to touch remote node
before the final query.

Note that this approach is still limiting in Citus MX case, but for now
we can ignore that.

* Formalize the concept of Local Node

Also some minor refactoring while creating the dummy placement

* Write intermediate results locally when the results are only needed locally

Before this commit, Citus used to always broadcast all the intermediate
results to remote nodes. However, it is possible to skip pushing
the results to remote nodes always.

There are two notable cases for doing that:

   (a) When the query consists of only intermediate results
   (b) When the query is a zero shard query

In both of the above cases, we don't need to access any data on the shards. So,
it is a valuable optimization to skip pushing the results to remote nodes.

The pattern mentioned in (a) is actually a common patterns that Citus users
use in practice. For example, if you have the following query:

WITH cte_1 AS (...), cte_2 AS (....), ... cte_n (...)
SELECT ... FROM cte_1 JOIN cte_2 .... JOIN cte_n ...;

The final query could be operating only on intermediate results. With this patch,
the intermediate results of the ctes are not unnecessarily pushed to remote
nodes.

* Add specific regression tests

As there are edge cases in Citus MX and with round-robin policy,
use the same queries on those cases as well.

* Fix failure tests

By forcing not to use local execution for intermediate results since
all the tests expects the results to be pushed remotely.

* Fix flaky test

* Apply code-review feedback

Mostly style changes

* Limit the max value of pg_dist_node_seq to reserve for internal use
2020-01-23 18:28:34 +01:00
Hadi Moshayedi be647ad944 Output filenames in ensure_no_intermediate_data_leak
This can helpful in guiding us where to look when this test fails.
For example, if the result file has repartitioned_results_ prefix,
then we need to look into repartitioned insert/select. Otherwise
it is probably a CTE or a subquery.
2020-01-22 11:12:16 -08:00
Jelte Fennema cd5259a25a
Do not place new shards with shards in TO_DELETE state (#3408)
When creating a new distributed table. The shards would colocate with shards
with SHARD_STATE_TO_DELETE (shardstate = 4). This means if that state was
because of a shard move the new shard would be created on two nodes and it
would not get deleted since it's shard state would be 1.
2020-01-22 14:52:12 +01:00
Halil Ozan Akgul b40f067d05 Adds propagation for grant on schema commands 2020-01-20 14:51:28 +03:00
Onder Kalaci fd17e4578e Improve tests 2020-01-17 16:02:57 +01:00
Onder Kalaci 0bf1e81e33 Cache local plans on BeginScan 2020-01-17 16:02:57 +01:00
Onder Kalaci 016f561e45 Ingest data for cte_inline tests 2020-01-17 12:46:00 +01:00
Onder Kalaci 3833a7e686 Fix issues for CTE inlining on Postgres 11
Comment from code:

/*
 * We had to implement this hack because on Postgres11 and below, the originalQuery
 * and the query would have significant differences in terms of CTEs where CTEs
 * would not be inlined on the query (as standard_planner() wouldn't inline CTEs
 * on PG 11 and below).
 *
 * Instead, we prefer to pass the inlined query to the distributed planning. We rely
 * on the fact that the query includes subqueries, and it'd definitely go through
 * query pushdown planning. During query pushdown planning, the only relevant query
 * tree is the original query.
 */
2020-01-17 11:59:02 +01:00
Jelte Fennema 246435be7e
Lazy query deparsing executable queries (#3350)
Deparsing and parsing a query can be heavy on CPU. When locally executing 
the query we don't need to do this in theory most of the time.

This PR is the first step in allowing to skip deparsing and parsing
the query in these cases, by lazily creating the query string and
storing the query in the task. Future commits will make use of this and
not deparse and parse the query anymore, but use the one from the task
directly.
2020-01-17 11:49:43 +01:00
Hadi Moshayedi 6cf1c01660 Don't use repartitioned INSERT/SELECT for repartition joins 2020-01-16 23:40:31 -08:00
Hadi Moshayedi 5eeb07124f Repartitioned INSERT/SELECT: include job id in result id prefix 2020-01-16 23:24:52 -08:00
Hadi Moshayedi a079278b0c Repartitioned INSERT/SELECT: Add a GUC to enable/disable it 2020-01-16 23:24:52 -08:00
Hadi Moshayedi ce5eea4885 INSERT/SELECT: make SELECT column names unique 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 3258d87f3e Isolation tests for INSERT/SELECT repartition 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 8b27a9a195 More range partitioned tests 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 8635396cea Repartitioned INSERT/SELECT: Test rollback behaviour 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 43218eebf6 Failure tests for INSERT/SELECT repartition 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 665b33dca1 MX tests for INSERT/SELECT repartition 2020-01-16 23:24:52 -08:00
Hadi Moshayedi af2349f21f Repartitioned INSERT/SELECT: Add a prepared statement test 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 97072c9eb1 INSERT/SELECT: show method in EXPLAIN output 2020-01-16 23:24:52 -08:00
Hadi Moshayedi b143d9588a Repartitioned INSERT/SELECT: Test GROUP BY 2020-01-16 23:24:52 -08:00
Hadi Moshayedi fe548b762f Repartitioned INSERT/SELECT: Test CTEs 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 494cc383cc Repartitioned INSERT/SELECT: Enable RETURNING 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 4b14347fc3 Tests for DML followed by insert/select repartition 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 44a2aede16 Don't start a coordinated transaction on workers.
Otherwise transaction hooks of Citus kick in and might cause unwanted errors.
2020-01-16 23:24:52 -08:00
Hadi Moshayedi 42c3c03b85 Handle extra columns added in ExpandWorkerTargetEntry() in repartitioned INSERT/SELECT 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 89463f9760 Repartitioned INSERT/SELECT: cast columns in SELECT targets 2020-01-16 23:24:52 -08:00
Hadi Moshayedi d67a384350 Enable repartitioned INSERT/SELECT ON CONFLICT. 2020-01-16 23:24:52 -08:00
Hadi Moshayedi b4e5f4b10a Implement INSERT ... SELECT with repartitioning 2020-01-16 23:24:52 -08:00
Hadi Moshayedi e30580e2bd Add ORDER BY to multi_row_insert.sql 2020-01-16 15:20:39 -08:00
Jelte Fennema 0ee1eab070 Make tests fail with a useful error message 2020-01-16 18:30:30 +01:00
Jelte Fennema cb5154cf03 Add more failing tests, of which some have bad error messages 2020-01-16 18:30:30 +01:00
Onder Kalaci dc17c2658e Defer shard pruning for fast-path router queries to execution
This is purely to enable better performance with prepared statements.
Before this commit, the fast path queries with prepared statements
where the distribution key includes a parameter always went through
distributed planning. After this change, we only go through distributed
planning on the first 5 executions.
2020-01-16 16:59:36 +01:00
Halil Ozan Akgul c5539d20d9 Adds alter table schema propagation 2020-01-16 17:04:16 +03:00
Nils Dijk b6e09eb691
Fix: distributed function with table reference in declare (#3384)
DESCRIPTION: Fixes a problem when adding a new node due to tables referenced in a functions body

Fixes #3378 

It was reported that `master_add_node` would fail if a distributed function has a table name referenced in its declare section of the body. By default postgres validates the body of a function on creation. This is not a problem in the normal case as tables are replicated to the workers when we distribute functions.

However when a new node is added we first create dependencies on the workers before we try to create any tables, and the original tables get created out of bound when the metadata gets synced to the new node. This causes the function body validator to raise an error the table is not on the worker.

To mitigate this issue we set `check_function_bodies` to `off` right before we are creating the function.

The added test shows this does resolve the issue. (issue can be reproduced on the commit without the fix)
2020-01-16 14:21:54 +01:00
Jelte Fennema e76281500c
Replace shardId lock with lock on colocation+shardIntervalIndex (#3374)
This new locking pattern makes sure that some deadlocks that could
happend during rebalancing cannot occur anymore.
2020-01-16 13:14:01 +01:00
Jelte Fennema 86343bcc8f Re-add test that broke with GUC workaround 2020-01-16 12:34:50 +01:00
Jelte Fennema 6b9b633695 Add more tests for prepared statements 2020-01-16 12:28:15 +01:00
Jelte Fennema 43a3fdd12f Fix comment 2020-01-16 12:28:15 +01:00
Jelte Fennema fe3827e499 Add tests for [NOT] MATERIALEZED 2020-01-16 12:28:15 +01:00
Onder Kalaci 326dfab44a Fix a query which triggers an existing bug, see https://github.com/citusdata/citus/issues/3189#issuecomment-571497051 2020-01-16 12:28:15 +01:00
Onder Kalaci c653923960 Update regression tests 6
Local execution and CTE pushdown
2020-01-16 12:28:15 +01:00
Onder Kalaci 3818be45a6 Update regression tests-5
Failure tests that rely on intermediate results
2020-01-16 12:28:15 +01:00
Onder Kalaci 1e85938b46 Update regression tests-4
Update the MX tests. Similar to the previous commits, prevent CTE
inlining in some cases to prevent divergent test outputs.
2020-01-16 12:28:15 +01:00
Onder Kalaci fc07bd7c5b Update regression tests-3
Update the regression tests which only change in PG 12.
2020-01-16 12:28:15 +01:00
Onder Kalaci 64560b07be Update regression tests-2
In this commit, we're introducing a way to prevent CTE inlining via a GUC.

The GUC is used in all the tests where PG 11 and PG 12 tests would diverge
otherwise.

Note that, in PG 12, the restriction information for CTEs are generated. It
means that for some queries involving CTEs, Citus planner (router planner/
pushdown planner) may behave differently. So, via the GUC, we prevent
tests to diverge on PG 11 vs PG 12.

When we drop PG 11 support, we should get rid of the GUC, and mark
relevant ctes as MATERIALIZED, which does the same thing.
2020-01-16 12:28:15 +01:00
Onder Kalaci 5cb203b276 Update regression tests-1
These set of tests has changed in both PG 11 and PG 12.
The changes are only about CTE inlining kicking in both
versions, and yielding the exact same distributed planning.
2020-01-16 12:28:15 +01:00
Onder Kalaci 421bf68516 Add the specific regression tests
With this commit, we're adding the specific tests for CTE inlining.
The test has a different output file for pg 11, because as mentioned
in the previous commits, PG 12 generates more restriction information
for CTEs.
2020-01-16 12:28:15 +01:00
Philip Dubé 4d9a733c2f Fix inserting multiple values with row expression partition column causing the insert to be ignored
Raise an error instead of silently inserting nothing if we hit this condition in the future
2020-01-15 21:10:50 +00:00
Marco Slot 06709ee108 Always use NOTICE in log_remote_commands and avoid redaction when possible 2020-01-13 18:24:36 +01:00
Marco Slot 90056f7d3c Remove copy from worker for append-partitioned table 2020-01-13 23:03:40 -08:00
Philip Dubé 62524d152d mitmscripts/fluent.py: use atomic increment 2020-01-13 20:35:08 +00:00
Philip Dubé ccabf19090 Propagate DROP ROUTINE, ALTER ROUTINE
In two places I've made code more straight forward by using ROUTINE in our own codegen

Two changes which may seem extraneous:

AppendFunctionName was updated to not use pg_get_function_identity_arguments.
This is because that function includes ORDER BY when printing an aggregate like my_rank.
While ALTER AGGREGATE my_rank(x "any" ORDER BY y "any") is accepted by postgres,
ALTER ROUTINE my_rank(x "any" ORDER BY y "any") is not.

Tests were updated to use macaddr over integer. Using integer is flaky, our logic
could sometimes end up on tables like users_table. I originally wanted to use money,
but money isn't hashable.
2020-01-13 15:37:46 +00:00
Philip Dubé 4b5d6c3ebe Rename RelayFileState to ShardState
Replace FILE_ prefix with SHARD_STATE_
2020-01-12 05:57:53 +00:00