Commit Graph

4179 Commits (aa465b6de1702fce7e2b5a0e577d085e838c2209)

Author SHA1 Message Date
Önder Kalacı dc6c194916
Show IDLE backends in citus_dist_stat_activity (#5700)
* Break the dependency to CitusInitiatedBackend infrastructure

With this change, we start to show non-distributed backends as well
in citus_dist_stat_activity. I think that
  (a) it is essential for making citus_lock_waits to work for blocked
      on DDL commands.
  (b) it is more expected from the user's perspective. The name of
      the view is a little inconsistent now (e.g., citus_dist_stat_activity)
      but we are already planning to improve the names with followup
      PRs.

Also, we have global pids assigned, the CitusInitiatedBackend
becomes obsolete.
2022-02-10 08:59:28 -08:00
Ahmet Gedemenli 76b63a307b Propagate create/drop schema commands 2022-02-10 14:58:09 +03:00
Marco Slot d0711ea9b4 Delegate function calls in FROM outside of transaction block 2022-02-09 20:56:25 +01:00
Onder Kalaci 1c30f61a70 Prevent citus.node_conninfo to use "application_name"
With https://github.com/citusdata/citus/pull/5657, Citus uses
a fixed application_name while connecting to remote nodes
for internal purposes.

It means that we cannot allow users to override it via
citus.node_conninfo.
2022-02-09 13:22:04 +01:00
Teja Mupparti 1e3c8e34c0 Allow create_distributed_function() on a function owned by an extension
Implement #5649
Allow create_distributed_function() on functions owned by extensions

1) Only update pg_dist_object, and do not propagate CREATE FUNCTION.
2) Ensure corresponding extension is in pg_dist_object.
3) Verify if dependencies exist on the function they should resolve to the extension.
4) Impact on node-scaling: We build a list of ddl commands based on all objects in
   pg_dist_object. We need to omit the ddl's for the extension-function, as it
   will get propagated by the virtue of the extension creation.
5) Extra checks for functions coming from extensions, to not propagate changes
   via ddl commands, even though the function is marked as distributed in pg_dist_object
2022-02-08 11:52:56 -08:00
Halil Ozan Akgul 8ee02b29d0 Introduce global PID 2022-02-08 16:49:38 +03:00
Burak Velioglu 0a70b78bf5
Add test for dist type 2022-02-07 17:50:49 +03:00
Burak Velioglu c0aece64d0
Add test for checking distributed extension function 2022-02-07 17:50:48 +03:00
Burak Velioglu ab248c1785
Check object ownership while creating pg_dist_object entries on remote 2022-02-07 17:50:48 +03:00
Burak Velioglu 8ae7577581
Use superuser connection while syncing dependent objects' pg_dist_object tuples 2022-02-07 17:50:45 +03:00
Marco Slot 872f0a79db Remove random shard placement policy 2022-02-06 21:55:58 +01:00
Marco Slot 0cae8e7d6b Remove local-node-first shard placement 2022-02-06 21:36:34 +01:00
Teja Mupparti c8e504dd69 Fix the issue #5673
If the expression is simple, such as, SELECT function() or PEFORM function()
in PL/PgSQL code, PL engine does a simple expression evaluation which can't
interpret the Citus CustomScan Node. Code checks for simple expressions when
executing an UDF but missed the DO-Block scenario, this commit fixes it.
2022-02-04 15:44:53 -08:00
Ying Xu b5c116449b
Removed dependency from EnsureTableOwner (#5676)
Removed dependency for EnsureTableOwner. Also removed pg_fini() and columnar_tableam_finish() Still need to remove CheckCitusVersion dependency to make Columnar_tableam.h dependency free from Citus.
2022-02-04 12:45:07 -08:00
Onur Tirtir 79442df1b7
Fix coordinator/worker query targetlists for agg. that we cannot push-down (#5679)
Previously, we were wrapping targetlist nodes with Vars that reference
to the result of the worker query, if the node itself is not `Const` or
not a `Param`. Indeed, we should not do that unless the node itself is
a `Var` node or contains a `Var` within it (e.g.: `OpExpr(Var(column_a) > 2)`).
Otherwise, when worker query returns empty result set, then combine
query exec would crash since the `Var` would be pointing to an empty
tuple slot, which is not desirable for the node-executor methods.
2022-02-04 05:37:25 -08:00
Onder Kalaci 72d7d92611 Apply code review feedback 2022-02-04 10:52:57 +01:00
Onder Kalaci 923bb194a4 Move isolation_multiuser_locking to MX tests 2022-02-04 10:52:57 +01:00
Onder Kalaci bcb00e3318 remove not used files 2022-02-04 10:52:57 +01:00
Onder Kalaci ff234fbfd2 Unify old GUCs into a single one
Replaces citus.enable_object_propagation with citus.enable_metadata_sync

Also, within Citus 11 release cycle, we added citus.enable_metadata_sync_by_default,
that is also replaced with citus.enable_metadata_sync.

In essence, when citus.enable_metadata_sync is set to true, all the objects
and the metadata is send to the remote node.

We strongly advice that the users never changes the value of
this GUC.
2022-02-04 10:52:56 +01:00
Teja Mupparti f31bce5b48 Fixes the issue seen in https://github.com/citusdata/citus-enterprise/issues/745
With this commit, rebalancer backends are identified by application_name = citus_rebalancer
and the regular internal backends are identified by application_name = citus_internal
2022-02-03 09:40:46 -08:00
jeff-davis b072b9235e
Columnar: fix checksums, broken in a4067913. (#5669)
Checksums must be set directly before writing the page. log_newpage()
sets the page LSN, and therefore invalidates the checksum.
2022-02-02 13:22:11 -08:00
Onder Kalaci 650243927c Relax some transactional limications on activate node
We already enforce EnsureSequentialModeMetadataOperations(), and given that all activate node is transaction, we should be fine
2022-02-01 15:56:55 +01:00
Onder Kalaci 34d91009ed Update outdated comment
As of the current HEAD, we support sequences as first class objects
2022-02-01 15:37:10 +01:00
Marco Slot 63c6896716 Enable function call pushdown from workers 2022-02-01 14:13:25 +01:00
Önder Kalacı f712dfc558
Add tests coverage (#5672)
For extension owned tables with sequences
2022-02-01 15:39:52 +03:00
Burak Velioglu f88cc230bf
Handle tables and objects as metadata. Update UDFs accordingly
With this commit we've started to propagate sequences and shell
tables within the object dependency resolution. So, ensuring any
dependencies for any object will consider shell tables and sequences
as well. Separate logics for both shell tables and sequences have
been removed.

Since both shell tables and sequences logic were implemented as a
part of the metadata handling before that logic, we were propagating
them while syncing table metadata. With this commit we've divided
metadata (which means anything except shards thereafter) syncing
logic into multiple parts and implemented it either as a part of
ActivateNode. You can check the functions called in ActivateNode
to check definition of different metadata.

Definitions of start_metadata_sync_to_node and citus_activate_node
have also been updated. citus_activate_node will basically create
an active node with all metadata and reference table shards.
start_metadata_sync_to_node will be same with citus_activate_node
except replicating reference tables. stop_metadata_sync_to_node
will remove all the metadata. All of those UDFs need to be called
by superuser.
2022-01-31 16:20:15 +03:00
Önder Kalacı f68ac4a7cf
Consider foreign keys between reference tables (#5659)
On #5071, we avoid edge cases, but below there are foreign key constraints as well

This commit makes sure we cover those as well
2022-01-28 13:38:14 +01:00
Heikki Linnakangas a40679139b
Use smgrextend() when extending relation, and WAL-log first. (#5654)
When creating a new table, we bypass the buffer cache and write the
initial pages directly with smgrwrite(). However, you're supposed to
use smgrextend() when extending a relation, rather than smgrwrite().
There isn't much difference between them, but smgrextend() updates the
relation size cache, which seems important, although I haven't seen
any real bugs caused by that.

Also, write the block to disk only after WAL-logging it, so that we
can include the LSN of the WAL record in the version that we write
out. Currently, the page as written to disk has LSN 0. That doesn't
cause any user-visible issues either, at worst it could make us
WAL-log a full page image of the page earlier than necessary, but that
doesn't matter currently because we WAL-log full page images of all
changes anyway.

I bumped into that issue with LSN 0 in the page header when testing
Citus with Zenith (https://github.com/zenithdb/zenith/issues/1176).
Zenith contains a check that PANICs if you write a block to disk
without WAL-logging it, and it works by checking the LSN of the page
that's written out. In this case, we are WAL-logging the page even
though the LSN on the page is 0, so it was a false alarm, but I'd love
to get this changed in Citus to keep the check in Zenith simple.

A downside of WAL-logging the page first is that if you run out of
disk space, you have already created the WAL record. So if you then
crash and restart, WAL recovery will likely run out of disk space,
too, which is bad. In practice, we have the same problem in other
places, like rewriteheap.c. Also, if you are on the brink of running
out of disk space, you will probably run out at WAL replay anyway,
regardless of which order we write these few pages. But if we wanted
to fix that, we could first extend the relation with zeros, and then
WAL-log the pages. That's how heap extension works.

It would be even nicer to use the buffer cache for this, and skip the
smgrimmedsync() on the relation. However, that would require more
work, because we don't have the Relation struct for the relation here.
We could use ReadBufferWithoutRelcache(), but that doesn't work for
unlogged tables. Unlogged tables are currently not supported
(https://github.com/citusdata/citus/issues/4742), but that would
become a problem if we want to support them in the future.
CreateFakeRelcacheEntry() also doesn't work with unlogged tables. We
could do things differently for logged and unlogged tables, but that
complicates the code further.

Co-authored-by: jeff-davis <Jeffrey.Davis@microsoft.com>
2022-01-27 12:04:08 -08:00
Onder Kalaci 303540e494 Add PGAPPNAME env. variable to arbitrary configs 2022-01-27 11:00:15 +01:00
Onder Kalaci b26eeaecd3 Use a fixed application_name while connecting to remote nodes
Citus heavily relies on application_name, see
`IsCitusInitiatedRemoteBackend()`.

But if the user set the application name, such as export PGAPPNAME=test_name,
Citus uses that name while connecting to the remote node.

With this commit, we ensure that Citus always connects with
the "citus" user name to the remote nodes.
2022-01-27 10:46:25 +01:00
Onder Kalaci b9b419ef16 Allow creating distributed tables in sequential mode
With https://github.com/citusdata/citus/pull/2780, we allow
COPY to use any number of connections that the executor used
in a tx block.

Meaning that, while COPYing data to the shards, create_distributed_table
could allow sequential mode.
2022-01-26 12:58:18 +01:00
Onur Tirtir 8c8d696621
Not fail over to local execution when it's not supported (#5625)
We fall back to local execution if we cannot establish any more
connections to local node. However, we should not do that for the
commands that we don't know how to execute locally (or we know we
shouldn't execute locally). To fix that, we take localExecutionSupported
take into account in CanFailoverPlacementExecutionToLocalExecution too.

Moreover, we also prompt a more accurate hint message to inform user
about whether the execution is failed because local execution is
disabled by them, or because local execution wasn't possible for given
command.
2022-01-25 16:43:21 +01:00
Onur Tirtir ff3913ad99
Copy errmsg for distributed deadlock error into heap (#5641)
multi_log_hook() hook is called by EmitErrorReport() when emitting the
ereport either to frontend or to the server logs. And some callers of
EmitErrorReport() (e.g.: errfinish()) seems to assume that string fields
of given ErrorData object needs to be freed. For this reason, we copy the
message into heap here.

I don't think we have faced with such a problem before but it seems worth
fixing as it is theoretically possible due to the reasoning above.
2022-01-24 06:27:41 -08:00
Ahmet Gedemenli c838fb428f Refactor GenerateGrantOnSchemaStmtForRights 2022-01-24 11:31:59 +03:00
Ahmet Gedemenli e6fc0c6f36 Turn mx on for test: multi_colocation_utils 2022-01-21 19:31:47 +03:00
Onur Tirtir 4dc38e9e3d
Use EnsureCompatibleLocalExecutionState instead (#5640) 2022-01-21 15:37:59 +01:00
Ahmet Gedemenli 8647682c11 Fix typo: taget/target 2022-01-21 10:35:56 +03:00
Onur Tirtir 181111b84f Drop ruleutils copied for statistics 2022-01-20 17:28:19 +03:00
Onur Tirtir 7b59295af2 Drop ruleutils copied for triggers 2022-01-20 17:28:19 +03:00
Önder Kalacı e8ba9dd9d3
Merge branch 'master' into make_minimal_work_again 2022-01-20 11:48:53 +01:00
Teja Mupparti 54862f8c22 (1) Functions will be delegated even when present in the scope of an explicit
BEGIN/COMMIT transaction block or in a UDF calling another UDF.
(2) Prohibit/Limit the delegated function not to do a 2PC (or any work on a
remote connection).
(3) Have a safety net to ensure the (2) i.e. we should block the connections
from the delegated procedure or make sure that no 2PC happens on the node.
(4) Such delegated functions are restricted to use only the distributed argument
value.

Note: To limit the scope of the project we are considering only Functions(not
procedures) for the initial work.

DESCRIPTION: Introduce a new flag "force_delegation" in create_distributed_function(),
which will allow a function to be delegated in an explicit transaction block.

Fixes #3265

Once the function is delegated to the worker, on that node during the planning

distributed_planner()
TryToDelegateFunctionCall()
CheckDelegatedFunctionExecution()
EnableInForceDelegatedFuncExecution()
Save the distribution argument (Constant)
ExecutorStart()
CitusBeginScan()
IsShardKeyValueAllowed()
Ensure to not use non-distribution argument.

ExecutorRun()
AdaptiveExecutor()
StartDistributedExecution()
EnsureNoRemoteExecutionFromWorkers()
Ensure all the shards are local to the node in the remoteTaskList.
NonPushableInsertSelectExecScan()
InitializeCopyShardState()
EnsureNoRemoteExecutionFromWorkers()
Ensure all the shards are local to the node in the placementList.

This also fixes a minor issue: Properly handle expressions+parameters in distribution arguments
2022-01-19 16:43:33 -08:00
Onder Kalaci 7f30222c90 Fix check-minimal
It seems like we broke check-minimal with the refactor on #5486

This commit fixes the minor issue
2022-01-19 16:21:59 +01:00
Ahmet Gedemenli 9e6ebe4826 Turn mx on for test file citus_local_tables, on multi-1 schedule 2022-01-19 13:55:51 +03:00
Onur Tirtir 4a53967bdd
Remove an outdated comment from RelationIsAKnownShard (#5629) 2022-01-19 11:24:10 +01:00
Ahmet Gedemenli 37b3f50447
Turn mx on for multi-1 schedule (#5627)
For test files: multi_generate_ddl_commands, multi_repair_shards, multi_create_shards, mixed_relkind_tests
2022-01-19 12:05:54 +03:00
Marco Slot 33bfa0b191 Hide shards from application_name's with a specific prefix 2022-01-18 15:20:55 +04:00
Onur Tirtir d98500ac22
Fix a flaky test related with temp columnar table cleanup (#5599)
Wait until old backend to expire to make sure that temp table cleanup
is complete.
2022-01-17 09:26:30 -08:00
Ahmet Gedemenli e564220dd5
Fix typo: GetRelationTriggerFunctionDependencyList (#5626) 2022-01-17 18:17:07 +03:00
Ahmet Gedemenli 8936543b80
Create wrapper function CreateObjectAddressDependencyDefList (#5623) 2022-01-17 15:35:40 +03:00
Ying Xu 4dca662e97
Making Columnar Dependency Free from Citus (#5622)
* Removed distributed dependency in columnar_metadata.c

* Changed columnar_debug.c so that it no longer needed distributed/tuplestore and made it return a record instead of a tuplestore

* removed distributed/commands.h dependency

* Made columnar_tableam.c dependency-free

* Fixed spacing for columnar_store_memory_stats function

* indentation fix

* fixed test failures
2022-01-14 09:43:05 -08:00
Onur Tirtir 70d8e1fe97
Assert that we will create indexes on shards via local execution (#5620) 2022-01-13 17:09:57 +01:00
Halil Ozan Akgul 63cd90e5dd Add missing library to dependencies.c 2022-01-11 18:36:43 +03:00
Önder Kalacı 46ec7cd5cf Enable MX for rebalancer tests 2022-01-11 12:07:39 +01:00
Önder Kalacı 885601c02c
Require superuser while activating a node (#5609)
* Require superuser while activating a node

With this change, we require ActiveNode() (hence citus_add_node(),
citus_activate_node()) explicitly require for a superuser.

Before this commit, these functions were designed to work with
non-superuser roles with the relevent GRANTs given.

However, that is not a widely used way for calling the functions
above.

Due to possibility of non-super user calling the UDFs, they were
designed in a way that some commands were using some additional
short-lived superuser connections. That is:
	(a) breaking transactional behavior (e.g., ROLLBACK
 	    wouldn't fully rollback the whole transaction)
        (b) Making it very complicated to reason about which
	    parts of the node activation goes over which connections,
	    and becoming vulnerable to deadlocks / visibility issues.
2022-01-10 08:30:13 -08:00
Onur Tirtir 3cc44ed8b3
Tell other backends it's safe to ignore the backend that concurrently built the shell table index (#5520)
In addition to starting a new transaction, we also need to tell other
backends --including the ones spawned for connections opened to
localhost to build indexes on shards of this relation-- that concurrent
index builds can safely ignore us.

Normally, DefineIndex() only does that if index doesn't have any
predicates (i.e.: where clause) and no index expressions at all.
However, now that we already called standard process utility, index
build on the shell table is finished anyway.

The reason behind doing so is that we cannot guarantee not grabbing any
snapshots via adaptive executor, and the backends creating indexes on
local shards (if any) might block on waiting for current xact of the
current backend to finish, which would cause self deadlocks that are not
detectable.
2022-01-10 10:23:09 +03:00
Marco Slot ee3b50b026 Disallow remote execution from queries on shards 2022-01-07 17:46:21 +01:00
Önder Kalacı 8d1b188620
Enable MX for the remaining failure tests (#5606) 2022-01-07 17:24:31 +01:00
Ahmet Gedemenli 3c834e6693
Disable foreign distributed tables (#5605)
* Disable foreign distributed tables
* Add warning for existing distributed foreign tables
2022-01-07 18:12:23 +03:00
Onder Kalaci 7cb1d6ae06 Improve metadata connections
With https://github.com/citusdata/citus/pull/5493 we introduced
metadata specific connections.

With this connection we guarantee that there is a single metadata connection.
But note that this connection can be used for any other operation.
In other words, this connection is not only reserved for metadata
operations.

However, as https://github.com/citusdata/citus-enterprise/issues/715 showed
us that the logic has a flaw. We allowed ineligible connections to be
picked as metadata connections: such as exclusively claimed connections
or not fully initialized connections.

With this commit, we make sure that we only consider eligable connections
for metadata operations.
2022-01-07 10:36:32 +01:00
Onder Kalaci 9f2d9e1487 Move placement deletion from disable node to activate node
We prefer the background daemon to only sync node metadata. That's
why we move placement metadata changes from disable node to
activate node. With that, we can make sure that disable node
only changes node metadata, whereas activate node syncs all
the metadata changes. In essence, we already expect all
nodes to be up when a node is activated. So, this does not change
the behavior much.
2022-01-07 09:56:03 +01:00
Hanefi Onaldi 9edfbe7718 Fix the default value for DeferShardDeleteOnMove
The default for GUC citus.defer_drop_after_shard_move is true. However
we initialize the global variable with a false value.
2022-01-07 11:01:49 +03:00
Ahmet Gedemenli 45e423136c
Support foreign tables in MX (#5461) 2022-01-06 18:50:34 +03:00
Önder Kalacı 5305aa4246
Do not drop sequences when dropping metadata (#5584)
Dropping sequences means we need to recreate
and hence losing the sequence.

With this commit, we keep the existing sequences
such that resyncing wouldn't drop the sequence.

We do that by breaking the dependency of the sequence
from the table.
2022-01-06 09:48:34 +01:00
Önder Kalacı 8007adda25
Convert the function to a distributed function (#5596)
so that when metadata is synced, the table is on the worker
2022-01-06 11:32:40 +03:00
Önder Kalacı 6d9218540b
Enable single node tests with Citus MX (#5595)
* Enable single node tests with Citus MX

The test already has comment on the changes
2022-01-05 16:00:44 +03:00
jeff-davis 2e03efd91e
Columnar: move DDL hooks to citus to remove dependency. (#5547)
Add a new hook ColumnarTableSetOptions_hook so that citus can get
control when the columnar table options change.
2022-01-04 23:26:46 -08:00
jeff-davis c9292cfad1
Make pg_version_compat.h and listutils.c dependency-free. (#5548)
Split distributed/version_compat.h into dependency-free
pg_version_compat.h, and the original which still has
dependencies. The original doesn't have much purpose, but until other
files have better discipline about including the correct header files,
then it's still needed.

Also make distributed/listutils.h dependency-free. Should be moved
outside of 'distributed' subdirectory, but that will cause significant
code churn, so leave for another cleanup patch.

Now both files can be included in columnar without creating a
dependency on citus.
2022-01-04 23:02:08 -08:00
jeff-davis 1546aa0d9f
Columnar: use proper generic WAL interface. (#5543)
Previously, we cheated by using the RM_GENERIC_ID record type, but not
actually using the generic WAL API. This worked because we always took
a full page image, and saved the extra work of allocating and copying
to a temporary page.

But it introduced complexity, and perhaps fragility, so better to just
use the API properly. The performance penalty for a serial data load
seems to be less than 1%.
2022-01-04 22:42:21 -08:00
Onder Kalaci 22b5175fd1 Make sure that the community and enterprise tests produce the same output 2022-01-04 13:30:31 +01:00
Önder Kalacı 0a8b0b06c6
Do not allow distributed functions on non-metadata synced nodes (#5586)
Before this commit, Citus was triggering metadata syncing
in the background when a function is distributed. However,
with Citus 11, we expect all clusters to have metadata synced
enabled. So, we do not expect any nodes not to have the metadata.

This change:
	(a) pro: simplifies the code and opens up possibilities
		 to simplify futher by reducing the scope of
		 bg worker to only sync node metadata
        (b) pro: explicitly asks users to sync the metadata such that
  	    any unforseen impact can be easily detected
        (c) con: For distributed functions without distribution
		 argument, we do not necessarily require the metadata
		 sycned. However, for completeness and simplicity, we
		 do so.
2022-01-04 13:12:57 +01:00
Halil Ozan Akgul 9547228e8d Add isolation_check_mx test 2021-12-30 14:58:30 +03:00
Halil Ozan Akgul aef2d83c7d Fix metadata sync fails on multi_transaction_recovery 2021-12-29 11:21:32 +03:00
Önder Kalacı d33650d1c1
Record if any partitioned Citus tables during upgrade (#5555)
With Citus 11, the default behavior is to sync the metadata.
However, partitioned tables created pre-Citus 11 might have
index names that are not compatiable with metadata syncing.

See https://github.com/citusdata/citus/issues/4962 for the
details.

With this commit, we record the existence of partitioned tables
such that we can fix it later if any exists.
2021-12-27 03:33:34 -08:00
Halil Ozan Akgul 0c292a74f5 Fix metadata sync fails on multi_truncate 2021-12-27 13:54:53 +03:00
Önder Kalacı c9127f921f
Avoid round trips while fixing index names (#5549)
With this commit, fix_partition_shard_index_names()
works significantly faster.

For example,

32 shards, 365 partitions, 5 indexes drop from ~120 seconds to ~44 seconds
32 shards, 1095 partitions, 5 indexes drop from ~600 seconds to ~265 seconds

`queryStringList` can be really long, because it may contain #partitions * #indexes entries.

Before this change, we were actually going through the executor where each command
in the query string triggers 1 round trip per entry in queryStringList.

The aim of this commit is to avoid the round-trips by creating a single query string.

I first simply tried sending `q1;q2;..;qn` . However, the executor is designed to
handle `q1;q2;..;qn` type of query executions via the infrastructure mentioned
above (e.g., by tracking the query indexes in the list and doing 1 statement
per round trip).

One another option could have been to change the executor such that only track
the query index when `queryStringList` is provided not with queryString
including multiple `;`s . That is (a) more work (b) could cause weird edge
cases with failure handling (c) felt like coding a special case in to the executor
2021-12-27 10:29:37 +01:00
Halil Ozan Akgul bb636e6a29 Fix metadata sync fails on multi_function_evaluation 2021-12-24 19:32:58 +03:00
Halil Ozan Akgul 70e68d5312 Fix metadata sync fails on multi_name_lengths 2021-12-24 14:33:32 +03:00
Halil Ozan Akgul 5c2fb06322 Fix metadata sync fails on multi_sequence_default 2021-12-24 14:33:32 +03:00
Halil Ozan Akgul b9c06a6762 Turn metadata sync on in multi_metadata_sync 2021-12-24 10:58:13 +03:00
Hanefi Onaldi 479b2da740 Fix one flaky failure test 2021-12-23 20:11:45 +03:00
Ahmet Gedemenli 042d45b263 Propagate foreign server ops 2021-12-23 17:54:04 +03:00
Onur Tirtir 61b5fb1cfc
Run failure_test_helpers in base schedule (#5559) 2021-12-23 12:54:12 +01:00
Talha Nisanci e196d23854
Refactor AttributeEquivalenceId (#5006) 2021-12-23 13:19:02 +03:00
Hanefi Onaldi 76176caea7 Fix typo s/exlusive/exclusive/ 2021-12-23 01:35:01 +03:00
Hanefi Onaldi 1af8ca8f7c
Fix statical analysis findings (#5550) 2021-12-22 18:16:11 +03:00
Ahmet Gedemenli 8e4ff34a2e Do not include return table params in the function arg list
(cherry picked from commit 90928cfd74)

Fix function signature generation

Fix comment typo

Add test for worker_create_or_replace_object

Add test for recreating distributed functions with OUT/TABLE params

Add test for recreating distributed function that returns setof int

Fix test output

Fix comment
2021-12-21 19:01:42 +03:00
Marco Slot 2eef71ccab Propagate SET TRANSACTION commands 2021-12-18 11:31:39 +01:00
Halil Ozan Akgul 46f718c76d Turn metadata sync on in add_coordinator, foreign_key_to_reference_table and replicate_reference_tables_to_coordinator 2021-12-17 16:33:25 +03:00
Halil Ozan Akgul 25755a7094 Turn ddl propagation off in worker on multi_copy 2021-12-17 15:54:20 +03:00
Onder Kalaci fc98f83af2 Add citus.grep_remote_commands
Simply applies

```SQL
SELECT textlike(command, citus.grep_remote_commands)
```
And, if returns true, the command is logged. Else, the log is ignored.

When citus.grep_remote_commands is empty string, all commands are
logged.
2021-12-17 11:47:40 +01:00
Halil Ozan Akgul df8d0f3db1 Turn metadata sync on in multi_replicate_reference_table and multi_citus_tools 2021-12-17 10:25:57 +03:00
Onur Tirtir cc4c83b1e5
HAVE_LZ4 -> HAVE_CITUS_LZ4 (#5541) 2021-12-16 16:21:52 +03:00
Talha Nisanci c0945d88de
Normalize a debug failure to WARNING failure (#4996) 2021-12-16 13:43:49 +03:00
Halil Ozan Akgul 8943d7b52f Turn metadata sync on in mx_regular_user and remove_coordinator 2021-12-16 11:26:24 +03:00
Halil Ozan Akgul b82af4db3b Turn metadata sync on in multi_size_queries, multi_drop_extension and multi_unsupported_worker_operations 2021-12-16 11:10:54 +03:00
Hanefi Onaldi 9d4d73898a
Move healthcheck logic into new file (#5531)
and add a missing `CheckCitusVersion(ERROR)` call
2021-12-15 15:58:20 -08:00
Hanefi Onaldi acdcd9422c
Fix one flaky failure test (#5528)
Removes flaky test
2021-12-15 18:59:58 +03:00
Hanefi Onaldi 29e4516642 Introduce citus_check_cluster_node_health UDF
This UDF coordinates connectivity checks accross the whole cluster.

This UDF gets the list of active readable nodes in the cluster, and
coordinates all connectivity checks in sequential order.

The algorithm is:

for sourceNode in activeReadableWorkerList:
    c = connectToNode(sourceNode)
    for targetNode in activeReadableWorkerList:
        result = c.execute(
            "SELECT citus_check_connection_to_node(targetNode.name,
                                                   targetNode.port")
        emit sourceNode.name,
             sourceNode.port,
             targetNode.name,
             targetNode.port,
             result

- result -> true  ->  connection attempt from source to target succeeded
- result -> false -> connection attempt from source to target failed
- result -> NULL  -> connection attempt from the current node to source node failed

I suggest you use the following query to get an overview on the connectivity:

SELECT bool_and(COALESCE(result, false))
FROM citus_check_cluster_node_health();

Whenever this query returns false, there is a connectivity issue, check in detail.
2021-12-15 01:41:51 +03:00
Hanefi Onaldi 13fff9c37a Remove NOOP tuplestore_donestoring calls
PostgreSQL does not need calling this function since 7.4 release, and it
is a NOOP.

For more details, check PostgreSQL commit below :

commit dd04e958c8b03c0f0512497651678c7816af3198
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Sun Mar 9 03:34:10 2003 +0000

    tuplestore_donestoring() isn't needed anymore, but provide a no-op
    macro definition so as not to create compatibility problems.

diff --git a/src/include/utils/tuplestore.h b/src/include/utils/tuplestore.h
index b46babacd1..76fe9fb428 100644
--- a/src/include/utils/tuplestore.h
+++ b/src/include/utils/tuplestore.h
@@ -17,7 +17,7 @@
  * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
- * $Id: tuplestore.h,v 1.8 2003/03/09 02:19:13 tgl Exp $
+ * $Id: tuplestore.h,v 1.9 2003/03/09 03:34:10 tgl Exp $
  *
  *-------------------------------------------------------------------------
  */
@@ -41,6 +41,9 @@ extern Tuplestorestate *tuplestore_begin_heap(bool randomAccess,

 extern void tuplestore_puttuple(Tuplestorestate *state, void *tuple);

+/* tuplestore_donestoring() used to be required, but is no longer used */
+#define tuplestore_donestoring(state)  ((void) 0)
+
 /* backwards scan is only allowed if randomAccess was specified 'true' */
 extern void *tuplestore_gettuple(Tuplestorestate *state, bool forward,
                                        bool *should_free);
2021-12-14 18:55:02 +03:00
Halil Ozan Akgul e060720370 Fix metadata sync fails in multi_index_statements 2021-12-14 11:28:08 +03:00
Halil Ozan Akgul a951e52ce8 Fix drop index trying to drop coordinator local indexes on metadata worker nodes 2021-12-14 11:28:08 +03:00
Halil Ozan Akgul 1d7dde2c4c Fix metadata sync fails on multi_copy 2021-12-14 10:59:59 +03:00
Halil Ozan Akgul 98e38e2e4e Fix metadata sync fails on failure_connection_establishment 2021-12-13 11:51:56 +03:00
Halil Ozan Akgul 507df08422 Fix metadata sync fails on propagate_statistics and pg13_propagate_statistics tests 2021-12-09 12:28:11 +03:00
Halil Ozan Akgul 351314f8a1 Turn metadata sync on in base/minimal schedules 2021-12-08 13:34:41 +03:00
Halil Ozan Akgul ee894c9e73 Fix metadata sync fails on multi_follower_schedule 2021-12-08 13:07:37 +03:00
Halil Ozan Akgul 4c8f79d7dd Turn metadata sync on in failure schedule 2021-12-08 11:22:56 +03:00
Halil Ozan Akgul 4f272ea0e5 Fix metadata sync fails in multi_extension 2021-12-08 10:25:43 +03:00
Halil Ozan Akgul a3834edeaa Turn metadata sync on in multi_mx_schedule 2021-12-08 10:25:43 +03:00
Halil Ozan Akgul ea37f4fd29 Turn metadata sync on in upgrade schedules 2021-12-08 10:19:02 +03:00
Hanefi Onaldi 05a3dfa8a9 Remove redundant arbitrary config class
We had 2 class definitions for CitusCacheManyConnectionsConfig, where
one of them was a copy of CitusSmallCopyBuffersConfig.

This commit leaves the intended class definition that configures caching
many connections, and removes the one that is a copy of another class
2021-12-08 04:47:08 +03:00
Burak Velioglu e8534c1dd5
Drop sequence metadata from workers explicitly 2021-12-06 19:25:51 +03:00
Burak Velioglu 21194c3b9d
Mark sequence distributed explicitly while syncing metadata
Since sequences are not marked as distributed while creating table if no
metadata worker node exists, we are marking all sequences distributed
while syncing metadata explicitly.
2021-12-06 19:25:51 +03:00
Burak Velioglu 6d849cf394
Allow delegating function from worker nodes
We've both allowed delegating functions and procedures from worker nodes
and also prevented delegation if a function/procedure has already been
propagated from another node.
2021-12-06 19:25:51 +03:00
Burak Velioglu a8b1ee87f7
Increment command counter after altering the sequence type 2021-12-06 19:25:51 +03:00
Burak Velioglu ed8e32de5e
Sync pg_dist_object on an update and propagate while syncing to a new node
Before that PR we were updating citus.pg_dist_object metadata, which keeps
the metadata related to objects on Citus, only on the coordinator node. In
order to allow using those object from worker nodes (or erroring out with
proper error message) we've started to propagate that metedata to worker
nodes as well.
2021-12-06 19:25:50 +03:00
Halil Ozan Akgul ef09ba0d06 Fix metadata sync fails of multi_table_ddl 2021-12-06 13:44:30 +03:00
Halil Ozan Akgul a6d0de060c Fix fails with metadata syncing in undistribute_table 2021-12-03 13:58:53 +03:00
Hanefi Onaldi 56e9b1b968 Introduce UDF to check worker connectivity
citus_check_connection_to_node runs a simple query on a remote node and
reports whether this attempt was successful.

This UDF will be used to make sure each worker node can connect to all
the worker nodes in the cluster.

parameters:
nodename: required
nodeport: optional (default: 5432)

return value:
boolean success
2021-12-03 02:30:28 +03:00
Talha Nisanci e4ead8f408
Update broken link for upgrade tests (#5408)
* Update broken link for upgrade tests

* Update src/test/regress/README.md

Co-authored-by: Nils Dijk <nils@citusdata.com>

Co-authored-by: Nils Dijk <nils@citusdata.com>
2021-12-02 15:25:36 +01:00
Onder Kalaci 549edcabb6 Allow disabling node(s) when multiple failures happen
As of master branch, Citus does all the modifications to replicated tables
(e.g., reference tables and distributed tables with replication factor > 1),
via 2PC and avoids any shardstate=3. As a side-effect of those changes,
handling node failures for replicated tables change.

With this PR, when one (or multiple) node failures happen, the users would
see query errors on modifications. If the problem is intermitant, that's OK,
once the node failure(s) recover by themselves, the modification queries would
succeed. If the node failure(s) are permenant, the users should call
`SELECT citus_disable_node(...)` to disable the node. As soon as the node is
disabled, modification would start to succeed. However, now the old node gets
behind. It means that, when the node is up again, the placements should be
re-created on the node. First, use `SELECT citus_activate_node()`. Then, use
`SELECT replicate_table_shards(...)` to replicate the missing placements on
the re-activated node.
2021-12-01 10:19:48 +01:00
Halil Ozan Akgul 316274b5f0 Add normalize.sed item for multi_fix_partition_shard_index_names test 2021-11-30 13:28:41 +03:00
Halil Ozan Akgul 11072b4cb8 Normalize create role command in drop_partitioned_table test 2021-11-30 12:46:22 +03:00
Onder Kalaci d405993b57 Make sure to use a dedicated metadata connection
With this commit, we make sure to use a dedicated connection per
node for all the metadata operations within the same transaction.

This is needed because the same metadata (e.g., metadata includes
the distributed table on the workers) can be modified accross
multiple connections.

With this connection we guarantee that there is a single metadata connection.
But note that this connection can be used for any other operation.
In other words, this connection is not only reserved for metadata
operations.
2021-11-26 14:36:28 +01:00
Onder Kalaci 38b08ebde9 Generalize the error checks while removing node
The checks for preventing to remove a node are very much reference
table centric. We are soon going to add the same checks for replicated
tables. So, make the checks generic such that:
	 (a) replicated tables fit naturally
	 (b) we can the same checks in `citus_disable_node`.
2021-11-26 14:25:29 +01:00
Hanefi Onaldi 4c135de9e4 Introduce CI checks for hash comments in specs
We do not use comments starting with # in spec files because it creates
errors from C preprocessor that expects directives after this character.
Instead use C style comments, i.e:
// single line comment

You can also use multiline comments as well
/*
 * multi line comment
 */
2021-11-26 14:52:51 +03:00
Halil Ozan Akgul 87a1c760d9 Fix tests in multi-1-schedule that fail with metadata syncing 2021-11-26 12:09:53 +03:00
Onder Kalaci 121f5c4271 Active placements can only be on active nodes
We re-define the meaning of active shard placement. It used
to only be defined via shardstate == SHARD_STATE_ACTIVE.

Now, we also add one more check. The worker node that the
placement is on should be active as well.

This is a preparation for supporting citus_disable_node()
for MX with multiple failures at the same time.

With this change, the maintanince daemon only needs to
sync the "node metadata" (e.g., pg_dist_node), not the
shard metadata.
2021-11-26 09:14:33 +01:00
Onder Kalaci b4931f7345 Do not acquire locks on reference tables when a node is removed/disabled
Before this commit, we acquire the metadata locks on the reference
tables while removing/disabling a node on all the MX nodes.

Although it has some marginal benefits, such as a concurrent
modification during remove/disable node blocks, instead of erroring
out, the drawbacks seems worse. Both citus_remove_node and citus_disable_node
are not tolerant to multiple node failures.

With this commit, we relax the locks. The implication is that while
a node is removed/disabled, users might see query errors. On the
other hand, this change becomes removing/disabling nodes more
tolerant to multiple node failures.
2021-11-26 09:08:25 +01:00
Onur Tirtir 76b8006a9e
Allow overwriting columnar storage pages written by aborted xacts (#5484)
When refactoring storage layer in #4907, we deleted the code that allows
overwriting a disk page previously written but not known by metadata.

Readers can see the change that introduced the code allows doing so in
commit a8da9acc63.

The reasoning was that; as of 10.2, we started aligning page
reservations (`AlignReservation`) for subsequent writes right after
allocating pages from disk. That means, even if writer transaction
fails, subsequent writes are guaranteed to allocate a new page and write
to there. For this reason, attempting to write to a page allocated
before is not possible for a columnar table that user created when using
v10.2.x.

However, since the older versions of columnar doesn't do that, following
example scenario can still result in writing to such disk page, even if
user now upgraded to v10.2.x. This is because, when upgrading storage to
2.0 (`ColumnarStorageUpdateIfNeeded`), we calculate `reservedOffset` of
the metapage based on the highest used address known by stripe
metadata (`GetHighestUsedAddressAndId`). However, stripe metadata
doesn't have entries for aborted writes. As a result, highest used
address would be computed by ignoring pages that are allocated but not
used.

- User attempts writing to columnar table on Citus v10.0x/v10.1x.
- Write operation fails for some reason.
- User upgrades Citus to v10.2.x.
- When attempting to write to same columnar table, they hit to "attempt
  to write columnar data .." error since write operation done in the
  older version of columnar already allocated that page, and now we are
  overwriting it.

For this reason, with this commit, we re-do the change done in
a8da9acc63.

And for the reasons given above, it wasn't possible to add a test for
this commit via usual code-paths. For this reason, added a UDF only for
testing purposes so that we can reproduce the exact scenario in our
regression test suite.
2021-11-26 07:51:13 +01:00
Onur Tirtir 85da4fc2e0
Merge branch 'master' into col/pg-upgrade-dependency 2021-11-26 09:34:43 +03:00
Onur Tirtir 81af605e07
Fix typo: "no sharding pruning constraints" -> "no shard pruning constraints" (#5490) 2021-11-25 21:00:44 +01:00
Onur Tirtir 73f06323d8 Introduce dependencies from columnarAM to columnar metadata objects
During pg upgrades, we have seen that it is not guaranteed that a
columnar table will be created after metadata objects got created.
Prior to changes done in this commit, we had such a dependency
relationship in `pg_depend`:

```
columnar_table ----> columnarAM ----> citus extension
                                           ^  ^
                                           |  |
columnar.storage_id_seq --------------------  |
                                              |
columnar.stripe -------------------------------
```

Since `pg_upgrade` just knows to follow topological sort of the objects
when creating database dump, above dependency graph doesn't imply that
`columnar_table` should be created before metadata objects such as
`columnar.storage_id_seq` and `columnar.stripe` are created.

For this reason, with this commit we add new records to `pg_depend` to
make columnarAM depending on all rel objects living in `columnar`
schema. That way, `pg_upgrade` will know it needs to create those before
creating `columnarAM`, and similarly, before creating any tables using
`columnarAM`.

Note that in addition to inserting those records via installation script,
we also do the same in `citus_finish_pg_upgrade()`. This is because,
`pg_upgrade` rebuilds catalog tables in the new cluster and that means,
we must insert them in the new cluster too.
2021-11-23 13:14:00 +03:00
Onur Tirtir ef2ca03f24 Reproduce bug via test suite 2021-11-23 13:14:00 +03:00
Burak Velioglu 6590f12de4
Merge branch 'master' into velioglu/make_object_lock_explicit 2021-11-22 13:55:36 +03:00
Burak Velioglu 12e05ad196
Sorted addresses before getting lock 2021-11-22 11:43:32 +03:00
Marco Slot f49d26fbeb Remove citus_update_table_statistics isolation test 2021-11-19 10:51:15 +01:00
Marco Slot 56eae48daf Stop updating shard range in citus_update_shard_statistics 2021-11-19 10:51:15 +01:00
Burak Velioglu 3a68263cc7
Change lock type 2021-11-19 12:03:17 +03:00
Burak Velioglu baeaca7bc5
Update comment 2021-11-19 10:51:56 +03:00
Hanefi Onaldi c0d43d4905
Prevent cache usage on citus_drop_trigger codepaths 2021-11-18 20:24:51 +03:00
Burak Velioglu 77dd12c09d
Merge branch 'master' into velioglu/make_object_lock_explicit 2021-11-18 20:18:07 +03:00
Hanefi Onaldi e6160ad131
Document failing tests for issue 5099 2021-11-18 20:01:34 +03:00
Hanefi Onaldi a3cc9b4e53
Remove case block that is identical to its neighbor (#5472) 2021-11-18 19:41:39 +03:00
Burak Velioglu b484d9b234
Make object locking explicit while adding dependencies 2021-11-18 19:34:00 +03:00
Marco Slot 9e6ca23286 Remove cstore_fdw-related logic 2021-11-16 13:59:03 +01:00
Önder Kalacı 8c0bc94b51
Enable replication factor > 1 in metadata syncing (#5392)
- [x] Add some more regression test coverage
- [x] Make sure returning works fine in case of
     local execution + remote execution
     (task->partiallyLocalOrRemote works as expected, already added tests)
- [x] Implement locking properly (and add isolation tests)
     - [x] We do #shardcount round-trips on `SerializeNonCommutativeWrites`.
           We made it a single round-trip.
- [x] Acquire locks for subselects on the workers & add isolation tests
- [x] Add a GUC to prevent modification from the workers, hence increase the
      coordinator-only throughput
       - The performance slightly drops (~%15), unless
         `citus.allow_modifications_from_workers_to_replicated_tables`
         is set to false
2021-11-15 15:10:18 +03:00
Onur Tirtir 25024b776e
Skip deleting options if columnar.options is already dropped (#5458)
Drop extension might cascade to columnar.options before dropping a
columnar table. In that case, we were getting below error when opening
columnar.options to delete records for the columnar table that we are
about to drop.: "ERROR:  could not open relation with OID 0".

I somehow reproduced this bug easily when upgrading pg, that is why
adding added the test to after_pg_upgrade_schedule.
2021-11-12 12:30:09 +03:00
Ahmet Gedemenli 14a33d4e8e Introduce GUC citus.use_citus_managed_tables 2021-11-11 14:09:06 +03:00
Hanefi Onaldi 3d9cec70fd
Update migration paths from 10.2 to 11.0 (#5459)
We recently introduced a set of patches to 10.2, and introduced 10.2-4
migration version. This migration version only resides on `release-10.2`
branch, and is missing on our default branch. This creates a problem
because we do not have a valid migration path from 10.2 to latest 11.0.

To remedy this issue, I copied the relevant migration files from
`release-10.2` branch, and renamed some of our migration files on
default branch to make sure we have a linear upgrade path.
2021-11-11 13:55:28 +03:00
Önder Kalacı 6f5a343ff4
Make sure that enterprise tests pass (#5451) 2021-11-08 18:11:19 +03:00
Önder Kalacı 98ca6ba6ca
Allow lock_shard_resources to be called by the users with privileges (#5441)
Before this commit, we required the user to be owner of the shard/table
in order to call lock_shard_resources.

However, that is too restrictive. We can have users with GRANTS
to the table who are not owners of the tables/shards.

With this commit, we allow such patterns.
2021-11-08 15:36:51 +01:00
Onder Kalaci d5e89b1132 Unify distributed execution logic for single replicated tables
Citus does not acquire any executor locks for shard replication == 1.
With this commit, we unify this decision and exit early.
2021-11-08 13:52:20 +01:00
Önder Kalacı d5b371b2e0
Merge branch 'master' into naisila/fix-partitioned-index 2021-11-08 10:53:16 +01:00
naisila 385ba94d15 Run fix_partition_shard_index_names after each wrong naming command 2021-11-08 10:43:34 +01:00
Marco Slot 78866df13c Remove master_append_table_to_shard UDF 2021-11-08 10:43:24 +01:00
Marco Slot fba93df4b0 Remove copy into new append shard logic 2021-11-07 21:01:40 +01:00
Marco Slot 27ba19f7e1 Fix a flappy test in drop_column_partitioned_table 2021-11-07 18:25:44 +01:00
Nils Dijk 3fcb456381
Refactor/partitioned result destreceiver (#5432)
This change creates a slightly higher abstraction of the `PartitionedResultDestReceiver` where it decouples the partitioning from writing it to a file. This allows for easier reuse for other `DestReceiver`'s that would like to route different tuples to different `DestReceiver`'s.

Originally there was a lot of state kept in `PartitionedResultDestReceiver` to be able to lazily create `FileDestReceivers` when the first tuple arrived for that target. This convoluted the implementation of the processing of tuples with where they should go.

This refactor changes that where it makes the `PartitionedResultDestReceiver` completely agnostic of what kind of Receivers it is writing to. When constructed you pass it a list of `DestReceiver` compatible pointers with the length of `partitionCount`. Internally the `PartitionedResultDestReceiver` keeps track of which `DestReceiver`'s have been started or not, and start them when they first receive a tuple.

Alternatively, if the instantiating code of the `PartitionedResultDestReceiver` wants, the startup can be turned from lazily to eagerly. When the startup is eager (not lazy) all `rStartup` functions on the list of `DestReceiver`'s are called during the startup of the `PartitionedResultDestReceiver` and marked as such.

A downside of this approach is the following. On highly partitioned destinations we now need to allocate a `FileDestReceiver` for every target, _always_. When the data passed into the `PartitionedResultDestReceiver` is highly skewed to a small set of `FileDestReceiver`'s this will waste some memory. Given the small size of a `FileDestReceiver`, and the fact that actual file handles are only created during the processing of the startup of the `FileDestReceiver` I think this memory waste is not a problem. If this would become a problem we could refactor the source list into some kind of generator object which can generate the `DestReceiver`'s on the fly.
2021-11-05 13:31:18 +01:00
Nils Dijk 0e7cf9f0ca
reinstate optimization that got unintentionally broken in 366461ccdb (#5418)
DESCRIPTION: Reinstate optimisation for uniform shard interval ranges

During a refactor introduced in #4132 the following change was made, which made the optimisation in `CalculateUniformHashRangeIndex` unreachable: 
366461ccdb (diff-565a339ed3c78bc5a0d4ffeb4e91032150b1dffbeeff59cd3e65981d20b998c7L319-R319)

This PR reinstates the path to the optimisation!
2021-11-05 13:07:51 +01:00
Önder Kalacı 763176a4d9
Some minor improvements on top of 5314 (#5428)
* Refactor some checks in citus local tables

* all existing citus local tables are auto converted after upgrade

* Update warning messages in CreateCitusLocalTable

* Hide notice msg for auto converting local tables

* Hide hint msg

Co-authored-by: Ahmet Gedemenli <afgedemenli@gmail.com>
2021-11-05 13:59:13 +03:00
Sait Talha Nisanci ab29c25658 Fix missing from entry 2021-11-04 18:54:52 +03:00
Halil Ozan Akgul a8f3f712cc Turns mx on in isolations tests 2021-11-04 17:12:30 +03:00
Ahmet Gedemenli b30ed46068
Fixes ALTER STATISTICS IF EXISTS bug (#5435)
* Fix ALTER STATISTICS IF EXISTS bug
2021-11-04 16:14:05 +03:00
Halil Ozan Akgul 91b377490b Fix multi_cluster_management fails for metadata syncing 2021-11-04 11:09:21 +03:00
Talha Nisanci 19f28eabae
Fix citus upgrade local run issues (#5414)
This PR is fixing 2 separate issues related to the local run of citus upgrade tests.

d3e7c825ab fixes the issue that, with our new testing infrastructure, we moved/renamed some of existing folders. This created a problem for local runs of citus upgrade tests since some paths were sensitive to such changes. This commit tries to make it more generic so that this issue is less likely to happen in the future, while also fixing the current issue.

93de6b60c3 we are fixing an issue that a new environment variable was added for citus upgrade tests, which is defined in the CI. 0cb51f8c37/.circleci/config.yml (L294)
This environment variable wasn't set in our local runs hence it would create problems. Instead of defining this environment variable in the local run, we change the citus_upgrade run command to use an existing env variable, which is now also set in the CI.
2021-11-03 16:17:36 +03:00
Jelte Fennema 9b784e58bf
Add tests for special hash values (#5431)
We fixed some crashes a while back that would only occur in cases where
the value of a distribution column would have result in a high or a very
low hash value. This adds a regression test for those crashes.
2021-11-03 13:42:39 +01:00
Jelte Fennema 0cb51f8c37
Test a query that failed on 9.5.8 when coordinator is in metadata (#5412)
This test starts passing because of PR #4508, to be precise commit:
24e60b44a1

When I undo that commit this newly added test starts failing. This adds
this test to make sure we don't regress on this again.
2021-11-03 12:27:28 +01:00
Halil Ozan Akgul c0785d570c Remove EnsureSuperUser from start and stop metadata sync to node 2021-11-01 18:01:49 +03:00
Halil Ozan Akgul c0eb67b24f Skip forceCloseAtTransactionEnd connections only if BEGIN was not sent on them 2021-11-01 17:43:04 +03:00
Jelte Fennema 57a0228c52
Fix string-concatenation warning on Clang 13 (#5425)
Clang 13 complains about a suspicious string concatenation. It thinks we
might have missed a comma. This adds parentheses to make it clear that
concatenation is indeed what we meant.
2021-11-01 13:55:43 +03:00
naisila 796d56a7b1 Rename ddlJob->commandString to ddlJob->metadataSyncCommand 2021-10-29 23:45:43 +03:00
Ahmet Gedemenli 67dca4363d
Dont auto-undistribute user-added citus local tables (#5314)
* Disable auto-undistribute for user-added citus local tables
2021-10-28 12:10:26 +03:00
Nils Dijk f4297f774a
Bump mitmproxy version (#5334)
There is a vulnerability in mitmproxy with the version we are using.

It would be hard to exploit anything with regards to the artifacts we ship as its only used in our test suite. Still its good hygiene to _not_ use software with known vulnerabilities.

This PR updates the version of python, mitmproxy and the crypto libraries used.
The latest version of mitmproxy for python 3.6 is not patched, hence the upgrade of python.
For our CI images this cascades into upgrading debian as well :)

For CI we bake these versions in our images so we need to update them as well.

Changes to the CI images: https://github.com/citusdata/the-process/pull/65
2021-10-27 17:57:13 +02:00
Jelte Fennema a8cbeb1047
Fix docs of arbitrary configs (#5413)
The old command would run none of the tests. The new command runs all of
the tests for the given configs.
2021-10-27 17:16:24 +02:00
Philip Dubé cc50682158 Fix typos. Spurred spotting "connectios" in logs 2021-10-25 13:54:09 +00:00
Jelte Fennema 3bdbfc3edf
Fix duplicate typedef which can cause compile failures (#5406)
ColumnarScanDesc is already defined in columnar_tableam.h. Redifining it
again causes a compiler error on some C compilers.

Useful reference: https://bugzilla.redhat.com/show_bug.cgi?id=767538

Fixes #5404
2021-10-25 12:20:13 +00:00
Onder Kalaci ce4c4540c5 Simplify 2PC decision in the executor
It seems like the decision for 2PC is more complicated than
it should be.

With this change, we do one behavioral change. In essense,
before this commit, when a SELECT task with replication factor > 1
is executed, the executor was triggering 2PC. And, in fact,
the transaction manager (`ConnectionModifiedPlacement()`) was
able to understand not to trigger 2PC when no modification happens.

However, for transaction blocks like:
BEGIN;
-- a command that triggers 2PC
-- A SELECT command on replication > 1
..
COMMIT;

The SELECT was used to be qualified as required 2PC. And, as a side-effect
the executor was setting `xactProperties.errorOnAnyFailure = true;`

So, the commands was failing at the time of execution. Now, they fail at
the end of the transaction.
2021-10-23 09:06:28 +02:00
Onder Kalaci 575bb6dde9 Drop support for Inactive Shard placements
Given that we do all operations via 2PC, there is no way
for any placement to be marked as INACTIVE.
2021-10-22 18:03:35 +02:00
Önder Kalacı b3299de81c
Drop support for citus.multi_shard_commit_protocol (#5380)
In the past, we allowed users to manually switch to 1PC
(e.g., one phase commit). However, with this commit, we
don't. All multi-shard modifications are done via 2PC.
2021-10-21 14:01:28 +02:00
Marco Slot df43868369 Remove PG11 expected upgrade_list_citus_objects output 2021-10-21 12:08:05 +02:00
Marco Slot dafba6c242 Deprecate master_get_table_metadata UDF 2021-10-21 12:08:05 +02:00
Marco Slot defb97b7f5 Support operator class parameters in indexes 2021-10-20 17:03:59 +02:00
Önder Kalacı 3f726c72e0
When replication factor > 1, all modifications are done via 2PC (#5379)
With Citus 9.0, we introduced `citus.single_shard_commit_protocol` which
defaults to 2PC.

With this commit, we prevent any user to set it to 1PC and drop support
for `citus.single_shard_commit_protocol`.

Although this might add some overhead for users, it is already the default
behaviour (so less likely) and marking placements as INVALID is much
worse.
2021-10-20 01:39:03 -07:00
Sait Talha Nisanci a851211dbc Run tests sequentially 2021-10-19 18:35:26 +03:00
Marco Slot 641ef9bd6f Fix flappy subquery_append test 2021-10-19 15:29:01 +02:00
Sait Talha Nisanci 56abd3d501 Increase parallelism 2021-10-19 15:38:58 +03:00
Marco Slot 096660d61d Remove master_apply_delete_command 2021-10-18 22:29:37 +02:00
Marco Slot bece86b2f7 Add some subquery on append-distributed table tests 2021-10-18 21:11:16 +02:00
Marco Slot 93e79b9262 Never allow co-located joins of append-distributed tables 2021-10-18 21:11:16 +02:00
Marco Slot b97e5081c7 Disable co-located joins for append-distributed tables 2021-10-18 21:11:16 +02:00
Marco Slot dfad73d918 Disable implicit single re-partition joins for append tables 2021-10-18 21:11:16 +02:00
Marco Slot 2206e64e42 Disable single-repartition joins for append tables 2021-10-18 21:11:16 +02:00
Sait Talha Nisanci 6ff2083311 Remove base test as it is not useful anymore 2021-10-18 20:31:18 +03:00
Sait Talha Nisanci 7336c03c22 Add local-dist table joins to arbitrary configs 2021-10-18 20:31:18 +03:00
Önder Kalacı 31c8f279ac
Add helper UDFs to inspect object dependencies (#5293)
- citus_get_all_dependencies_for_object: emulate what Citus
                                         would qualify as
					 dependency when adding
					 a new node
- citus_get_dependencies_for_object: emulate what Citus would qualify
				     as dependency when creating an
				     object

Example use:
```SQL
-- find all the depedencies of table test
SELECT
	pg_identify_object(t.classid, t.objid, t.objsubid)
FROM
	(SELECT * FROM pg_get_object_address('table', '{test}', '{}')) as addr
JOIN LATERAL
	citus_get_all_dependencies_for_object(addr.classid, addr.objid, addr.objsubid) as t(classid oid, objid oid, objsubid int)
ON TRUE
	ORDER BY 1;
```
2021-10-18 14:46:49 +03:00
Halil Ozan Akgul e3446692f3 Fix the bug by adding comma before the values 2021-10-15 18:42:23 +03:00
Halil Ozan Akgul 3fb996f6de Fix the tests that fail with MX in columnar_schedule 2021-10-15 13:09:01 +03:00
Halil Ozan Akgul b710e0064d Fix tests that fail with MX in multi_schedule 2021-10-15 12:58:38 +03:00
Ahmet Gedemenli 35f6fe5f9f
Refactor/Improve PreprocessAlterTableStmtAttachPartition (#5366)
* Refactor/Improve PreprocessAlterTableStmtAttachPartition
2021-10-14 11:39:39 +03:00
SaitTalhaNisanci de61a89083
Fix sql_schedule_name problem (#5371) 2021-10-13 13:10:00 +02:00
Hanefi Onaldi 3e64dc44c8
Fix some typos in comments (#5369) 2021-10-13 13:00:39 +03:00
Önder Kalacı af876bf452
Add value materialization test (#5368) 2021-10-13 09:08:24 +02:00
SaitTalhaNisanci a39859bc74
Remove unnecesary output (#5367) 2021-10-13 09:28:01 +03:00
SaitTalhaNisanci 3f65751d43
Add an infrastructure to run same tests with arbitrary configs (#5316)
To run tests in parallel use:

```bash
make check-arbitrary-configs parallel=4
```

To run tests sequentially use:

```bash
make check-arbitrary-configs parallel=1
```

To run only some configs:

```bash
make check-arbitrary-base CONFIGS=CitusSingleNodeClusterConfig,CitusSmallSharedPoolSizeConfig
```

To run only some test files with some config:

```bash
make check-arbitrary-base CONFIGS=CitusSingleNodeClusterConfig EXTRA_TESTS=dropped_columns_1
```

To get a deterministic run, you can give the random's seed:

```bash
make check-arbitrary-configs parallel=4 seed=12312
```

The `seed` will be in the output of the run.

In our regular regression tests, we can see all the details about either planning or execution but this means
we need to run the same query under different configs/cluster setups again and again, which is not really maintanable.

When we don't care about the internals of how planning/execution is done but the correctness, especially with different configs
this infrastructure can be used.

With `check-arbitrary-configs` target, the following happens:

-   a bunch of configs are loaded, which are defined in `config.py`. These configs have different settings such as different shard count, different citus settings, postgres settings, worker amount, or different metadata.
-   For each config, a separate data directory is created for tests in `tmp_citus_test` with the config's name.
-   For each config, `create_schedule` is run on the coordinator to setup the necessary tables.
-   For each config, `sql_schedule` is run. `sql_schedule` is run on the coordinator if it is a non-mx cluster. And if it is mx, it is either run on the coordinator or a random worker.
-   Tests results are checked if they match with the expected.

When tests results don't match, you can see the regression diffs in a config's datadir, such as `tmp_citus_tests/dataCitusSingleNodeClusterConfig`.

We also have a PostgresConfig which runs all the test suite with Postgres.
By default configs use regular user, but we have a config to run as a superuser as well.

So the infrastructure tests:

-   Postgres vs Citus
-   Mx vs Non-Mx
-   Superuser vs regular user
-   Arbitrary Citus configs

When you want to add a new test, you can add the create statements to `create_schedule` and add the sql queries to `sql_schedule`.
If you are adding Citus UDFs that should be a NO-OP for Postgres, make sure to override the UDFs in `postgres.sql`.

You can add your new config to `config.py`. Make sure to extend either `CitusDefaultClusterConfig` or `CitusMXBaseClusterConfig`.

On the CI, upon a failure, all logfiles will be uploaded as artifacts, so you can check the artifacts tab.
All the regressions will be shown as part of the job on CI.

In your local, you can check the regression diffs in config's datadirs as in `tmp_citus_tests/dataCitusSingleNodeClusterConfig`.
2021-10-12 14:24:19 +03:00
Teja Mupparti a8348047c5
Pushdown procedures with OUT parameters (#5348) 2021-10-11 23:14:36 -07:00
Onur Tirtir f7f4a93073 Remove get_relation_trigger_oid_compat 2021-10-11 11:53:00 +03:00
Onur Tirtir a1e0511583 Remove get_relation_constraint_oid_compat 2021-10-11 11:53:00 +03:00
Ahmet Gedemenli d19793c174 Add partitioning support for citus local tables
Add/fix tests

Fix creating partitions

Add test for mx - partition creating case

Enable cascading to partitioned tables

Fix mx partition adding test

Fix cascading through fkeys

Style

Disable converting with non-inherited fkeys

Fix detach bug

Early return in case of cascade & Add tests

Style

Fix undistribute_table bug & Fix test outputs

Remove RemovePartitionRelationIds

Test with undistribute_table

Add test for mx+convert+undistribute

Remove redundant usage of CreatePartitionedCitusLocalTable

Add some comments

Introduce bulk functions for generating attach/detach partition commands

Fix: Convert partitioned tables after adding fkey

Change the error message for partitions

Introduce function ErrorIfPartitionTableAddedToMetadata

Polish attach/detach command generation functions

Use time_partitions for testing

Move mx tests to citus_local_tables_mx

Add new partitioned table to cascade test

Add test with time series management UDFs

Fix test output

Fix: Assertion fail on relation access tracking

Style

Refactor creating partitioned citus local tables

Remove CreatePartitionedCitusLocalTable

Style

Error out if converting multi-level table

Revert some old tests

Error out adding partitioned partition

Polish

Polish/address

Fix create table partition of case

Use CascadeOperationForRelationIdList if no cascade needed

Fix create partition bug

Revert / Add new tests to mx

Style

Fix dropping fkey bug

Add test with IF NOT EXISTS

Convert to CLT when doing ATTACH PARTITION

Add comments

Add more tests with time series management

Edit the error message for converting the child

Use OR instead of AND in ErrorIfUnsupportedAlterTableStmt

Edit/improve tests

Disable ddl prop when dropping default column definitions

Disable/enable ddl prop just before/after the command

Add comment

Add sequence test

Add trigger test

Remove NeedCascadeViaForeignKeys

Add one more insert to sequence test

Add comment

Style

Fix test output shard ids

Update comments

Disable creating fkey on partitions

Move partition check to CreateCitusLocalTable

Add comment

Add check for  attachingmulti-level  partition

Add test for pg_constraint

Check pg_dist_partition in tests

Add test inserting on the worker
2021-10-11 10:45:07 +03:00
Marco Slot 386d2567d4 Reduce reliance on append tables in regression tests 2021-10-08 21:27:14 +02:00
Halil Ozan Akgul 9c9d4b5eeb Turn MX on by default 2021-10-08 18:17:21 +03:00
Naisila Puka 99d3785b5c
Fix flaky test in multi_fix_partition_shard_index_names.sql (#5364) 2021-10-08 18:03:34 +03:00
Naisila Puka d0390af72d
Add fix_partition_shard_index_names udf to fix currently broken names (#5291)
* Add udf to include shardId in broken partition shard index names

* Address reviews: rename index such that operations can be done on it

* More comprehensive index tests

* Final touches and formatting
2021-10-07 19:34:52 +03:00
Marco Slot 91b647024a Fixes CREATE INDEX deparsing issue 2021-10-06 13:08:16 +02:00
Onur Tirtir 5d8f74bd0b
(Share) Lock buffer page when reading from columnar storage (#5338)
Under high write concurrency, we were sometimes reading columnar
metapage as all zeros.

In `WriteToBlock()`, if `clear == true`, then it will clear the page before
writing the new one, rather than just adding data to the page. That
means any concurrent connection that is holding only a pin will be
able to see the all-zero state between the `InitPage()` and the
`memcpy_s()`.

Moreover, postgres/storage/buffer/README states that:

> Buffer access rules:
>
> 1. To scan a page for tuples, one must hold a pin and either shared or
> exclusive content lock.  To examine the commit status (XIDs and status bits)
> of a tuple in a shared buffer, one must likewise hold a pin and either shared
> or exclusive lock.

For those reasons, we have to make sure to never keep a pin on the
page without (at least) the shared lock, to avoid having such problems.
2021-10-06 11:57:02 +03:00
Halil Ozan Akgul 43d5853b6d Fixes function names in comments 2021-10-06 09:24:43 +03:00
Hanefi Onaldi a74409f24c
Bump Citus to 11.0devel 2021-10-01 22:21:22 +03:00
Onur Tirtir fe72e8bb48
Discard index deletion requests made to columnarAM (#5331)
A write operation might trigger index deletion if index already had
dead entries for the key we are about to insert.
There are two ways of index deletion:
  a) simple deletion
  b) bottom-up deletion (>= pg14)

Since columnar_index_fetch_tuple never sets all_dead to true,
columnarAM doesn't ever expect to receive simple deletion requests
(columnar_index_delete_tuples) as we don't mark any index entries
as dead.

However, since columnarAM doesn't delete any dead entries via simple
deletion, postgres might ask for a more comprehensive deletion
(i.e.: bottom-up) at some point when pg >= 14.

So with this commit, we start gracefully ignoring bottom-up deletion
requests made to columnar_index_delete_tuples.

Given that users can anyway "VACUUM FULL" their columnar tables,
we don't see any problem in ignoring deletion requests.
2021-10-01 14:32:47 +03:00
Önder Kalacı c2311b4c0c
Make (columnar.stripe) first_row_number index a unique constraint (#5324)
* Make (columnar.stripe) first_row_number index a unique constraint

Since stripe_first_row_number_idx is required to scan a columnar
table, we need to make sure that it is created before doing anything
with columnar tables during pg upgrades.

However, a plain btree index is not a dependency of a table, so
pg_upgrade cannot guarantee that stripe_first_row_number_idx gets
created when creating columnar.stripe, unless we make it a unique
"constraint".

To do that, drop stripe_first_row_number_idx and create a unique
constraint with the same name to keep the code change at minimum.

* Add more pg upgrade tests for columnar

* Fix a logic error in uprade_columnar_after test

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2021-09-30 10:51:56 +03:00
Jelte Fennema 97077c5c4a
Check more exit codes in upgrade tests (#5323)
We were trying to find the cause for a strange update bug. We thought
`pg_upgrade` succeeded and then were surprised that certain data was not
in the database after the upgrade. Instead `pg_upgrade` had failed
halfway through with an actionable error. It took us pretty long to
realise this.

This commit adds checking of exit codes to a lot more subprocess
executions. That should make debugging in the future much easier.
2021-09-24 15:51:00 +02:00
Jeff Davis d49d321eac Columnar: only call BuildStripeMetadata() with heap tuple.
BuildStripeMetadata() calls HeapTupleHeaderGetXmin(), which must only
be called on a proper heap tuple with MVCC information. Make sure the
caller passes the heap tuple, and not a datum tuple.

Fixes #5318.
2021-09-23 15:51:01 -07:00
tejeswarm a1604a87e6 Parition shards to be colocated with the parent shards 2021-09-22 14:47:04 -07:00
Onur Tirtir 77a2dd68da
Revoke read access to columnar.chunk from unprivileged user (#5313)
Since this could expose chunk min/max values to unprivileged users.
2021-09-22 16:23:02 +03:00
Onur Tirtir 68335285b4 Columnar CustomScan: Pushdown BoolExpr's as we do before 2021-09-22 10:51:34 +03:00
Onur Tirtir e6ed764f63
Check if xact id is in progress before checking if aborted (#5312) 2021-09-21 21:20:31 +03:00
Onur Tirtir f8b1ff7214
Add CheckCitusVersion() calls to columnarAM (#5308)
Considering all code-paths that we might interact with a columnar table,
add `CheckCitusVersion` calls to tableAM callbacks:
- initializing table scan (`columnar_beginscan` & `columnar_index_fetch_begin`)
- setting a new filenode for a relation (storage initializiation or a table rewrite)
- truncating the storage
- inserting tuple (single and multi)

Also add `CheckCitusVersion` call to:
- drop hook (`ColumnarTableDropHook`)
- `alter_columnar_table_set` & `alter_columnar_table_reset` UDFs
2021-09-20 17:26:41 +03:00
Onder Kalaci cea937f52f Add missing version checks for citus_internal_XXX functions 2021-09-20 09:54:35 +02:00
SaitTalhaNisanci 35ff513dfe
Give proper error while distributing a temp table (#5269) 2021-09-17 14:34:40 +03:00
jeff-davis 6e8b19984e
Columnar: separate plan and runtime quals. (#5261)
* Columnar: separate plain and exec quals.

Make a clear separation between plain quals, which contain constants
or extern params; and exec quals, which contain exec params and can't
be evaluated until a rescan.

Fixes #5258.

* more vanilla tests

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2021-09-13 10:54:53 -07:00
jeff-davis d48ceee238
Columnar: add method ReparameterizeCustomPathByChild. (#5275)
When performing a partition-wise join, the planner will adjust paths
parameterized by the parent rel to instead parameterize by the child
rel directly. When this reparameterization happens, we also need to
adjust the join quals to reference the child rather than the parent.

Fixes #5257.
2021-09-13 10:33:48 -07:00
Onur Tirtir ea61efb63a
Not flush writes until need to read them when doing index-scan on columnar (#5247)
Not flush pending writes if given tid belongs to a "flushed" or
"aborted" stripe write, or to an "in-progress" stripe write of
another backend.

That way, we would reduce the cases where we flush single-tuple
stripes during index scan.

To do that, we follow below steps for index look-up's:

- Do not flush any pending writes and do stripe metadata look-up for
  given tid.
  If tuple with tid is found, then no need to do another look-up
  since we already found the tuple without needing to flush pending
  writes.

- If tuple is not found without flushing pending writes, then we have two
  scenarios:

  -  If given tid belongs to a pending write of my backend, then do stripe
     metadata look-up for given tid. But this time first **flush any pending
     writes**.
     
  -  Otherwise, just return false from `index_fetch_tuple` since flushing
      pending writes wouldn't help.
2021-09-13 18:41:20 +02:00
Onur Tirtir 4ee0fb2758
Make sure to skip aborted writes when reading the first tuple (#5274)
With 5825c44d5f, we made the changes to
skip aborted writes when scanning a columnar table.

However, looks like we forgot to handle such cases for the very first
call made to columnar_getnextslot. That means, that commit only
considered the intermediate stripe read operations.

However, functions called by columnar_getnextslot to find first stripe
to read (ColumnarBeginRead & ColumnarRescan) were not caring about
those aborted writes.

To fix that, we teach AdvanceStripeRead to find the very first stripe
to read, and then start using it where were blindly calling
FindNextStripeByRowNumber.
2021-09-13 11:50:53 +03:00
Burak Velioglu ceec5d72e3
Swallow errors while aborting remote transactions 2021-09-10 11:06:16 +03:00
Naisila Puka a69abe3be0
Fixes bug about int and smallint sequences on MX (#5254)
* Introduce worker_nextval udf for int&smallint column defaults

* Fix current tests and add new ones for worker_nextval
2021-09-09 23:41:07 +03:00
Nils Dijk 80a44a7b93
prevent double inclusion of columnar_tableam.h (#5266)
Recently there are some warnings during the compilation of Citus.
Part of the warnings come due to the `columnar_tableam.h` header not being properly guarded with defines and ifndef's.

This PR fixes these warnings.
2021-09-09 17:37:58 +02:00
Onur Tirtir be74518965
Improve memset calls made to reset bool arrays (#5262) 2021-09-09 17:56:03 +03:00
Halil Ozan Akgul 19af1cef2f Errors for CTEs with search clause
Relevant PG commit:
3696a600e2292d43c00949ddf0352e4ebb487e5b
2021-09-09 13:48:24 +03:00
Marco Slot f84164a000 Avoid switch to superuser in worker_merge_files_into_table 2021-09-09 11:00:29 +02:00
Marco Slot 04388e13b0 Add worker_append_table_to_shard permissions tests 2021-09-09 11:00:29 +02:00
Marco Slot 4faa49775b Perform copy command as regular user in worker_append_table_to_shard 2021-09-09 11:00:29 +02:00
Hanefi Onaldi 9ae912a8c8
Prevent C-style comments in all directories (#5250) 2021-09-09 11:54:58 +03:00
SaitTalhaNisanci e3e0a028c7
return early in case we want to skip outer vars (#5259) 2021-09-09 10:53:36 +03:00
Onur Tirtir 32e3e51ed4
Fix a compiler warning that we get on debian (#5260) 2021-09-08 20:03:59 +03:00
Onur Tirtir 9935dfb958 Remove a flaky test from columnar_paths
We already knew that it was flaky. Moreover, now it failed on my
branch too.

So removing it with this commit.
2021-09-08 14:15:22 +03:00
Onur Tirtir be3914ae28 Prevent generating index-only "Path"s for columnar tables
Previously, even when `EXPLAIN` output tells that we will do
index-only scan, it was never the case since columnar tables
don't have the visibility fork that postgres is looking for.

For this reason, visibility check done in
`IndexOnlyNext->VM_ALL_VISIBLE`
code-path was always returning false and postgres was reading
the tuple from the columnar relation itself.
2021-09-08 14:14:24 +03:00
Onur Tirtir cc49e63222
Not read heaptuple after closing pg_rewrite (#5255) 2021-09-08 13:03:17 +02:00
Onur Tirtir 3340f17c4e
Prevent planner from choosing parallel scan for columnar tables (#5245)
Previously, for regular table scans, we were setting `RelOptInfo->partial_pathlist`
to `NIL` via `set_rel_pathlist_hook` to discard scan `Path`s that need to use any
parallel workers, this was working nicely.

However, when building indexes, this hook doesn't get called so we were not
able to prevent spawning parallel workers when building an index. For this
reason, 9b4dc2f804 added basic
implementation for `columnar_parallelscan_*` callbacks but also made some
changes to skip using those workers when building the index.

However, now that we are doing stripe reservation in two stages, we call 
`heap_inplace_update` at some point to complete stripe reservation.
However, postgres throws an error if we call `heap_inplace_update` during
a parallel operation, even if we don't actually make use of those workers.

For this reason, with this pr, we make sure to not generate scan `Path`s that
need to use any parallel workers by using `get_relation_info_hook`.

This is indeed useful to prevent spawning parallel workers during index builds.
2021-09-08 13:53:43 +03:00
Onur Tirtir 5825c44d5f
Handle aborted writes properly when scanning a columnar table (#5244)
If it is certain that we will not use any `parallel_worker`s for a columnar table,
then stripe entries inserted by aborted transactions become visible to
`SnapshotAny` and that causes `REINDEX` to fail by throwing a duplicate key
error.

To fix that:
* consider three states for a stripe write operation:
   "flushed", "aborted", or "in-progress",
* make sure to have a clear separation between them, and
* act according to those three states when reading from a columnar table
2021-09-08 13:26:11 +03:00
Onur Tirtir 5dc619162d
Add valgrind test target for multi-1 (#5251) 2021-09-07 16:27:34 +03:00
Jelte Fennema bb5c494104 Enable binary encoding by default on PG14
Since PG14 we can now use binary encoding for arrays and composite types
that contain user defined types. This was fixed in this commit in
Postgres: 670c0a1d47

This change starts using that knowledge, by not necessarily falling back
to text encoding anymore for those types.

While doing this and testing a bit more I found various cases where
binary encoding would fail that our checks didn't cover. This fixes
those cases and adds tests for those. It also fixes EXPLAIN ANALYZE
never using binary encoding, which was a leftover of workaround that
was not necessary anymore.

Finally, it changes the default for both `citus.enable_binary_protocol`
and `citus.binary_worker_copy_format` to `true` for PG14 and up. In our
cloud offering `binary_worker_copy_format` already was true by default.
`enable_binary_protocol` had some bug with MX and user defined types,
this bug was fixed by the above mentioned fixes.
2021-09-06 10:27:29 +02:00
Burak Velioglu c3895f35cd
Add helper UDFs for easy time partition management
- get_missing_time_partition_ranges: Gets the ranges of missing partitions for the given table, interval and range unless any existing partition conflicts with calculated missing ranges.

- create_time_partitions: Creates partitions by getting range values from get_missing_time_partition_ranges.

- drop_old_time_partitions: Drops partitions of the table older than given threshold.
2021-09-03 23:03:13 +03:00
Onur Tirtir 2b71263e40
Align columnar path costing functions (#5239)
* Rename RecostColumnarPaths to CostColumnarPaths

* Rename RecostColumnarIndexPath to CostColumnarIndexPath

* Reorder args of CostColumnarScan to align with other two costing functions

* Not adjust index scan start-up cost

* Rename ColumnarIndexScanAddTotalCost to ColumnarIndexScanAdditionalCost

* Reflect that index scan will at least read one stripe in totalCost calculation

* Organize declarations in columnar_customscan.c
2021-09-03 19:37:42 +03:00
jeff-davis cc58b58f73
Columnar: reserve metapage flag for UNLOGGED support. (#5237)
Reserve space in the metapage for a flag to support UNLOGGED tables in
the future without a metapage upgrade.
2021-09-03 08:40:55 -07:00
Halil Ozan Akgul 7fadfb74bb Adds error message for REINDEX TABLE queries on distributed partitioned tables 2021-09-03 16:46:42 +03:00
Sait Talha Nisanci 3ad3bbba84 Apply latest version compat without conflicts 2021-09-03 16:09:59 +03:00
Sait Talha Nisanci 0b67fcf81d Fix style 2021-09-03 16:09:59 +03:00
Halil Ozan Akgul e1f5520e1a Adds propagation of ALTER TABLE .. ALTER COLUMN .. SET COMPRESSION .. 2021-09-03 15:44:28 +03:00
SaitTalhaNisanci 902af39a04 Add join alias tests (#5233)
PG COMMIT: 055fee7eb4dcc78e58672aef146334275e1cc40d
2021-09-03 15:44:28 +03:00
SaitTalhaNisanci 2a2ebab1fa Add tests for jsonb subscripting (#5232)
PG commit: 676887a3b0b8e3c0348ac3f82ab0d16e9a24bd43
2021-09-03 15:44:28 +03:00
Ahmet Gedemenli 2b263f9a2a ALTER STATISTICS .. OWNER TO CURRENT_ROLE (#5225)
(cherry picked from commit 42322caf90ca094777aa01376e02d1187afc1560)
2021-09-03 15:44:28 +03:00
Onder Kalaci 82a3b20fb3 Fix flaky test 2021-09-03 15:44:28 +03:00
Onder Kalaci 5844ab286c Support OUT parameters in procedure pushdown delegation
In PG 14, procedures can have OUT parameters. In Citus' procedure
delegation framework, we need to adjust the function expression
to get the outargs parameters.

Releven PG change:
e56bce5d43
2021-09-03 15:44:28 +03:00
Ahmet Gedemenli 1ff7186d20 Extended statistics on expressions - PG14 a4d75c8 (#5224)
(cherry picked from commit 1268415f123b5d99cfacfe207c8670240efc1c00)
2021-09-03 15:44:28 +03:00
Halil Ozan Akgul 113d5d6615 Adds support for column compression in table distribution 2021-09-03 15:44:28 +03:00
Ahmet Gedemenli 6fbdeb38a8 ALTER TABLE ... DETACH PARTITION ... CONCURRENTLY - PG14 #71f4c8c (#5223) 2021-09-03 15:44:28 +03:00
Onder Kalaci c431bb2e45 Add support for "COPY dist/ref tables FROM" progress report
Simply call Postgres' function to report the progress on
each row recieved.

Note that we currently do not support "COPY dist/ref TO .." progress
report nicely. Citus has some specialized logic to support
"COPY dist/ref TO .." such that it either converts the underlying
command into "COPY (SELECT * FROM dist/ref ) ..." or sends COPY
command to shards directly. In the former case, "tuples_processed"
is only updated when the executor returns all the tuples, so the
progress is not accurate. In the latter case, Citus can actually
implement the progress report. But, for the sake of consistency,
we prefer to not implement at all.

Added to PG 14 with https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8a4f618e7ae3cb11b0b37d0f06f05c8ff905833f
2021-09-03 15:44:28 +03:00
Ahmet Gedemenli 66303785f3 Add option PROCESS_TOAST to VACUUM - PG14 #7cb3048 (#5219)
(cherry picked from commit e63bdfc49f9203db14ef77313c1d5e3461a84a32)
2021-09-03 15:44:28 +03:00
Sait Talha Nisanci 35a3f7240d CHANGELOG: Allow REINDEX to change the tablespace of the new index 2021-09-03 15:44:28 +03:00
Sait Talha Nisanci 4e85d9ffce Add empty pg14 sql file 2021-09-03 15:44:28 +03:00
Sait Talha Nisanci 307eb81278 Fix failure for 1pc_copy_hash 2021-09-03 15:41:28 +03:00
Sait Talha Nisanci a6c40ebd14 Fix multi_follower_dml
When the_table is emtpy, we don't get an error with pg14 anymore so we
replace it generate_series so that we get the error.
2021-09-03 15:41:28 +03:00
Sait Talha Nisanci b16dadbe7c Avoid NOTICE message to avoid an alternative output with pg14 2021-09-03 15:41:28 +03:00
Sait Talha Nisanci 6ff609fa86 Add alternative output for data_types
It seems like there is a problem with Postgres14 with SELECT DISTINCT
COUNT. The issue is reported to Postgres and an alternative output is
added. We can remove the alternative output when the issue is fixed on
PG. If this is not an issue on PG(which is unlikely) we should consider
some other solution.
2021-09-03 15:41:28 +03:00
Sait Talha Nisanci 2fa1e5ffe3 Use the default max_parallel_workers_per_gather for vanilla 2021-09-03 15:41:28 +03:00
Sait Talha Nisanci 4b951a2ed9 Add alternative output for multi-mx 2021-09-03 15:41:28 +03:00
Sait Talha Nisanci 96964aeee5 Turn off debug for one query to avoid adding an alternative output 2021-09-03 15:41:28 +03:00
Sait Talha Nisanci e7607b6bed Add a helper function to check explain has a single task
In order to avoid adding an alternative output, a function to check if a
given explan plan has a single task added. This doesn't change what the
changed tests intend to do.
2021-09-03 15:41:28 +03:00
Sait Talha Nisanci e0faf34417 turn off costs in columnar_indexes explain query 2021-09-03 15:41:28 +03:00
Nils Dijk e63302d012 update error messages for libpq 14beta3 2021-09-03 15:41:28 +03:00
Sait Talha Nisanci 2656d885f9 Rewrite AppendColumnNames for Pg14
Postgres changed stats expression types as of PG14. Hence we needed to
write the AppendColumnNames method. Also they removed the error on PG
side so we remove it as well.

Relevant commits on pg14:
a4d75c86bf15220df22de0a92c819ecef9db3849
388e75ad33489b77cfb9a8590a91e9287d8fb960
2021-09-03 15:41:28 +03:00
Sait Talha Nisanci d1c0403055 Disable Query Idenfifier calculation in tests
When queryId is not 0 and verbose is true, the query identifier is
emitted to the explain output. This is breaking Postgres outputs.
We disable de query identifier calculation in the tests.
Commit on PG that introduced the query identifier in the explain output:
4f0b0966c866ae9f0e15d7cc73ccf7ce4e1af84b
2021-09-03 15:41:28 +03:00
Sait Talha Nisanci 7c0389a7a1 Update propagate extension commands test for pg12
The test file was changes slightly to avoid adding an alternative
output. We update the existing alternative output for pg12 with the new
changes.
2021-09-03 15:41:28 +03:00
Sait Talha Nisanci cd402b6a2b Add alternative output for pg12 for window_functions 2021-09-03 15:41:28 +03:00
Halil Ozan Akgul c31b0c2652 Sets next_shard_id at partition_wise_join test 2021-09-03 15:41:28 +03:00
Halil Ozan Akgul 9fc4c27b08 Readds deleted resultRelInfo changes for previos PG versions
These changes were removed in commit: Introduces ExecSimpleRelationInsert_compat and modifyStateResultRelInfo macros
We shouldn't have removed them but instead kept them for before PG14
2021-09-03 15:41:28 +03:00
Sait Talha Nisanci aca2b8b675 Add alternative output for isolation_master_update_node 2021-09-03 15:41:28 +03:00
Sait Talha Nisanci f3fa133caa Bind seg version to 1.3 in isolation_textension_commands 2021-09-03 15:41:28 +03:00
Sait Talha Nisanci 75fff14792 Turn off VERBOSE to avoid alternative output
With VERBOSE option, as of PG14, we get a line with "Query Identifier".
2021-09-03 15:41:28 +03:00
Sait Talha Nisanci 6b65dbc492 Add partition_wise_join to avoid big alternative output
There was a small part in multi_partitioning that would need an
alternative output for pg14. Instead of adding an alternative for the
whole file, we created a new file, called partition_wise_join.sql and
added the alternative output for that.
2021-09-03 15:41:28 +03:00
Sait Talha Nisanci 375a1adc9e Check if extversion is the same for seg extension
When we check the exact version of the seg extension, it becomes a
problem when its version changes, such as from 1.3 to 1.4. So now we
modified the changes to check for that the version is the same in all
the cluster.
2021-09-03 15:41:28 +03:00
Halil Ozan Akgul ca0d4c3bde Includes pg_version_constants.h in columnar_version_compat.h 2021-09-03 15:41:28 +03:00
Halil Ozan Akgul 7823e49219 Introduces pg_get_statisticsobj_worker_compat macro
Relevant PG commit:
a4d75c86bf15220df22de0a92c819ecef9db3849
2021-09-03 15:41:28 +03:00
Halil Ozan Akgul f16d5e1833 Introduces make_simple_restrictinfo_compat and pull_varnos_compat macros
make_simple_restrictinfo and pull_varnos functions now have a new parameter
These new macros give us the ability to use this new parameter for PG14 and they don't give the parameter for previous versions

Relevant PG commit:
55dc86eca70b1dc18a79c141b3567efed910329d
2021-09-03 15:41:28 +03:00
Halil Ozan Akgul 9b6ce10892 Removes password outputs from alter_role_propagation tests 2021-09-03 15:41:28 +03:00
Sait Talha Nisanci 20c32a7a1d Add alternative output for multi_deparse_function
Postgres tightened up its checks for invalid GUC names hence we started
to get an alternative output for one of our tests. We add an alternative
output since the file is relatively small.

Commit on PG:
3db826bd55cd1df0dd8c3d811f8e5b936d7ba1e4
2021-09-03 15:41:28 +03:00
Sait Talha Nisanci 256e7d1540 Add alternative output for window_functions 2021-09-03 15:41:28 +03:00
Sait Talha Nisanci df9b7149c3 Add some normalization rules for pg14 2021-09-03 15:41:28 +03:00
Sait Talha Nisanci dc81cae18f Turn off COSTS to avoid alternative output for pg14 2021-09-03 15:41:28 +03:00
Sait Talha Nisanci fb8671f291 Change pg13 test to not differ with pg14 to avoid adding alternative output 2021-09-03 15:41:28 +03:00
Sait Talha Nisanci 3f5c178c93 Remove VERBOSE output to make pg14 and pg13 output the same 2021-09-03 15:41:28 +03:00
Sait Talha Nisanci abd3c1089b Use oid_hash in write state management 2021-09-03 15:41:28 +03:00
Halil Ozan Akgul 8ef94dc1f5 Changes array_cat argument type from anyarray to anycompatiblearray
Relevant PG commit:
9e38c2bb5093ceb0c04d6315ccd8975bd17add66

fix array_cat_agg for pg upgrades

array_cat_agg now needs to take anycompatiblearray instead of anyarray
because array_cat changed its type from anyarray to anycompatiblearray
with pg14.

To handle upgrades correctly, we drop the aggregate in
citus_pg_prepare_upgrade. To be able to drop it, we first remove the
dependency from pg_depend.

Then we create the right aggregate in citus_finish_pg_upgrade and we
also add the dependency back to pg_depend.
2021-09-03 15:41:28 +03:00
Sait Talha Nisanci a1bfb4f31b Fix unlimited copy size variable's value 2021-09-03 15:41:28 +03:00
Sait Talha Nisanci 29f5b99951 Use empty string instead of NULL for queryString
Postgres doesn't accept NULL for queryStrings in explain plans anymore.
Internally, there are some places in Postgres where they modified the
NULLS to ""(the empty string). So we do the same on citus side.

Commit on Postgres:
1111b2668d89bfcb6f502789158b1233ab4217a6
2021-09-03 15:27:25 +03:00
Sait Talha Nisanci 96833e2b8f Use HASH_STRINGS explicitly in hash functions
Postgres expects to set the HASH_STRINGS explicitly in case of the
default behaivor for string hash function.

Postgres Commit
b3817f5f774663d55931dd4fab9c5a94a15ae7ab
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 5930378f61 Renames shadowing ruleutils_14.c variables 2021-09-03 15:27:25 +03:00
Halil Ozan Akgul b21a00e775 Introduces index_insert_compat macro
index_insert function now has a new parameter, indexUnchanged
This new macro give us the ability to use these new parameter for PG14 and they don't give the parameters for previous versions
Existing parameter is set to false

Relevant PG commit:
9dc718bdf2b1a574481a45624d42b674332e2903
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul fd2ca2825b Introduces ExecSimpleRelationInsert_compat and modifyStateResultRelInfo macros
es_result_relation_info is removed from Estate. In this commit we make some changes to handle that.
resultRelationInfo filed is added to ModifyState to support the removed field.

Relevant PG commits:
1375422c7826a2bf387be29895e961614f69de4b
a04daa97a4339c38e304cd6164d37da540d665a8
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul b644ac55c6 Introduces GetOldestNonRemovableTransactionId_compat macro
GetOldestXmin function is removed so we use GetOldestNonRemovableTransactionId functions instead
GetOldestNonRemovableTransactionId_compat picks the appropriate one

Relevant PG commit:
dc7420c2c9274a283779ec19718d2d16323640c0
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul cb3b76ed24 Introduces get_partition_parent_compat and RelationGetPartitionDesc_compat macros
get_partition_parent and RelationGetPartitionDesc functions now have new parameters to also include detached partitions
Thess new macros give us the ability to use these new parameter for PG14 and they don't give the parameters for previous versions
Existing parameters are set to not accept detached partitions

Relevant PG commit:
71f4c8c6f74ba021e55d35b1128d22fb8c6e1629
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 898d3bb8d3 Introduces proc_statusflags_compat macro
In two commits vacuumFlags in PGXACT is moved and then renamed to status flags
This macro uses the appropriate version of the flag

Relevant PG commits:
5788e258bb26495fab65ff3aa486268d1c50b123
cd9c1b3e197a9b53b840dcc87eb41b04d601a5f9
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 287706b717 Introduces SetTuplestoreDestReceiverParams_compat macro
SetTuplestoreDestReceiverParams function now has two new parameters
This new macro give us the ability to use this new parameter for PG14 and it doesn't give the parameter for previous versions
Existing parameters are set to NULL to keep previous behavior

Relevant PG commit:
2f48ede080f42b97b594fb14102c82ca1001b80c
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul b01e7e884c Pass NULL for plannerInfo as we don't generate PlaceHolderVars 2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 86d9260781 Uses lfirst_node in ruleutils_14.c
Relevant PG commit:
2b00db4fb0c7f02f000276bfadaab65a14059168
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 3b7bcf7555 Adds missing include_out_argument parameter to func_get_detail in ruleutils_14.c
Relevant PG commit:
e56bce5d43789cce95d099554ae9593ada92b3b7
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 2990cfb6c9 Adds SQL-standard function body support to ruleutils_14.c
Relevant PG commit:
e717a9a18b2e34c9c40e5259ad4d31cd7e420750
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 84f0be56c3 Adds EXTRACT cases to get_func_sql_syntax in ruleutils_14.c
Relevant PG commit:
a2da77cdb4661826482ebf2ddba1f953bc74afe4
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 131062d6b5 Removes ModifyTable check from set_deparse_plan in ruleutils_14.c
Relevant PG commit:
86dc90056dfdbd9d1b891718d2e5614e3e432f35
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul f557bae64c Adds JOIN ... USING alias to ruleutils_14.c
Relevant PG commit:
055fee7eb4dcc78e58672aef146334275e1cc40d
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul c3f0528607 Extends statistics on expressions in ruleutils_14.c
Relevant PG commit:
a4d75c86bf15220df22de0a92c819ecef9db3849
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul af2853d1de Adds GROUP BY DISTINCT to ruleutils_14.c
Relevant PG commit:
be45be9c33a85e72cdaeb9967e9f6d2d00199e09
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 5bb538543d Enhances cycle mark values at ruleutils_14.c
Relevant PG commit:
f4adc41c4f92cc91d507b19e397140c35bb9fd71
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 12b3c04fe3 Adds SEARCH and CYCLE clauses to ruleutils_14.c
Relevant PG commit:
3696a600e2292d43c00949ddf0352e4ebb487e5b
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 1174046a33 Adds bytea equivalents of ltrim() and rtrim() to ruleutils_14.c
Relevant PG commit:
a6cf3df4ebdcbc7857910a67f259705645383e9f
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 71691ecf06 Adds HASH_STRINGS flag to ruleutils_14.c
Relevant PG commit:
b3817f5f774663d55931dd4fab9c5a94a15ae7ab
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul e72bd0c1a1 Removes dependency.h from ruleutils_14.c
Relevant PG commit:
8b069ef5dca97cd737a5fd64c420df3cd61ec1c9
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul d4874f5ad2 Removes indexing.h header from ruleutils_14.c
Relevant PG commit:
bdc4edbea6fc847f806e1e7118d730e159512bfc
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 1cb865deb8 Adds SQL syntax function calls related changes to ruleutils_14.c
Relevant PG commit:
40c24bfef92530bd846e111c1742c2a54441c62c
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul b4f76303c6 Updates F_ARRAY_UNNEST to F_UNNEST_ANYARRAY in ruleutils_14.c
Relevant PG commit:
8e1f37c07aafd4bb7aa6e1e1982010af11f8b5c7
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 30f77b29a7 Fixes some appendStringInfos in ruleutils_14.c
Relevant PG commit:
110d81728a0a006abcf654543fc15346f8043dc0
2021-09-03 15:27:25 +03:00
Halil Ozan Akgul 69aa240b99 Adds for_each_from to ruleutils_14.c
Relevant PG commit:
56fe008996bc1a547ce60c8dddd2ca821cac163e
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul beb49f0d53 Updates AlternativeSubPlan comment in ruleutils_14.c
Relevant PG commit:
41efb8340877e8ffd0023bb6b2ef22ffd1ca014d
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul e642f6c97f Removes support for postfix operators from ruleutils_14.c
Relevant PG commit:
1ed6b895634ce0dc5fd4bd040e87252b32182cba
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul a710b3b949 Removes some comments with printf %.*s format from ruleutils_14.c
Relevant PG commit:

c410af098c46949e36607eb13689e697fa2def97
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul e38b75799d Fixes some indentation in ruleutils_14.c
Relevant PG commit:

fa27dd40d5c5f56a1ee837a75c97549e992e32a4
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 1d5053b652 Removes support for old protocols in Copy functions from PG14
Some Copy related functions copied from Postgres had support for both old and new protocols
Postgres removed support for old version so we remove it too

Relevant PG commit:
3174d69fb96a66173224e60ec7053b988d5ed4d9
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 82858ca8fe Introduces ProcessUtility macros for readOnlyTree parameter
New macros: standard_ProcessUtility_compat, ProcessUtility_compat, ColumnarProcessUtility_compat, PrevProcessUtilityHook_compat

The functions now have a new bool parameter: readOnlyTree
These new macros give us the ability to use this new parameter for PG14 and it doesn't give the parameter for previous versions

In multi_ProcessUtility and ColumnarProcessUtility, before doing anything else, we check if readOnlyTree parameter is true and create a copy of pstmt
Existing readOnlyTree parameters are set to false since we already handle the read only case at multi_ProcessUtility and ColumnarProcessUtility

Relevant PG commit:
7c337b6b527b7052e6a751f966d5734c56f668b5
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 5df6251619 Removes CopyGetAttnums function definition for PG14
This function was copied from Postgres but it is not static at PG14
So we keep the definition only for previous versions

Relevant PG commit:
c532d15dddff14b01fe9ef1d465013cb8ef186df
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul db2d9af863 Introduces BeginCopyFrom_compat macro
BeginCopyFrom function now has a new whereClause parameter.
In the function this parameter is assigned to the whereClause field of the CopyFromState returned
Currently in Postgres there is only one place where this argument isn't NULL, and in previous PG version the whereClause argument of copy state is set right after the function call
Since we don't have such example all current whereClause parameters are set to NULL

Relevant PG commit:
c532d15dddff14b01fe9ef1d465013cb8ef186df
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 35cfa5d7b9 Introduces CopyFromState_compat macro
CopyState struct is divided into parts and one of them is CopyFromState
This macro uses the appropriate one for PG versions

Relevant PG commit:
c532d15dddff14b01fe9ef1d465013cb8ef186df
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 8f34f84ce6 Introduces IsReindexWithParam_compat macro
In ReindexStmt concurrent field is moved to options and then options are converted to params list.
This macro uses previous fields for previous versions and the new params list with a new function named IsReindexWithParam for PG14

Relevant PG commits:
844c05abc3f1c1703bf17cf44ab66351ed9711d2
b5913f6120792465f4394b93c15c2e2ac0c08376
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 37ae22ce3e Introduces macros for vacuum options
VacOptTernaryValue enum is renamed to VacOptValue.
In the enum there were three values, VACOPT_TERNARY_DEFAULT, VACOPT_TERNARY_DISABLED, and VACOPT_TERNARY_ENABLED
Now there are four values VACOPTVALUE_UNSPECIFIED, VACOPTVALUE_AUTO, VACOPTVALUE_DISABLED, and VACOPTVALUE_ENABLED

New macros are VacOptValue_compat, VACOPTVALUE_UNSPECIFIED_COMPAT, VACOPTVALUE_DISABLED_COMPAT, and VACOPTVALUE_ENABLED_COMPAT
The VACOPTVALUE_UNSPECIFIED_COMPAT matches VACOPT_TERNARY_DEFAULT and VACOPTVALUE_UNSPECIFIED. And there are no macro for VACOPTVALUE_AUTO.

Relevant PG commit:
3499df0dee8c4ea51d264a674df5b5e31991319a
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul ebf1b7e23f Introduces macros for functions that now have include_out_arguments argument
New macros: FuncnameGetCandidates_compat and expand_function_arguments_compat

The functions (the ones without _compat) now have a new bool include_out_arguments parameter
These new macros give us the ability to use this new parameter for PG14 and it doesn't give the parameter for previous versions
Existing include_out_arguments parameters are set to 'false' to keep current behavior

Relevant PG commit:
e56bce5d43789cce95d099554ae9593ada92b3b7
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 347ae2928f Introduces stats_compat macro for MemoryContextMethods->stats
stats function now have a new bool print_to_stderr parameter
This new macro gives us the ability to use this new parameter for PG14 and it doesn't give the parameter for previous versions
Existing print_to_stderr parameter is set to true to keep current behavior

Relevant PG commit:
43620e328617c1f41a2a54c8cee01723064e3ffa
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 54ee93885a Introduces getObjectTypeDescription_compat and getObjectIdentity_compat macros
getObjectTypeDescription and getObjectIdentity functions now have a new bool missing_ok parameter
These new macros give us the ability to use this new parameter for PG14 and they don't give the parameter for previous versions
Currently all missing_ok parameters are set to false to keep current behavior

Relevant PG commit:
2a10fdc4307a667883f7a3369cb93a721ade9680
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul f8d3e50f25 Introduces STATUS_WAITING_COMPAT macro
The STATUS_WAITING define is removed and an enum with PROC_WAIT_STATUS_WAITING is added instead
This macro uses appropriate one

Relevant PG commit:
a513f1dfbf2c29a51b0f7cbd5913ce2d2ee452c5
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 3c10e0f568 Introduces ROLE_MONITOR_COMPAT macro
DEFAULT_ROLE_MONITOR is renamed to ROLE_PG_MONITOR
This macro uses appropriate one

Relevant PG commit:
c9c41c7a337d3e2deb0b2a193e9ecfb865d8f52b
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 4bc0c80bba Adds index_delete_tuples instead of compute_xid_horizon_for_tuples
Relevant PG commit:
d168b666823b6e0bcf60ed19ce24fb5fb91b8ccf
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul b790ecf180 Introduces F_NEXTVAL_COMPAT macro
Name of F_NEXTVAL_OID is changed to F_NEXTVAL

Relevant PG commit:
8e1f37c07aafd4bb7aa6e1e1982010af11f8b5c7
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul f933d2a57a Includes defrem.h in index.c 2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 63cdb4b70a Adds AlterTableStmtObjType macro
AlterTableStmt's relkind field is changed into objtype
New AlterTableStmtObjType macro uses the appropriate one

Relevant PG commit:
cc35d8933a211d9965eb1c1d2749a903d5735db2
2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 1b6c8348fb Adds PG14 to version_compat.h and columnar_version_compat.h files 2021-09-03 15:27:24 +03:00
Halil Ozan Akgul 7a27d7cee3 Adds copy of ruleutils_13.c as ruleutils_14.c 2021-09-03 15:27:24 +03:00
jeff-davis 4718b6bcdf
Generate parameterized paths for columnar scans. (#5172)
Allow ColumnarScans to push down join quals by generating
parameterized paths. This significantly expands the utility of chunk
group filtering, making a ColumnarScan behave similar to an index when
on the inner of a nested loop join.

Also, evaluate all parameters on beginscan/rescan, which also works
for external parameters.

Fixes #4488.
2021-09-02 22:22:48 -07:00
Onur Tirtir 37d0ecfbb7 Show projected cols for columnar tables in EXPLAIN output 2021-09-02 19:05:32 +03:00
Onur Tirtir 42ba82fb67 Comment ColumnarAttrNeeded 2021-09-02 13:20:11 +03:00
Onur Tirtir 9cb5ef5007 Pass ColumnarScanDesc to ColumnarScanChunkGroupsFiltered 2021-09-02 13:20:11 +03:00
Naisila Puka 4fb05efabb
Distributes partition-to-be table before ProcessUtility (#5191)
* Skip ALTER TABLE constraint checks while planning

* Revert previous commit's solution, keep tests

* Distribute partition-to-be table before ProcessUtility

* Acquire locks in PreprocessAlterTableStmtAttachPartition
2021-09-02 13:07:42 +03:00
Onur Tirtir 889a2731cb
Split columnar stripe reservation into two phases (#5188)
Previously, we were doing `first_row_number` reservation for the first
row written to current `WriteState` but were doing `stripe_id`
reservation when flushing the `WriteState` and were inserting the
related record to `columnar.stripe` at that time as well.

However, inserting `columnar.stripe` record at flush-time is
problematic. This is because, as told in #5160, if relation has
any index-based constraints and if there are two concurrent
writes that are inserting conflicting key values for that constraint,
then postgres relies on `tableAM->fetch_index_tuple`
(=`columnar_fetch_index_tuple`) callback to return `true` when
indexAM is checking against possible constraint violations.

However, pending writes of other backends are not visible to concurrent
sessions in columnar since we were not inserting the stripe metadata
record until flushing the stripe.

With this commit, we split stripe reservation into two phases:
i) Reserve `stripe_id` and insert a "dummy" record to `columnar.stripe`
at the very same time we reserve `first_row_number`, i.e. when writing
the first row to the current `WriteState`.
ii) At flush time, do the storage level allocation and complete the
missing fields of the dummy record inserted into `columnar.stripe`
during i).

That way, any concurrent writes would be able to check against possible
constraint violations by using `SnapshotDirty` when scanning
`columnar.stripe`.

Note that `columnar_fetch_index_tuple` still wouldn't be able to fill
the output tupleslot for the requested tid but it would at least return
`true` for such index look-up's and we believe this should be sufficient
for the caller indexAM callback to make the concurrent writer block on
prior one.

That is how we fix #5160.

Only downside of reserving `stripe_id` at the same time we reserve
`first_row_number` is that now any aborted writes would also waste
some amount of `stripe_id` as in the case of `first_row_number` but
we are just wasting them one-by-one.

Considering the fact that we waste `first_row_number` by the amount
stripe row limit (=150k by default) in such cases, this shouldn't be
important at all.
2021-09-02 11:49:14 +03:00
Onur Tirtir bf4dfad6f7 Update curcid of given snapshot if it is MVCC
Before starting to scan a columnar table, we always flush the pending
writes to disk.

However, we increment command counter after modifying metadata tables.

On the other hand, now that we _don't always use_ xact snapshot to scan
a columnar table, writes that we just flushed might not be visible to
the query that just flushed pending writes to disk since curcid of
provided snapshot would become smaller than the command id being used
when modifying metadata tables.

To give an example, before this change, below was a possible scenario
due to the changes that we made to use the correct snapshot.

```sql
CREATE TABLE t(a int, b int) USING columnar;
BEGIN;
  INSERT INTO t VALUES (5, 10);

  SELECT * FROM t;
  ┌───┬───┐
  │ a │ b │
  ├───┼───┤
  └───┴───┘
  (0 rows)

  SELECT * FROM t;
  ┌───┬────┐
  │ a │ b  │
  ├───┼────┤
  │ 5 │ 10 │
  └───┴────┘
  (1 row)
```
2021-09-02 11:11:59 +03:00
Onur Tirtir 6c26c67ea0 Flush write state when initializing read state
In next commit, we will adjust curcid of the snapshot being used when
scanning the columnar table.

However, for index scan, snapshot is provided not when beginning scan
but within fetch-tuple call.

For this reason, start flushing pending writes in init_columnar_read_state
since this seem to be a prerequisite step that needs to be done before
scanning a columnar table regardless of the scan method being used.
2021-09-02 11:10:11 +03:00
Onur Tirtir db0e4ce889 Increment command counter in FinishModifyRelation instead
Seems that we always increment the command counter right after
finishing metadata table modification.

For this reason, it makes sense to call CommandCounterIncrement
within FinishModifyRelation.
2021-09-02 11:10:11 +03:00
Onur Tirtir 0b4ed075b5 Use correct snapshot when reading a columnar table
Instead of using xact snapshot, use the snapshot provided
to columnarAM when scanning table.
2021-09-02 11:10:11 +03:00
Naisila Puka bd91df298f
Fixes ConnectionModifiedPlacement output for a failed transaction (#5198) 2021-08-31 18:58:46 +03:00
Naisila Puka 7755d5ed3a
Fixes order of citus_drop_all_shards arguments (#5200) 2021-08-31 18:25:38 +03:00
Naisila Puka acb5ae6ab6
Skip dropping shards when we know it's a partition (#5176) 2021-08-31 17:41:37 +03:00
SaitTalhaNisanci 5ae01303d4
Use get_attnum to find the attribute number of target entry (#5220)
* Use get_attnum to find the attribute number of target entry
2021-08-31 16:47:19 +03:00
Jelte Fennema 481f8be084
Fix crash in shard rebalancer when no distributed tables exist (#5205)
The logging of the amount of ignored moves crashed when no distributed
tables existed in a cluster. This also fixes in passing that the logging
of ignored moves logs the correct number of ignored moves if there
exist multiple colocation groups and all are rebalanced at the same time.
2021-08-31 14:15:24 +02:00
SaitTalhaNisanci d50830d4cc
Update failure tests README (#5197)
* Update failure tests README

I keep finding this page when trying to run failure tests, so updating the README that way:
https://github.com/pypa/pipenv/issues/3363#issuecomment-452171564

Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>

Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
2021-08-26 12:35:06 +03:00
Hanefi Onaldi 7e39c7ea83
Replace master with citus in logs and comments (#5210)
I replaced 

- master_add_node,
- master_add_inactive_node
- master_activate_node

with

- citus_add_node,
- citus_add_inactive_node
- citus_activate_node

respectively.
2021-08-26 11:31:17 +03:00
SaitTalhaNisanci b923d51fc6
Bump pg12 and pg13 images to pg12.8 and pg13.8 (#5208)
In our testing infra structure, even though we use pinned versions of postgres, the auxiliary libraries might pull in newer versions. This is for example the case for libpq, which will now use the libpq libraries from 14beta3.

The changes in this PR are a lot due to the libpq changes.

We also have changed the citus version that is used as a base for the citus upgrades, from 10.0 to 10.1 . This caused columnar to enforce some extra limits on the settings, which conflicted with our upgrade tests.

The changes in failure tests are due to the libpq changes.

There are also a lot of changes on isolation tests outputs, hence we
updated all of them.

Co-authored-by: Nils Dijk <nils@citusdata.com>
2021-08-25 16:04:57 +03:00
SaitTalhaNisanci c8326df8c0
Fix missing comma in connection options (#5206) 2021-08-25 13:40:42 +03:00
Jelte Fennema a31429aae5
Allow configuring tcp_user_timeout using citus.node_conn_info (#5203)
`tcp_user_timeout` is the awesome relatively unknown big brother of the
TCP keepalive related options. Instead of depending on keepalives being
sent, this determines that a socket is dead by waiting at most N seconds
for an ack of data that it has sent. It's exposed in libpq starting from
PG12.
2021-08-24 11:48:40 +03:00
Onur Tirtir 5af839ada0
Not print metapage.reserved_offset in regression tests (#5168)
* We were anyway not testing reserved_offset in any of those tests
   but other fields.

* This only happens with compressed columnar tables and is because the
   libzstd/liblz4 versions that we have on exttester ci image might be different
   than what we might have on our local environments.
2021-08-23 11:07:10 +03:00
Onur Tirtir 7dcd9380e7 Update index support section of columnar README 2021-08-23 10:35:11 +03:00
Onur Tirtir 3acd3ebae2 Remove temp table limitation from columnar README 2021-08-23 10:35:11 +03:00
Onur Tirtir 4e1201a333 Use RelationGetStatExtList instead of scanning pg_stats_ext 2021-08-18 17:50:58 +03:00
Onur Tirtir 4b03195c06 Use RelationGetStatExtList instead of GetExplicitStatisticsIdList 2021-08-18 17:50:57 +03:00
Onur Tirtir 91544d0191 Use PGIndexProcessor infra to find explicitly created indexes 2021-08-18 17:50:57 +03:00
Onur Tirtir 549ca4de6d Use RelationGetIndexList instead of scanning pg_index 2021-08-18 17:50:57 +03:00
Onur Tirtir fa9933daf3 Use get_am_name to find indexAM name 2021-08-18 00:44:37 +03:00
Nils Dijk dfc950ce1e
Fix a segfault caused by use after free in ConnectionsPlacementHash (#5170)
DESCRIPTION: Fix a segfault caused by use after free in ConnectionsPlacementHash

Fix a segfault caused by retaining data in any of the hashmaps making up the Placement Connection Management.

We have seen production systems segfault due to random data referenced from ConnectionPlacementHash.
On investigation we found that the backends segfaulting on this had OOM errors closely prior to the segfault.
It has shown there are at least 15 places where an allocation can OOM that would cause ConnectionPlacementHash to retain pointers to memory from contexts that are subsequently freed. This would reproduce the segfault we have observed in production.

Conditions for these allocations are:
 - allocated after first call to `AssociatePlacementWithShard`: https://github.com/citusdata/citus/blob/v10.0.3/src/backend/distributed/connection/placement_connection.c#L880-L881
 - allocated before `StartNodeUserDatabaseConnection`: https://github.com/citusdata/citus/blob/v10.0.3/src/backend/distributed/connection/connection_management.c#L291

At least 15 points of memory allocation (which could fail) are between the callsites of both in a primary key lookup on a reference table - where we have seen an OOM cause a segfault moments later.

Instead of leaving any references in ConnectionPlacementHash, ConnectionShardHash and ColocatedPlacementsHash that could retain any pointers that are freed due to the TopTransactionContext being reset we clear all these hashes irregardless of the state of CurrentCoordinatedTransactionState.

Downside is that on any transaction abort we will now iterate through 4 hashmaps and clear their contents. Given that they are either already empty, which should cause a quick iteration, or non-empty, causing segfaults in subsequent executions, this overhead seems reasonable.

A better solution would be to move the creation of these hashmaps so they would live in the TopTransactionContext themself, assuming their contents would never outlive a transaction. This needs more investigation and is an involved refactor Hence fixing this quickly here.
2021-08-17 17:42:35 +02:00
jeff-davis 4f213f293e
Columnar: use generate_series for test rather than load. (#5181) 2021-08-16 16:12:06 -07:00
Onur Tirtir 68f46c5dc9 Use scan context for intermediate mem allocs too 2021-08-16 11:06:03 +03:00
Onur Tirtir b3d9fc91f8 Always use right mem cxt when creating ColumnarReadState
All the callers except columnar_relation_copy_for_cluster were already
switching to right memory context when creating ColumnarReadState.

With this commit, we embed that logic into init_columnar_read_state
to avoid further such bugs.

That way, we start using the right memory context for
columnar_relation_copy_for_cluster too.
2021-08-16 11:06:03 +03:00
Onur Tirtir 7fcecde203 Use init_columnar_read_state instead of lower level func
Funtionally, this doesn't change anything. This is just a preparation
before next commit.
2021-08-16 11:06:03 +03:00
Burak Velioglu 4355ba0a38
Add CREATE INDEX ... ON ONLY and ALTER INDEX ... ATTACH PARTITION (#4938 #4980)
- Add support for CRETE INDEX ... ON ONLY: Before that commit we were not sending "ONLY" option to the worker nodes at all. With this commit, "ONLY" parameter will be sent to the worker nodes if it is necessary. (#4938)

- Add support for ALTER INDEX ... ATTACH PARTITION: Attach child_index to parent_index by creating same inheritance on shard level in addition to table level. (#4980)
2021-08-13 13:12:45 +03:00
SaitTalhaNisanci 2ec4e37e45
Fix assert failure in FindReferencedTableColumn (#5175) 2021-08-12 18:21:45 +03:00
Ahmet Gedemenli 9e90894f21
Synchronize hasmetadata flag on mx workers (#5086)
* Synchronize hasmetadata flag on mx workers

* Switch to sequential execution

* Add test

* Use SetWorkerColumn

* Add test for stop_sync

* Remove usage of UpdateHasmetadataOnWorkersWithMetadata

* Remove MarkNodeMetadataSynced

* Fix test for metadatasynced

* Remove MarkNodeMetadataSynced

* Style

* Remove MarkNodeHasMetadata

* Remove UpdateDistNodeBoolAttr

* Refactor SetWorkerColumn

* Use SetWorkerColumnLocalOnly when setting up dependencies

* Use SetWorkerColumnLocalOnly in TriggerSyncMetadataToPrimaryNodes

* Style

* Make update command generator functions static

* Set metadatasynced before syncing

* Call SetWorkerColumn only if the sync is successful

* Try to sync all nodes

* Fix indexno

* Update metadatasynced locally first

* Break if a node fails to sync metadata

* Send worker commands optional

* Style & Rebase

* Add raiseOnError param to SetWorkerColumn

* Style

* Set metadatasynced for all metadata nodes

* Style

* Introduce SetWorkerColumnOptional

* Polish

* Style

* Dont send set command to not synced metadata nodes

* Style

* Polish

* Add test for stop_sync

* Add test for shouldhaveshards

* Add test for isactive flag

* Sort by placementid in the function verify_metadata

* Cover edge cases for failing nodes

* Add comments

* Add nodeport to isactive test

* Add warning if metadata out of sync

* Update warning message
2021-08-12 14:16:18 +03:00
Naisila Puka e5b32b2c3c
Acquire AccessShareLock before updating table statistics (#5155) 2021-08-12 13:58:15 +03:00
Onder Kalaci d4368ff2b3 Make sure that shouldhaveshards is synced to workers 2021-08-11 15:53:31 +02:00
Onder Kalaci 86bd28b92c Guard against hard WaitEvenSet errors
In short, add wrappers around Postgres' AddWaitEventToSet() and
ModifyWaitEvent().

AddWaitEventToSet()/ModifyWaitEvent*() may throw hard errors. For
example, when the underlying socket for a connection is closed by
the remote server and already reflected by the OS, however
Citus hasn't had a chance to get this information. In that case,
if replication factor is >1, Citus can failover to other nodes
for executing the query. Even if replication factor = 1, Citus
can give much nicer errors.

So CitusAddWaitEventSetToSet()/CitusModifyWaitEvent() simply puts
AddWaitEventToSet()/ModifyWaitEvent() into a PG_TRY/PG_CATCH block
in order to catch any hard errors, and returns this information to
the caller.
2021-08-10 09:35:03 +02:00
Onder Kalaci 5f02d18ef8 transactional metadata sync for maintanince daemon
As we use the current user to sync the metadata to the nodes
with #5105 (and many other PRs), there is no reason that
prevents us to use the coordinated transaction for metadata syncing.

This commit also renames few functions to reflect their actual
implementation.
2021-08-09 10:34:55 +02:00
Onder Kalaci 35964c6366 Dropped columns do not diverge distribution column for partitioned tables
Before this commit, creating a partition after a DROP column
on the parent (position before dist. key) was leading to
partition to have the wrong distribution column.
2021-08-06 13:36:12 +02:00
jeff-davis deb7ec605b
Columnar: fix misleading comments and useless types. (#5162)
CustomScan and CustomPath structures cannot be extended with
additional fields. Fix comments and type structure that implied that
they can.
2021-08-05 09:22:21 -07:00
Ahmet Gedemenli 51d410bb7b Add check for alphabetically sorted gucs
Move to a separate script

Add the new script to readme
2021-08-05 16:37:49 +03:00
naisila 798a7902bf Fix master_update_table_statistics scripts for 9.5 2021-08-03 18:15:56 +03:00
naisila f9fa5a3d69 Fix master_update_table_statistics scripts for 9.4 2021-08-03 18:15:56 +03:00
Onder Kalaci 482b8096e9 Introduce citus_internal_update_relation_colocation
update_distributed_table_colocation can be called by the relation
owner, and internally it updates pg_dist_partition. With this
commit, update_distributed_table_colocation uses an internal
UDF to access pg_dist_partition.

As a result, this operation can now be done by regular users
on MX.
2021-08-03 11:44:58 +02:00
Onur Tirtir 93ebbb0607 Re-cost SeqPath's as well for columnar tables 2021-08-02 11:32:25 +03:00
Onur Tirtir 453ac40725 Comment why we still remove non IndexPath's when custom scan is off 2021-08-02 11:25:18 +03:00
Onur Tirtir a87405b6ba Not adjust IndexPath cost if indexscan is off 2021-08-02 11:25:18 +03:00
Onur Tirtir 51691a8994 Rename RecostColumnarIndexPaths to RecostColumnarPaths 2021-08-02 11:25:18 +03:00
Onur Tirtir 297f59a70e Re-cost columnar table index paths 2021-08-02 11:16:37 +03:00
Onur Tirtir 8adcf2096b Multiply ColumnarCustomScan cost by tblspace.seqpage cost 2021-08-02 11:16:37 +03:00
Onur Tirtir dba8421453 Refactor ColumnarScanCost into ColumnarPerChunkGroupScanCost 2021-08-02 11:16:37 +03:00
Onur Tirtir d8f92697f2
Free memory used for last stripe read when re-scanning a columnar table (#5143)
Instead of setting stripeReadState to NULL, call ColumnarResetRead
before re-scanning a columnar table since this function is already
designed for doing the necessary clean up when finishing a stripe
read.

Note that this change shouldn't have a great effect on memory usage
since AdvanceStripe was already doing the clean-up for all the
stripes except the last one.
2021-08-02 11:16:01 +03:00
Onur Tirtir 73058d35cc Not free (stripe) chunk buffers after de-serializing
Previously, we were only using chunk group reader for sequential scan.
However, to support index scans on columnar tables, now we use very
same low level functions for index scan too.

Since those low-level functions were only used for sequential scan, it
was guaranteed that we would never read the same chunk group more than
once, so we were freeing chunk buffers after deserializing them into a
separate buffer.

Now that we use those low level functions for index scan, we cannot
free chunk buffers since it's possible to read the same chunk group
again, such that:

- read chunk group 1 of stripe 5
- read chunk group 2 of stripe 5
- read chunk group 1 of stripe 5 again

Here, when we decide to read chunk group 1 for a second time,
chunk group 1 is not cached. Plus, before this commit, we were
freeing the chunk buffers for chunk group 1 after the first
read and then we were getting segfault or errors from low-level
de-compression APIs.
2021-08-02 11:00:12 +03:00
Onur Tirtir 327ae43b83 Get rid of EndStripeRead, since we anyway reset mem cxt 2021-08-02 11:00:12 +03:00
Onur Tirtir 83f5d42365 Use long-lasting mem cxt & optimize correlated index scan 2021-08-02 11:00:12 +03:00
Onur Tirtir c021b82a43 Introduce CreateColumnarScanMemoryContext 2021-08-02 11:00:12 +03:00
Onur Tirtir 84a49cc221 Improve error message for indexAMs not supported by columnar 2021-07-30 16:41:53 +03:00
Onur Tirtir 90e856d6bc Keep supported indexes when converting table to columnar 2021-07-30 16:41:01 +03:00
Onur Tirtir eeecbd2324 Introduce ColumnarSupportsIndexAM 2021-07-30 16:40:27 +03:00
Halil Ozan Akgul 286b0fe0e8 Corrects the endif comment 2021-07-29 17:22:31 +03:00
SaitTalhaNisanci 4559d02c41
Fix union pushdown issue (#5079)
* Fix UNION not being pushdown

Postgres optimizes column fields that are not needed in the output. We
were relying on these fields to understand if it is safe to push down a
union query.

This fix looks at the parse query, which has the original column fields
to detect if it is safe to push down a union query.

* Add more tests

* Simplify code and make it more robust

* Process varlevelsup > 0 in FindReferencedTableColumn

* Only look for outers vars in union path

* Add more comments

* Remove UNION ALL specific logic for pulling up childvars
2021-07-29 13:52:55 +03:00
Jelte Fennema 2aa67421a7
Fix showing target shard size in the rebalance progress monitor (#5136)
The progress monitor wouldn't actually update the size of the shard on
the target node when using "block_writes" as the `shard_transfer_mode`.
The reason for this is that the CREATE TABLE part of the shard creation
would only be committed once all data was moved as well. This caused
our size calculation to always return 0, since the table did not exist
yet in the session that the progress monitor used.

This is fixed by first committing creation of the table, and only then
starting the actual data copy.

The test output changes slightly. Apparently splitting this up in two
transactions instead of one, increases the table size after the copy by
about 40kB. The additional size used doesn't increase when with the
amount of data in the table is larger (it stays ~40kB per shard). So 
this small change in test output is not considered an actual problem.
2021-07-23 16:37:00 +02:00
Jelte Fennema 7d0b6dc9be Include data_type and cache in sequence definition on workers
These two options were not included when creating the sequences on the
workers as part of metadata syncing.

The missing `data_type` part of the definition made finding the cause
of #5126 harder than necessary, because of confusing errors.
2021-07-22 11:49:06 +02:00
Onder Kalaci 903489c763 Improve wording of an error message 2021-07-19 14:38:52 +02:00
Onder Kalaci c8368e7929 Introduce citus_internal_delete_shard_metadata
With this function, the owner of the table is allowed to remove
shard metadata. This is going to be useful for tenant-isolation.
2021-07-19 13:25:05 +02:00
Önder Kalacı 87a51ae552
CLUSTER ON deparser should consider schemas (#5122) 2021-07-16 19:13:18 +03:00
Jelte Fennema adf17a8cf1
Add upgrade and dowgrade tests for Citus 10.2 (#5120)
It seems we forgot to add this when starting 10.2 development.
2021-07-16 14:39:04 +02:00
Onder Kalaci 2c349e6dfd Use current user to sync metadata
Before this commit, we always synced the metadata with superuser.
However, that creates various edge cases such as visibility errors
or self distributed deadlocks or complicates user access checks.

Instead, with this commit, we use the current user to sync the metadata.
Note that, `start_metadata_sync_to_node` still requires super user
because accessing certain metadata (like pg_dist_node) always require
superuser (e.g., the current user should be a superuser).

However, metadata syncing operations regarding the distributed
tables can now be done with regular users, as long as the user
is the owner of the table. A table owner can still insert non-sense
metadata, however it'd only affect its own table. So, we cannot do
anything about that.
2021-07-16 13:25:27 +02:00
Onur Tirtir f00c63c33d
Support columnar table index builds with CONCURRENTLY option (#5032)
With this commit, we add (`CREATE INDEX` / `REINDEX`) `CONCURRENTLY` support for columnar tables.

For that, we implement `columnar_index_validate_scan` callback.
The reasoning behind the implementation is as follows:

* Postgres function `validate_index` provides all the TIDs that are currently in the
  index to `columnar_index_validate_scan` callback via a `tupleSort` object..

* We start scanning the table by using `columnar_getnextslot` as usual.
  Before moving forward, note that `columnar_getnextslot` guarantees
  to return tuples in the order of their TIDs.

* For us to use during table scan, postgres provides a snapshot guaranteeing
  that any tuples that are valid according to that snapshot but are not in the
  index must be added to the index.

* Then for each tuple that we read from our table, we continue iterating
  given `tupleSort` to find the first TID that is greater than or equal to our
  tuple's TID.

  If both TID's are equal to each other, then we skip the tuple since it's already
  indexed.

  If the TID that we read from tupleSort is greater then our tuple's TID, then
  we decide to insert this tuple into index.
2021-07-09 13:44:58 +03:00
Onur Tirtir ea5fe022a4
Be more explicit when doing ordered scan on columnar cat. tables (#5026)
systable_getnext already uses ForwardScanDirection if relation has any
open indexes, but let's be more explicit doing ordered scan on columnar
catalog tables.
2021-07-09 13:24:27 +03:00
Hanefi Onaldi efc5776451
Remove public schema dependency for 10.1 upgrades
This commit contains a subset of the changes that should be cherry
picked to 10.1 releases.
2021-07-09 02:08:22 +03:00
Hanefi Onaldi 8e9cc229ff
Remove public schema dependency for 10.0 upgrades
This commit contains a subset of the changes that should be cherry
picked to 10.0 releases.
2021-07-09 02:08:22 +03:00
Ahmet Gedemenli ed3b98a80b
Add failure test for stop_metadata_sync_to_node (#5102) 2021-07-08 18:23:19 +03:00
Nils Dijk 18652ef9ff
fix 10.1-1 upgrade script to adhere to idempotency 2021-07-08 12:24:52 +02:00
Nils Dijk e5517dc7b3
fix 9.5-2 upgrade script to adhere to idempotency 2021-07-08 12:24:52 +02:00
Nils Dijk 366796a72e
Add test for idempotency of citus_prepare_pg_upgrade 2021-07-08 12:24:51 +02:00
Onur Tirtir 7bfd84bc70 Introduce StripeGetHighestRowNumber 2021-07-07 11:01:39 +03:00
Onur Tirtir 8942086506 Remove stripeList & currentStripe from ColumnarReadState 2021-07-07 11:01:39 +03:00
Onur Tirtir 16dee73b10 Refactor FindStripeByRowNumber into StripeMetadataLookupRowNumber
Push the most logic in FindStripeByRowNumber down to an helper function
to re-use it in next commit.
2021-07-07 11:01:38 +03:00
Marco Slot 214c674989
Fix PG upgrade scripts for 10.1 2021-07-05 14:38:26 +02:00
Marco Slot b14955c2bd
Fix PG upgrade scripts for 10.0 2021-07-05 14:38:20 +02:00
Marco Slot 3c0dfc12c0
Fix PG upgrade scripts for 9.5 2021-07-05 13:39:35 +02:00
Marco Slot bee202aa39
Fix PG upgrade scripts for 9.4 2021-07-05 13:39:28 +02:00
Onur Tirtir b118d4188e
Fix lower boundary calculation when pruning range dist table shards (#5082)
This happens only when we have a "<" or "<=" filter on distribution
column of a range distributed table and that filter falls in between
two shards.

When the filter falls in between two shards:

  If the filter is ">" or ">=", then UpperShardBoundary was
  returning "upperBoundIndex - 1", where upperBoundIndex is
  exclusive shard index used during binary seach.
  This is expected since upperBoundIndex is an exclusive
  index.
 
  If the filter is "<" or "<=", then LowerShardBoundary was
  returning "lowerBoundIndex + 1", where lowerBoundIndex is
  inclusive shard index used during binary seach.
  On the other hand, since lowerBoundIndex is an inclusive
  index, we should just return lowerBoundIndex instead of
  doing "+ 1". Before this commit, we were missing leftmost
  shard in such queries.

* Remove useless conditional branches

The branch that we delete from UpperShardBoundary was obviously useless.

The other one in LowerShardBoundary became useless after we remove "+ 1"
from there.

This indeed is another proof of what & how we are fixing with this pr.

* Improve comments and add more

* Add some tests for upper bound calculation too
2021-07-02 14:48:21 +03:00
Ahmet Gedemenli 8bae58fdb7
Add parameter to cleanup metadata (#5055)
* Add parameter to cleanup metadata

* Set clear metadata default to true

* Add test for clearing metadata

* Separate test file for start/stop metadata syncing

* Fix stop_sync bug for secondary nodes

* Use PreventInTransactionBlock

* DRemovedebuggiing logs

* Remove relation not found logs from mx test

* Revert localGroupId when doing stop_sync

* Move metadata sync test to mx schedule

* Add test with name that needs to be quoted

* Add test for views and matviews

* Add test for distributed table with custom type

* Add comments to test

* Add test with stats, indexes and constraints

* Fix matview test

* Add test for dropped column

* Add notice messages to stop_metadata_sync

* Add coordinator check to stop metadat sync

* Revert local_group_id only if clearMetadata is true

* Add a final check to see the metadata is sane

* Remove the drop verbosity in test

* Remove table description tests from sync test

* Add stop sync to coordinator test

* Change the order in stop_sync

* Add test for hybrid (columnar+heap) partitioned table

* Change error to notice for stop sync to coordinator

* Sync at the end of the test to prevent any failures

* Add test case in a transaction block

* Remove relation not found tests
2021-07-01 16:23:53 +03:00
Sait Talha Nisanci e7ed16c296 Not include to-be-deleted shards while finding shard placements
Ignore orphaned shards in more places

Only use active shard placements in RouterInsertTaskList

Use IncludingOrphanedPlacements in some more places

Fix comment

Add tests
2021-06-28 13:05:31 +03:00
Jelte Fennema 802225940e
Make clear that IsTableLocallyAccessible is only for citus local tables (#5075)
The name and comment of this function did not indicate that it only
really could detect locally accessible citus local tables. This fixes
that, while also cleaning up the function a bit.
2021-06-28 11:47:21 +02:00
Naisila Puka fe5907ad2d
Adds propagation of ALTER SEQUENCE and other improvements (#5061)
* Alter seq type when we first use the seq in a dist table

* Don't allow type changes when seq is used in dist table

* ALTER SEQUENCE propagation

* Tests for ALTER SEQUENCE propagation

* Relocate AlterSequenceType and ensure dependencies for sequence

* Support for citus local tables, and other fixes

* Final formatting
2021-06-24 21:23:25 +03:00
Jelte Fennema e9bfb8eddd
Fix check to always allow foreign keys to reference tables (#5073)
With the previous version of this check we would disallow distributed
tables that did not have a colocationid, to have a foreign key to a
reference table. This fixes that, since there's no reason to disallow
that.
2021-06-24 12:15:52 +02:00
Jelte Fennema f4a2d99ce9
Harden ReplicateShardToNode to unexpected placements (#5071)
Originally ReplicateShardToNode was meant for
`upgrade_to_reference_table`, which required handling of existing inactive
placements. These days `upgrade_to_reference_table` is deprecated and
cannot be used anymore. Now that we have SHARD_STATE_TO_DELETE too, this
left over code seemed error prone. So this removes support for
activating inactive reference table placemements, since these should not
be possible. If it finds a non active reference table placement anyway
it now errors out.

This also removes a few outdated comments related to `upgrade_to_refeference_table`.
2021-06-24 13:11:02 +03:00
Jelte Fennema d1d386a904
Only allow moves of shards of distributed tables (#5072)
Moving shards of reference tables was possible in at least one case:
```sql
select citus_disable_node('localhost', 9702);
create table r(x int);
select create_reference_table('r');
set citus.replicate_reference_tables_on_activate = off;
select citus_activate_node('localhost', 9702);
select citus_move_shard_placement(102008, 'localhost', 9701, 'localhost', 9702);
```

This would then remove the reference table shard on the source, causing
all kinds of issues. This fixes that by disallowing all shard moves
except for shards of distributed tables.

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2021-06-23 16:25:46 +02:00
Onder Kalaci 75847d10b5 Add regression tests for changing column type with fkey
closes https://github.com/citusdata/citus/issues/2337 as it doesn't
apply anymore.
2021-06-23 09:03:55 +03:00
Onder Kalaci 55ed93bf0d fix regression tests to avoid any conflicts in enterprise 2021-06-22 08:45:17 +03:00
Jelte Fennema ca00b63272
Avoid two race conditions in the rebalance progress monitor (#5050)
The first and main issue was that we were putting absolute pointers into
shared memory for the `steps` field of the `ProgressMonitorData`. This
pointer was being overwritten every time a process requested the monitor
steps, which is the only reason why this even worked in the first place.

To quote a part of a relevant stack overflow answer:

> First of all, putting absolute pointers in shared memory segments is
> terrible terible idea - those pointers would only be valid in the
> process that filled in their values. Shared memory segments are not
> guaranteed to attach at the same virtual address in every process.
> On the contrary - they attach where the system deems it possible when
> `shmaddr == NULL` is specified on call to `shmat()`

Source: https://stackoverflow.com/a/10781921/2570866

In this case a race condition occurred when a second process overwrote
the pointer in between the first process its write and read of the steps
field.

This issue is fixed by not storing the pointer in shared memory anymore.
Instead we now calculate it's position every time we need it.

The second race condition I have not been able to trigger, but I found
it while investigating this. This issue was that we published the handle
of the shared memory segment, before we initialized the data in the
steps. This means that during initialization of the data, a call to
`get_rebalance_progress()` could read partial data in an unsynchronized
manner.
2021-06-21 14:03:42 +00:00
Onder Kalaci 76ae5dd0db Improve regression tests for prepared statements
With a recent commit, we made (644b266dee)
the behaviour of prepared statements for local cached plans has
slightly changed.

Now, Citus caches the plans when they are re-used. This make triggering
of local cached plans on the 7th execution, and 8th execution is the
first time the plan is used from the cached.

So, the tests are improved to cover 8th execution.
2021-06-21 13:34:44 +03:00
Onder Kalaci 69ca943e58 Deparse/parse the local cached queries
With local query caching, we try to avoid deparse/parse stages as the
operation is too costly.

However, we can do deparse/parse operations once per cached queries, right
before we put the plan into the cache. With that, we avoid edge
cases like (4239) or (5038).

In a sense, we are making the local plan caching behave similar for non-cached
local/remote queries, by forcing to deparse the query once.
2021-06-21 12:24:29 +03:00
Onur Tirtir 82e58c91f3
Use correct test schedule name in columnar vg test target (#5027) 2021-06-18 11:31:16 +03:00
Onur Tirtir 6215a3aa93 Merge remote-tracking branch 'origin/master' into columnar-index 2021-06-17 14:31:12 +03:00
Hanefi Onaldi c4f50185e0
Ignore pl/pgsql line numbers in regression outputs (#4411) 2021-06-17 14:11:17 +03:00
SaitTalhaNisanci 3edef11a9f
Fix a test in hyperscale schedule (#5042) 2021-06-17 13:40:05 +03:00
Onder Kalaci bc09288651 Get ready for Improve index backed constraint creation for online rebalancer
See:
https://github.com/citusdata/citus-enterprise/issues/616
2021-06-17 13:05:56 +03:00
Onur Tirtir 681f700321 Fix first_row_number test for stripe_row_limit enforcement 2021-06-17 10:51:43 +03:00
Onur Tirtir 18fe0311c0 Move rest of the schema changes to 10.2-1 2021-06-16 20:43:41 +03:00
Onur Tirtir 07117b0454 Move sql files for upgrade/downgrade_columnar_storage to 10.2-1 2021-06-16 20:40:26 +03:00
Onur Tirtir 3d11c0f9ef Merge remote-tracking branch 'origin/master' into columnar-index
Conflicts:
	src/test/regress/expected/columnar_empty.out
	src/test/regress/expected/multi_extension.out
2021-06-16 20:23:50 +03:00
Onur Tirtir b6b969971a Error out for CLUSTER commands on columnar tables 2021-06-16 20:06:33 +03:00
Onur Tirtir 5adab2a3ac Report progress when building index on columnar tables 2021-06-16 20:06:33 +03:00
Onur Tirtir 9b4dc2f804 Prevent using parallel scan for columnar index builds 2021-06-16 19:59:32 +03:00
Onur Tirtir 82ea1b5daf Not remove all paths, keep IndexPath's 2021-06-16 19:59:32 +03:00
Onur Tirtir 1af50e98b3 Fix a comment in ColumnarMetapageRead 2021-06-16 19:59:32 +03:00
Onur Tirtir 10a762aa88 Implement columnar index support functions 2021-06-16 19:59:32 +03:00
Halil Ozan Akgul db03afe91e Bump citus version to 10.2devel 2021-06-16 17:44:05 +03:00
Ahmet Gedemenli 5115100db0
Set table size to zero if no size is read (#5049)
* Set table size to zero if no size is read

* Add comment to relation size bug fix
2021-06-16 17:23:19 +03:00
SaitTalhaNisanci 1784c7ef85
Merge branch 'master' into split_multi 2021-06-16 15:26:09 +03:00
Sait Talha Nisanci c7d04e7f40 swap multi_schedule and multi_schedule_1 2021-06-16 14:40:14 +03:00
Sait Talha Nisanci c55e44a4af Drop table if exists 2021-06-16 14:19:59 +03:00
Sait Talha Nisanci fc89487e93 Split check multi 2021-06-16 14:19:59 +03:00
Naisila Puka e26b29d3bb
Fix nextval('seq_name'::text) bug, and schema for seq tests (#5046) 2021-06-16 13:58:49 +03:00
Marco Slot a7e4d6c94a Fix a bug that causes worker_create_or_alter_role to crash with NULL input 2021-06-15 20:07:08 +02:00
Jelte Fennema 4c3934272f
Improve performance of citus_shards (#5036)
We were effectively joining on a calculated column because of our calls
to `shard_name`. This caused a really bad plan to be generated. In my
specific case it was taking ~18 seconds to show the output of
citus_shards. It had this explain plan:

```
                                                                                                       QUERY PLAN
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 Subquery Scan on citus_shards  (cost=18369.74..18437.34 rows=5408 width=124) (actual time=18277.461..18278.509 rows=5408 loops=1)
   ->  Sort  (cost=18369.74..18383.26 rows=5408 width=156) (actual time=18277.457..18277.726 rows=5408 loops=1)
         Sort Key: ((pg_dist_shard.logicalrelid)::text), pg_dist_shard.shardid
         Sort Method: quicksort  Memory: 1629kB
         CTE shard_sizes
           ->  Function Scan on citus_shard_sizes  (cost=0.00..10.00 rows=1000 width=40) (actual time=71.137..71.934 rows=5413 loops=1)
         ->  Hash Join  (cost=177.62..18024.42 rows=5408 width=156) (actual time=77.985..18257.237 rows=5408 loops=1)
               Hash Cond: ((pg_dist_shard.logicalrelid)::oid = (pg_dist_partition.logicalrelid)::oid)
               ->  Hash Join  (cost=169.81..371.98 rows=5408 width=48) (actual time=1.415..13.166 rows=5408 loops=1)
                     Hash Cond: (pg_dist_placement.groupid = pg_dist_node.groupid)
                     ->  Hash Join  (cost=168.68..296.49 rows=5408 width=16) (actual time=1.403..10.011 rows=5408 loops=1)
                           Hash Cond: (pg_dist_placement.shardid = pg_dist_shard.shardid)
                           ->  Seq Scan on pg_dist_placement  (cost=0.00..113.60 rows=5408 width=12) (actual time=0.004..3.684 rows=5408 loops=1)
                                 Filter: (shardstate = 1)
                           ->  Hash  (cost=101.08..101.08 rows=5408 width=12) (actual time=1.385..1.386 rows=5408 loops=1)
                                 Buckets: 8192  Batches: 1  Memory Usage: 318kB
                                 ->  Seq Scan on pg_dist_shard  (cost=0.00..101.08 rows=5408 width=12) (actual time=0.003..0.688 rows=5408 loops=1)
                     ->  Hash  (cost=1.06..1.06 rows=6 width=40) (actual time=0.007..0.007 rows=6 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 9kB
                           ->  Seq Scan on pg_dist_node  (cost=0.00..1.06 rows=6 width=40) (actual time=0.004..0.005 rows=6 loops=1)
               ->  Hash  (cost=5.69..5.69 rows=169 width=130) (actual time=0.070..0.071 rows=169 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 36kB
                     ->  Seq Scan on pg_dist_partition  (cost=0.00..5.69 rows=169 width=130) (actual time=0.009..0.041 rows=169 loops=1)
               SubPlan 2
                 ->  Limit  (cost=0.00..3.25 rows=1 width=8) (actual time=3.370..3.370 rows=1 loops=5408)
                       ->  CTE Scan on shard_sizes  (cost=0.00..32.50 rows=10 width=8) (actual time=3.369..3.369 rows=1 loops=5408)
                             Filter: ((shard_name(pg_dist_shard.logicalrelid, pg_dist_shard.shardid) = table_name) OR (('public.'::text || shard_name(pg_dist_shard.logicalrelid, pg_dist_shard.shardid)) = table_name))
                             Rows Removed by Filter: 2707
 Planning Time: 0.705 ms
 Execution Time: 18278.877 ms
```

With the changes it only takes 180ms to show the same output:
```
                                                                              QUERY PLAN
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 Sort  (cost=904.59..918.11 rows=5408 width=156) (actual time=182.508..182.960 rows=5408 loops=1)
   Sort Key: ((pg_dist_shard.logicalrelid)::text), pg_dist_shard.shardid
   Sort Method: quicksort  Memory: 1629kB
   ->  Hash Join  (cost=418.03..569.27 rows=5408 width=156) (actual time=136.333..146.591 rows=5408 loops=1)
         Hash Cond: ((pg_dist_shard.logicalrelid)::oid = (pg_dist_partition.logicalrelid)::oid)
         ->  Hash Join  (cost=410.22..492.83 rows=5408 width=56) (actual time=136.231..140.132 rows=5408 loops=1)
               Hash Cond: (pg_dist_placement.groupid = pg_dist_node.groupid)
               ->  Hash Right Join  (cost=409.09..417.34 rows=5408 width=24) (actual time=136.218..138.890 rows=5408 loops=1)
                     Hash Cond: ((((regexp_matches(citus_shard_sizes.table_name, '_(\d+)$'::text))[1])::integer) = pg_dist_shard.shardid)
                     ->  HashAggregate  (cost=45.00..48.50 rows=200 width=12) (actual time=131.609..132.481 rows=5408 loops=1)
                           Group Key: ((regexp_matches(citus_shard_sizes.table_name, '_(\d+)$'::text))[1])::integer
                           Batches: 1  Memory Usage: 737kB
                           ->  Result  (cost=0.00..40.00 rows=1000 width=12) (actual time=107.786..129.831 rows=5408 loops=1)
                                 ->  ProjectSet  (cost=0.00..22.50 rows=1000 width=40) (actual time=107.780..128.492 rows=5408 loops=1)
                                       ->  Function Scan on citus_shard_sizes  (cost=0.00..10.00 rows=1000 width=40) (actual time=107.746..108.107 rows=5414 loops=1)
                     ->  Hash  (cost=296.49..296.49 rows=5408 width=16) (actual time=4.595..4.598 rows=5408 loops=1)
                           Buckets: 8192  Batches: 1  Memory Usage: 339kB
                           ->  Hash Join  (cost=168.68..296.49 rows=5408 width=16) (actual time=1.702..3.783 rows=5408 loops=1)
                                 Hash Cond: (pg_dist_placement.shardid = pg_dist_shard.shardid)
                                 ->  Seq Scan on pg_dist_placement  (cost=0.00..113.60 rows=5408 width=12) (actual time=0.004..0.837 rows=5408 loops=1)
                                       Filter: (shardstate = 1)
                                 ->  Hash  (cost=101.08..101.08 rows=5408 width=12) (actual time=1.683..1.685 rows=5408 loops=1)
                                       Buckets: 8192  Batches: 1  Memory Usage: 318kB
                                       ->  Seq Scan on pg_dist_shard  (cost=0.00..101.08 rows=5408 width=12) (actual time=0.004..0.824 rows=5408 loops=1)
               ->  Hash  (cost=1.06..1.06 rows=6 width=40) (actual time=0.007..0.008 rows=6 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 9kB
                     ->  Seq Scan on pg_dist_node  (cost=0.00..1.06 rows=6 width=40) (actual time=0.004..0.006 rows=6 loops=1)
         ->  Hash  (cost=5.69..5.69 rows=169 width=130) (actual time=0.079..0.079 rows=169 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 36kB
               ->  Seq Scan on pg_dist_partition  (cost=0.00..5.69 rows=169 width=130) (actual time=0.011..0.046 rows=169 loops=1)
 Planning Time: 0.789 ms
 Execution Time: 184.095 ms
 ```
2021-06-14 13:32:30 +02:00
Onur Tirtir a209999618
Enforce table opt constraints when using alter_columnar_table_set (#5029) 2021-06-08 17:39:16 +03:00
Hanefi Onaldi 5c6069a74a
Do not rely on fk cache when truncating local data (#5018) 2021-06-07 11:56:48 +03:00
Marco Slot e81d25a7be Refactor RelationIsAKnownShard to remove onlySearchPath argument 2021-06-02 14:30:27 +02:00
Ahmet Gedemenli 089ef35940 Disable dropping and truncating known shards
Add test for disabling dropping and truncating known shards
2021-06-02 14:30:27 +02:00
Jelte Fennema 1a83628195 Use "orphaned shards" naming in more places
We were not very consistent in how we named these shards.
2021-06-04 11:39:19 +02:00
Jelte Fennema 3f60e4f394 Add ExecuteCriticalCommandInDifferentTransaction function
We use this pattern multiple times throughout the codebase now. Seems
like a good moment to abstract it away.
2021-06-04 11:30:27 +02:00
Jelte Fennema 503c70b619 Cleanup orphaned shards before moving when necessary
A shard move would fail if there was an orphaned version of the shard on
the target node. With this change before actually fail, we try to clean
up orphaned shards to see if that fixes the issue.
2021-06-04 11:23:07 +02:00
Jelte Fennema 280b9ae018 Cleanup orphaned shards at the start of a rebalance
In case the background daemon hasn't cleaned up shards yet, we do this
manually at the start of a rebalance.
2021-06-04 11:23:07 +02:00
Jelte Fennema 7015049ea5 Add citus_cleanup_orphaned_shards UDF
Sometimes the background daemon doesn't cleanup orphaned shards quickly
enough. It's useful to have a UDF to trigger this removal when needed.
We already had a UDF like this but it was only used during testing. This
exposes that UDF to users. As a safety measure it cannot be run in a
transaction, because that would cause the background daemon to stop
cleaning up shards while this transaction is running.
2021-06-04 11:23:07 +02:00
Naisila Puka 0f37ab5f85
Fixes column default coming from a sequence (#4914)
* Add user-defined sequence support for MX

* Remove default part when propagating to workers

* Fix ALTER TABLE with sequences for mx tables

* Clean up and add tests

* Propagate DROP SEQUENCE

* Removing function parts

* Propagate ALTER SEQUENCE

* Change sequence type before propagation & cleanup

* Revert "Propagate ALTER SEQUENCE"

This reverts commit 2bef64c5a29f4e7224a7f43b43b88e0133c65159.

* Ensure sequence is not used in a different column with different type

* Insert select tests

* Propagate rename sequence stmt

* Fix issue with group ID cache invalidation

* Add ALTER TABLE ALTER COLUMN TYPE .. precaution

* Fix attnum inconsistency and add various tests

* Add ALTER SEQUENCE precaution

* Remove Citus hook

* More tests

Co-authored-by: Marco Slot <marco.slot@gmail.com>
2021-06-03 23:02:09 +03:00
Marco Slot c03729ad03 Only warn about reference tables when removing last node 2021-06-01 10:53:12 +02:00
Hanefi Onaldi 056005db4d
Improve tests for truncating local data (#5012)
We have a slightly different behavior when using truncate_local_data_after_distributing_table UDF on metadata synced clusters. This PR aims to add tests to cover such cases.

We allow distributing tables with data that have foreign keys to reference tables only on metadata synced clusters. This is the reason why some of my earlier tests failed when run on a single node Citus cluster.
2021-06-03 08:51:32 +03:00
Hanefi Onaldi fa29d6667a
Accept invalidation before fk graph validity check (#5017)
InvalidateForeignKeyGraph sends an invalidation via shared memory to all
backends, including the current one.

However, we might not call AcceptInvalidationMessages before reading
from the cache below. It would be better to also add a call to
AcceptInvalidationMessages in IsForeignConstraintRelationshipGraphValid.
2021-06-02 14:45:35 +03:00
Ahmet Gedemenli 103cf34418 Sort GUCs in alphabetical order 2021-06-02 12:52:18 +03:00
Jelte Fennema b1cad26ebc Move CheckCitusVersion to the top of each function
Previously this was usually done after argument parsing. This can cause
SEGFAULTs if the number or type of arguments changes in a new version.
By checking that Citus version is correct before doing any argument
parsing we protect against these types of issues. Issues like this have
occurred in pg_auto_failover, so it's not just a theoretical issue.

The main reason why these calls were not at the top of functions is
really just historical. It was because in the past we didn't allow
statements before declarations. Thus having this check before the
argument parsing would have only been possible if we first declared all
variables.

In addition to moving existing CheckCitusVersion calls it also adds
these calls to rebalancer related functions (they were missing there).
2021-06-01 17:43:46 +02:00
Ahmet Gedemenli 0fbddc740d Fix shard id difference for enterprise 2021-06-01 17:17:46 +03:00
Jelte Fennema 4c20bf7a36
Remove pg_dist_rebalence_strategy_enterprise_check (#5014)
This is not necessary anymore now that the rebalancer is open source.
2021-06-01 06:16:46 -07:00
Ahmet Gedemenli 69d39c0e8b Fix relname null bug when parallel execution 2021-06-01 14:14:35 +03:00
Ahmet Gedemenli 9638933d9d Remove function GenerateNewTargetEntriesForSortClauses 2021-06-01 12:35:36 +03:00
Jelte Fennema d3feee37ea
Add a simple python script to generate a new test (#3972)
The current default citus settings for tests are not really best
practice anymore. However, we keep them because lots of tests depend on
them.

I noticed that I created the same test harness for new tests I added all
the time. This is a simple script that generates that harness, given a
name for the test.

To run:

src/test/regress/bin/create_test.py my_awesome_test
2021-06-01 11:22:11 +02:00
Onur Tirtir 94f30a0428 Refactor index check in ColumnarProcessUtility 2021-06-01 11:12:28 +03:00
SaitTalhaNisanci c72d2b479b
Add tests for union pushdown workaround (#5005) 2021-05-31 20:02:20 +02:00
Jelte Fennema 3271f1bd13
Fix data race in get_rebalance_progress (#5008)
To be able to report progress of the rebalancer, the rebalancer updates
the state of a shard move in a shared memory segment. To then fetch the
progress, `get_rebalance_progress` can be called which reads this shared
memory.

Without this change it did so without using any synchronization
primitives, allowing for data races. This fixes that by using atomic
operations to update and read from the parts of the shared memory that
can be changed after initialization.
2021-05-31 15:27:32 +02:00
SaitTalhaNisanci 8c3f85692d
Not consider old placements when disabling or removing a node (#4960)
* Not consider old placements when disabling or removing a node

* update cluster test
2021-05-28 22:38:20 +02:00
SaitTalhaNisanci 40a229976f
Fix flaky test because of parallel metadata syncing (#5004) 2021-05-28 13:19:15 +03:00
SaitTalhaNisanci a20cc3b36a
Only consider shard state 1 in citus shards (#4970) 2021-05-28 11:33:48 +03:00
SaitTalhaNisanci a4944a2102
Rename CoordinatedTransactionShouldUse2PC (#4995) 2021-05-21 18:57:42 +03:00
Hanefi Onaldi 4941f00a95
Do not run ref2ref tests in parallel 2021-05-21 16:14:59 +03:00
Hanefi Onaldi c160325d07
Use streaming replication when repl factor = 1 2021-05-21 16:14:59 +03:00
Hanefi Onaldi 878513f325
Remove all occurences of replication_model GUC 2021-05-21 16:14:59 +03:00
SaitTalhaNisanci 87e3a5e24a
Use 2PC when using a node connection (#4997) 2021-05-21 14:58:53 +03:00
SaitTalhaNisanci 82f34a8d88
Enable citus.defer_drop_after_shard_move by default (#4961)
Enable citus.defer_drop_after_shard_move by default
2021-05-21 10:48:32 +03:00
Nils Dijk d7dd247fb5
fix shared dependencies that are not resident in a database (#4992)
DESCRIPTION: fix shared dependencies that are not resident in a database

eg. databases depend on users (their owners) that both don’t have a
database they reside in. These dependencies are recorded in pg_shdepend
with a `dbid` of `InvalidOid` When we fetch our shared dependencies we don’t take
these links in account.

With this patch we use logic inspired by `classIdGetDbId` to decide when to use `MyDatabaseId` vs `InvalidOid` to correctly resolve dependencies between shared objects.
2021-05-20 08:55:02 -07:00
Jelte Fennema 10f06ad753 Fetch shard size on the fly for the rebalance monitor
Without this change the rebalancer progress monitor gets the shard sizes
from the `shardlength` column in `pg_dist_placement`. This column needs to
be updated manually by calling `citus_update_table_statistics`.
However, `citus_update_table_statistics` could lead to distributed
deadlocks while database traffic is on-going (see #4752).

To work around this we don't use `shardlength` column anymore. Instead
for every rebalance we now fetch all shard sizes on the fly.

Two additional things this does are:
1. It adds tests for the rebalance progress function.
2. If a shard move cannot be done because a source or target node is
   unreachable, then we error in stop the rebalance, instead of showing
   a warning and continuing. When using the by_disk_size rebalance
   strategy it's not safe to continue with other moves if a specific
   move failed. It's possible that the failed move made space for the
   next move, and because the failed move never happened this space now
   does not exist.
3. Adds two new columns to the result of `get_rebalancer_progress` which
   shows the size of the shard on the source and target node.

Fixes #4930
2021-05-20 16:38:17 +02:00
Nils Dijk a6c2d2a4c4
Feature: alter database owner (#4986)
DESCRIPTION: Add support for ALTER DATABASE OWNER

This adds support for changing the database owner. It achieves this by marking the database as a distributed object. By marking the database as a distributed object it will look for its dependencies and order the user creation commands (enterprise only) before the alter of the database owner. This is mostly important when adding new nodes.

By having the database marked as a distributed object it can easily understand for which `ALTER DATABASE ... OWNER TO ...` commands to propagate by resolving the object address of the database and verifying it is a distributed object, and hence should propagate changes of owner ship to all workers.

Given the ownership of the database might have implications on subsequent commands in transactions we force sequential mode for transactions that have a `ALTER DATABASE ... OWNER TO ...` command in them. This will fail the transaction with meaningful help when the transaction already executed parallel statements.

By default the feature is turned off since roles are not automatically propagated, having it turned on would cause hard to understand errors for the user. It can be turned on by the user via setting the `citus.enable_alter_database_owner`.
2021-05-20 13:27:44 +02:00
Onder Kalaci d07db99ea4 Make sure that target node in shard moves is eligable for shard move 2021-05-20 10:51:01 +02:00
Onder Kalaci 926069a859 Wait until all connections are successfully established
Comment from the code:
/*
 * Iterate until all the tasks are finished. Once all the tasks
 * are finished, ensure that that all the connection initializations
 * are also finished. Otherwise, those connections are terminated
 * abruptly before they are established (or failed). Instead, we let
 * the ConnectionStateMachine() to properly handle them.
 *
 * Note that we could have the connections that are not established
 * as a side effect of slow-start algorithm. At the time the algorithm
 * decides to establish new connections, the execution might have tasks
 * to finish. But, the execution might finish before the new connections
 * are established.
 */

 Note that the abruptly terminated connections lead to the following errors:

2020-11-16 21:09:09.800 CET [16633] LOG:  could not accept SSL connection: Connection reset by peer
2020-11-16 21:09:09.872 CET [16657] LOG:  could not accept SSL connection: Undefined error: 0
2020-11-16 21:09:09.894 CET [16667] LOG:  could not accept SSL connection: Connection reset by peer

To easily reproduce the issue:

- Create a single node Citus
- Add the coordinator to the metadata
- Create a distributed table with shards on the coordinator
- f.sql:  select count(*) from test;
- pgbench -f /tmp/f.sql postgres -T 12 -c 40 -P 1  or pgbench -f /tmp/f.sql postgres -T 12 -c 40 -P 1 -C
2021-05-19 15:59:13 +02:00
Onder Kalaci 995adf1a19 Executor takes connection establishment and task execution costs into account
With this commit, the executor becomes smarter about refrain to open
new connections. The very basic example is that, if the connection
establishments take 1000ms and task executions as 5 msecs, the executor
becomes smart enough to not establish new connections.
2021-05-19 15:48:07 +02:00
Onder Kalaci 28b0b4ebd1 Move slow start increment to generic place 2021-05-19 14:31:20 +02:00
Marco Slot 715dce1eea Reduce local insert memory usage during deparsing 2021-05-18 16:11:43 +02:00
Marco Slot 644b266dee Only cache local plans when reusing a distributed plan 2021-05-18 16:11:43 +02:00
Marco Slot 00792831ad Add execution memory contexts and free after local query execution 2021-05-18 16:11:43 +02:00
Jelte Fennema 924959fdb1
Include result type in upgrade diff test (#4987)
We often change result types of functions slightly. Our downgrade tests
wouldn't notice these changes. This change adds them to the description
of these items.

An example of an SQL change that isn't caught without this change and is
caught with the get_rebalance_progress change in this PR:
https://github.com/citusdata/citus/pull/4963
2021-05-18 16:25:39 +02:00
SaitTalhaNisanci ff2a125a5b
Lookup hostname before execution (#4976)
We lookup the hostname just before the execution so that even if there are cached entries in the prepared statement cache we use the updated entries.
2021-05-18 16:46:31 +03:00
SaitTalhaNisanci eaa7d2bada
Not block maintenance daemon (#4972)
It was possible to block maintenance daemon by taking an SHARE ROW
EXCLUSIVE lock on pg_dist_placement. Until the lock is released
maintenance daemon would be blocked.

We should not block the maintenance daemon under any case hence now we
try to get the pg_dist_placement lock without waiting, if we cannot get
it then we don't try to drop the old placements.
2021-05-17 03:22:35 -07:00
Nils Dijk c91f8d8a15
Feature: localhost guc (#4836)
DESCRIPTION: introduce `citus.local_hostname` GUC for connections to the current node

Citus once in a while needs to connect to itself for some systems operations. This used to be hardcoded to `localhost`. The hardcoded hostname causes some issues, for example in environments where `sslmode=verify-full` is required. It is not always desirable or even feasible to get `localhost` as an alt name on the certificate.

By introducing a GUC to use when connecting to the current instance the user has more control what network path is used and what hostname is required to be present in the server certificate.
2021-05-12 16:59:44 +02:00
Hanefi Onaldi 13808b60cf
Update gitignore files 2021-05-12 09:49:07 +03:00
Jelte Fennema cbbd10b974
Implement an improvement threshold in the rebalancer (#4927)
Every move in the rebalancer algorithm results in an improvement in the
balance. However, even if the improvement in the balance was very small
the move was still chosen. This is especially problematic if the shard
itself is very big and the move will take a long time.

This changes the rebalancer algorithm to take the relative size of the
balance improvement into account when choosing moves. By default a move
will not be chosen if it improves the balance by less than half of the
size of the shard. An extra argument is added to the rebalancer
functions so that the user can decide to lower the default threshold if
the ignored move is wanted anyway.
2021-05-11 14:24:59 +02:00
Onder Kalaci cc4870a635 Remove wrong PG_USED_FOR_ASSERTS_ONLY 2021-05-11 12:58:37 +02:00
Onder Kalaci a231ff29b0 Get prepared for some improvements for online rebalancer
To see all the changes, see https://github.com/citusdata/citus-enterprise/pull/586/files
2021-05-10 19:54:31 +02:00
Onur Tirtir 4f3c672ebe Re-consider VALID_ITEMPOINTER_OFFSETS wrt bitmap scan logic 2021-05-10 20:16:50 +03:00
Onur Tirtir 0f4c97e0d0 Improve the constants around row number mapping 2021-05-10 20:16:50 +03:00
Onur Tirtir 181848cc80 Implement ErrorIfInvalidRowNumber
To use the same logic when mapping tid's to row number's
2021-05-10 20:16:50 +03:00
Onur Tirtir 7ae90b7f96 Rename ColumnarStripeIndexRelationId to ColumnarStripePKeyIndexRelationId
Since now we have another index on columnar.stripe
2021-05-10 20:16:50 +03:00
Onur Tirtir f846c16514 Implement BuildStripeMetadata 2021-05-10 20:16:50 +03:00
Onur Tirtir 2552aee404 Handle old versioned columnar metapage after binary upgrade (#4956)
* Make VACUUM hint for upgrade scenario actually work

* Suggest using VACUUM if metapage doesn't exist

Plus, suggest upgrading sql version as another option.

* Always force read metapage block

* Fix two typos
2021-05-10 20:16:50 +03:00
Onur Tirtir 2e419ea177 Add first_row_number column to columnar.stripe for tid mapping 2021-05-10 20:16:50 +03:00
Onur Tirtir 9c1ac3127f Implement ColumnarOverwriteMetapage 2021-05-10 20:16:50 +03:00
jeff-davis 7b9aecff21 Columnnar: metapage changes. (#4907)
* Columnar: introduce columnar storage API.

This new API is responsible for the low-level storage details of
columnar; translating large reads and writes into individual block
reads and writes that respect the page headers and emit WAL. It's also
responsible for the columnar metapage, resource reservations (stripe
IDs, row numbers, and data), and truncation.

This new API is not used yet, but will be used in subsequent
forthcoming commits.

* Columnar: add columnar_storage_info() for debugging purposes.

* Columnar: expose ColumnarMetadataNewStorageId().

* Columnar: always initialize metapage at creation time.

This avoids the complexity of dealing with tables where the metapage
has not yet been initialized.

* Columnar: columnar storage upgrade/downgrade UDFs.

Necessary upgrade/downgrade step so that new code doesn't see an old
metapage.

* Columnar: improve metadata.c comment.

* Columnar: make ColumnarMetapage internal to the storage API.

Callers should not have or need direct access to the metapage.

* Columnar: perform resource reservation using storage API.

* Columnar: implement truncate using storage API.

* Columnar: implement read/write paths with storage API.

* Columnar: add storage tests.

* Revert "Columnar: don't include stripe reservation locks in lock graph."

This reverts commit c3dcd6b9f8.

No longer needed because the columnar storage API takes care of
concurrency for resource reservation.

* Columnar: remove unnecessary lock when reserving.

No longer necessary because the columnar storage API takes care of
concurrent resource reservation.

* Add simple upgrade tests for storage/ branch

* fix multi_extension.out

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2021-05-10 20:16:46 +03:00
Onur Tirtir 7def297a3b
Move the logic that builds relation col list into a function (#4964) 2021-05-10 20:01:28 +03:00
Onur Tirtir 59fea712e2
Implement an helper to create memory cxt for stripe read (#4965) 2021-05-10 19:55:47 +03:00
SaitTalhaNisanci 5a941814fd
Close connection after each shard move (#4967) 2021-05-10 16:57:19 +03:00
Ahmet Gedemenli 8cb505d6e1
Fix matview access method change issue (#4959)
* Fix matview access method change issue

* Use pg function get_am_name

* Split view generation command into pieces
2021-05-07 15:47:24 +03:00
SaitTalhaNisanci 6b1904d37a
When moving a shard to a new node ensure there is enough space (#4929)
* When moving a shard to a new node ensure there is enough space

* Add WairForMiliseconds time utility

* Add more tests and increase readability

* Remove the retry loop and use a single udf for disk stats

* Address review

* address review

Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
2021-05-06 17:28:02 +03:00
Ahmet Gedemenli bc818e76e2 Add notice log message for skipping child tables for optimization 2021-05-06 16:49:37 +03:00
Ahmet Gedemenli 2e0bb5c0c8 Fix nested select query with union bug 2021-05-05 20:35:00 +03:00
Jelte Fennema 0e6c080e81
Run copy_modified in upgrade tests (#4952)
This allows running the following command to update the expected files
with normalized output files for upgrade tests too:

```bash
cp src/test/regress/{results,expected}/upgrade_rebalance_strategy_before.out
```
2021-05-05 12:28:05 +02:00
Jelte Fennema 50357db957
Simplify code that tests the shard rebalancer algorithm (#4925)
This modifies the test code to use sane defaults instead of requiring
all values to be specified in the test.
2021-05-03 15:47:19 +02:00
Hanefi Onaldi 23a505d41f
Bump PG versions in CI (#4941)
Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
Co-authored-by: Sait Talha Nisanci <s.talhanisanci@gmail.com>
2021-05-03 13:51:20 +03:00
Jelte Fennema 2f29d4e53e Continue to remove shards after first failure in DropMarkedShards
The comment of DropMarkedShards described the behaviour that after a
failure we would continue trying to drop other shards. However the code
did not do this and would stop after the first failure. Instead of
simply fixing the comment I fixed the code, because the described
behaviour is more useful. Now a single shard that cannot be removed yet
does not block others from being removed.
2021-04-30 15:42:09 +03:00
Sait Talha Nisanci 8cabd2e822 Decrease memory usage with rebalancer
We decrease memory usage by:
- Freeing temporary buffers
- Using separate memory context for blocks that uses "small" amount of
memory but can be repeated many times such as loops
2021-04-29 13:40:47 +03:00
Hanefi Onaldi 2f90ce931b
Fix minor issues with makefile targets (#4717) 2021-04-28 15:46:55 +03:00
Marco Slot 4b49cb112f Fix FROM ONLY queries on partitioned tables 2021-04-27 16:10:07 +02:00
Ahmet Gedemenli fe65be993e Sort GUCs in alphabetic order 2021-04-26 15:05:42 +03:00
Onur Tirtir 889ad6fa8c Run some upgrade tests only when old version=9.0 2021-04-26 14:53:53 +03:00
Ahmet Gedemenli 332c5ce4ad
Fix worker partitioned size functions (#4922) 2021-04-26 10:29:46 +03:00
Jelte Fennema 763fa1cf41 Fix diff-filter to search the whole line for matches
Recently two new normalization line deletion rules have been added that
don't match the start of a line:
```
/local tables that are added to metadata but not chained with reference tables via foreign keys might be automatically converted back to postgres tables$/d
/Consider setting citus.enable_local_reference_table_foreign_keys to 'off' to disable this behavior$/d
```

Because `diff-filter` used `regex.match` these lines were not removed
when creating a new diff. This could cause some confusing diffs, where
the wrong lines were shown as changed. This fixes that by using
`regex.search` instead of `regex.match`.
2021-04-23 12:43:49 +02:00
Onder Kalaci 918838e488 Allow constant VALUES clauses in pushdown queries
As long as the VALUES clause contains constant values, we should not
recursively plan the queries/CTEs.

This is a follow-up work of #1805. So, we can easily apply OUTER join
checks as if VALUES clause is a reference table/immutable function.
2021-04-21 14:28:08 +02:00
SaitTalhaNisanci 93c2dcf3d2
Fix data-race with concurrent calls of DropMarkedShards (#4909)
* Fix problews with concurrent calls of DropMarkedShards

When trying to enable `citus.defer_drop_after_shard_move` by default it
turned out that DropMarkedShards was not safe to call concurrently.
This could especially cause big problems when also moving shards at the
same time. During tests it was possible to trigger a state where a shard
that was moved would not be available on any of the nodes anymore after
the move.

Currently DropMarkedShards is only called in production by the
maintenaince deamon. Since this is only a single process triggering such
a race is currently impossible in production settings. In future changes
we will want to call DropMarkedShards from other places too though.

* Add some isolation tests

Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
2021-04-21 10:59:48 +03:00
Ahmet Gedemenli 33c620f232
Optimize partitioned disk size calculation (#4905)
* Optimize partitioned disk size calculation

* Polish

* Fix test for citus_shard_cost_by_disk_size

Try optimizing if not CSTORE
2021-04-19 13:30:56 +03:00
Onur Tirtir 96278822d9
Move columnar test helpers to a separate file (#4908)
* Move columnar test helpers to another file

* Rename column_store_memory_stats to columnar_store_memory_stats
2021-04-16 18:56:21 +03:00
Onder Kalaci 5482d5822f Keep more statistics about connection establishment times
When DEBUG4 enabled, Citus now prints per connection establishment
time.
2021-04-16 14:56:31 +02:00
Onder Kalaci 5b78f6cd63 Keep more execution statistics
When DEBUG4 enabled, Citus now prints per task execution times.
2021-04-16 14:45:00 +02:00
jeff-davis 9ed56928d3
Columnar: fix use-after-free. (#4906)
Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2021-04-15 01:00:00 -07:00
Hanefi Onaldi 9919fbe3f8 Switch to sequential mode on long partition names
This commit adds support for long partition names for distributed tables:
- ALTER TABLE dist_table ATTACH PARTITION ..
- CREATE TABLE .. PARTITION OF dist_table ..

Note: create_distributed_table UDF does not support long table and
partition names, and is not covered in this commit
2021-04-14 15:27:50 +03:00
Ahmet Gedemenli e445e3d39c
Introduce 3 partitioned size udfs (#4899)
* Introduce 3 partitioned size udfs

* Add tests for new partition size udfs

* Fix type incompatibilities

* Convert UDFs into pure sql functions

* Fix function comment
2021-04-13 17:36:27 +03:00
Onur Tirtir fe5c985e1d
Remove HAS_TABLEAM config since we dropped pg11 support (#4862)
* Remove HAS_TABLEAM config

* Drop columnar_ensure_objects_exist

* Not call columnar_ensure_objects_exist in citus_finish_pg_upgrade
2021-04-13 10:51:26 +03:00
Onur Tirtir 716cc629f1
Refactor ColumnarReadNextRow for better readability (#4823) 2021-04-13 10:44:00 +03:00
jeff-davis 3efdfdd791
Columnar: make projectedColumnList an integer list. (#4869)
Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2021-04-12 19:07:21 -07:00
Ahmet Gedemenli d74d358a45
Refactor size queries with new enum SizeQueryType (#4898)
* Refactor size queries with new enum SizeQueryType

* Polish
2021-04-12 17:14:29 +03:00
SaitTalhaNisanci b453563e88
Warm up connections params hash (#4872)
ConnParams(AuthInfo and PoolInfo) gets a snapshot, which will block the
remote connectinos to localhost. And the release of snapshot will be
blocked by the snapshot. This leads to a deadlock.

We warm up the conn params hash before starting a new transaction so
that the entries will already be there when we start a new transaction.
Hence GetConnParams will not get a snapshot.
2021-04-12 13:08:38 +03:00
Ahmet Gedemenli caef0463b0 Update func comment for PostprocessCreateTableStmt 2021-04-09 13:41:59 +03:00
Ahmet Gedemenli 52e467a9a0
Error out if inheriting a distributed table (#4871)
* Error out if inheriting a distributed table

* Add test inheriting a distirbuted table
2021-04-07 11:21:06 +03:00
Ahmet Gedemenli e4c4a9b683
Fix error message for local table joins (#4870)
* Fix error message for local table joins

* Fix error messages for regression tests expected outputs
2021-04-06 16:18:28 +03:00
Ahmet Gedemenli 48a6a5b128 Add test for public shard not found issue 2021-04-06 10:29:17 +03:00
Ahmet Gedemenli d530d79d73 Fix tests for public schema 2021-04-06 10:29:17 +03:00
Ahmet Gedemenli 840c879572 Remove redundant if statement for schema name 2021-04-06 10:29:17 +03:00
jeff-davis 063e673038
Columnar: use clause Vars for chunk group filtering. (#4856)
* Columnar: use clause Vars for chunk group filtering.

This solves #4780 and also provides a cleaner separation between chunk
group filtering and projection pushdown.

* Columnar: sort and deduplicate Vars pulled from clauses.

* Columnar: cleanup variable names.

* Columnar: remove alternate test output.

* Columnar: do not recurse when looking for whereClauseVars.

Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2021-04-01 12:27:28 -07:00
Halil Ozan Akgul a5038046f9 Adds shard_count parameter to create_distributed_table 2021-03-29 16:22:49 +03:00
Hanefi Önaldı 797538750f Delete all upgrade test artifacts before citus-upgrade-local 2021-03-27 00:46:06 +03:00
SaitTalhaNisanci 03832f353c Drop postgres 11 support 2021-03-25 09:20:28 +03:00
Onur Tirtir 7081690480
Add check-columnar-vg regression test target (#4737) 2021-03-25 11:55:58 +03:00
SaitTalhaNisanci 3a3171cd04 Ignore temporary output files 2021-03-25 09:59:21 +03:00
jeff-davis 248c6cb91a
Columnar: do not bother building unnecessary RestrictInfo. (#4852)
Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2021-03-24 16:05:08 -07:00
Onur Tirtir c01507a91b
Remove columnar/.gitignore (#4825) 2021-03-24 13:04:14 +03:00
Nils Dijk 1c1999ed7b
incorporate the fixopen fix for osx users on bigsur (#4837)
comparable to https://github.com/citusdata/tools/pull/88

this patch adds checks to the perl script running the testing harness of citus to start the postgres instances via the fixopen binary when present to work around `Interrupted System` call errors on OSX Big Sur.
2021-03-22 16:22:08 +01:00
Nils Dijk 787ee97867
Tests: foreign key non colocated tests (#4841)
Earlier versions of Citus (pre 9.0) had a bug where a user was able to get in a situation where a foreign key between two non-colocated tables was allowed. This was caused by the wrongful scoping together with only setting to on of a boolean variable in a loop, causing the `true` from an earlier iteration to leak into a new iteration.

This was 'by accident' solved in a refactor that was executed in the preparation of the 9.0 release. Only recently we had a user running into this and it was tracked down to this behaviour.

Given the dire situation a user could get them self into when running into this bug we have backported a fix to the latest 8.3 release branch.

To make sure this regression does not happen anymore in the future I propose we add the tests from the backport to our mainline.

For reference: https://github.com/citusdata/citus/pull/4840
2021-03-22 15:33:56 +01:00
dependabot[bot] a1aedc41f1
Bump jinja2 from 2.11.2 to 2.11.3 in /src/test/regress
Bumps [jinja2](https://github.com/pallets/jinja) from 2.11.2 to 2.11.3.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/master/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/2.11.2...2.11.3)

Signed-off-by: dependabot[bot] <support@github.com>
2021-03-20 05:51:26 +00:00
Önder Kalacı b5f4320164
Make sure that single task local executions start coordinated transaction (#4831)
With https://github.com/citusdata/citus/pull/4806 we enabled
2PC for any non-read-only local task. However, if the execution
is a single task, enabling 2PC (CoordinatedTransactionShouldUse2PC)
hits an assertion as we are not in a coordinated transaction.

There is no downside of using a coordinated transaction for single
task local queries.
2021-03-17 12:20:57 +01:00
Ahmet Gedemenli 5e5db9eefa Add udf citus_get_active_worker_nodes 2021-03-17 13:15:59 +03:00
Marco Slot fbc2147e11 Replace MAX_PUT_COPY_DATA_BUFFER_SIZE by citus.remote_copy_flush_threshold GUC 2021-03-16 06:00:38 +01:00
Marco Slot 1646fca445 Add GUC to set maximum connection lifetime 2021-03-16 01:57:57 +01:00
jeff-davis 3b12556401
Columnar: cleanup (#4814)
* Columnar: fix misnamed file.

* Columnar: make compression not dependent on columnar.h.

* Columnar: rename columnar_metadata_tables.c to columnar_metadata.c.

* Columnar: make customscan not depend on columnar.h.

Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2021-03-15 11:34:39 -07:00
Onur Tirtir b2a7bafcc4
Fix flaky test in multi_foreign_key_relation_graph (#4819) 2021-03-15 17:55:04 +03:00
Marco Slot 6c5d263b7a Remove unnecessary AtEOXact_Files call 2021-03-15 09:34:02 +01:00
Onur Tirtir 1d3e075e62
Support temporary columnar tables (#4766) 2021-03-12 12:01:36 +03:00
Onder Kalaci e65e72130d Rename use -> shouldUse
Because setting the flag doesn't necessarily mean that we'll
use 2PC. If connections are read-only, we will not use 2PC.
In other words, we'll use 2PC only for connections that modified
any placements.
2021-03-12 08:29:43 +00:00
Onder Kalaci 6a7ed7b309 Do not trigger 2PC for reads on local execution
Before this commit, Citus used 2PC no matter what kind of
local query execution happens.

For example, if the coordinator has shards (and the workers as well),
even a simple SELECT query could start 2PC:
```SQL

WITH cte_1 AS (SELECT * FROM test LIMIT 10) SELECT count(*) FROM cte_1;
```

In this query, the local execution of the shards (and also intermediate
result reads) triggers the 2PC.

To prevent that, Citus now distinguishes local reads and local writes.
And, Citus switches to 2PC only if a modification happens. This may
still lead to unnecessary 2PCs when there is a local modification
and remote SELECTs only. Though, we handle that separately
via #4587.
2021-03-12 08:29:43 +00:00
Onur Tirtir 874d5fd962
Remove foreign keys between columnar metadata tables (#4791)
Postgres keeps AFTER trigger state for each transaction, because we can have deferred AFTER triggers which will be fired at the end of a transaction. Postgres cleans up this state at the end of transaction.

Postgres processes ON COMMIT triggers after cleaning-up the AFTER trigger states. So if we fire any triggers in ON COMMIT, the AFTER trigger state won't be cleaned-up properly and the transaction state will be left in an inconsistent state, which might result in assertion failure.

So with this commit, we remove foreign keys between columnar metadata tables and enforce constraints between them manually when dropping columnar tables.
2021-03-12 11:28:17 +03:00
Naisila Puka 71a9f45513
Fix upgrade and downgrade paths for master/citus_update_table_statistics (#4805) 2021-03-11 14:52:40 +03:00
Naisila Puka 196064836c
Skip 2PC for readonly connections in a transaction (#4587)
* Skip 2PC for readonly connections in a transaction

* Use ConnectionModifiedPlacement() function

* Remove the second check of ConnectionModifiedPlacement()

* Add order by to prevent flaky output

* Test using pg_dist_transaction
2021-03-10 20:01:37 +03:00
Marco Slot 9c0d7f5c26 Add tests for modifying CTE and SELECT without FROM 2021-03-09 10:39:33 +01:00
Marco Slot 58f85f55c0 Fixes a crash in queries with a modifying CTE and a SELECT without FROM 2021-03-09 10:39:33 +01:00
SaitTalhaNisanci aef7fc3a51
Ignore columnar generated test files (#4796) 2021-03-09 10:52:08 +03:00
Philip Dubé 4e22f02997 Fix various typos due to zealous repetition 2021-03-04 19:28:15 +00:00
Onur Tirtir 1bb7a0a268
Fix chunk_group_consistency regression test view (#4765) 2021-03-04 12:20:25 +03:00
Onur Tirtir 9728ce1167
Add tests for concurrent index deadlock issue (#4775) 2021-03-04 11:56:54 +03:00
Marco Slot f25de6a0e3 Try to return earlier in idempotent master_add_node 2021-03-02 21:22:47 +01:00
Hadi Moshayedi affe38eac6 Populate DATABASEOID cache before CREATE INDEX CONCURRENTLY 2021-03-03 12:59:46 -08:00
Onder Kalaci 54ee96470e Pass pointer of AttributeEquivalenceClass instead of pointer of pointer
AttributeEquivalenceClass seems to be unnecessarily used with multiple
pointers. Just use a single pointer for ease of read.
2021-03-03 12:27:26 +01:00
Onder Kalaci d1cd198655 Prevent infinite recursion for queries that involve UNION ALL and JOIN
With this commit, we make sure to prevent infinite recursion for queries
in the format: [subquery with a UNION ALL] JOIN [table or subquery]

Also, fixes a bug where we pushdown UNION ALL below a JOIN even if the
UNION ALL is not safe to pushdown.
2021-03-03 12:27:26 +01:00
Hadi Moshayedi 1a05131331
Use chunk groups to read columnar data (#4768) 2021-03-02 23:53:24 -08:00
Naisila Puka 2f30614fe3
Reimplement citus_update_table_statistics to detect dist. deadlocks (#4752)
* Reimplement citus_update_table_statistics

* Update stats for the given table not colocation group

* Add tests for reimplemented citus_update_table_statistics

* Use coordinated transaction, merge with citus_shard_sizes functions

* Update the old master_update_table_statistics as well
2021-03-03 04:12:30 +03:00
Marco Slot dca615c5aa Normalize the ConvertTable notices 2021-03-01 10:36:12 +01:00
jeff-davis 9da9bd3dfd
Columnar: rename files and tests. (#4751)
* Columnar: rename files and tests.

* Columnar: Rename Table*State to Columnar*State.
2021-03-01 08:34:24 -08:00
SaitTalhaNisanci feee25dfbd
Use translated vars in postgres 13 as well (#4746)
* Use translated vars in postgres 13 as well

Postgres 13 removed translated vars with pg 13 so we had a special logic
for pg 13. However it had some bug, so now we copy the translated vars
before postgres deletes it. This also simplifies the logic.

* fix rtoffset with pg >= 13
2021-02-26 19:41:29 +03:00
Halil Ozan Akgul 5c5cb200f7 Adds GRANT for public to citus_tables 2021-02-26 16:24:33 +03:00
Önder Kalacı 0fe26a216c
Prevent cross join without any target list entries (#4750)
/*
 * The physical planner assumes that all worker queries would have
 * target list entries based on the fact that at least the column
 * on the JOINs have to be on the target list. However, there is
 * an exception to that if there is a cartesian product join and
 * there is no additional target list entries belong to one side
 * of the JOIN. Once we support cartesian product join, we should
 * remove this error.
 */
2021-02-26 11:04:21 +01:00
Onur Tirtir 54ac924bef Grant read access for columnar metadata tables to unprivileged user 2021-02-26 12:31:09 +03:00
Onur Tirtir dcc0207605 Add 10.0-2 schema version 2021-02-26 12:31:09 +03:00
Onur Tirtir 5ed954844c
Ensure table owner when using alter_columnar_table_set/alter_columnar_table_reset (#4748) 2021-02-26 12:27:51 +03:00
jeff-davis fbeb747006
Columnar: refactor read path and fix zero-column tables. (#4668)
Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2021-02-25 09:04:54 -08:00
Naisila Puka 5ebd4eac7f
Preserve colocation with procedures in alter_distributed_table (#4743) 2021-02-25 19:52:47 +03:00
Hanefi Onaldi 5aff18b573 Fix flaky test 2021-02-24 17:09:08 +03:00
Hanefi Onaldi 9a792ef841 Remove length limitations for table renames 2021-02-24 03:35:27 +03:00
Hanefi Onaldi 7bebeb872d Failing long table name tests 2021-02-24 03:35:27 +03:00
Onur Tirtir 495096ef5e
Remove useless pg version checks (#4741) 2021-02-23 21:20:18 +03:00
Naisila Puka dbb88f6f8b
Fix insert query with CTEs/sublinks/subqueries etc (#4700)
* Fix insert query with CTE

* Add more cases with deferred pruning but false fast path

* Add more tests

* Better readability with if statements
2021-02-23 18:00:47 +03:00
Naisila Puka 105bb580e1
Add columnar regression tests (#4727)
* Add cursor tests for columnar tables

* Add columnar tests for data types w/out comp. operators

* Add more prepared statements with columnar tables

* Add constraint tests for columnar tables

* Add row level security, detach partition and rename columnar tests

* Add some ORDER BYs
2021-02-23 14:16:38 +03:00
Hadi Moshayedi 2fca5ff3b5 Fix alignment issue in DatumToBytea 2021-02-22 16:04:30 -08:00
SaitTalhaNisanci dcf54eaf2a Use PROCESS_UTILITY_QUERY in utility calls
When we use PROCESS_UTILITY_TOPLEVEL it causes some problems when
combined with other extensions such as pg_audit. With this commit we use
PROCESS_UTILITY_QUERY in the codebase to fix those problems.
2021-02-19 13:55:59 +03:00
Sait Talha Nisanci bbf6132226 Revert "wip (#4730)"
This reverts commit 62e6d54a4e.
2021-02-19 13:55:59 +03:00
SaitTalhaNisanci 62e6d54a4e
wip (#4730) 2021-02-19 13:42:19 +03:00
Marco Slot 972a8bc0b7 Rewrite time_partitions join clause to avoid smallint[] operator 2021-02-18 12:01:18 +01:00
Ahmet Gedemenli 1f345f65b4 Support dropping local table indexes along with a distributed index 2021-02-18 13:30:12 +03:00
Onur Tirtir 676d9a9726 Bump Citus to 10.1devel 2021-02-17 11:54:33 +03:00
jeff-davis 0227317002
Columnar: better specification for microbenchmark. (#4711)
Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2021-02-16 15:28:25 -08:00
Onur Tirtir d61fd6e478
Decide changing sequence dependencies on MX nodes according to resulting relation (#4713)
When executing alter_table / undistribute_table udf's, we should not try
to change sequence dependencies on MX workers if new table wouldn't
require syncing metadata.

Previously, we were checking that for input table. But in some cases, the
fact that input table requires syncing metadata doesn't imply the same
for resulting table (e.g when undistributing a Citus table).

Even more, doing that was giving an unexpected error when undistributing
a Citus table so this commit actually fixes that.
2021-02-15 19:20:26 +03:00
SaitTalhaNisanci bcbd24f8de
Only consider pseudo constants for shortcuts (#4712)
It seems that we need to consider only pseudo constants while doing some
shortcuts in planning. For example there could be a false clause but it
can contribute to the result in which case it will not be a pseudo
constant.
2021-02-15 18:39:37 +03:00
SaitTalhaNisanci 0f1ce7a913
Not skip relation in conversion if it doesn't have RelationRestriction (#4685)
We would exclude tables without relationRestriction from conversion
candidates in local-distributed table joins. This could leave a leftover
local table which should have been converted to a subquery.

Ideally I would expect that in each call to CreateDistributedPlan we
would pass a new plan id, but that seems like a bigger change.
2021-02-12 12:33:55 +03:00
Hadi Moshayedi e690d8b79b Move stripe.chunk_count to last position 2021-02-11 17:00:44 -08:00
Jeff Davis b96673de69 Columnar: update README to compare with cstore_fdw. 2021-02-11 10:47:27 -08:00
Jeff Davis 1f1c3c362b Columnar: rename chunk_num -> chunk_group_num. 2021-02-11 09:27:00 -08:00
Onder Kalaci f297c96ec5 Add regression tests for COPY into colocated intermediate results
To add the tests without too much data, make the copy switchover
configurable.
2021-02-11 15:41:06 +01:00
Onder Kalaci 5d5a357487 Do not connection re-use for intermediate results
/*
 * Colocated intermediate results are just files and not required to use
 * the same connections with their co-located shards. So, we are free to
 * use any connection we can get.
 *
 * Also, the current connection re-use logic does not know how to handle
 * intermediate results as the intermediate results always truncates the
 * existing files. That's why, we use one connection per intermediate
 * result.
 */
2021-02-11 15:41:06 +01:00
Ahmet Gedemenli c8e83d1f26 Fix dropping fkey when distributing table 2021-02-11 15:48:35 +03:00
SaitTalhaNisanci 847b79078f
Not consider subplans in restriction list (#4679)
* Not consider subplans in restriction list

* Not consider sublink, alternative subplan in restrictions
2021-02-11 15:04:07 +03:00
Hadi Moshayedi c3dcd6b9f8 Columnar: don't include stripe reservation locks in lock graph. 2021-02-10 10:20:20 -08:00
Hadi Moshayedi 841d25bae9 Release metadata locks early 2021-02-10 10:20:12 -08:00
Onur Tirtir ec7ab68f3b Test adding local table with long name to metadata 2021-02-10 18:05:04 +03:00
Onur Tirtir 9f619a85d6
Fix EXPLAIN ANALYZE exec when query returns no cols (#4672)
We do not include dummy column if original task didn't return any
columns.
Otherwise, number of columns that original task returned wouldn't
match number of columns returned by worker_save_query_explain_analyze.
2021-02-10 17:59:47 +03:00
Hadi Moshayedi 52297804ae Fix zero column tables 2021-02-09 23:05:11 -08:00
Hadi Moshayedi 2d09c76b76 Rename storageid to storage_id 2021-02-09 19:57:04 -08:00
Hadi Moshayedi 8270b598b6 Rename stripeid, chunkid, and attnum 2021-02-09 19:50:50 -08:00
Hadi Moshayedi 9114fd4050 Move chunk.value_count to last position 2021-02-09 19:43:34 -08:00
Hadi Moshayedi be90c20457 Fix write path for zero column tables 2021-02-09 14:14:06 -08:00
Hadi Moshayedi c8d61a31e2 Columnar: chunk_group metadata table 2021-02-09 14:11:58 -08:00
Onder Kalaci c804c9aa21 Allow local execution for intermediate results in COPY
When COPY is used for copying into co-located files, it was
not allowed to use local execution. The primary reason was
Citus treating co-located intermediate results as co-located
shards, and COPY into the distributed table was done via
"format result". And, local execution of such COPY commands
was not implemented.

With this change, we implement support for local execution with
"format result". To do that, we use the buffer for every file
on shardState->copyOutState, similar to how local copy on
shards are implemented. In fact, the logic is similar to
local copy on shards, but instead of writing to the shards,
Citus writes the results to a file.

The logic relies on LOCAL_COPY_FLUSH_THRESHOLD, and flushes
only when the size exceeds the threshold. But, unlike local
copy on shards, in this case we write the headers and footers
just once.
2021-02-09 15:00:06 +01:00
Jeff Davis 2ea31c899e Columnar: make read and write state private. 2021-02-08 10:11:57 -08:00
Hanefi Onaldi 353b080474
Fix Semmle errors (#4636)
Co-authored-by: Halil Ozan Akgül <hozanakgul@gmail.com>
2021-02-08 18:37:44 +03:00
SaitTalhaNisanci e96da4886f
Sort results in citus_shards and give raw size (#4649)
* Sort results in citus_shards and give raw size

Sort results so that it is consistent and also similar to citus_tables.

Use raw size in the output so that doing operations on the size is
easier.

* Change column ordering
2021-02-08 15:29:42 +03:00
Hadi Moshayedi 3e6b54b964 Normalize isolation_metadata_sync_deadlock 2021-02-06 15:59:28 -08:00
Hadi Moshayedi eff8cffaf3
Columnar: improve naming of limit config variables. (#4653)
* Rename chunk_row_count to chunk_group_row_limit

* Rename stripe_row_count to stripe_row_limit

* Undo couple of renames
2021-02-06 09:04:04 -08:00
Jeff Davis b1882d4400 Columnar: Call nextval_internal instead of DirectFunctionCall. 2021-02-06 01:45:30 -08:00
Hadi Moshayedi 4e53314e3f Make isolation_metadata_sync_deadlock more resilient 2021-02-06 01:05:24 -08:00
Hadi Moshayedi 0a9fd91d8f Use 'Chunk Groups' in EXPLAIN ANALYZE of columnar scan 2021-02-05 10:58:01 -08:00
Hadi Moshayedi 1d311b0709 Columnar: don't double count chunks filtered 2021-02-05 10:58:01 -08:00
Ahmet Gedemenli 5dd2a3da03 Convert RelabelTypes into CollateExprs in get_rule_expr function 2021-02-05 12:06:46 +03:00
Ahmet Gedemenli 503171d2f2
Merge branch 'master' into rename-master-parameter-for-dist-stat-activity 2021-02-04 15:37:13 +03:00
Ahmet Gedemenli 2443b20b2c Rename master to distributed for worker stat activity 2021-02-04 12:20:06 +03:00
Onder Kalaci fc9a23792c COPY uses adaptive connection management on local node
With #4338, the executor is smart enough to failover to
local node if there is not enough space in max_connections
for remote connections.

For COPY, the logic is different. With #4034, we made COPY
work with the adaptive connection management slightly
differently. The cause of the difference is that COPY doesn't
know which placements are going to be accessed hence requires
to get connections up-front.

Similarly, COPY decides to use local execution up-front.

With this commit, we change the logic for COPY on local nodes:

Try to reserve a connection to local host. This logic follows
the same logic (e.g., citus.local_shared_pool_size) as the
executor because COPY also relies on TryToIncrementSharedConnectionCounter().
If reservation to local node fails, switch to local execution
Apart from this, if local execution is disabled, we follow the
exact same logic for multi-node Citus. It means that if we are
out of the connection, we'd give an error.
2021-02-04 09:45:07 +01:00
Ahmet Gedemenli 34840ddc5c Rename master to citus for dist stat activity cols 2021-02-04 11:12:23 +03:00
Hadi Moshayedi 5fde617229 Columnar: disallow CREATE INDEX CONCURRENTLY 2021-02-03 12:10:00 -08:00
Jeff Davis 4043731c41 Columnar: fix inheritance planning. 2021-02-03 10:41:21 -08:00
Sait Talha Nisanci ff82e85ea2 Replace workerNodeCount -> nodeCount 2021-02-03 20:02:03 +03:00
Sait Talha Nisanci eb5be579e3 Set previous cell inside a for loop 2021-02-03 20:02:03 +03:00
Sait Talha Nisanci 9ba3f70420 Remove unused method 2021-02-03 20:02:03 +03:00
Sait Talha Nisanci 24e60b44a1 Consider coordinator in intermediate result optimization
It seems that we were not considering the case where coordinator was
added to the cluster as a worker in the optimization of intermediate
results.

This could lead to errors when coordinator was added as a worker.
2021-02-03 20:02:03 +03:00
Onur Tirtir c0f2817b70
Disallow using alter_table udfs with tables having any identity cols (#4635)
pg_get_tableschemadef_string doesn't know how to deparse identity
columns so we cannot reflect those columns when creating table
from scratch. For this reason, we don't allow using alter_table udfs
with tables having any identity cols.
2021-02-03 19:33:54 +03:00
Onur Tirtir 3a403090fd
Disallow adding local table with identity column to metadata (#4633)
pg_get_tableschemadef_string doesn't know how to deparse identity
columns so we cannot reflect those columns when creating shell
relation.
For this reason, we don't allow adding local tables -having identity cols-
to metadata.
2021-02-03 19:05:17 +03:00
Onur Tirtir 5efb742f8a
Skip copying GENERATED ALWAYS AS STORED cols in ReplaceTable (#4616)
Postgres doesn't allow inserting into columns having GENERATED ALWAYS
AS (...) STORED expressions.
For this reason, when executing undistribute_table or an alter_* udf,
we should skip copying such columns.
This is not bad since Postgres would already generate such columns.
2021-02-03 17:55:16 +03:00
jeff-davis e03246dd45
Colummnar: mark custom scan path paralle_safe. (#4619)
Enables an overall plan to be parallel (e.g. over a partition
hierarchy), even though an individual ColumnarScan is not
parallel-aware.

Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2021-02-02 11:56:00 -08:00
jeff-davis e195af7e72
Columnar: always disable parallel paths. (#4617)
Previously, if columnar.enable_custom_scan was false, parallel paths
could remain, leading to an unexpected error.

Also, ensure that cheapest_parameterized_paths is cleared if a custom
scan is used.

Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2021-02-02 11:37:42 -08:00
Onur Tirtir 53b1888cac Rename DropAndMoveDefaultSequenceOwnerships 2021-02-02 18:17:42 +03:00
Onur Tirtir 93c3f30024 Rename ExtractColumnsOwningSequences 2021-02-02 18:17:42 +03:00
Onur Tirtir 912d829757 Skip GENERATED AS ALWAYS STORED cols when processing cols owning sequences
When finding columns owning sequences, we shouldn't rely on atthasdef
since it might be true when column has GENERATED ALWAYS AS (...)
STORED expression.
2021-02-02 18:17:42 +03:00
Onur Tirtir c8a48c6eee
Not try to sync metadata for local tables (#4625) 2021-02-02 15:12:12 +03:00
Onur Tirtir c5d4e7081b
Fix invalid read issue in deprecated create_citus_local_table udf (#4611)
Since create_citus_local_table doesn't specify cascadeViaForeignKeys
option, we can't directly call citus_add_local_table_to_metadata
from create_citus_local_table.
Instead, implement an internal method and call it from deprecated udf
too.
2021-02-02 12:53:27 +03:00
Jeff Davis f417510a7f Columnar: properly initialize rowNumber. 2021-02-01 21:15:14 -08:00
Hadi Moshayedi bcb162976f Fix #4608 2021-02-01 16:23:16 -08:00
Hadi Moshayedi f5b1e49b79 Columnar: Fix lateral joins 2021-02-01 11:59:36 -08:00
Hadi Moshayedi ef927688fa Columnar: Fix ALTER TABLE ... ADD COLUMN. 2021-02-01 11:40:17 -08:00
Brian Bergeron 1253eeb9ff
Don't propagate ALTER ROLE SET when scoped to a different database (#4471)
Co-authored-by: brberger <brberger@microsoft.com>
2021-02-01 15:49:26 +03:00
Hanefi Önaldı cab17afce9 Introduce UDFs for fixing partitioned table constraint names 2021-01-29 17:32:20 +03:00
Hanefi Önaldı 92cf49b7e9 Limit shardId in partitioned table constraint names to only CHECK 2021-01-29 17:29:53 +03:00
SaitTalhaNisanci 738825cc38
Fix partition column index issue (#4591)
* Fix partition column index issue

We send column names to worker_hash/range_partition_table methods, and
in these methods we check the column name index from tuple descriptor.
Then this index is used to decide the bucket that the current row will
be sent for the repartition.

This becomes a problem when there are the same column names in the
tupleDescriptor. Then we can choose the wrong index. Hence the
partitioned data will be put to wrong workers. Then the result could
miss some data because workers might contain different range of data.

An example:
TupleDescriptor contains "trip_id", "car_id", "car_id" for one table.
It contains only "car_id" for the other table. And assuming that the
tables will be partitioned by car_id, it is not certain what should be
used for deciding the bucket number for the first table. Assuming value
2 goes to bucket 2 and value 3 goes to bucket 3, it is not certain which
bucket "1 2 3" (trip_id, car_id, car_id)  row will go to.

As a solution we send the index of partition column in targetList
instead of the column name.

The old API is kept so that if workers upgrade work, it still works
(though it will have the same bug)

* Use the same method so that backporting is easier
2021-01-29 14:40:40 +03:00
SaitTalhaNisanci 1ba399f5ca
Fix a flaky behaviour in shared_connection_stats (#4596)
With the previous query, we were not pushing down the pg_sleep hence the
number of connections to a worker could be different from run to run.
2021-01-28 18:42:49 +03:00
Onder Kalaci c7ea46067f Add regression tests 2021-01-28 12:45:57 +01:00
Onder Kalaci 04fcd73eb6 When reaches to shared pool size, COPY sets the placement access
It looks like we forgot to set the placement accesses, and
this could lead to self-deadlocks on complex transaction blocks.
2021-01-28 12:45:57 +01:00
Onder Kalaci 36bdeef1bb When reaches to executor pool size, COPY sets the placement access
It looks like we forgot to set the placement accesses, and
this could lead to self-deadlocks on complex transaction blocks.
2021-01-28 12:45:57 +01:00
Onur Tirtir bb5962ee79
Early error out when creating citus local from a temp table (#4592) 2021-01-28 14:18:06 +03:00
Halil Ozan Akgul 913aa91449 Adds error message to AlterTableSetAccessMethod for below PG12 2021-01-28 11:32:02 +03:00
jeff-davis 15297cab49
Columnar: add GUC to control qual pushdown. (#4586) 2021-01-27 09:57:40 -08:00
jeff-davis 62e0383150
Columnar readme. (#4585)
Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2021-01-27 09:33:35 -08:00
Nils Dijk 07d3b4fd04
fix NaN cost estimate on empty columnar tables (#4593)
Fixing a division by zero in the cost calculations for scanning a columnar table.

Due to how the columns in a columnar table are counted an empty table would result in a division by zero. Instead this patch keeps the column selection ratio on zero when this happens, resulting in an accurate cost of zero pages to scan a columnar table.

fixes #4589
2021-01-27 17:32:17 +01:00