Commit Graph

3540 Commits (f0536fa299d0cbbb2831cca0a6b620212be3f3cd)

Author SHA1 Message Date
Burak Velioglu 596e49db8c
Citus Indent 2021-12-31 13:16:30 +03:00
Burak Velioglu 3c1361dd4d
Move detach partition command list 2021-12-31 13:08:00 +03:00
Halil Ozan Akgul 9547228e8d Add isolation_check_mx test 2021-12-30 14:58:30 +03:00
Burak Velioglu 848d13f6eb
Changes depending on the discussion 2021-12-30 14:38:31 +03:00
Halil Ozan Akgul aef2d83c7d Fix metadata sync fails on multi_transaction_recovery 2021-12-29 11:21:32 +03:00
Burak Velioglu a348b6b097
Comment out failing test 2021-12-28 16:28:28 +03:00
Burak Velioglu cc68e87903
Citus indent 2021-12-28 12:48:20 +03:00
Burak Velioglu e610a147a1
Remove unnecessary comments on tests 2021-12-28 11:37:02 +03:00
Burak Velioglu d912cd9a9b
Update todos 2021-12-28 11:27:12 +03:00
Burak Velioglu cfe3f97faf
Create shell always for dist and reference table 2021-12-28 11:08:27 +03:00
Burak Velioglu d18525a906
Create shell always for local table 2021-12-28 10:58:35 +03:00
Burak Velioglu 416196b9be
Unmark dropped tabled on worker 2021-12-28 10:52:31 +03:00
Burak Velioglu 880533a609
Divide object and metadata handling 2021-12-27 18:14:51 +03:00
Önder Kalacı d33650d1c1
Record if any partitioned Citus tables during upgrade (#5555)
With Citus 11, the default behavior is to sync the metadata.
However, partitioned tables created pre-Citus 11 might have
index names that are not compatiable with metadata syncing.

See https://github.com/citusdata/citus/issues/4962 for the
details.

With this commit, we record the existence of partitioned tables
such that we can fix it later if any exists.
2021-12-27 03:33:34 -08:00
Halil Ozan Akgul 0c292a74f5 Fix metadata sync fails on multi_truncate 2021-12-27 13:54:53 +03:00
Önder Kalacı c9127f921f
Avoid round trips while fixing index names (#5549)
With this commit, fix_partition_shard_index_names()
works significantly faster.

For example,

32 shards, 365 partitions, 5 indexes drop from ~120 seconds to ~44 seconds
32 shards, 1095 partitions, 5 indexes drop from ~600 seconds to ~265 seconds

`queryStringList` can be really long, because it may contain #partitions * #indexes entries.

Before this change, we were actually going through the executor where each command
in the query string triggers 1 round trip per entry in queryStringList.

The aim of this commit is to avoid the round-trips by creating a single query string.

I first simply tried sending `q1;q2;..;qn` . However, the executor is designed to
handle `q1;q2;..;qn` type of query executions via the infrastructure mentioned
above (e.g., by tracking the query indexes in the list and doing 1 statement
per round trip).

One another option could have been to change the executor such that only track
the query index when `queryStringList` is provided not with queryString
including multiple `;`s . That is (a) more work (b) could cause weird edge
cases with failure handling (c) felt like coding a special case in to the executor
2021-12-27 10:29:37 +01:00
Halil Ozan Akgul bb636e6a29 Fix metadata sync fails on multi_function_evaluation 2021-12-24 19:32:58 +03:00
Halil Ozan Akgul 70e68d5312 Fix metadata sync fails on multi_name_lengths 2021-12-24 14:33:32 +03:00
Halil Ozan Akgul 5c2fb06322 Fix metadata sync fails on multi_sequence_default 2021-12-24 14:33:32 +03:00
Halil Ozan Akgul b9c06a6762 Turn metadata sync on in multi_metadata_sync 2021-12-24 10:58:13 +03:00
Hanefi Onaldi 479b2da740 Fix one flaky failure test 2021-12-23 20:11:45 +03:00
Ahmet Gedemenli 042d45b263 Propagate foreign server ops 2021-12-23 17:54:04 +03:00
Burak Velioglu 6598a23963
Dependency update 2021-12-23 17:46:37 +03:00
Onur Tirtir 61b5fb1cfc
Run failure_test_helpers in base schedule (#5559) 2021-12-23 12:54:12 +01:00
Talha Nisanci e196d23854
Refactor AttributeEquivalenceId (#5006) 2021-12-23 13:19:02 +03:00
Hanefi Onaldi 76176caea7 Fix typo s/exlusive/exclusive/ 2021-12-23 01:35:01 +03:00
Burak Velioglu 48c5ce8960
Dist table refactor 2021-12-23 00:16:38 +03:00
Burak Velioglu a448ca01bc
Add TODOs 2021-12-22 21:40:04 +03:00
Burak Velioglu 6705f0c612
Undistributed removed sequences while undistributing table 2021-12-22 20:26:02 +03:00
Hanefi Onaldi 1af8ca8f7c
Fix statical analysis findings (#5550) 2021-12-22 18:16:11 +03:00
Burak Velioglu 81a6cb47d6
Merge branch 'master' into velioglu/table_wo_seq_prototype 2021-12-22 11:24:40 +03:00
Burak Velioglu d66f1243bf
Object distribution change 2021-12-22 11:19:34 +03:00
Burak Velioglu 9fe1c9646d
Uncomment test 2021-12-21 22:31:45 +03:00
Burak Velioglu 50f52101b8
Remove metadata object from ref table prop 2021-12-21 22:27:36 +03:00
Burak Velioglu 9fec89d70b
Citus_disable_node 2021-12-21 21:36:32 +03:00
Ahmet Gedemenli 8e4ff34a2e Do not include return table params in the function arg list
(cherry picked from commit 90928cfd74)

Fix function signature generation

Fix comment typo

Add test for worker_create_or_replace_object

Add test for recreating distributed functions with OUT/TABLE params

Add test for recreating distributed function that returns setof int

Fix test output

Fix comment
2021-12-21 19:01:42 +03:00
Burak Velioglu 2e61c3e6b8
Add todos 2021-12-20 15:44:01 +03:00
Burak Velioglu ab29b939b2
Indentation fix 2021-12-20 12:39:03 +03:00
Burak Velioglu 303c7e230e
Ensure dependency craetion on for adding/activating node 2021-12-20 12:28:43 +03:00
Burak Velioglu b9d1ab38af
Move shell table creation out 2021-12-18 21:28:24 +03:00
Burak Velioglu 3d828bc7b6
Sync metadata first before propagating any 2021-12-18 20:25:09 +03:00
Marco Slot 2eef71ccab Propagate SET TRANSACTION commands 2021-12-18 11:31:39 +01:00
Burak Velioglu f66b1d5116
Fix list creation 2021-12-17 17:17:49 +03:00
Halil Ozan Akgul 46f718c76d Turn metadata sync on in add_coordinator, foreign_key_to_reference_table and replicate_reference_tables_to_coordinator 2021-12-17 16:33:25 +03:00
Halil Ozan Akgul 25755a7094 Turn ddl propagation off in worker on multi_copy 2021-12-17 15:54:20 +03:00
Onder Kalaci fc98f83af2 Add citus.grep_remote_commands
Simply applies

```SQL
SELECT textlike(command, citus.grep_remote_commands)
```
And, if returns true, the command is logged. Else, the log is ignored.

When citus.grep_remote_commands is empty string, all commands are
logged.
2021-12-17 11:47:40 +01:00
Halil Ozan Akgul df8d0f3db1 Turn metadata sync on in multi_replicate_reference_table and multi_citus_tools 2021-12-17 10:25:57 +03:00
Onur Tirtir cc4c83b1e5
HAVE_LZ4 -> HAVE_CITUS_LZ4 (#5541) 2021-12-16 16:21:52 +03:00
Burak Velioglu 3636b7c9c5
Start handling local tables 2021-12-16 14:57:46 +03:00
Talha Nisanci c0945d88de
Normalize a debug failure to WARNING failure (#4996) 2021-12-16 13:43:49 +03:00
Halil Ozan Akgul 8943d7b52f Turn metadata sync on in mx_regular_user and remove_coordinator 2021-12-16 11:26:24 +03:00
Halil Ozan Akgul b82af4db3b Turn metadata sync on in multi_size_queries, multi_drop_extension and multi_unsupported_worker_operations 2021-12-16 11:10:54 +03:00
Hanefi Onaldi 9d4d73898a
Move healthcheck logic into new file (#5531)
and add a missing `CheckCitusVersion(ERROR)` call
2021-12-15 15:58:20 -08:00
Burak Velioglu a6cdd43d42
Fix metadata changes and use same connection for all 2021-12-16 00:04:53 +03:00
Hanefi Onaldi acdcd9422c
Fix one flaky failure test (#5528)
Removes flaky test
2021-12-15 18:59:58 +03:00
Burak Velioglu fea68a43ad
Start moving table dependent metadata 2021-12-15 18:04:59 +03:00
Hanefi Onaldi 29e4516642 Introduce citus_check_cluster_node_health UDF
This UDF coordinates connectivity checks accross the whole cluster.

This UDF gets the list of active readable nodes in the cluster, and
coordinates all connectivity checks in sequential order.

The algorithm is:

for sourceNode in activeReadableWorkerList:
    c = connectToNode(sourceNode)
    for targetNode in activeReadableWorkerList:
        result = c.execute(
            "SELECT citus_check_connection_to_node(targetNode.name,
                                                   targetNode.port")
        emit sourceNode.name,
             sourceNode.port,
             targetNode.name,
             targetNode.port,
             result

- result -> true  ->  connection attempt from source to target succeeded
- result -> false -> connection attempt from source to target failed
- result -> NULL  -> connection attempt from the current node to source node failed

I suggest you use the following query to get an overview on the connectivity:

SELECT bool_and(COALESCE(result, false))
FROM citus_check_cluster_node_health();

Whenever this query returns false, there is a connectivity issue, check in detail.
2021-12-15 01:41:51 +03:00
Hanefi Onaldi 13fff9c37a Remove NOOP tuplestore_donestoring calls
PostgreSQL does not need calling this function since 7.4 release, and it
is a NOOP.

For more details, check PostgreSQL commit below :

commit dd04e958c8b03c0f0512497651678c7816af3198
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Sun Mar 9 03:34:10 2003 +0000

    tuplestore_donestoring() isn't needed anymore, but provide a no-op
    macro definition so as not to create compatibility problems.

diff --git a/src/include/utils/tuplestore.h b/src/include/utils/tuplestore.h
index b46babacd1..76fe9fb428 100644
--- a/src/include/utils/tuplestore.h
+++ b/src/include/utils/tuplestore.h
@@ -17,7 +17,7 @@
  * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
- * $Id: tuplestore.h,v 1.8 2003/03/09 02:19:13 tgl Exp $
+ * $Id: tuplestore.h,v 1.9 2003/03/09 03:34:10 tgl Exp $
  *
  *-------------------------------------------------------------------------
  */
@@ -41,6 +41,9 @@ extern Tuplestorestate *tuplestore_begin_heap(bool randomAccess,

 extern void tuplestore_puttuple(Tuplestorestate *state, void *tuple);

+/* tuplestore_donestoring() used to be required, but is no longer used */
+#define tuplestore_donestoring(state)  ((void) 0)
+
 /* backwards scan is only allowed if randomAccess was specified 'true' */
 extern void *tuplestore_gettuple(Tuplestorestate *state, bool forward,
                                        bool *should_free);
2021-12-14 18:55:02 +03:00
Halil Ozan Akgul e060720370 Fix metadata sync fails in multi_index_statements 2021-12-14 11:28:08 +03:00
Halil Ozan Akgul a951e52ce8 Fix drop index trying to drop coordinator local indexes on metadata worker nodes 2021-12-14 11:28:08 +03:00
Halil Ozan Akgul 1d7dde2c4c Fix metadata sync fails on multi_copy 2021-12-14 10:59:59 +03:00
Halil Ozan Akgul 98e38e2e4e Fix metadata sync fails on failure_connection_establishment 2021-12-13 11:51:56 +03:00
Burak Velioglu 14f8cd5a75
Handle sequences 2021-12-10 17:57:19 +03:00
Halil Ozan Akgul 507df08422 Fix metadata sync fails on propagate_statistics and pg13_propagate_statistics tests 2021-12-09 12:28:11 +03:00
Burak Velioglu 5762bfb454
Shell table work prototype 2021-12-09 12:09:32 +03:00
Halil Ozan Akgul 351314f8a1 Turn metadata sync on in base/minimal schedules 2021-12-08 13:34:41 +03:00
Halil Ozan Akgul ee894c9e73 Fix metadata sync fails on multi_follower_schedule 2021-12-08 13:07:37 +03:00
Halil Ozan Akgul 4c8f79d7dd Turn metadata sync on in failure schedule 2021-12-08 11:22:56 +03:00
Halil Ozan Akgul 4f272ea0e5 Fix metadata sync fails in multi_extension 2021-12-08 10:25:43 +03:00
Halil Ozan Akgul a3834edeaa Turn metadata sync on in multi_mx_schedule 2021-12-08 10:25:43 +03:00
Halil Ozan Akgul ea37f4fd29 Turn metadata sync on in upgrade schedules 2021-12-08 10:19:02 +03:00
Hanefi Onaldi 05a3dfa8a9 Remove redundant arbitrary config class
We had 2 class definitions for CitusCacheManyConnectionsConfig, where
one of them was a copy of CitusSmallCopyBuffersConfig.

This commit leaves the intended class definition that configures caching
many connections, and removes the one that is a copy of another class
2021-12-08 04:47:08 +03:00
Burak Velioglu e8534c1dd5
Drop sequence metadata from workers explicitly 2021-12-06 19:25:51 +03:00
Burak Velioglu 21194c3b9d
Mark sequence distributed explicitly while syncing metadata
Since sequences are not marked as distributed while creating table if no
metadata worker node exists, we are marking all sequences distributed
while syncing metadata explicitly.
2021-12-06 19:25:51 +03:00
Burak Velioglu 6d849cf394
Allow delegating function from worker nodes
We've both allowed delegating functions and procedures from worker nodes
and also prevented delegation if a function/procedure has already been
propagated from another node.
2021-12-06 19:25:51 +03:00
Burak Velioglu a8b1ee87f7
Increment command counter after altering the sequence type 2021-12-06 19:25:51 +03:00
Burak Velioglu ed8e32de5e
Sync pg_dist_object on an update and propagate while syncing to a new node
Before that PR we were updating citus.pg_dist_object metadata, which keeps
the metadata related to objects on Citus, only on the coordinator node. In
order to allow using those object from worker nodes (or erroring out with
proper error message) we've started to propagate that metedata to worker
nodes as well.
2021-12-06 19:25:50 +03:00
Halil Ozan Akgul ef09ba0d06 Fix metadata sync fails of multi_table_ddl 2021-12-06 13:44:30 +03:00
Halil Ozan Akgul a6d0de060c Fix fails with metadata syncing in undistribute_table 2021-12-03 13:58:53 +03:00
Hanefi Onaldi 56e9b1b968 Introduce UDF to check worker connectivity
citus_check_connection_to_node runs a simple query on a remote node and
reports whether this attempt was successful.

This UDF will be used to make sure each worker node can connect to all
the worker nodes in the cluster.

parameters:
nodename: required
nodeport: optional (default: 5432)

return value:
boolean success
2021-12-03 02:30:28 +03:00
Talha Nisanci e4ead8f408
Update broken link for upgrade tests (#5408)
* Update broken link for upgrade tests

* Update src/test/regress/README.md

Co-authored-by: Nils Dijk <nils@citusdata.com>

Co-authored-by: Nils Dijk <nils@citusdata.com>
2021-12-02 15:25:36 +01:00
Onder Kalaci 549edcabb6 Allow disabling node(s) when multiple failures happen
As of master branch, Citus does all the modifications to replicated tables
(e.g., reference tables and distributed tables with replication factor > 1),
via 2PC and avoids any shardstate=3. As a side-effect of those changes,
handling node failures for replicated tables change.

With this PR, when one (or multiple) node failures happen, the users would
see query errors on modifications. If the problem is intermitant, that's OK,
once the node failure(s) recover by themselves, the modification queries would
succeed. If the node failure(s) are permenant, the users should call
`SELECT citus_disable_node(...)` to disable the node. As soon as the node is
disabled, modification would start to succeed. However, now the old node gets
behind. It means that, when the node is up again, the placements should be
re-created on the node. First, use `SELECT citus_activate_node()`. Then, use
`SELECT replicate_table_shards(...)` to replicate the missing placements on
the re-activated node.
2021-12-01 10:19:48 +01:00
Halil Ozan Akgul 316274b5f0 Add normalize.sed item for multi_fix_partition_shard_index_names test 2021-11-30 13:28:41 +03:00
Halil Ozan Akgul 11072b4cb8 Normalize create role command in drop_partitioned_table test 2021-11-30 12:46:22 +03:00
Onder Kalaci d405993b57 Make sure to use a dedicated metadata connection
With this commit, we make sure to use a dedicated connection per
node for all the metadata operations within the same transaction.

This is needed because the same metadata (e.g., metadata includes
the distributed table on the workers) can be modified accross
multiple connections.

With this connection we guarantee that there is a single metadata connection.
But note that this connection can be used for any other operation.
In other words, this connection is not only reserved for metadata
operations.
2021-11-26 14:36:28 +01:00
Onder Kalaci 38b08ebde9 Generalize the error checks while removing node
The checks for preventing to remove a node are very much reference
table centric. We are soon going to add the same checks for replicated
tables. So, make the checks generic such that:
	 (a) replicated tables fit naturally
	 (b) we can the same checks in `citus_disable_node`.
2021-11-26 14:25:29 +01:00
Hanefi Onaldi 4c135de9e4 Introduce CI checks for hash comments in specs
We do not use comments starting with # in spec files because it creates
errors from C preprocessor that expects directives after this character.
Instead use C style comments, i.e:
// single line comment

You can also use multiline comments as well
/*
 * multi line comment
 */
2021-11-26 14:52:51 +03:00
Halil Ozan Akgul 87a1c760d9 Fix tests in multi-1-schedule that fail with metadata syncing 2021-11-26 12:09:53 +03:00
Onder Kalaci 121f5c4271 Active placements can only be on active nodes
We re-define the meaning of active shard placement. It used
to only be defined via shardstate == SHARD_STATE_ACTIVE.

Now, we also add one more check. The worker node that the
placement is on should be active as well.

This is a preparation for supporting citus_disable_node()
for MX with multiple failures at the same time.

With this change, the maintanince daemon only needs to
sync the "node metadata" (e.g., pg_dist_node), not the
shard metadata.
2021-11-26 09:14:33 +01:00
Onder Kalaci b4931f7345 Do not acquire locks on reference tables when a node is removed/disabled
Before this commit, we acquire the metadata locks on the reference
tables while removing/disabling a node on all the MX nodes.

Although it has some marginal benefits, such as a concurrent
modification during remove/disable node blocks, instead of erroring
out, the drawbacks seems worse. Both citus_remove_node and citus_disable_node
are not tolerant to multiple node failures.

With this commit, we relax the locks. The implication is that while
a node is removed/disabled, users might see query errors. On the
other hand, this change becomes removing/disabling nodes more
tolerant to multiple node failures.
2021-11-26 09:08:25 +01:00
Onur Tirtir 76b8006a9e
Allow overwriting columnar storage pages written by aborted xacts (#5484)
When refactoring storage layer in #4907, we deleted the code that allows
overwriting a disk page previously written but not known by metadata.

Readers can see the change that introduced the code allows doing so in
commit a8da9acc63.

The reasoning was that; as of 10.2, we started aligning page
reservations (`AlignReservation`) for subsequent writes right after
allocating pages from disk. That means, even if writer transaction
fails, subsequent writes are guaranteed to allocate a new page and write
to there. For this reason, attempting to write to a page allocated
before is not possible for a columnar table that user created when using
v10.2.x.

However, since the older versions of columnar doesn't do that, following
example scenario can still result in writing to such disk page, even if
user now upgraded to v10.2.x. This is because, when upgrading storage to
2.0 (`ColumnarStorageUpdateIfNeeded`), we calculate `reservedOffset` of
the metapage based on the highest used address known by stripe
metadata (`GetHighestUsedAddressAndId`). However, stripe metadata
doesn't have entries for aborted writes. As a result, highest used
address would be computed by ignoring pages that are allocated but not
used.

- User attempts writing to columnar table on Citus v10.0x/v10.1x.
- Write operation fails for some reason.
- User upgrades Citus to v10.2.x.
- When attempting to write to same columnar table, they hit to "attempt
  to write columnar data .." error since write operation done in the
  older version of columnar already allocated that page, and now we are
  overwriting it.

For this reason, with this commit, we re-do the change done in
a8da9acc63.

And for the reasons given above, it wasn't possible to add a test for
this commit via usual code-paths. For this reason, added a UDF only for
testing purposes so that we can reproduce the exact scenario in our
regression test suite.
2021-11-26 07:51:13 +01:00
Onur Tirtir 85da4fc2e0
Merge branch 'master' into col/pg-upgrade-dependency 2021-11-26 09:34:43 +03:00
Onur Tirtir 81af605e07
Fix typo: "no sharding pruning constraints" -> "no shard pruning constraints" (#5490) 2021-11-25 21:00:44 +01:00
Onur Tirtir 73f06323d8 Introduce dependencies from columnarAM to columnar metadata objects
During pg upgrades, we have seen that it is not guaranteed that a
columnar table will be created after metadata objects got created.
Prior to changes done in this commit, we had such a dependency
relationship in `pg_depend`:

```
columnar_table ----> columnarAM ----> citus extension
                                           ^  ^
                                           |  |
columnar.storage_id_seq --------------------  |
                                              |
columnar.stripe -------------------------------
```

Since `pg_upgrade` just knows to follow topological sort of the objects
when creating database dump, above dependency graph doesn't imply that
`columnar_table` should be created before metadata objects such as
`columnar.storage_id_seq` and `columnar.stripe` are created.

For this reason, with this commit we add new records to `pg_depend` to
make columnarAM depending on all rel objects living in `columnar`
schema. That way, `pg_upgrade` will know it needs to create those before
creating `columnarAM`, and similarly, before creating any tables using
`columnarAM`.

Note that in addition to inserting those records via installation script,
we also do the same in `citus_finish_pg_upgrade()`. This is because,
`pg_upgrade` rebuilds catalog tables in the new cluster and that means,
we must insert them in the new cluster too.
2021-11-23 13:14:00 +03:00
Onur Tirtir ef2ca03f24 Reproduce bug via test suite 2021-11-23 13:14:00 +03:00
Burak Velioglu 6590f12de4
Merge branch 'master' into velioglu/make_object_lock_explicit 2021-11-22 13:55:36 +03:00
Burak Velioglu 12e05ad196
Sorted addresses before getting lock 2021-11-22 11:43:32 +03:00
Marco Slot f49d26fbeb Remove citus_update_table_statistics isolation test 2021-11-19 10:51:15 +01:00
Marco Slot 56eae48daf Stop updating shard range in citus_update_shard_statistics 2021-11-19 10:51:15 +01:00
Burak Velioglu 3a68263cc7
Change lock type 2021-11-19 12:03:17 +03:00