Commit Graph

84 Commits (6fd5f8e65836d17a5e7ca83fe6a16cf93348b3d9)

Author SHA1 Message Date
Burak Velioglu 6fd5f8e658
Move functions 2022-01-16 23:54:04 +03:00
Burak Velioglu aedd09ffdf
Separate command list generation and execution 2022-01-16 23:49:31 +03:00
Burak Velioglu 381871890f
Move DetachPartitionCommandList back 2022-01-16 21:20:39 +03:00
Burak Velioglu e107fb7519
Move DistributedObjectMetadataSyncCommandList back 2022-01-16 21:02:16 +03:00
Burak Velioglu 04511cc5e0
Adress reviews 2022-01-16 16:44:21 +03:00
Burak Velioglu 9fc09947ed
Citus indent 2022-01-12 15:26:01 +03:00
Burak Velioglu ad67942cda
Sequence and add node fix 2022-01-12 15:24:09 +03:00
Burak Velioglu 2b513c4100
Merge branch 'master' into velioglu/table_wo_seq_prototype 2022-01-11 11:35:32 +03:00
Burak Velioglu 697d1468fe
Use coordianated transaction for object prop 2022-01-10 22:08:43 +03:00
Önder Kalacı 885601c02c
Require superuser while activating a node (#5609)
* Require superuser while activating a node

With this change, we require ActiveNode() (hence citus_add_node(),
citus_activate_node()) explicitly require for a superuser.

Before this commit, these functions were designed to work with
non-superuser roles with the relevent GRANTs given.

However, that is not a widely used way for calling the functions
above.

Due to possibility of non-super user calling the UDFs, they were
designed in a way that some commands were using some additional
short-lived superuser connections. That is:
	(a) breaking transactional behavior (e.g., ROLLBACK
 	    wouldn't fully rollback the whole transaction)
        (b) Making it very complicated to reason about which
	    parts of the node activation goes over which connections,
	    and becoming vulnerable to deadlocks / visibility issues.
2022-01-10 08:30:13 -08:00
Burak Velioglu 76e1e1fd6b
Add adjust sequence settings and update tests 2022-01-09 19:35:07 +03:00
Burak Velioglu 8006765504
Handle foreign tables and update tests 2022-01-07 15:48:20 +03:00
Burak Velioglu 36fb662bf4
Merge branch 'master' into velioglu/table_wo_seq_prototype 2022-01-07 12:43:35 +03:00
Burak Velioglu 1383c442ea
Update multiple table integration sync 2022-01-07 12:18:06 +03:00
Onder Kalaci 9f2d9e1487 Move placement deletion from disable node to activate node
We prefer the background daemon to only sync node metadata. That's
why we move placement metadata changes from disable node to
activate node. With that, we can make sure that disable node
only changes node metadata, whereas activate node syncs all
the metadata changes. In essence, we already expect all
nodes to be up when a node is activated. So, this does not change
the behavior much.
2022-01-07 09:56:03 +01:00
Burak Velioglu 299043dfaa
Update activate node 2022-01-05 16:33:11 +03:00
Önder Kalacı 0a8b0b06c6
Do not allow distributed functions on non-metadata synced nodes (#5586)
Before this commit, Citus was triggering metadata syncing
in the background when a function is distributed. However,
with Citus 11, we expect all clusters to have metadata synced
enabled. So, we do not expect any nodes not to have the metadata.

This change:
	(a) pro: simplifies the code and opens up possibilities
		 to simplify futher by reducing the scope of
		 bg worker to only sync node metadata
        (b) pro: explicitly asks users to sync the metadata such that
  	    any unforseen impact can be easily detected
        (c) con: For distributed functions without distribution
		 argument, we do not necessarily require the metadata
		 sycned. However, for completeness and simplicity, we
		 do so.
2022-01-04 13:12:57 +01:00
Burak Velioglu 7e3f2486f3
Remove metadata by checking isactive 2022-01-03 17:55:26 +03:00
Burak Velioglu 070e2afbe5
Remove placements wisely 2021-12-31 17:50:35 +03:00
Burak Velioglu 596e49db8c
Citus Indent 2021-12-31 13:16:30 +03:00
Burak Velioglu 3c1361dd4d
Move detach partition command list 2021-12-31 13:08:00 +03:00
Burak Velioglu 848d13f6eb
Changes depending on the discussion 2021-12-30 14:38:31 +03:00
Burak Velioglu cc68e87903
Citus indent 2021-12-28 12:48:20 +03:00
Burak Velioglu 880533a609
Divide object and metadata handling 2021-12-27 18:14:51 +03:00
Burak Velioglu 6598a23963
Dependency update 2021-12-23 17:46:37 +03:00
Burak Velioglu 9fec89d70b
Citus_disable_node 2021-12-21 21:36:32 +03:00
Burak Velioglu ab29b939b2
Indentation fix 2021-12-20 12:39:03 +03:00
Burak Velioglu 303c7e230e
Ensure dependency craetion on for adding/activating node 2021-12-20 12:28:43 +03:00
Burak Velioglu 3d828bc7b6
Sync metadata first before propagating any 2021-12-18 20:25:09 +03:00
Burak Velioglu f66b1d5116
Fix list creation 2021-12-17 17:17:49 +03:00
Burak Velioglu 3636b7c9c5
Start handling local tables 2021-12-16 14:57:46 +03:00
Burak Velioglu a6cdd43d42
Fix metadata changes and use same connection for all 2021-12-16 00:04:53 +03:00
Burak Velioglu fea68a43ad
Start moving table dependent metadata 2021-12-15 18:04:59 +03:00
Burak Velioglu 14f8cd5a75
Handle sequences 2021-12-10 17:57:19 +03:00
Burak Velioglu 5762bfb454
Shell table work prototype 2021-12-09 12:09:32 +03:00
Onder Kalaci 549edcabb6 Allow disabling node(s) when multiple failures happen
As of master branch, Citus does all the modifications to replicated tables
(e.g., reference tables and distributed tables with replication factor > 1),
via 2PC and avoids any shardstate=3. As a side-effect of those changes,
handling node failures for replicated tables change.

With this PR, when one (or multiple) node failures happen, the users would
see query errors on modifications. If the problem is intermitant, that's OK,
once the node failure(s) recover by themselves, the modification queries would
succeed. If the node failure(s) are permenant, the users should call
`SELECT citus_disable_node(...)` to disable the node. As soon as the node is
disabled, modification would start to succeed. However, now the old node gets
behind. It means that, when the node is up again, the placements should be
re-created on the node. First, use `SELECT citus_activate_node()`. Then, use
`SELECT replicate_table_shards(...)` to replicate the missing placements on
the re-activated node.
2021-12-01 10:19:48 +01:00
Onder Kalaci d405993b57 Make sure to use a dedicated metadata connection
With this commit, we make sure to use a dedicated connection per
node for all the metadata operations within the same transaction.

This is needed because the same metadata (e.g., metadata includes
the distributed table on the workers) can be modified accross
multiple connections.

With this connection we guarantee that there is a single metadata connection.
But note that this connection can be used for any other operation.
In other words, this connection is not only reserved for metadata
operations.
2021-11-26 14:36:28 +01:00
Onder Kalaci 38b08ebde9 Generalize the error checks while removing node
The checks for preventing to remove a node are very much reference
table centric. We are soon going to add the same checks for replicated
tables. So, make the checks generic such that:
	 (a) replicated tables fit naturally
	 (b) we can the same checks in `citus_disable_node`.
2021-11-26 14:25:29 +01:00
Halil Ozan Akgul 91b377490b Fix multi_cluster_management fails for metadata syncing 2021-11-04 11:09:21 +03:00
Halil Ozan Akgul 9c9d4b5eeb Turn MX on by default 2021-10-08 18:17:21 +03:00
Halil Ozan Akgul 43d5853b6d Fixes function names in comments 2021-10-06 09:24:43 +03:00
Hanefi Onaldi 7e39c7ea83
Replace master with citus in logs and comments (#5210)
I replaced 

- master_add_node,
- master_add_inactive_node
- master_activate_node

with

- citus_add_node,
- citus_add_inactive_node
- citus_activate_node

respectively.
2021-08-26 11:31:17 +03:00
Ahmet Gedemenli 9e90894f21
Synchronize hasmetadata flag on mx workers (#5086)
* Synchronize hasmetadata flag on mx workers

* Switch to sequential execution

* Add test

* Use SetWorkerColumn

* Add test for stop_sync

* Remove usage of UpdateHasmetadataOnWorkersWithMetadata

* Remove MarkNodeMetadataSynced

* Fix test for metadatasynced

* Remove MarkNodeMetadataSynced

* Style

* Remove MarkNodeHasMetadata

* Remove UpdateDistNodeBoolAttr

* Refactor SetWorkerColumn

* Use SetWorkerColumnLocalOnly when setting up dependencies

* Use SetWorkerColumnLocalOnly in TriggerSyncMetadataToPrimaryNodes

* Style

* Make update command generator functions static

* Set metadatasynced before syncing

* Call SetWorkerColumn only if the sync is successful

* Try to sync all nodes

* Fix indexno

* Update metadatasynced locally first

* Break if a node fails to sync metadata

* Send worker commands optional

* Style & Rebase

* Add raiseOnError param to SetWorkerColumn

* Style

* Set metadatasynced for all metadata nodes

* Style

* Introduce SetWorkerColumnOptional

* Polish

* Style

* Dont send set command to not synced metadata nodes

* Style

* Polish

* Add test for stop_sync

* Add test for shouldhaveshards

* Add test for isactive flag

* Sort by placementid in the function verify_metadata

* Cover edge cases for failing nodes

* Add comments

* Add nodeport to isactive test

* Add warning if metadata out of sync

* Update warning message
2021-08-12 14:16:18 +03:00
Onder Kalaci 5f02d18ef8 transactional metadata sync for maintanince daemon
As we use the current user to sync the metadata to the nodes
with #5105 (and many other PRs), there is no reason that
prevents us to use the coordinated transaction for metadata syncing.

This commit also renames few functions to reflect their actual
implementation.
2021-08-09 10:34:55 +02:00
Marco Slot c03729ad03 Only warn about reference tables when removing last node 2021-06-01 10:53:12 +02:00
Jelte Fennema b1cad26ebc Move CheckCitusVersion to the top of each function
Previously this was usually done after argument parsing. This can cause
SEGFAULTs if the number or type of arguments changes in a new version.
By checking that Citus version is correct before doing any argument
parsing we protect against these types of issues. Issues like this have
occurred in pg_auto_failover, so it's not just a theoretical issue.

The main reason why these calls were not at the top of functions is
really just historical. It was because in the past we didn't allow
statements before declarations. Thus having this check before the
argument parsing would have only been possible if we first declared all
variables.

In addition to moving existing CheckCitusVersion calls it also adds
these calls to rebalancer related functions (they were missing there).
2021-06-01 17:43:46 +02:00
SaitTalhaNisanci 8c3f85692d
Not consider old placements when disabling or removing a node (#4960)
* Not consider old placements when disabling or removing a node

* update cluster test
2021-05-28 22:38:20 +02:00
Nils Dijk c91f8d8a15
Feature: localhost guc (#4836)
DESCRIPTION: introduce `citus.local_hostname` GUC for connections to the current node

Citus once in a while needs to connect to itself for some systems operations. This used to be hardcoded to `localhost`. The hardcoded hostname causes some issues, for example in environments where `sslmode=verify-full` is required. It is not always desirable or even feasible to get `localhost` as an alt name on the certificate.

By introducing a GUC to use when connecting to the current instance the user has more control what network path is used and what hostname is required to be present in the server certificate.
2021-05-12 16:59:44 +02:00
Marco Slot f25de6a0e3 Try to return earlier in idempotent master_add_node 2021-03-02 21:22:47 +01:00
Onder Kalaci fc9a23792c COPY uses adaptive connection management on local node
With #4338, the executor is smart enough to failover to
local node if there is not enough space in max_connections
for remote connections.

For COPY, the logic is different. With #4034, we made COPY
work with the adaptive connection management slightly
differently. The cause of the difference is that COPY doesn't
know which placements are going to be accessed hence requires
to get connections up-front.

Similarly, COPY decides to use local execution up-front.

With this commit, we change the logic for COPY on local nodes:

Try to reserve a connection to local host. This logic follows
the same logic (e.g., citus.local_shared_pool_size) as the
executor because COPY also relies on TryToIncrementSharedConnectionCounter().
If reservation to local node fails, switch to local execution
Apart from this, if local execution is disabled, we follow the
exact same logic for multi-node Citus. It means that if we are
out of the connection, we'd give an error.
2021-02-04 09:45:07 +01:00