Commit Graph

7037 Commits (98d95a9b9d58c6db080321ffe3e2ab343672ab21)

Author SHA1 Message Date
Gürkan İndibay 9a0cdbf5af
Fixes granted by cascade/restrict statements for revoke (#7517)
DESCRIPTION: Fixes incorrect propagating of `GRANTED BY` and
`CASCADE/RESTRICT` clauses for `REVOKE` statements

There are two issues fixed in this PR
1. granted by statement will appear for revoke statements as well
2. revoke/cascade statement will appear after granted by

Since granted by statements does not appear in statements, this bug
hasn't been visible until now. However, after activating the granted by
statement for revoke, order problem arised and this issue was fixed
order problem for cascade/revoke as well
In summary, this PR provides usage of granted by statements properly now
with the correct order of statements.
We can verify the both errors, fixed with just single statement
REVOKE dist_role_3 from non_dist_role_3 granted by test_admin_role
cascade;
2024-02-19 15:44:21 +03:00
Onur Tirtir 74b55d0546
Enforce using werkzeug 2.3.7 for failure tests and update Postgres versions to latest minors (#7491)
Let's use version 2.3.7 to fix the following error as we do in docker
images created in https://github.com/citusdata/the-process/ repo.
```
ImportError: cannot import name 'url_quote' from 'werkzeug.urls' (/home/onurctirtir/.local/share/virtualenvs/regress-ffZKpSmO/lib/python3.9/site-packages/werkzeug/urls.py)
```

And changing werkzeug version required rebuilding Pipfile.lock file in
src/test/regress. Before updating this Pipfile.lock file, we want to
make sure that versions specified there don't break any tests. And to
ensure that this is the case,
https://github.com/citusdata/the-process/pull/155 synchronizes
requirements.txt file based on new Pipfile.lock and hence this PR
updates test image suffix accordingly.

Also, while updating https://github.com/citusdata/the-process/pull/155,
I also had to update Postgres versions to latest minors to make image
builds passing again and updating Postgres versions in images requires
updating Postgres versions in this repo too. While doing that, we also
update Postgres version used in devcontainer too.
2024-02-16 14:38:32 +00:00
eaydingol 15a3adebe8
Support SECURITY LABEL ON ROLE from any node (#7508)
DESCRIPTION: Propagates SECURITY LABEL ON ROLE statement from any node
2024-02-15 20:34:15 +03:00
Gürkan İndibay 59da0633bb
Fixes invalid grantor field parsing in grant role propagation (#7451)
DESCRIPTION: Resolves an issue that disrupts distributed GRANT
statements with the grantor option

In this issue 3 issues are being solved:
1.Correcting the erroneous appending of multiple granted by in the
deparser.
2Adding support for grantor (granted by) in grant role propagation.
3. Implementing grantor (granted by) support during the metadata sync
grant role propagation phase.

Limitations: Currently, the grantor must be created prior to the
metadata sync phase. During metadata sync, both the creation of the
grantor and the grants given by that role cannot be performed, as the
grantor role is not detected during the dependency resolution phase.

---------

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2024-02-15 08:27:29 +00:00
Gürkan İndibay c665cb8af3
Adds changelog for 11.0.9,11.1.7,11.2.2,11.3.1,12.0.1,12.1.2 (#7507) 2024-02-14 08:40:28 +03:00
Ivan Vyazmitinov 2fae91c5df
Force LC_COLLATE=C for sort in check_gucs_are_alphabetically_sorted.sh (#7489)
Fixed gucs check, as described
[here](https://github.com/citusdata/citus/pull/7286#discussion_r1481049261)
2024-02-08 12:21:21 +01:00
Onur Tirtir 689c6897a4
Refactor CREATE / DROP database functions for better readability (#7486) 2024-02-08 01:55:50 +03:00
eaydingol f01c5f2593
Move remaining citus_internal functions (#7478)
Moves the following functions to the Citus internal schema: 

citus_internal_local_blocked_processes
citus_internal_global_blocked_processes
citus_internal_mark_node_not_synced
citus_internal_unregister_tenant_schema_globally
citus_internal_update_none_dist_table_metadata
citus_internal_update_placement_metadata
citus_internal_update_relation_colocation
citus_internal_start_replication_origin_tracking
citus_internal_stop_replication_origin_tracking
citus_internal_is_replication_origin_tracking_active


#7405

---------

Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
2024-02-07 16:58:17 +03:00
Filip Sedlák 6869b3ad10
Fail early when shard can't be safely moved to a new node (#7467)
DESCRIPTION: citus_move_shard_placement now fails early when shard
cannot be safely moved

The implementation is quite simplistic -
`citus_move_shard_placement(...)` will fail with an error if there's any
new node in the cluster that doesn't have reference tables yet.

It could have been finer-grained, i.e. erroring only when trying to move
a shard to an unitialized node. Looking at the related functions -
`replicate_reference_tables()` or `citus_rebalance_start()`, I think
it's acceptable behaviour. These other functions also treat "any"
unitialized node as a temporary anomaly.

Fixes #7426

---------

Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
2024-02-07 12:04:52 +00:00
Karina 9ff8436f14
Create directories and files with pg_file_create_mode and pg_dir_create_mode permissions (#7479)
Since Postgres commit da9b580d files and directories are supposed to
be created with pg_file_create_mode and pg_dir_create_mode permissions
when default permissions are expected.

This fixes a failure of one of the postgres tests:
If we create file add.conf containing
```
shared_preload_libraries='citus'
```
and run postgres tests
```
TEMP_CONFIG=/path/to/add.conf make installcheck -C src/bin/pg_ctl/
```
then 001_start_stop.pl fails with
```
.../data/base/pgsql_job_cache mode must be 0750
```
in the log.

In passing this also stops creating directories that we haven't used
since Citus 7.4

This change explicitely doesn't change permissions of certificates/keys
that we create.

---------

Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>
2024-02-07 12:48:31 +01:00
eaydingol 594cb6f274
Move more citus internal functions (#7473)
Moves the following functions:

 citus_internal_delete_colocation_metadata 
 citus_internal_delete_partition_metadata 
 citus_internal_delete_placement_metadata 
 citus_internal_delete_shard_metadata 
 citus_internal_delete_tenant_schema
2024-01-31 23:00:04 +03:00
eaydingol d05174093b
Move citus internal functions (#7470)
Move more functions to citus_internal schema, the list:

citus_internal_add_placement_metadata
citus_internal_add_shard_metadata
citus_internal_add_tenant_schema
citus_internal_adjust_local_clock_to_remote
citus_internal_database_command

#7405
2024-01-31 11:45:19 +00:00
Onur Tirtir 3ce731d497
Make multi_metadata_sync runnable via run_test.py (#7472) 2024-01-31 09:50:16 +00:00
Onur Tirtir 6f43d5c02f
Enhance technical README for DDL propagation (#7471) 2024-01-31 10:30:14 +01:00
Onur Tirtir 5aedec4242
Improve error message for recursive CTEs (#7407)
Fixes #2870
2024-01-30 15:12:48 +00:00
eaydingol f6ea619e27
Move citus internal functions (#7466)
Move the following functions from pg_catalog to citus_internal:

citus_internal_add_object_metadata
citus_internal_add_partition_metadata


#7405
2024-01-30 12:27:10 +03:00
Onur Tirtir 9c243d4477
Improve check_gucs_are_alphabetically_sorted.sh (#7460)
Apparently https://github.com/citusdata/citus/pull/7452 was not enough,
need to consider the GUC-like expressions only within
RegisterCitusConfigVariables
function.
2024-01-26 12:10:35 +00:00
eaydingol 5d673874f7
Move citus internal functions (#7456)
Move citus_internal_acquire_citus_advisory_object_class_lock and
citus_internal_add_colocation_metadata functions from pg_catalog to
citus_internal.

#7405
2024-01-26 11:46:05 +03:00
Onur Tirtir 24188959ed
Improve the script that sorts GUCs in alphabetical order (#7452)
Soon we will have occurrences of "citus.X" in shared_library_init.c that
are not part of GUC defs, so we need to use a more precise regular
expression.
2024-01-25 11:22:39 +03:00
eaydingol 542212c3d8
Make citus_internal schema public (#7450)
DESCRIPTION: Makes citus_internal schema public



#7405
2024-01-24 17:11:10 +03:00
Onur Tirtir 3de5601bcc
Replace LOCAL_HOST_NAME with LocalHostName (#7449)
The only usages of LOCAL_HOST_NAME were in functions that are only used
during regression tests and in places where it was used incorrectly.
2024-01-24 13:50:39 +00:00
Onur Tirtir 1d096df7f4
Not use hardcoded LOCAL_HOST_NAME but citus.local_hostname to distinguish loopback connections (#7436)
Fixes a bug that breaks queries from non-maindbs when
citus.local_hostname is set to a value different than "localhost".

This is a very old bug doesn't cause a problem as long as Citus catalog
is available to FindWorkerNode(). And the catalog is always available
unless we're in non-main database, which might be the case on main but
not on older releases, hence not adding a `DESCRIPTION`. For this
reason, I don't see a reason to backport this.

Maybe we should totally refrain using LOCAL_HOST_NAME in all code-paths,
but not doing that in this PR as the other paths don't seem to be
breaking something that is user-facing.

```c
char *
GetAuthinfo(char *hostname, int32 port, char *user)
{
	char *authinfo = NULL;
	bool isLoopback = (strncmp(LOCAL_HOST_NAME, hostname, MAX_NODE_LENGTH) == 0 &&
					   PostPortNumber == port);

	if (IsTransactionState())
	{
		int64 nodeId = WILDCARD_NODE_ID;

		/* -1 is a special value for loopback connections (task tracker) */
		if (isLoopback)
		{
			nodeId = LOCALHOST_NODE_ID;
		}
		else
		{
			WorkerNode *worker = FindWorkerNode(hostname, port);
			if (worker != NULL)
			{
				nodeId = worker->nodeId;
			}
		}

		authinfo = GetAuthinfoViaCatalog(user, nodeId);
	}

	return (authinfo != NULL) ? authinfo : "";
}
```
2024-01-24 12:58:55 +00:00
Filip Sedlák 8b48d6ab02
Log username in the failed connection message (#7432)
This patch includes the username in the reported error message.
This makes debugging easier when certain commands open connections
as other users than the user that is executing the command.

```
monitora_snapshot=# SELECT citus_move_shard_placement(102030, 'monitora.db-dev-worker-a', 6005, 'monitora.db-dev-worker-a', 6017);
ERROR:  connection to the remote node monitora_user@monitora.db-dev-worker-a:6017 failed with the following error: fe_sendauth: no password supplied
Time: 40,198 ms
```
2024-01-24 11:24:23 +00:00
Halil Ozan Akgül 1cb2e1e4e8
Fixes create user queries from Citus non-main databases with other users (#7442)
This PR makes the connections to other nodes for
`mark_object_distributed` use the same user as
`execute_command_on_remote_nodes_as_user` so they'll use the same
connection.
2024-01-24 12:57:54 +03:00
Gokhan Gulbiz 3ffb831beb
Update contributing docs (#7447)
This is a minor change to use a generic name instead of our legacy CI
provider name in the contributing documentation.
2024-01-24 09:50:49 +01:00
Gürkan İndibay 863713e9b7
Refactors ExtendedTaskList methods (#7372)
ExecuteTaskListIntoTupleDestWithParam and ExecuteTaskListIntoTupleDest
are nearly the same. I parameterized and a made a reusable structure
here

---------

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2024-01-24 06:00:19 +00:00
Teja Mupparti 11d7c27352 Fix assertions in other PG versions too, the original fix is in PR-7379 2024-01-23 15:10:06 -08:00
Jelte Fennema-Nio 9683bef2ec
Replace more spurious strdups with pstrdups (#7441)
DESCRIPTION: Remove a few small memory leaks

In #7440 one instance of a strdup was removed. But there were a few
more. This removes the ones that are left over, or adds a comment why
strdup is on purpose.
2024-01-23 13:28:26 +01:00
Marco Slot 72fbea20c4
Replace spurious strdup with pstrdup (#7440)
Not sure why we never found this using valgrind, but using strdup will
cause memory leaks because the pointer is not tracked in a memory
context.
2024-01-23 11:55:03 +01:00
eaydingol ee11492a0e
Generate qualified relation name (#7427)
This change refactors the code by using generate_qualified_relation_name
from id instead of using a sequence of functions to generate the
relation name.


Fixes #6602
2024-01-22 17:32:49 +03:00
zhjwpku 4b295cc857
Simplify CitusNewNode (#7434)
postgres refactored newNode() in PG 17, the main point for doing this is
the original tricks is no longer neccessary for modern compilers[1].

This does the same for Citus.

This should have no backward compatibility issues since it just replaces
palloc0fast with palloc0.

This is good for forward compatibility since palloc0fast no longer
exists in PG 17.

[1]
https://www.postgresql.org/message-id/b51f1fa7-7e6a-4ecc-936d-90a8a1659e7c@iki.fi
2024-01-22 14:55:14 +01:00
Jelte Fennema-Nio 14ecebe47c
Fix problems with make check (#7433)
This fixes two problems:
1. Allow `make check -j20` to work, by disabling parallelism. This was
   reported by a user in #7432
2. Actually run all the tests by forwarding to `make check` instead of
   `check-full`, because confusingly `check-full` does not run all the
   tests.
2024-01-19 17:11:29 +01:00
Gürkan İndibay 188614512f
Adds comment on database and role propagation (#7388)
DESCRIPTION: Adds comment on database and role propagation.
Example commands are as below

comment on database <db_name> is '<comment_text>'
comment on database <db_name> is NULL
comment on role <role_name> is '<comment_text>'
comment on role <role_name> is NULL

---------

Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
2024-01-18 20:58:44 +03:00
Jelte Fennema-Nio 5ec056a172
Add pytest test example about connecting to a worker (#7386)
I noticed while reviewing #7203 that there as no example of executing
sql on a worker for the pytest README. Since this is a pretty common
thing that people want to do, this PR adds that.
2024-01-18 15:05:24 +03:00
Jelte Fennema-Nio fcfedff8d1
Support running isolation_update_node in flaky test detection (#7425)
I noticed in #7423 that `isolation_update_node` could not be run using
flaky test detection. This fixes that.
2024-01-17 15:36:26 +00:00
Valery 6cf6cf37fd
Adds information to explain output when using citus.explain_distributed_queries=false (#7412)
Fixes https://github.com/citusdata/citus/issues/6490
2024-01-17 15:04:42 +00:00
zhjwpku 51e607878b
remove a duplicate forward declaration and polish some comments (#7371)
remove a duplicate forward declaration and polish some comments

Signed-off-by: Zhao Junwang <zhjwpku@gmail.com>
2024-01-17 14:30:23 +00:00
Karina 21464adfec
Make isolation_update_node test system independent (#7423)
Test isolation_update_node fails on some systems with the following error:
```
-s2: WARNING:  connection to the remote node non-existent:57637 failed with the following error: could not translate host name "non-existent" to address: Name or service not known
+s2: WARNING:  connection to the remote node non-existent:57637 failed with the following error: could not translate host name "non-existent" to address: Temporary failure in name resolution
```

This slightly modifies an already existing [normalization
rule](739c6d26df/src/test/regress/bin/normalize.sed (L217-L218))
to fix it.

Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>
2024-01-17 13:39:07 +00:00
Onur Tirtir 04b374fc01
Fix upgrade tests (#7413)
Adding upgrade_basic_before_non_mixed.sql file because while
upgrade_basic_after_non_mixed exist, its before variation didn't exist
as we don't have any "before" steps. However, run_test.py assumes that
all "after" files do have a "before" variation as well. So this PR adds
an empty upgrade_basic_before_non_mixed.sql file.

Also, given that we don't have such a version called as 12.1devel
anymore, change it to 12.1.1.

And finally, let CI skip testing flakyness for upgrade tests both
because it's quite hard to get flaky-test-detection job working for
upgrade tests and also because in the end it is not much useful to test
upgrade tests against flakyness.
2024-01-16 12:37:18 +00:00
Halil Ozan Akgül 739c6d26df
Fix inserting to pg_dist_object for queries from other nodes (#7402)
Running a query from a Citus non-main database that inserts to
pg_dist_object requires a new connection to the main database itself.
This PR adds that connection to the main database.

---------

Co-authored-by: Jelte Fennema-Nio <github-tech@jeltef.nl>
2024-01-11 16:05:14 +03:00
Teja Mupparti 00068e07c5 Fix the incorrect column count after ALTER TABLE, this fixes the bug #7378 (please read the analysis in the bug for more information) 2024-01-10 12:49:44 -08:00
LightDB Enterprise Postgres 9a91136a3d
Fix timeout when underlying socket is changed in a MultiConnection (#7377)
When there are multiple localhost entries in /etc/hosts like following
/etc/hosts:
```
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
127.0.0.1   localhost
```

multi_cluster_management check will failed:
```

@@ -857,20 +857,21 @@
 ERROR:  group 14 already has a primary node
 -- check that you can add secondaries and unavailable nodes to a group
 SELECT groupid AS worker_2_group FROM pg_dist_node WHERE nodeport = :worker_2_port \gset
 SELECT 1 FROM master_add_node('localhost', 9998, groupid => :worker_1_group, noderole => 'secondary');
  ?column?
 ----------
         1
 (1 row)

 SELECT 1 FROM master_add_node('localhost', 9997, groupid => :worker_1_group, noderole => 'unavailable');
+WARNING:  could not establish connection after 5000 ms
  ?column?
 ----------
         1
 (1 row)
```

This actually isn't just a problem in test environments, but could occur
as well during actual usage when a hostname in pg_dist_node
resolves to multiple IPs and one of those IPs is unreachable.
Postgres will then automatically continue with the next IP, but
Citus should listen for events on the new socket. Not on the
old one.

Co-authored-by: chuhx43211 <chuhx43211@hundsun.com>
2024-01-10 10:49:53 +00:00
zhjwpku 8e979f7ac6
[performance improvement] remove duplicate LoadShardList call (#7380)
LoadShardList is called twice, which is not neccessary, and there is no
need to sort the shard placement list since we only want to know the list
length.
2024-01-10 11:15:19 +01:00
Onur Tirtir 1d55debb98
Support CREATE / DROP database commands from any node (#7359)
DESCRIPTION: Adds support for issuing `CREATE`/`DROP` DATABASE commands
from worker nodes

With this commit, we allow issuing CREATE / DROP DATABASE commands from
worker nodes too.
As in #7278, this is not allowed when the coordinator is not added to
metadata because we don't ever sync metadata changes to coordinator
when adding coordinator to the metadata via
`SELECT citus_set_coordinator_host('<hostname>')`, or equivalently, via
`SELECT citus_add_node(<coordinator_node_name>, <coordinator_node_port>, 0)`.

We serialize database management commands by acquiring a Citus specific
advisory lock on the first primary worker node if there are any workers in the
cluster. As opposed to what we've done in https://github.com/citusdata/citus/pull/7278
for role management commands, we try to avoid from running into distributed deadlocks
as much as possible. This is because, while distributed deadlocks that can happen around
role management commands can be detected by Citus, this is not the case for database
management commands because most of them cannot be run inside in a transaction block.
In that case, Citus cannot even detect the distributed deadlock because the command is not
part of a distributed transaction at all, then the command execution might not return the
control back to the user for an indefinite amount of time.
2024-01-08 16:47:49 +00:00
Karina 20dc58cf5d
Fix getting heap tuple size (#7387)
This fixes #7230. 

First of all, using HeapTupleHeaderGetDatumLength(heapTuple) is
definetly wrong, it gives a number that's 4 times less than the correct
tuple size (heapTuple.t_len). See

https://github.com/postgres/postgres/blob/REL_16_0/src/include/access/htup_details.h#L455-L456

https://github.com/postgres/postgres/blob/REL_16_0/src/include/varatt.h#L279

https://github.com/postgres/postgres/blob/REL_16_0/src/include/varatt.h#L225-L226

When I fixed it, the limit_intermediate_size test failed, so I tried to
understand what's going on there. In original commit fd546cf these
queries were supposed to fail. Then in b3af63c three of the queries that
were supposed to fail suddenly worked and tests were changed to pass
without understanding why the output had changed or how to keep test
testing what it had to test. Even comments saying that these queries
should fail were left untouched. Commit message gives no clue about why
exactly test has changed:

> It seems that when we use adaptive executor instead of task tracker,
we
> exceed the intermediate result size less in the test. Therefore
updated
> the tests accordingly.

Then 3fda2c3 also blindly raised the limit for one of the queries to
keep it working:


3fda2c3254 (diff-a9b7b617f9dfd345318cb8987d5897143ca1b723c87b81049bbadd94dcc86570R19)

When in fe3caf3 that HeapTupleHeaderGetDatumLength(heapTuple) call was
finally added, one of those test queries became failing again.

The other two of them now also failing after the fix. I don't understand
how exactly the calculation of "intermediate result size" that is
limited by citus.max_intermediate_result_size had changed through
b3af63c and fe3caf3, but these numbers are now closer to what
they originally were when this limitation was added in
fd546cf. So these queries should fail, like in the original
version of the limit_intermediate_size test.

Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>
2024-01-08 17:09:30 +01:00
Onur Tirtir 968ac74cde
Fix foreign_key_to_reference_shard_rebalance test (#7400)
foreign_key_to_reference_shard_rebalance failed because partition of
2024 year does not exist, fixed by add default partition.

Replaces https://github.com/citusdata/citus/pull/7396 by adding a rule
that allows properly testing foreign_key_to_reference_shard_rebalance
via run_test.py.

Closes #7396

Co-authored-by: chuhx <148182736+cstarc1@users.noreply.github.com>
2024-01-04 13:16:45 +01:00
Onur Tirtir d940cfa992
Do nothing if the database is not distributed (#7392)
Fixes the remaining cases reported in
https://github.com/citusdata/citus/issues/7370.
2024-01-03 17:03:06 +03:00
Gürkan İndibay c3579eef06
Adds REASSIGN OWNED BY propagation (#7319)
DESCRIPTION: Adds REASSIGN OWNED BY propagation

This pull request introduces the propagation of the "Reassign owned by"
statement. It accommodates both local and distributed roles for both the
old and new assignments. However, when the old role is a local role, it
undergoes filtering and is not propagated. On the other hand, if the new
role is a local role, the process involves first creating the role on
worker nodes before propagating the "Reassign owned" statement.
2023-12-28 15:15:58 +03:00
Gürkan İndibay 181b8ab6d5
Adds additional alter database propagation support (#7253)
DESCRIPTION: Adds database connection limit, rename and set tablespace
propagation
In this PR, below statement propagations are added

alter database <database_name> with allow_connections = <boolean_value>;
alter database <database_name> rename to <database_name2>;
alter database <database_name> set TABLESPACE <table_space_name>

---------

Co-authored-by: Jelte Fennema-Nio <github-tech@jeltef.nl>
Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2023-12-26 14:55:04 +03:00
Halil Ozan Akgül b877d606c7
Adds 2PC distributed commands from other databases (#7203)
DESCRIPTION: Adds support for 2PC from non-Citus main databases

This PR only adds support for `CREATE USER` queries, other queries need
to be added. But it should be simple because this PR creates the
underlying structure.

Citus main database is the database where the Citus extension is
created. A non-main database is all the other databases that are in the
same node with a Citus main database.

When a `CREATE USER` query is run on a non-main database we:

1. Run `start_management_transaction` on the main database. This
function saves the outer transaction's xid (the non-main database
query's transaction id) and marks the current query as main db command.
2. Run `execute_command_on_remote_nodes_as_user("CREATE USER
<username>", <username to run the command>)` on the main database. This
function creates the users in the rest of the cluster by running the
query on the other nodes. The user on the current node is created by the
query on the outer, non-main db, query to make sure consequent commands
in the same transaction can see this user.
3. Run `mark_object_distributed` on the main database. This function
adds the user to `pg_dist_object` in all of the nodes, including the
current one.

This PR also implements transaction recovery for the queries from
non-main databases.
2023-12-22 19:19:41 +03:00