citus

Commit Graph

Author	SHA1	Message	Date
Hanefi Onaldi	5c6069a74a	Do not rely on fk cache when truncating local data (#5018 )	2021-06-07 11:56:48 +03:00
Marco Slot	e81d25a7be	Refactor RelationIsAKnownShard to remove onlySearchPath argument	2021-06-02 14:30:27 +02:00
Ahmet Gedemenli	089ef35940	Disable dropping and truncating known shards Add test for disabling dropping and truncating known shards	2021-06-02 14:30:27 +02:00
Jelte Fennema	1a83628195	Use "orphaned shards" naming in more places We were not very consistent in how we named these shards.	2021-06-04 11:39:19 +02:00
Jelte Fennema	3f60e4f394	Add ExecuteCriticalCommandInDifferentTransaction function We use this pattern multiple times throughout the codebase now. Seems like a good moment to abstract it away.	2021-06-04 11:30:27 +02:00
Jelte Fennema	503c70b619	Cleanup orphaned shards before moving when necessary A shard move would fail if there was an orphaned version of the shard on the target node. With this change before actually fail, we try to clean up orphaned shards to see if that fixes the issue.	2021-06-04 11:23:07 +02:00
Jelte Fennema	280b9ae018	Cleanup orphaned shards at the start of a rebalance In case the background daemon hasn't cleaned up shards yet, we do this manually at the start of a rebalance.	2021-06-04 11:23:07 +02:00
Jelte Fennema	7015049ea5	Add citus_cleanup_orphaned_shards UDF Sometimes the background daemon doesn't cleanup orphaned shards quickly enough. It's useful to have a UDF to trigger this removal when needed. We already had a UDF like this but it was only used during testing. This exposes that UDF to users. As a safety measure it cannot be run in a transaction, because that would cause the background daemon to stop cleaning up shards while this transaction is running.	2021-06-04 11:23:07 +02:00
Naisila Puka	0f37ab5f85	Fixes column default coming from a sequence (#4914 ) * Add user-defined sequence support for MX * Remove default part when propagating to workers * Fix ALTER TABLE with sequences for mx tables * Clean up and add tests * Propagate DROP SEQUENCE * Removing function parts * Propagate ALTER SEQUENCE * Change sequence type before propagation & cleanup * Revert "Propagate ALTER SEQUENCE" This reverts commit 2bef64c5a29f4e7224a7f43b43b88e0133c65159. * Ensure sequence is not used in a different column with different type * Insert select tests * Propagate rename sequence stmt * Fix issue with group ID cache invalidation * Add ALTER TABLE ALTER COLUMN TYPE .. precaution * Fix attnum inconsistency and add various tests * Add ALTER SEQUENCE precaution * Remove Citus hook * More tests Co-authored-by: Marco Slot <marco.slot@gmail.com>	2021-06-03 23:02:09 +03:00
Marco Slot	c03729ad03	Only warn about reference tables when removing last node	2021-06-01 10:53:12 +02:00
Hanefi Onaldi	fa29d6667a	Accept invalidation before fk graph validity check (#5017 ) InvalidateForeignKeyGraph sends an invalidation via shared memory to all backends, including the current one. However, we might not call AcceptInvalidationMessages before reading from the cache below. It would be better to also add a call to AcceptInvalidationMessages in IsForeignConstraintRelationshipGraphValid.	2021-06-02 14:45:35 +03:00
Ahmet Gedemenli	103cf34418	Sort GUCs in alphabetical order	2021-06-02 12:52:18 +03:00
Jelte Fennema	b1cad26ebc	Move CheckCitusVersion to the top of each function Previously this was usually done after argument parsing. This can cause SEGFAULTs if the number or type of arguments changes in a new version. By checking that Citus version is correct before doing any argument parsing we protect against these types of issues. Issues like this have occurred in pg_auto_failover, so it's not just a theoretical issue. The main reason why these calls were not at the top of functions is really just historical. It was because in the past we didn't allow statements before declarations. Thus having this check before the argument parsing would have only been possible if we first declared all variables. In addition to moving existing CheckCitusVersion calls it also adds these calls to rebalancer related functions (they were missing there).	2021-06-01 17:43:46 +02:00
Jelte Fennema	4c20bf7a36	Remove pg_dist_rebalence_strategy_enterprise_check (#5014 ) This is not necessary anymore now that the rebalancer is open source.	2021-06-01 06:16:46 -07:00
Ahmet Gedemenli	69d39c0e8b	Fix relname null bug when parallel execution	2021-06-01 14:14:35 +03:00
Ahmet Gedemenli	9638933d9d	Remove function GenerateNewTargetEntriesForSortClauses	2021-06-01 12:35:36 +03:00
Onur Tirtir	94f30a0428	Refactor index check in ColumnarProcessUtility	2021-06-01 11:12:28 +03:00
Jelte Fennema	3271f1bd13	Fix data race in get_rebalance_progress (#5008 ) To be able to report progress of the rebalancer, the rebalancer updates the state of a shard move in a shared memory segment. To then fetch the progress, `get_rebalance_progress` can be called which reads this shared memory. Without this change it did so without using any synchronization primitives, allowing for data races. This fixes that by using atomic operations to update and read from the parts of the shared memory that can be changed after initialization.	2021-05-31 15:27:32 +02:00
SaitTalhaNisanci	8c3f85692d	Not consider old placements when disabling or removing a node (#4960 ) * Not consider old placements when disabling or removing a node * update cluster test	2021-05-28 22:38:20 +02:00
SaitTalhaNisanci	a20cc3b36a	Only consider shard state 1 in citus shards (#4970 )	2021-05-28 11:33:48 +03:00
SaitTalhaNisanci	a4944a2102	Rename CoordinatedTransactionShouldUse2PC (#4995 )	2021-05-21 18:57:42 +03:00
Hanefi Onaldi	c160325d07	Use streaming replication when repl factor = 1	2021-05-21 16:14:59 +03:00
Hanefi Onaldi	878513f325	Remove all occurences of replication_model GUC	2021-05-21 16:14:59 +03:00
SaitTalhaNisanci	87e3a5e24a	Use 2PC when using a node connection (#4997 )	2021-05-21 14:58:53 +03:00
SaitTalhaNisanci	82f34a8d88	Enable citus.defer_drop_after_shard_move by default (#4961 ) Enable citus.defer_drop_after_shard_move by default	2021-05-21 10:48:32 +03:00
Nils Dijk	d7dd247fb5	fix shared dependencies that are not resident in a database (#4992 ) DESCRIPTION: fix shared dependencies that are not resident in a database eg. databases depend on users (their owners) that both don’t have a database they reside in. These dependencies are recorded in pg_shdepend with a `dbid` of `InvalidOid` When we fetch our shared dependencies we don’t take these links in account. With this patch we use logic inspired by `classIdGetDbId` to decide when to use `MyDatabaseId` vs `InvalidOid` to correctly resolve dependencies between shared objects.	2021-05-20 08:55:02 -07:00
Jelte Fennema	10f06ad753	Fetch shard size on the fly for the rebalance monitor Without this change the rebalancer progress monitor gets the shard sizes from the `shardlength` column in `pg_dist_placement`. This column needs to be updated manually by calling `citus_update_table_statistics`. However, `citus_update_table_statistics` could lead to distributed deadlocks while database traffic is on-going (see #4752). To work around this we don't use `shardlength` column anymore. Instead for every rebalance we now fetch all shard sizes on the fly. Two additional things this does are: 1. It adds tests for the rebalance progress function. 2. If a shard move cannot be done because a source or target node is unreachable, then we error in stop the rebalance, instead of showing a warning and continuing. When using the by_disk_size rebalance strategy it's not safe to continue with other moves if a specific move failed. It's possible that the failed move made space for the next move, and because the failed move never happened this space now does not exist. 3. Adds two new columns to the result of `get_rebalancer_progress` which shows the size of the shard on the source and target node. Fixes #4930	2021-05-20 16:38:17 +02:00
Nils Dijk	a6c2d2a4c4	Feature: alter database owner (#4986 ) DESCRIPTION: Add support for ALTER DATABASE OWNER This adds support for changing the database owner. It achieves this by marking the database as a distributed object. By marking the database as a distributed object it will look for its dependencies and order the user creation commands (enterprise only) before the alter of the database owner. This is mostly important when adding new nodes. By having the database marked as a distributed object it can easily understand for which `ALTER DATABASE ... OWNER TO ...` commands to propagate by resolving the object address of the database and verifying it is a distributed object, and hence should propagate changes of owner ship to all workers. Given the ownership of the database might have implications on subsequent commands in transactions we force sequential mode for transactions that have a `ALTER DATABASE ... OWNER TO ...` command in them. This will fail the transaction with meaningful help when the transaction already executed parallel statements. By default the feature is turned off since roles are not automatically propagated, having it turned on would cause hard to understand errors for the user. It can be turned on by the user via setting the `citus.enable_alter_database_owner`.	2021-05-20 13:27:44 +02:00
Onder Kalaci	d07db99ea4	Make sure that target node in shard moves is eligable for shard move	2021-05-20 10:51:01 +02:00
Onder Kalaci	926069a859	Wait until all connections are successfully established Comment from the code: /* * Iterate until all the tasks are finished. Once all the tasks * are finished, ensure that that all the connection initializations * are also finished. Otherwise, those connections are terminated * abruptly before they are established (or failed). Instead, we let * the ConnectionStateMachine() to properly handle them. * * Note that we could have the connections that are not established * as a side effect of slow-start algorithm. At the time the algorithm * decides to establish new connections, the execution might have tasks * to finish. But, the execution might finish before the new connections * are established. / Note that the abruptly terminated connections lead to the following errors: 2020-11-16 21:09:09.800 CET [16633] LOG: could not accept SSL connection: Connection reset by peer 2020-11-16 21:09:09.872 CET [16657] LOG: could not accept SSL connection: Undefined error: 0 2020-11-16 21:09:09.894 CET [16667] LOG: could not accept SSL connection: Connection reset by peer To easily reproduce the issue: - Create a single node Citus - Add the coordinator to the metadata - Create a distributed table with shards on the coordinator - f.sql: select count() from test; - pgbench -f /tmp/f.sql postgres -T 12 -c 40 -P 1 or pgbench -f /tmp/f.sql postgres -T 12 -c 40 -P 1 -C	2021-05-19 15:59:13 +02:00
Onder Kalaci	995adf1a19	Executor takes connection establishment and task execution costs into account With this commit, the executor becomes smarter about refrain to open new connections. The very basic example is that, if the connection establishments take 1000ms and task executions as 5 msecs, the executor becomes smart enough to not establish new connections.	2021-05-19 15:48:07 +02:00
Onder Kalaci	28b0b4ebd1	Move slow start increment to generic place	2021-05-19 14:31:20 +02:00
Marco Slot	715dce1eea	Reduce local insert memory usage during deparsing	2021-05-18 16:11:43 +02:00
Marco Slot	644b266dee	Only cache local plans when reusing a distributed plan	2021-05-18 16:11:43 +02:00
Marco Slot	00792831ad	Add execution memory contexts and free after local query execution	2021-05-18 16:11:43 +02:00
SaitTalhaNisanci	ff2a125a5b	Lookup hostname before execution (#4976 ) We lookup the hostname just before the execution so that even if there are cached entries in the prepared statement cache we use the updated entries.	2021-05-18 16:46:31 +03:00
SaitTalhaNisanci	eaa7d2bada	Not block maintenance daemon (#4972 ) It was possible to block maintenance daemon by taking an SHARE ROW EXCLUSIVE lock on pg_dist_placement. Until the lock is released maintenance daemon would be blocked. We should not block the maintenance daemon under any case hence now we try to get the pg_dist_placement lock without waiting, if we cannot get it then we don't try to drop the old placements.	2021-05-17 03:22:35 -07:00
Nils Dijk	c91f8d8a15	Feature: localhost guc (#4836 ) DESCRIPTION: introduce `citus.local_hostname` GUC for connections to the current node Citus once in a while needs to connect to itself for some systems operations. This used to be hardcoded to `localhost`. The hardcoded hostname causes some issues, for example in environments where `sslmode=verify-full` is required. It is not always desirable or even feasible to get `localhost` as an alt name on the certificate. By introducing a GUC to use when connecting to the current instance the user has more control what network path is used and what hostname is required to be present in the server certificate.	2021-05-12 16:59:44 +02:00
Jelte Fennema	cbbd10b974	Implement an improvement threshold in the rebalancer (#4927 ) Every move in the rebalancer algorithm results in an improvement in the balance. However, even if the improvement in the balance was very small the move was still chosen. This is especially problematic if the shard itself is very big and the move will take a long time. This changes the rebalancer algorithm to take the relative size of the balance improvement into account when choosing moves. By default a move will not be chosen if it improves the balance by less than half of the size of the shard. An extra argument is added to the rebalancer functions so that the user can decide to lower the default threshold if the ignored move is wanted anyway.	2021-05-11 14:24:59 +02:00
Onder Kalaci	cc4870a635	Remove wrong PG_USED_FOR_ASSERTS_ONLY	2021-05-11 12:58:37 +02:00
Onder Kalaci	a231ff29b0	Get prepared for some improvements for online rebalancer To see all the changes, see https://github.com/citusdata/citus-enterprise/pull/586/files	2021-05-10 19:54:31 +02:00
jeff-davis	7b9aecff21	Columnnar: metapage changes. (#4907 ) * Columnar: introduce columnar storage API. This new API is responsible for the low-level storage details of columnar; translating large reads and writes into individual block reads and writes that respect the page headers and emit WAL. It's also responsible for the columnar metapage, resource reservations (stripe IDs, row numbers, and data), and truncation. This new API is not used yet, but will be used in subsequent forthcoming commits. * Columnar: add columnar_storage_info() for debugging purposes. * Columnar: expose ColumnarMetadataNewStorageId(). * Columnar: always initialize metapage at creation time. This avoids the complexity of dealing with tables where the metapage has not yet been initialized. * Columnar: columnar storage upgrade/downgrade UDFs. Necessary upgrade/downgrade step so that new code doesn't see an old metapage. * Columnar: improve metadata.c comment. * Columnar: make ColumnarMetapage internal to the storage API. Callers should not have or need direct access to the metapage. * Columnar: perform resource reservation using storage API. * Columnar: implement truncate using storage API. * Columnar: implement read/write paths with storage API. * Columnar: add storage tests. * Revert "Columnar: don't include stripe reservation locks in lock graph." This reverts commit `c3dcd6b9f8`. No longer needed because the columnar storage API takes care of concurrency for resource reservation. * Columnar: remove unnecessary lock when reserving. No longer necessary because the columnar storage API takes care of concurrent resource reservation. * Add simple upgrade tests for storage/ branch * fix multi_extension.out Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2021-05-10 20:16:46 +03:00
SaitTalhaNisanci	5a941814fd	Close connection after each shard move (#4967 )	2021-05-10 16:57:19 +03:00
Ahmet Gedemenli	8cb505d6e1	Fix matview access method change issue (#4959 ) * Fix matview access method change issue * Use pg function get_am_name * Split view generation command into pieces	2021-05-07 15:47:24 +03:00
SaitTalhaNisanci	6b1904d37a	When moving a shard to a new node ensure there is enough space (#4929 ) * When moving a shard to a new node ensure there is enough space * Add WairForMiliseconds time utility * Add more tests and increase readability * Remove the retry loop and use a single udf for disk stats * Address review * address review Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2021-05-06 17:28:02 +03:00
Ahmet Gedemenli	bc818e76e2	Add notice log message for skipping child tables for optimization	2021-05-06 16:49:37 +03:00
Ahmet Gedemenli	2e0bb5c0c8	Fix nested select query with union bug	2021-05-05 20:35:00 +03:00
Jelte Fennema	50357db957	Simplify code that tests the shard rebalancer algorithm (#4925 ) This modifies the test code to use sane defaults instead of requiring all values to be specified in the test.	2021-05-03 15:47:19 +02:00
Jelte Fennema	2f29d4e53e	Continue to remove shards after first failure in DropMarkedShards The comment of DropMarkedShards described the behaviour that after a failure we would continue trying to drop other shards. However the code did not do this and would stop after the first failure. Instead of simply fixing the comment I fixed the code, because the described behaviour is more useful. Now a single shard that cannot be removed yet does not block others from being removed.	2021-04-30 15:42:09 +03:00
Sait Talha Nisanci	8cabd2e822	Decrease memory usage with rebalancer We decrease memory usage by: - Freeing temporary buffers - Using separate memory context for blocks that uses "small" amount of memory but can be repeated many times such as loops	2021-04-29 13:40:47 +03:00
Hanefi Onaldi	2f90ce931b	Fix minor issues with makefile targets (#4717 )	2021-04-28 15:46:55 +03:00
Marco Slot	4b49cb112f	Fix FROM ONLY queries on partitioned tables	2021-04-27 16:10:07 +02:00
Ahmet Gedemenli	fe65be993e	Sort GUCs in alphabetic order	2021-04-26 15:05:42 +03:00
Ahmet Gedemenli	332c5ce4ad	Fix worker partitioned size functions (#4922 )	2021-04-26 10:29:46 +03:00
Onder Kalaci	918838e488	Allow constant VALUES clauses in pushdown queries As long as the VALUES clause contains constant values, we should not recursively plan the queries/CTEs. This is a follow-up work of #1805. So, we can easily apply OUTER join checks as if VALUES clause is a reference table/immutable function.	2021-04-21 14:28:08 +02:00
SaitTalhaNisanci	93c2dcf3d2	Fix data-race with concurrent calls of DropMarkedShards (#4909 ) * Fix problews with concurrent calls of DropMarkedShards When trying to enable `citus.defer_drop_after_shard_move` by default it turned out that DropMarkedShards was not safe to call concurrently. This could especially cause big problems when also moving shards at the same time. During tests it was possible to trigger a state where a shard that was moved would not be available on any of the nodes anymore after the move. Currently DropMarkedShards is only called in production by the maintenaince deamon. Since this is only a single process triggering such a race is currently impossible in production settings. In future changes we will want to call DropMarkedShards from other places too though. * Add some isolation tests Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2021-04-21 10:59:48 +03:00
Ahmet Gedemenli	33c620f232	Optimize partitioned disk size calculation (#4905 ) * Optimize partitioned disk size calculation * Polish * Fix test for citus_shard_cost_by_disk_size Try optimizing if not CSTORE	2021-04-19 13:30:56 +03:00
Onder Kalaci	5482d5822f	Keep more statistics about connection establishment times When DEBUG4 enabled, Citus now prints per connection establishment time.	2021-04-16 14:56:31 +02:00
Onder Kalaci	5b78f6cd63	Keep more execution statistics When DEBUG4 enabled, Citus now prints per task execution times.	2021-04-16 14:45:00 +02:00
Hanefi Onaldi	9919fbe3f8	Switch to sequential mode on long partition names This commit adds support for long partition names for distributed tables: - ALTER TABLE dist_table ATTACH PARTITION .. - CREATE TABLE .. PARTITION OF dist_table .. Note: create_distributed_table UDF does not support long table and partition names, and is not covered in this commit	2021-04-14 15:27:50 +03:00
Ahmet Gedemenli	e445e3d39c	Introduce 3 partitioned size udfs (#4899 ) * Introduce 3 partitioned size udfs * Add tests for new partition size udfs * Fix type incompatibilities * Convert UDFs into pure sql functions * Fix function comment	2021-04-13 17:36:27 +03:00
Onur Tirtir	fe5c985e1d	Remove HAS_TABLEAM config since we dropped pg11 support (#4862 ) * Remove HAS_TABLEAM config * Drop columnar_ensure_objects_exist * Not call columnar_ensure_objects_exist in citus_finish_pg_upgrade	2021-04-13 10:51:26 +03:00
Ahmet Gedemenli	d74d358a45	Refactor size queries with new enum SizeQueryType (#4898 ) * Refactor size queries with new enum SizeQueryType * Polish	2021-04-12 17:14:29 +03:00
SaitTalhaNisanci	b453563e88	Warm up connections params hash (#4872 ) ConnParams(AuthInfo and PoolInfo) gets a snapshot, which will block the remote connectinos to localhost. And the release of snapshot will be blocked by the snapshot. This leads to a deadlock. We warm up the conn params hash before starting a new transaction so that the entries will already be there when we start a new transaction. Hence GetConnParams will not get a snapshot.	2021-04-12 13:08:38 +03:00
Ahmet Gedemenli	caef0463b0	Update func comment for PostprocessCreateTableStmt	2021-04-09 13:41:59 +03:00
Ahmet Gedemenli	52e467a9a0	Error out if inheriting a distributed table (#4871 ) * Error out if inheriting a distributed table * Add test inheriting a distirbuted table	2021-04-07 11:21:06 +03:00
Ahmet Gedemenli	e4c4a9b683	Fix error message for local table joins (#4870 ) * Fix error message for local table joins * Fix error messages for regression tests expected outputs	2021-04-06 16:18:28 +03:00
Ahmet Gedemenli	840c879572	Remove redundant if statement for schema name	2021-04-06 10:29:17 +03:00
Halil Ozan Akgul	a5038046f9	Adds shard_count parameter to create_distributed_table	2021-03-29 16:22:49 +03:00
SaitTalhaNisanci	03832f353c	Drop postgres 11 support	2021-03-25 09:20:28 +03:00
Önder Kalacı	b5f4320164	Make sure that single task local executions start coordinated transaction (#4831 ) With https://github.com/citusdata/citus/pull/4806 we enabled 2PC for any non-read-only local task. However, if the execution is a single task, enabling 2PC (CoordinatedTransactionShouldUse2PC) hits an assertion as we are not in a coordinated transaction. There is no downside of using a coordinated transaction for single task local queries.	2021-03-17 12:20:57 +01:00
Ahmet Gedemenli	5e5db9eefa	Add udf citus_get_active_worker_nodes	2021-03-17 13:15:59 +03:00
Marco Slot	fbc2147e11	Replace MAX_PUT_COPY_DATA_BUFFER_SIZE by citus.remote_copy_flush_threshold GUC	2021-03-16 06:00:38 +01:00
Marco Slot	1646fca445	Add GUC to set maximum connection lifetime	2021-03-16 01:57:57 +01:00
Marco Slot	6c5d263b7a	Remove unnecessary AtEOXact_Files call	2021-03-15 09:34:02 +01:00
Onder Kalaci	e65e72130d	Rename use -> shouldUse Because setting the flag doesn't necessarily mean that we'll use 2PC. If connections are read-only, we will not use 2PC. In other words, we'll use 2PC only for connections that modified any placements.	2021-03-12 08:29:43 +00:00
Onder Kalaci	6a7ed7b309	Do not trigger 2PC for reads on local execution Before this commit, Citus used 2PC no matter what kind of local query execution happens. For example, if the coordinator has shards (and the workers as well), even a simple SELECT query could start 2PC: ```SQL WITH cte_1 AS (SELECT * FROM test LIMIT 10) SELECT count(*) FROM cte_1; ``` In this query, the local execution of the shards (and also intermediate result reads) triggers the 2PC. To prevent that, Citus now distinguishes local reads and local writes. And, Citus switches to 2PC only if a modification happens. This may still lead to unnecessary 2PCs when there is a local modification and remote SELECTs only. Though, we handle that separately via #4587.	2021-03-12 08:29:43 +00:00
Onur Tirtir	874d5fd962	Remove foreign keys between columnar metadata tables (#4791 ) Postgres keeps AFTER trigger state for each transaction, because we can have deferred AFTER triggers which will be fired at the end of a transaction. Postgres cleans up this state at the end of transaction. Postgres processes ON COMMIT triggers after cleaning-up the AFTER trigger states. So if we fire any triggers in ON COMMIT, the AFTER trigger state won't be cleaned-up properly and the transaction state will be left in an inconsistent state, which might result in assertion failure. So with this commit, we remove foreign keys between columnar metadata tables and enforce constraints between them manually when dropping columnar tables.	2021-03-12 11:28:17 +03:00
Naisila Puka	71a9f45513	Fix upgrade and downgrade paths for master/citus_update_table_statistics (#4805 )	2021-03-11 14:52:40 +03:00
Naisila Puka	196064836c	Skip 2PC for readonly connections in a transaction (#4587 ) * Skip 2PC for readonly connections in a transaction * Use ConnectionModifiedPlacement() function * Remove the second check of ConnectionModifiedPlacement() * Add order by to prevent flaky output * Test using pg_dist_transaction	2021-03-10 20:01:37 +03:00
Marco Slot	58f85f55c0	Fixes a crash in queries with a modifying CTE and a SELECT without FROM	2021-03-09 10:39:33 +01:00
Philip Dubé	4e22f02997	Fix various typos due to zealous repetition	2021-03-04 19:28:15 +00:00
Marco Slot	f25de6a0e3	Try to return earlier in idempotent master_add_node	2021-03-02 21:22:47 +01:00
Hadi Moshayedi	affe38eac6	Populate DATABASEOID cache before CREATE INDEX CONCURRENTLY	2021-03-03 12:59:46 -08:00
Onder Kalaci	54ee96470e	Pass pointer of AttributeEquivalenceClass instead of pointer of pointer AttributeEquivalenceClass seems to be unnecessarily used with multiple pointers. Just use a single pointer for ease of read.	2021-03-03 12:27:26 +01:00
Onder Kalaci	d1cd198655	Prevent infinite recursion for queries that involve UNION ALL and JOIN With this commit, we make sure to prevent infinite recursion for queries in the format: [subquery with a UNION ALL] JOIN [table or subquery] Also, fixes a bug where we pushdown UNION ALL below a JOIN even if the UNION ALL is not safe to pushdown.	2021-03-03 12:27:26 +01:00
Naisila Puka	2f30614fe3	Reimplement citus_update_table_statistics to detect dist. deadlocks (#4752 ) * Reimplement citus_update_table_statistics * Update stats for the given table not colocation group * Add tests for reimplemented citus_update_table_statistics * Use coordinated transaction, merge with citus_shard_sizes functions * Update the old master_update_table_statistics as well	2021-03-03 04:12:30 +03:00
Marco Slot	dca615c5aa	Normalize the ConvertTable notices	2021-03-01 10:36:12 +01:00
SaitTalhaNisanci	feee25dfbd	Use translated vars in postgres 13 as well (#4746 ) * Use translated vars in postgres 13 as well Postgres 13 removed translated vars with pg 13 so we had a special logic for pg 13. However it had some bug, so now we copy the translated vars before postgres deletes it. This also simplifies the logic. * fix rtoffset with pg >= 13	2021-02-26 19:41:29 +03:00
Halil Ozan Akgul	5c5cb200f7	Adds GRANT for public to citus_tables	2021-02-26 16:24:33 +03:00
Önder Kalacı	0fe26a216c	Prevent cross join without any target list entries (#4750 ) /* * The physical planner assumes that all worker queries would have * target list entries based on the fact that at least the column * on the JOINs have to be on the target list. However, there is * an exception to that if there is a cartesian product join and * there is no additional target list entries belong to one side * of the JOIN. Once we support cartesian product join, we should * remove this error. */	2021-02-26 11:04:21 +01:00
Onur Tirtir	54ac924bef	Grant read access for columnar metadata tables to unprivileged user	2021-02-26 12:31:09 +03:00
Onur Tirtir	dcc0207605	Add 10.0-2 schema version	2021-02-26 12:31:09 +03:00
Naisila Puka	5ebd4eac7f	Preserve colocation with procedures in alter_distributed_table (#4743 )	2021-02-25 19:52:47 +03:00
Hanefi Onaldi	9a792ef841	Remove length limitations for table renames	2021-02-24 03:35:27 +03:00
Naisila Puka	dbb88f6f8b	Fix insert query with CTEs/sublinks/subqueries etc (#4700 ) * Fix insert query with CTE * Add more cases with deferred pruning but false fast path * Add more tests * Better readability with if statements	2021-02-23 18:00:47 +03:00
SaitTalhaNisanci	dcf54eaf2a	Use PROCESS_UTILITY_QUERY in utility calls When we use PROCESS_UTILITY_TOPLEVEL it causes some problems when combined with other extensions such as pg_audit. With this commit we use PROCESS_UTILITY_QUERY in the codebase to fix those problems.	2021-02-19 13:55:59 +03:00
Sait Talha Nisanci	bbf6132226	Revert "wip (#4730 )" This reverts commit `62e6d54a4e`.	2021-02-19 13:55:59 +03:00
SaitTalhaNisanci	62e6d54a4e	wip (#4730 )	2021-02-19 13:42:19 +03:00
Marco Slot	972a8bc0b7	Rewrite time_partitions join clause to avoid smallint[] operator	2021-02-18 12:01:18 +01:00
Ahmet Gedemenli	1f345f65b4	Support dropping local table indexes along with a distributed index	2021-02-18 13:30:12 +03:00
Onur Tirtir	676d9a9726	Bump Citus to 10.1devel	2021-02-17 11:54:33 +03:00
Onur Tirtir	d61fd6e478	Decide changing sequence dependencies on MX nodes according to resulting relation (#4713 ) When executing alter_table / undistribute_table udf's, we should not try to change sequence dependencies on MX workers if new table wouldn't require syncing metadata. Previously, we were checking that for input table. But in some cases, the fact that input table requires syncing metadata doesn't imply the same for resulting table (e.g when undistributing a Citus table). Even more, doing that was giving an unexpected error when undistributing a Citus table so this commit actually fixes that.	2021-02-15 19:20:26 +03:00
SaitTalhaNisanci	bcbd24f8de	Only consider pseudo constants for shortcuts (#4712 ) It seems that we need to consider only pseudo constants while doing some shortcuts in planning. For example there could be a false clause but it can contribute to the result in which case it will not be a pseudo constant.	2021-02-15 18:39:37 +03:00
SaitTalhaNisanci	0f1ce7a913	Not skip relation in conversion if it doesn't have RelationRestriction (#4685 ) We would exclude tables without relationRestriction from conversion candidates in local-distributed table joins. This could leave a leftover local table which should have been converted to a subquery. Ideally I would expect that in each call to CreateDistributedPlan we would pass a new plan id, but that seems like a bigger change.	2021-02-12 12:33:55 +03:00
Onder Kalaci	f297c96ec5	Add regression tests for COPY into colocated intermediate results To add the tests without too much data, make the copy switchover configurable.	2021-02-11 15:41:06 +01:00
Onder Kalaci	5d5a357487	Do not connection re-use for intermediate results /* * Colocated intermediate results are just files and not required to use * the same connections with their co-located shards. So, we are free to * use any connection we can get. * * Also, the current connection re-use logic does not know how to handle * intermediate results as the intermediate results always truncates the * existing files. That's why, we use one connection per intermediate * result. */	2021-02-11 15:41:06 +01:00
Ahmet Gedemenli	c8e83d1f26	Fix dropping fkey when distributing table	2021-02-11 15:48:35 +03:00
SaitTalhaNisanci	847b79078f	Not consider subplans in restriction list (#4679 ) * Not consider subplans in restriction list * Not consider sublink, alternative subplan in restrictions	2021-02-11 15:04:07 +03:00
Hadi Moshayedi	c3dcd6b9f8	Columnar: don't include stripe reservation locks in lock graph.	2021-02-10 10:20:20 -08:00
Onur Tirtir	9f619a85d6	Fix EXPLAIN ANALYZE exec when query returns no cols (#4672 ) We do not include dummy column if original task didn't return any columns. Otherwise, number of columns that original task returned wouldn't match number of columns returned by worker_save_query_explain_analyze.	2021-02-10 17:59:47 +03:00
Onder Kalaci	c804c9aa21	Allow local execution for intermediate results in COPY When COPY is used for copying into co-located files, it was not allowed to use local execution. The primary reason was Citus treating co-located intermediate results as co-located shards, and COPY into the distributed table was done via "format result". And, local execution of such COPY commands was not implemented. With this change, we implement support for local execution with "format result". To do that, we use the buffer for every file on shardState->copyOutState, similar to how local copy on shards are implemented. In fact, the logic is similar to local copy on shards, but instead of writing to the shards, Citus writes the results to a file. The logic relies on LOCAL_COPY_FLUSH_THRESHOLD, and flushes only when the size exceeds the threshold. But, unlike local copy on shards, in this case we write the headers and footers just once.	2021-02-09 15:00:06 +01:00
Hanefi Onaldi	353b080474	Fix Semmle errors (#4636 ) Co-authored-by: Halil Ozan Akgül <hozanakgul@gmail.com>	2021-02-08 18:37:44 +03:00
SaitTalhaNisanci	e96da4886f	Sort results in citus_shards and give raw size (#4649 ) * Sort results in citus_shards and give raw size Sort results so that it is consistent and also similar to citus_tables. Use raw size in the output so that doing operations on the size is easier. * Change column ordering	2021-02-08 15:29:42 +03:00
Ahmet Gedemenli	5dd2a3da03	Convert RelabelTypes into CollateExprs in get_rule_expr function	2021-02-05 12:06:46 +03:00
Ahmet Gedemenli	503171d2f2	Merge branch 'master' into rename-master-parameter-for-dist-stat-activity	2021-02-04 15:37:13 +03:00
Ahmet Gedemenli	2443b20b2c	Rename master to distributed for worker stat activity	2021-02-04 12:20:06 +03:00
Onder Kalaci	fc9a23792c	COPY uses adaptive connection management on local node With #4338, the executor is smart enough to failover to local node if there is not enough space in max_connections for remote connections. For COPY, the logic is different. With #4034, we made COPY work with the adaptive connection management slightly differently. The cause of the difference is that COPY doesn't know which placements are going to be accessed hence requires to get connections up-front. Similarly, COPY decides to use local execution up-front. With this commit, we change the logic for COPY on local nodes: Try to reserve a connection to local host. This logic follows the same logic (e.g., citus.local_shared_pool_size) as the executor because COPY also relies on TryToIncrementSharedConnectionCounter(). If reservation to local node fails, switch to local execution Apart from this, if local execution is disabled, we follow the exact same logic for multi-node Citus. It means that if we are out of the connection, we'd give an error.	2021-02-04 09:45:07 +01:00
Ahmet Gedemenli	34840ddc5c	Rename master to citus for dist stat activity cols	2021-02-04 11:12:23 +03:00
Sait Talha Nisanci	ff82e85ea2	Replace workerNodeCount -> nodeCount	2021-02-03 20:02:03 +03:00
Sait Talha Nisanci	eb5be579e3	Set previous cell inside a for loop	2021-02-03 20:02:03 +03:00
Sait Talha Nisanci	9ba3f70420	Remove unused method	2021-02-03 20:02:03 +03:00
Sait Talha Nisanci	24e60b44a1	Consider coordinator in intermediate result optimization It seems that we were not considering the case where coordinator was added to the cluster as a worker in the optimization of intermediate results. This could lead to errors when coordinator was added as a worker.	2021-02-03 20:02:03 +03:00
Onur Tirtir	c0f2817b70	Disallow using alter_table udfs with tables having any identity cols (#4635 ) pg_get_tableschemadef_string doesn't know how to deparse identity columns so we cannot reflect those columns when creating table from scratch. For this reason, we don't allow using alter_table udfs with tables having any identity cols.	2021-02-03 19:33:54 +03:00
Onur Tirtir	3a403090fd	Disallow adding local table with identity column to metadata (#4633 ) pg_get_tableschemadef_string doesn't know how to deparse identity columns so we cannot reflect those columns when creating shell relation. For this reason, we don't allow adding local tables -having identity cols- to metadata.	2021-02-03 19:05:17 +03:00
Onur Tirtir	5efb742f8a	Skip copying GENERATED ALWAYS AS STORED cols in ReplaceTable (#4616 ) Postgres doesn't allow inserting into columns having GENERATED ALWAYS AS (...) STORED expressions. For this reason, when executing undistribute_table or an alter_* udf, we should skip copying such columns. This is not bad since Postgres would already generate such columns.	2021-02-03 17:55:16 +03:00
Onur Tirtir	53b1888cac	Rename DropAndMoveDefaultSequenceOwnerships	2021-02-02 18:17:42 +03:00
Onur Tirtir	93c3f30024	Rename ExtractColumnsOwningSequences	2021-02-02 18:17:42 +03:00
Onur Tirtir	912d829757	Skip GENERATED AS ALWAYS STORED cols when processing cols owning sequences When finding columns owning sequences, we shouldn't rely on atthasdef since it might be true when column has GENERATED ALWAYS AS (...) STORED expression.	2021-02-02 18:17:42 +03:00
Onur Tirtir	c8a48c6eee	Not try to sync metadata for local tables (#4625 )	2021-02-02 15:12:12 +03:00
Onur Tirtir	c5d4e7081b	Fix invalid read issue in deprecated create_citus_local_table udf (#4611 ) Since create_citus_local_table doesn't specify cascadeViaForeignKeys option, we can't directly call citus_add_local_table_to_metadata from create_citus_local_table. Instead, implement an internal method and call it from deprecated udf too.	2021-02-02 12:53:27 +03:00
Brian Bergeron	1253eeb9ff	Don't propagate ALTER ROLE SET when scoped to a different database (#4471 ) Co-authored-by: brberger <brberger@microsoft.com>	2021-02-01 15:49:26 +03:00
Hanefi Önaldı	cab17afce9	Introduce UDFs for fixing partitioned table constraint names	2021-01-29 17:32:20 +03:00
Hanefi Önaldı	92cf49b7e9	Limit shardId in partitioned table constraint names to only CHECK	2021-01-29 17:29:53 +03:00
SaitTalhaNisanci	738825cc38	Fix partition column index issue (#4591 ) * Fix partition column index issue We send column names to worker_hash/range_partition_table methods, and in these methods we check the column name index from tuple descriptor. Then this index is used to decide the bucket that the current row will be sent for the repartition. This becomes a problem when there are the same column names in the tupleDescriptor. Then we can choose the wrong index. Hence the partitioned data will be put to wrong workers. Then the result could miss some data because workers might contain different range of data. An example: TupleDescriptor contains "trip_id", "car_id", "car_id" for one table. It contains only "car_id" for the other table. And assuming that the tables will be partitioned by car_id, it is not certain what should be used for deciding the bucket number for the first table. Assuming value 2 goes to bucket 2 and value 3 goes to bucket 3, it is not certain which bucket "1 2 3" (trip_id, car_id, car_id) row will go to. As a solution we send the index of partition column in targetList instead of the column name. The old API is kept so that if workers upgrade work, it still works (though it will have the same bug) * Use the same method so that backporting is easier	2021-01-29 14:40:40 +03:00
Onder Kalaci	04fcd73eb6	When reaches to shared pool size, COPY sets the placement access It looks like we forgot to set the placement accesses, and this could lead to self-deadlocks on complex transaction blocks.	2021-01-28 12:45:57 +01:00
Onder Kalaci	36bdeef1bb	When reaches to executor pool size, COPY sets the placement access It looks like we forgot to set the placement accesses, and this could lead to self-deadlocks on complex transaction blocks.	2021-01-28 12:45:57 +01:00
Onur Tirtir	bb5962ee79	Early error out when creating citus local from a temp table (#4592 )	2021-01-28 14:18:06 +03:00
Halil Ozan Akgul	913aa91449	Adds error message to AlterTableSetAccessMethod for below PG12	2021-01-28 11:32:02 +03:00
Onur Tirtir	b20615cbbe	Advise dropping foreign key in addition to create_reference_table hint (#4590 )	2021-01-27 17:59:06 +03:00
Onur Tirtir	8151c4b443	Merge remote-tracking branch 'origin/master' into rename-create_citus_local_table	2021-01-27 17:08:58 +03:00
Ahmet Gedemenli	b2c1bbddd4	Merge branch 'master' into fix-dropping-mat-views-when-alter-table	2021-01-27 16:33:10 +03:00
Ahmet Gedemenli	35043c56f1	Fix dropping materialized views while doing alter table	2021-01-27 16:32:09 +03:00
Onur Tirtir	93a83d5472	Rename create_citus_local_table.c to citus_add_local_table_to_metadata.c	2021-01-27 15:52:37 +03:00
Onur Tirtir	1a4482a37c	Get rid of the sql dir for new udf	2021-01-27 15:52:37 +03:00
Onur Tirtir	2f30be823e	Rename create_citus_local_table to citus_add_local_table_to_metadata For simplicity in downgrade test in multi_extension, didn't actually remove create_citus_local_table udf.	2021-01-27 15:52:36 +03:00
Onur Tirtir	c06fcc26e5	Hide notice messages when implicitly undistributing citus local tables	2021-01-27 13:42:06 +03:00
Onur Tirtir	458a81f93d	Add suppressNoticeMessages to TableConversionState	2021-01-27 12:53:58 +03:00
Onur Tirtir	cacb76d2c6	Not mention citus local tables in error messages (#4579 )	2021-01-27 12:36:53 +03:00
Naisila Puka	94bc2703bc	Make undistribute_table() and citus_create_local_table() work with columnar (#4563 ) * Make undistribute_table() and citus_create_local_table() work with columnar * Rename and use LocallyExecuteUtilityTask for UDF check * Remove 'local' references in ExecuteUtilityCommand	2021-01-27 01:17:20 +03:00
Halil Ozan Akgul	bafa692fc1	Adds error messages with names of indexes that will be dropped	2021-01-26 18:18:26 +03:00
Ahmet Gedemenli	e99f052904	Fix index renaming when creating citus local tables	2021-01-26 15:52:48 +03:00
Ahmet Gedemenli	14bf9d85d6	Merge branch 'master' into fix-maintenance-daemon-crash	2021-01-26 12:52:28 +03:00
Onur Tirtir	6a28f62239	Remove stale comment	2021-01-25 18:55:57 +03:00
Onur Tirtir	9e0150e9e2	Drop notify_constraint_dropped beforehand when downgrading	2021-01-25 18:55:57 +03:00
Nils Dijk	d127516dc8	Mitigate segfault in connection statemachine (#4551 ) As described in the comment, we have observed crashes in production due to a segfault caused by the dereference of a NULL pointer in our connection statemachine. As a mitigation, preventing system crashes, we provide an error with a small explanation of the issue. Unfortunately the case is not reliably reproduced yet, hence the inability to add tests. DESCRIPTION: Prevent segfaults when SAVEPOINT handling cannot recover from connection failures	2021-01-25 15:55:04 +01:00
Onur Tirtir	b5ea033a0b	Convert postgres tables to citus local when creating reference table having fkeys	2021-01-25 11:02:50 +03:00
Onur Tirtir	8e02375aa3	Some refactor as a preparation	2021-01-25 11:01:33 +03:00
Onur Tirtir	253c19062a	Rename IsCitusInitiatedBackend to IsCitusInitiatedRemoteBackend (#4562 )	2021-01-23 01:07:43 +03:00
Jeff Davis	53f7b019d5	Columnar: clean up old references to cstore.	2021-01-22 11:08:36 -08:00
Onur Tirtir	941c8fbf32	Automatically undistribute citus local tables when no more fkeys with reference tables (#4538 )	2021-01-22 18:15:41 +03:00
Ahmet Gedemenli	5022fc8301	Remove failing assertions	2021-01-22 17:09:24 +03:00
Marco Slot	03328e9679	Rename citus_tables column names to be query-friendly	2021-01-21 18:58:30 +01:00
Ahmet Gedemenli	63fab1b7d9	Merge branch 'master' into remove-deprecated-gucs-udfs	2021-01-22 13:29:07 +03:00
SaitTalhaNisanci	3d69ab5576	Choose the smallest colocation id among all matches (#4559 ) Currently we choose an arbitrary colocation id from all the matches for a colocation id. This could mean that 2 distributed tables, which have the same scheme could go into different colocation groups. This fix makes sure that the same match will go to the same colocation group.	2021-01-22 13:28:43 +03:00
Ahmet Gedemenli	3ac30ef9d8	Merge branch 'master' into remove-deprecated-gucs-udfs	2021-01-22 13:06:13 +03:00
Ahmet Gedemenli	76354ff563	Merge branch 'master' into remove-deprecated-gucs-udfs	2021-01-22 12:47:06 +03:00
Ahmet Gedemenli	887b67953b	Merge branch 'master' into fix-bug-create-citus-local-table-with-stats	2021-01-22 12:46:47 +03:00
Hadi Moshayedi	222fb4d589	Don't use 'cstore' in function names	2021-01-21 18:32:21 -08:00
Önder Kalacı	9b39b25390	Prevent citus local table creation via remote execution (#4540 ) /* * Creating Citus local tables relies on functions that accesses * shards locally (e.g., ExecuteAndLogDDLCommand()). As long as * we don't teach those functions to access shards remotely, we * cannot relax this check. */	2021-01-21 11:26:45 +03:00
Ahmet Gedemenli	89a6fe83f7	Replace to update_distributed_table_colocation for tests	2021-01-20 17:30:06 +03:00
Ahmet Gedemenli	ceb6b503c0	Remove unused UDF mark_tables_colocated	2021-01-20 17:29:23 +03:00
Ahmet Gedemenli	2fa060a32d	Fix bug creating citus local table with stats	2021-01-20 17:17:13 +03:00
Onder Kalaci	8129ce472f	Refactor Utility Hook We want to be able to find the "top-level" DDL commands (not internal/cascading ones). To achieve that, we have some refactoring.	2021-01-20 15:54:00 +03:00
Onder Kalaci	8df58926c5	Rename CitusProcessUtility -> ProcessUtilityForNode	2021-01-20 15:54:00 +03:00
Halil Ozan Akgul	434f5af030	Adds same access method check	2021-01-20 15:18:03 +03:00
Hadi Moshayedi	bc01c795a2	Reland #4419	2021-01-19 07:48:47 -08:00
Halil Ozan Akgul	27c2bd1599	Moves creation of ALTER INDEX STATISTICS commands next to index commands	2021-01-18 16:55:53 +03:00
Naisila Puka	7124a7715d	Skip 'already exists' in CREATE TABLE IF NOT EXISTS PARTITION OF (#4507 ) * Just skip 'already exists' in CT IF NOT EXISTS PARTITION OF * Generalize to tables that are not already distributed partitions	2021-01-18 15:56:02 +03:00
Onur Tirtir	f1ecbc3a53	Fix segfault when adding/dropping fkey from ref to citus local via remote exec (#4528 )	2021-01-17 20:43:33 +03:00
Onur Tirtir	5a3e8a6e24	Skip postgres tables for UndistributeTable(cascadeViaFKeys) (#4530 ) The reason behind skipping postgres tables is that we support foreign keys between postgres tables and reference tables (without converting postgres tables to citus local tables) when enable_local_reference_table_foreign_keys is false or when coordinator is not added to metadata.	2021-01-17 20:32:30 +03:00
Ahmet Gedemenli	107097ee28	Fix assert failure when creating statistics	2021-01-15 19:36:58 +03:00
Onur Tirtir	7dddfa2d0b	Not invalidate fkey cache if citus not installed (#4521 )	2021-01-15 18:31:43 +03:00
Onder Kalaci	c35e22d75d	Skip validation for foreign key creation commands For certaion purposes, we drop and recreate the foreign keys. As we acquire exclusive locks on the tables in between drop and re-create, we can safely skip validation phase of the foreign keys. The reason is purely being performance as foreign key validation could take a long value.	2021-01-15 18:04:52 +03:00
Onder Kalaci	ae0b92233d	Rename function	2021-01-15 18:04:52 +03:00
Onder Kalaci	30d0a65f40	Adds citus.enable_local_reference_table_foreign_keys When enabled any foreign keys between local tables and reference tables supported by converting the local table to a citus local table. When the coordinator is not in the metadata, the logic is disabled as foreign keys are not allowed in this configuration.	2021-01-15 18:04:52 +03:00
Onder Kalaci	ed58a404d5	Release lock on CoordinatorAddedAsWorkerNode() Because master_add_node(or others) might acquire ExclusiveLock and their initiated sessions may call CoordinatorAddedAsWorkerNode(). With this we prevent potential deadlocks.	2021-01-15 18:04:42 +03:00
Onur Tirtir	e718d24868	Add support for CREATE TABLE commands defining foreign keys	2021-01-15 17:46:06 +03:00
Ahmet Gedemenli	9a100bcdb9	Remove unused GUCs Remove deprecated variables Remove GUC citus.sslmode Remove GUC citus.expire_cached_shards Remove GUC citus.task_tracker_delay Remove GUC citus.max_assign_task_batch_size Remove GUC citus.max_tracked_tasks_per_node Remove GUC citus.max_running_tasks_per_node Remove GUC citus.large_table_shard_count Remove GUC citus.max_task_string_size Remove GUC citus.binary_master_copy_format	2021-01-15 13:30:45 +03:00
Onur Tirtir	787ed643dd	Undistribute table when cascade_via_foreign_keys=true even if rel has no fkeys (#4516 ) If relation is not involved in any foreign key relationships, foreign key graph would not return any relations for given relationId as expected. But even if it's the case, we should still undistribute the table itself.	2021-01-15 12:45:44 +03:00
Halil Ozan Akgul	9407965817	Moves struct to the header	2021-01-15 11:50:11 +03:00
Onur Tirtir	36b418982f	Add support for ALTER TABLE commands defining foreign keys	2021-01-14 17:12:00 +03:00
Onur Tirtir	05931b8fe2	Pass ProcessUtilityContext to .preprocess	2021-01-14 17:12:00 +03:00
Onur Tirtir	ac7bccd847	Skip citus tables for CreateCitusLocalTable(cascadeViaFKeys)	2021-01-14 17:12:00 +03:00
Marco Slot	b840e97cd6	Add a alter_old_partitions_set_access_method UDF	2021-01-14 10:44:14 +01:00
Ahmet Gedemenli	9b56ad48cb	Recreate invalidation functions for Citus10 Fix multi_create_table Add schema name to altered functions Recreate invalidation functions when downgrading	2021-01-13 23:18:07 +03:00
Marco Slot	de6aaaa648	Expand support for subqueries in target list through recursive planning	2021-01-13 17:26:09 +01:00
Onur Tirtir	ccbc3de535	Enable reference/distributed table creation from citus local tables	2021-01-13 17:14:26 +03:00
Onur Tirtir	7180ef5df1	Increment command counter in UndistributeTable	2021-01-13 16:54:35 +03:00
Onur Tirtir	00da1eed20	Some refactor as a preparation	2021-01-13 16:50:09 +03:00
Halil Ozan Akgul	2be14cce2e	Adds alter_distributed_table and alter_table_set_access_method UDFs	2021-01-13 16:02:39 +03:00
Onur Tirtir	1299895e71	Give hint to use ref table for unsupported fkeys between citus local & ref (#4501 )	2021-01-13 15:33:46 +03:00
SaitTalhaNisanci	724d56f949	Add citus shard helper view (#4361 ) With citus shard helper view, we can easily see: - where each shard is, which node, which port - what kind of table it belongs to - its size With such a view, we can see shards that have a size bigger than some value, which could be useful. Also debugging can be easier in production as well with this view. Fetch shards in one go per node The previous implementation was slow because it would do a lot of round trips, one per shard to be exact. Hence it is improved so that we fetch all the shard_name, shard-size pairs per node in one go. Construct shards_names, sizes query on coordinator	2021-01-13 13:58:47 +03:00
Ahmet Gedemenli	436c9d9d79	Remove the word 'master' from Citus UDFs (#4472 ) * Replace master_add_node with citus_add_node * Replace master_activate_node with citus_activate_node * Replace master_add_inactive_node with citus_add_inactive_node * Use master udfs in old scripts * Replace master_add_secondary_node with citus_add_secondary_node * Replace master_disable_node with citus_disable_node * Replace master_drain_node with citus_drain_node * Replace master_remove_node with citus_remove_node * Replace master_set_node_property with citus_set_node_property * Replace master_unmark_object_distributed with citus_unmark_object_distributed * Replace master_update_node with citus_update_node * Replace master_update_shard_statistics with citus_update_shard_statistics * Replace master_update_table_statistics with citus_update_table_statistics * Rename master_conninfo_cache_invalidate to citus_conninfo_cache_invalidate Rename master_dist_local_group_cache_invalidate to citus_dist_local_group_cache_invalidate * Replace master_copy_shard_placement with citus_copy_shard_placement * Replace master_move_shard_placement with citus_move_shard_placement * Rename master_dist_node_cache_invalidate to citus_dist_node_cache_invalidate * Rename master_dist_object_cache_invalidate to citus_dist_object_cache_invalidate * Rename master_dist_partition_cache_invalidate to citus_dist_partition_cache_invalidate * Rename master_dist_placement_cache_invalidate to citus_dist_placement_cache_invalidate * Rename master_dist_shard_cache_invalidate to citus_dist_shard_cache_invalidate * Drop master_modify_multiple_shards * Rename master_drop_all_shards to citus_drop_all_shards * Drop master_create_distributed_table * Drop master_create_worker_shards * Revert old function definitions * Add missing revoke statement for citus_disable_node	2021-01-13 12:10:43 +03:00
Onur Tirtir	2ef5879bcc	Fix error thrown for foreign keys from citus local to dist tables (#4490 )	2021-01-13 10:15:12 +03:00
Onur Tirtir	dd55ab394e	Disallow cascade_via_foreign_keys if any partition rel has non-inherited fkeys (#4487 )	2021-01-11 21:50:09 +03:00
Naisila Puka	7b05777682	Add ALTER TABLE .. SET LOGGED/UNLOGGED support (#4486 )	2021-01-11 20:39:06 +03:00
Marco Slot	d900a7336e	Automatically add placeholder record for coordinator	2021-01-08 15:09:53 +01:00
Marco Slot	597533b1ff	Add citus_set_coordinator_host	2021-01-08 13:36:26 +01:00
Marco Slot	e7f13978b5	Add a view for simple (time) partitions and their access methods	2021-01-08 11:28:15 +01:00
Onur Tirtir	5289785da4	Add cascade_via_foreign_keys option to create_citus_local_table (#4462 )	2021-01-08 15:13:26 +03:00
Marco Slot	011283122b	Add the shard rebalancer implementation	2021-01-07 16:51:55 +01:00
Onur Tirtir	f3801143fb	Add cascade option to undistribute_table	2021-01-07 15:41:49 +03:00
Onur Tirtir	2e3e680ba9	Add infra to cascade citus table functions	2021-01-07 15:41:48 +03:00
Marco Slot	47c1b19174	Revert "Do metadata sync in a separate background worker." This reverts commit `4df723cf9b`.	2021-01-07 10:30:04 +01:00
Marco Slot	d9f175532b	Revert "Trigger metadata sync at transaction commit" This reverts commit `a2c73bef27`.	2021-01-07 10:30:00 +01:00
Marco Slot	5de3337b2f	Support local execution for INSERT..SELECT with re-partitioning	2021-01-06 16:15:53 +01:00
Onder Kalaci	2fe158961b	Remove "WarnAboutLeakedPreparedTransaction" function We used to need WarnAboutLeakedPreparedTransaction() as we didn't have auto 2PC recovery. But, we long have 2PC recovery by https://github.com/citusdata/citus/pull/1574 So, we don't need anymore.	2021-01-06 15:48:58 +03:00
Naisila Puka	bcfc0aa4e9	Rethrow original concurrent index creation failure message (#4469 ) * Rethrow original concurrent index creation failure message * Alter test outputs for concurrent index creation * Detect duplicate table failure in concurrent index creation * Add test for conc. index creation w/out duplicates	2021-01-06 15:27:13 +03:00
Onur Tirtir	0d7aea3a22	Move pre undistribute_table chekcs into C API (#4456 )	2021-01-06 10:49:35 +03:00
Ahmet Gedemenli	1f36ff7c17	Prevent deadlock for long named partitioned index creation on single node (#4461 ) * Prevent deadlock for long named partitioned index creation on single node * Create IsSingleNodeCluster function * Use both local and sequential execution	2021-01-05 13:39:13 +03:00
Ahmet Gedemenli	f27649754b	Add alter index set statistics support (#4455 ) * Add alter index set statistics support * Use attNum instead of attName	2021-01-05 13:23:11 +03:00
Onur Tirtir	e91e745dbc	Implement ConstraintWithNameIsOfType (#4451 )	2020-12-29 11:53:06 +03:00
Onur Tirtir	feda8bdd37	Now that we use tuples after closing pg_depend, don't release lock	2020-12-25 18:03:28 +03:00
Onur Tirtir	04a4167a8a	Implement GetPgDependTuplesForDependingObjects	2020-12-25 18:03:28 +03:00
Halil Ozan Akgül	a8626d1944	Fixes the table used in the error message (#4449 )	2020-12-25 16:48:50 +03:00
Naisila Puka	04aeb6938b	Merge branch 'master' into issue4237	2020-12-25 12:36:40 +03:00
Hadi Moshayedi	a2c73bef27	Trigger metadata sync at transaction commit	2020-12-24 08:28:38 -08:00
Hadi Moshayedi	4df723cf9b	Do metadata sync in a separate background worker.	2020-12-24 08:25:55 -08:00
Naisila Puka	0bb2c991f9	Merge branch 'master' into issue4237	2020-12-24 18:05:27 +03:00
Ahmet Gedemenli	5af585269a	Add separate pg13 test for stats targets	2020-12-24 18:01:25 +03:00
naisila	59a81491e8	Add test for master_create_empty_shard on coordinator	2020-12-24 17:59:40 +03:00
Ahmet Gedemenli	d4bc17f6f0	Propagate statistics with altered targets	2020-12-24 17:10:12 +03:00
Ahmet Gedemenli	48ca1637a4	Propagate alter stats owner	2020-12-24 17:10:12 +03:00
Ahmet Gedemenli	f7c70f9a63	Propagate alter stats target	2020-12-24 17:10:12 +03:00
Ahmet Gedemenli	5a1607b6c0	Propagate alter stats schema	2020-12-24 17:10:12 +03:00
Ahmet Gedemenli	bdce4a7e67	Propagate rename statistics	2020-12-24 17:10:12 +03:00
Onur Tirtir	5ed9197041	Implement infra to get foreign key connected relations (#4439 ) On top of our foreign key graph, implement the infrastructure to get list of relations that are connected to input relation via a foreign key graph. We need this to support cascading create_citus_local_table & undistribute_table operations. Also add regression tests to see what our foreign key graph is able to capture currently.	2020-12-24 16:42:40 +03:00
Onur Tirtir	0db21bbe14	Remove fkey graph visited flags & rework GetConnectedListHelper (#4446 ) With this commit, we remove visited flags from ForeignConstraintRelationshipNode struct since keeping local state in global object is both dangerous and meaningless. Also to improve readability, this commit also converts needless recursion to iterative DFS to avoid passing local hash-map as another parameter to GetConnectedListHelper function.	2020-12-24 12:38:48 +03:00
Onur Tirtir	57e7defa3c	Support CREATE INDEX commands without index name on citus tables (#4273 )	2020-12-23 23:15:39 +03:00
Marco Slot	e3dcc278e0	Remove upgrade_to_reference_table UDF	2020-12-23 00:40:14 +01:00
Halil Ozan Akgül	9fd3f62cb6	Refactor foreign key functions to use table types (#4424 ) * Reuses extractReferencing/Referenced variables * Refactors GetForeignKeyOids function to check table types * Converts flags to inclusive	2020-12-23 17:05:09 +03:00
Onur Tirtir	d1b3eaf767	Refactor ColumnAppearsInForeignKeyToReferenceTable (#4441 )	2020-12-23 11:44:02 +03:00
naisila	5234caecca	Prevent empty placement creation in the coordinator	2020-12-22 19:39:05 +03:00
Ahmet Gedemenli	874fa1fc09	Propagate Drop Statistics	2020-12-22 18:34:46 +03:00
Onur Tirtir	3f60b08b11	Refactor foreign_key_relationship.c (#4438 )	2020-12-22 18:12:02 +03:00
Marco Slot	321cc784c7	Collapse Citus 7.* scripts into Citus 8.0-1	2020-12-21 22:55:51 +01:00
Marco Slot	f2056e553f	Expose partition column of subqueries in optimizer (#4355 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2020-12-18 20:32:52 +01:00
SaitTalhaNisanci	145112f3a0	Fix attribute numbers in subquery conversions (#4426 ) Attribute number in a subquery RTE and relation RTE means different things. In a relation attribute number will point to the column number in the table definition including the dropped columns as well however in subquery, it means the index in the target list. When we convert a relation RTE to subquery RTE we should either correct all the relevant attribute numbers or we can just add a dummy column for the dropped columns. We choose the latter in this commit because it is practically too vulnerable to update all the vars in a query. Another thing this commit fixes is that in case a join restriction clause list contains a false clause, we should just returns a false clause instead of the whole list, because the whole list will contain restrictions from other RTEs as well and this breaks the query, which can be seen from the output changes, now it is much simpler. Also instead of adding single tests for dropped columns, we choose to run the whole mixed queries with tables with dropped columns, this revealed some bugs already, which are fixed in this commit.	2020-12-18 20:25:41 +03:00
Ahmet Gedemenli	770d3da1ca	Add dependencies for stat schemas	2020-12-18 17:04:13 +03:00
Ahmet Gedemenli	6c0465566a	Propagate create statistics	2020-12-17 20:38:36 +03:00
Marco Slot	100e5d3196	Address review feedback	2020-12-15 15:23:38 +01:00
Marco Slot	707a6554b1	Support co-located/recurring correlated subqueries	2020-12-15 14:17:16 +01:00
Sait Talha Nisanci	181a7e1d36	Skip dropped columns	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	7951273f74	Refactor WrapRteRelationIntoSubquery	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	0e53aa5d3b	Add more tests	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	d5b0f02a64	Decide what group to convert, then convert them all in one go	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	c4d3927956	Not allow local table updates with remote citus local tables	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	f5dd5379b2	Add more tests	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	f7c1509fed	Not check if the query is routable for converting It seems that there are only very few cases where that is useful, and for now we prefer not having that check. This means that we might perform some unnecessary checks, but that should be rare and not performance critical.	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	1d82972ff4	Increase the performance with a trick Instead of sending NULL's over a network, we now convert the subqueries in the form of: SELECT t.a, NULL, NULL FROM (SELECT a FROM table)t; And we recursively plan the inner part so that we don't send the NULL's over network. We still need the NULLs in the outer subquery because we currently don't have an easy way of updating all the necessary places in the query. Add some documentation for how the conversion is done	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	3aed6c3ad0	Rename containsOnlyLocalTable as isLocalTableModification Update error message in Modify View	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	13c43d5744	Improve table conversion logic in dist-local joins	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	5618f3a3fc	Use BaseRestrictInfo for finding equality columns Baseinfo also has pushed down filters etc, so it makes more sense to use BaseRestrictInfo to determine what columns have constant equality filters. Also RteIdentity is used for removing conversion candidates instead of rteIndex.	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	28c5b6a425	Convert some hard coded errors to deferred errors in router planner	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	69992d58f9	Add broken local-dist table modifications tests It seems that most of the updates were broken, we weren't aware of it because there wasn't any data in the tables. They are broken mostly because local tables do not have a shard id and some code paths should be updated with that information, currently when there is an invalid shard id, it is assumed to be pruned. Consider local tables in router planner In case there is a local table, the shard id will not be valid and there are some checks that rely on shard id, we should skip these in case of local tables, which is handled with a dummy placement. Add citus local table dist table join tests add local-dist table mixed joins tests	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	a34504d7bf	Move recursive planning related function to recursive_planning	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	2a44029aaf	Simplify ContainsTableToBeConvertedToSubquery AllDataLocallyAccessible and ContainsLocalTableSubqueryJoin are removed. We can possibly remove ModifiesLocalTableWithRemoteCitusLocalTable as well. Though this removal has a side effect that now when all the data is locally available, we could still wrap a relation into a subquery, I guess that should be resolved in the router planner itself. Add more tests	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	26d9f0b457	Use auto mode in tests and fix debug message	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	3bd53a24a3	Support update on postgres table from citus local table	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	4b6611460a	Support foreign table joins as well	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	7e9204eba9	Update vars in quals while wrapping RTE to subquery When we wrap an RTE to subquery we are updating the variables varno's as 1, however we should also update the varno's of vars in quals. Also some other small code quality improvements are done.	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	0689f2ac1a	Recursively plan distributed tables only if all have unique filters The previous algorithm was not consistent and it could convert different RTEs based on the table orders in the query. Now we convert local tables if there is a distributed table which doesn't have a unique index. So if there are 4 tables, local1, local2, dist1, dist2_with_pkey then we will convert local1 and local2 in `auto` mode. Converting a distributed table is not that logical because as there is a distributed table without a unique index, we will need to convert the local tables anyway. So converting the distributed table with pkey is redundant.	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	a008fc611c	Support materialized view joins as well	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	ff4f3b2f3c	Use PlannerRestrictionContext instead of RecursivePlannerContext	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	3fe3c55023	Use ShouldConvertLocalTableJoinsToSubqueries Remove FillLocalAndDistributedRTECandidates and use ShouldConvertLocalTableJoinsToSubqueries, which simplifies things as we rely on a single function to decide whether we should continue converting RTE to subquery.	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	eebcd995b3	Add some more tests	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	5693cabc41	Not convert an already routable plannable query We should not recursively plan an already routable plannable query. An example of this is (SELECT * FROM local JOIN (SELECT * FROM dist) d1 USING(a)); So we let the recursive planner do all of its work and at the end we convert the final query to to handle unsupported joins. While doing each conversion, we check if it is router plannable, if so we stop. Only consider range table entries that are in jointree If a range table is not in jointree then there is no point in considering that because we are trying to convert range table entries to subqueries for join use case.	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	2ff65f3630	Enable partitioned distributed tables in local-dist table joins	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	44953579cf	Enable citus-local distributed table joins Check equality in quals We want to recursively plan distributed tables only if they have an equality filter on a unique column. So '>' and '<' operators will not trigger recursive planning of distributed tables in local-distributed table joins. Recursively plan distributed table only if the filter is constant If the filter is not a constant then the join might return multiple rows and there is a chance that the distributed table will return huge data. Hence if the filter is not constant we choose to recursively plan the local table.	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	f3d55448b3	Choose distributed table if it has a unique index in filter When doing local-distributed table joins we convert one of them to subquery. The current policy is that we convert distributed tables to subquery if it has a unique index on a column that has unique index(primary key also has a unique index).	2020-12-15 18:17:10 +03:00
Onder Kalaci	3f4952cc2b	Pushdown projections when relations are recursively planned This is important to limit the data transfer size.	2020-12-15 18:17:10 +03:00
Onder Kalaci	594e001f3b	Add filter pushdown regression tests Also handle WHERE false	2020-12-15 18:17:10 +03:00
Onder Kalaci	82a4830c7d	Adjust the existing regression tests	2020-12-15 18:17:10 +03:00
Onder Kalaci	7a4d6b2984	Handle modifications as well	2020-12-15 18:17:10 +03:00
Onder Kalaci	8f8390ed6e	Recursively plan local table joins The logical planner cannot handle joins between local and distributed table. Instead, we can recursively plan one side of the join and let the logical planner handle the rest. Our algorithm is a little smart, trying not to recursively plan distributed tables, but favors local tables.	2020-12-15 18:17:10 +03:00
Onder Kalaci	7cc25c9125	Add ability to fetch the restrictions per relation With this commit, we add the ability to add restrictions per relation. We simply rely on the restrictions that Postgres keeps per relation.	2020-12-15 18:17:10 +03:00
Onur Tirtir	0eb5701658	Not consider single shard hash dist. tables as replicated (#4413 )	2020-12-15 14:33:01 +03:00
Marco Slot	f2538a456f	Support co-located/recurring sublinks in the target list	2020-12-13 15:45:24 +01:00
Marco Slot	8e8adcd92a	Harden citus_tables against node failure	2020-12-13 15:10:40 +01:00
Jeff Davis	5b3c32eb38	fixup tests	2020-12-07 13:18:22 -08:00
Ahmet Gedemenli	7577821920	Fix transaction name length calculation	2020-12-07 12:34:15 +03:00
Ahmet Gedemenli	936775e8e3	Delete transactions when removing node With this commit, we delete entries in pg_dist_transaction for the primary nodes that are removed by `master_remove_node`.	2020-12-07 11:35:20 +03:00
Marco Slot	c9b658daea	Add a public.citus_tables view	2020-12-03 17:31:40 +01:00
Marco Slot	4098d33acb	Allow citus size functions on replicated tables	2020-12-03 16:33:24 +01:00
SaitTalhaNisanci	f164575524	Add a utility to process each table index (#4382 ) A utility function is added so that each caller can implement a handler for each index on a given table. This means that the caller doesn't need to worry about how to access each index, the only thing that it needs to do each to implement a function to which each index on the table is passed iteratively.	2020-12-03 16:33:13 +03:00
Onder Kalaci	c546ec5e78	Local node connection management When Citus needs to parallelize queries on the local node (e.g., the node executing the distributed query and the shards are the same), we need to be mindful about the connection management. The reason is that the client backends that are running distributed queries are competing with the client backends that Citus initiates to parallelize the queries in order to get a slot on the max_connections. In that regard, we implemented a "failover" mechanism where if the distributed queries cannot get a connection, the execution failovers the tasks to the local execution. The failover logic is follows: - As the connection manager if it is OK to get a connection - If yes, we are good. - If no, we fail the workerPool and the failure triggers the failover of the tasks to local execution queue The decision of getting a connection is follows: /* * For local nodes, solely relying on citus.max_shared_pool_size or * max_connections might not be sufficient. The former gives us * a preview of the future (e.g., we let the new connections to establish, * but they are not established yet). The latter gives us the close to * precise view of the past (e.g., the active number of client backends). * * Overall, we want to limit both of the metrics. The former limit typically * kics in under regular loads, where the load of the database increases in * a reasonable pace. The latter limit typically kicks in when the database * is issued lots of concurrent sessions at the same time, such as benchmarks. */	2020-12-03 14:16:13 +03:00
Ahmet Gedemenli	5242dcfe99	Add tests for propagating alter schema rename	2020-12-02 15:18:26 +03:00
Ahmet Gedemenli	514c6a76ac	Propagate alter schema rename	2020-12-02 15:18:26 +03:00
Nils Dijk	6f9c040f76	DESCRIPTION: Propagate columnar table settings for distributed tables When distributing a columnar table, as well as changing options on a distributed columnar table, this patch will forward the settings from the coordinator to the workers. For propagating options changes on an already distributed table this change is pretty straight forward. Before applying the change in options locally we will create a `DDLJob` that contains a call to `alter_columnar_table_set(...)` for every shard placement with all settings of the current table. This goes both for setting an option as well as resetting. This will reset the values to the defaults configured on the coordinator. Having the effect that the coordinator is authoritative on the settings and makes sure the shards have the same settings set as the table on the coordinator. When a columnar table is distributed it is using the `TableDDLCommand` infra structure to create a new kind of `TableDDLCommand`. This new type, called a `TableDDLCommandFunction` contains a context and 2 function pointers to execute. One function returns the command as applied on the table, the second function will return the sql command to apply to a shard with a given shard id. The schema name is ignored as it will use the fully qualified name of the shard in the same schema as the base table.	2020-12-02 13:02:42 +01:00
Onder Kalaci	f7e1aa3f22	Multi-row INSERTs use local execution when placements are local Multi-row execution already uses sequential execution. When shards are local, using local execution is profitable as it avoids an extra connection establishment to the local node.	2020-12-01 21:37:59 +03:00
Onur Tirtir	03bcccdee0	Fix hostname length check in StartNodeUserDatabaseConnection (#4363 ) Copying string before hostname length check makes the check useless	2020-11-30 20:00:35 +03:00
Onur Tirtir	7f3d1182ed	Handle invalid connection hash entries (#4362 ) If MemoryContextAlloc errors out -e.g. during an OOM-, ConnectionHashEntry->connections stays as NULL. With this commit, we add isValid flag to ConnectionHashEntry that should be set to true right after we allocate & initialize ConnectionHashEntry->connections list properly, and we check it before accesing to ConnectionHashEntry->connections.	2020-11-30 19:44:03 +03:00
SaitTalhaNisanci	af02ac6cf5	Refactor MultiRouterPlannableQuery (#4350 ) The name of the function is different than the implemantation. Because the function is designed to only consider SELECT queries. Also this changes the assert with an error.	2020-11-27 18:44:38 +03:00
Nils Dijk	326e6afa53	refactor table ddl events scoped for shards (#4342 ) Refactor internals on how Citus creates the SQL commands it sends to recreate shards. Before Citus collected solely ddl commands as `char `'s to recreate a table. If they were used to create a shard they were wrapped with `worker_apply_shard_ddl_command` and send to the workers. On the workers the UDF wrapping the ddl command would rewrite the parsetree to replace tables names with their shard name equivalent. This worked well, but poses an issue when adding columnar. Due to limitations in Postgres on creating custom options on table access methods we need to fall back on a UDF to set columnar specific options. Now, to recreate the table, we can not longer rely on having solely DDL statements to recreate a table. A prototype was made to run this UDF wrapped in `worker_apply_shard_ddl_command`. This became pretty messy, hard to understand and subsequently hard to maintain. This PR proposes a refactor of the internal representation of table ddl commands into a `TableDDLCommand` structure. The current implementation only supports a `char ` as its contents. Based on the use of the DDL statement (eg. creating the table -mx- or creating a shard) one of two different functions can be called to get the statement to send to the worker: - `GetTableDDLCommand(TableDDLCommand command)`: This function returns that ddl command to create the table. In this implementation it will just return the `char `. This has the same functionality as getting the old list and not wrapping it. - `GetShardedTableDDLCommand(TableDDLCommand command, uint64 shardId, char schemaName)`: This function returns the ddl command wrapped in `worker_apply_shard_ddl_command` with the `shardId` as an argument. Due to backwards compatibility it also accepts a. `schemaName`. The exact purpose is not directly clear. Ideally new implementations would work with fully qualified statements and ignore the `schemaName`. A future implementation could accept 2.function pointers and a `void *` for context to let the two pointers work on. This gives greater flexibility in controlling what commands get send in which situations. Also, in a future, we could implement the intermediate step of creating the `parsetree` datastructure of statements based on the contents in the catalog with a corresponding deparser. For sharded queries a mutator could be ran over the parsetree to rewrite the tablenames to the names with the shard identifier. This will completely omit the requirement for `worker_apply_shard_ddl_command`.	2020-11-26 13:31:59 +01:00
SaitTalhaNisanci	83020f444e	Initialize fast planner restriction context (#4349 ) We initialize fast planner restriction context so that code paths that rely on this being not NULL will operate without a problem.	2020-11-26 13:45:27 +03:00
Onder Kalaci	629ecc3dee	Add the infrastructure to count the number of client backends Considering the adaptive connection management improvements that we plan to roll soon, it makes it very helpful to know the number of active client backends. We are doing this addition to simplify yhe adaptive connection management for single node Citus. In single node Citus, both the client backends and Citus parallel queries would compete to get slots on Postgres' `max_connections` on the same Citus database. With adaptive connection management, we have the counters for Citus parallel queries. That helps us to adaptively decide on the remote executions pool size (e.g., throttle connections if necessary). However, we do not have any counters for the total number of client backends on the database. For single node Citus, we should consider all the client backends, not only the remote connections that Citus does. Of course Postgres internally knows how many client backends are active. However, to get that number Postgres iterates over all the backends. For examaple, see [pg_stat_get_db_numbackends](`8e90ec5580/src/backend/utils/adt/pgstatfuncs.c (L1240)`) where Postgres iterates over all the backends. For our purpuses, we need this information on every connection establishment. That's why we cannot affort to do this kind of iterattion.	2020-11-25 19:19:24 +01:00
SaitTalhaNisanci	180195b445	Remove unused parameter from VarConstOpExprClause (#4348 )	2020-11-25 21:00:22 +03:00
Ahmet Gedemenli	a64dc8a72b	Fixes a bug preventing INSERT SELECT .. ON CONFLICT with a constraint name on local shards Separate search relation shard function Add tests	2020-11-25 15:10:46 +03:00
Onur Tirtir	46be63d76b	Refactor PreprocessIndexStmt (#4272 )	2020-11-25 12:19:37 +03:00
Onder Kalaci	7accbff3f6	Do not cache all the distributed table metadata during CitusTableTypeIdList() CitusTableTypeIdList() function iterates on all the entries of pg_dist_partition and loads all the metadata in to the cache. This can be quite memory intensive especially when there are lots of distributed tables. When partitioned tables are used, it is common to have many distributed tables given that each partition also becomes a distributed table. CitusTableTypeIdList() is used on every CREATE TABLE .. PARTITION OF.. command as well. It means that, anytime a partition is created, Citus loads all the metadata to the cache. Note that Citus typically only loads the accessed table's metadata to the cache.	2020-11-24 17:44:06 +01:00
Önder Kalacı	c760cd3470	Move local execution after remote execution (#4301 ) * Move local execution after the remote execution Before this commit, when both local and remote tasks exist, the executor was starting the execution with local execution. There is no strict requirements on this. Especially considering the adaptive connection management improvements that we plan to roll soon, moving the local execution after to the remote execution makes more sense. The adaptive connection management for single node Citus would look roughly as follows: - Try to connect back to the coordinator for running parallel queries. - If succeeds, go on and execute tasks in parallel - If fails, fallback to the local execution So, we'll use local execution as a fallback mechanism. And, moving it after to the remote execution allows us to implement such further scenarios.	2020-11-24 13:43:38 +01:00
Önder Kalacı	532b457554	Solidify the slow-start algorithm (#4318 ) The adaptive executor emulates the TCP's slow start algorithm. Whenever the executor needs new connections, it doubles the number of connections established in the previous iteration. This approach is powerful. When the remote queries are very short (like index lookup with < 1ms), even a single connection is sufficent most of the time. When the remote queries are long, the executor can quickly establish necessary number of connections. One missing piece on our implementation seems that the executor keeps doubling the number of connections even if the previous connection attempts have been finalized. Instead, we should wait until all the attempts are finalized. This is how TCP's slow-start works. Plus, it decreases the unnecessary pressure on the remote nodes.	2020-11-23 19:20:13 +01:00
Onder Kalaci	c433c66f2b	Do not execute subplans multiple times with cursors Before this commit, we let AdaptiveExecutorPreExecutorRun() to be effective multiple times on every FETCH on cursors. That does not affect the correctness of the query results, but adds significant overhead.	2020-11-20 10:43:56 +01:00
Önder Kalacı	b0ddbbd33a	Enable parallel query on EXPLAIN ANALYZE (#4325 ) It seems that we forgot to pass the revelant flag to enable Postgres' parallel query capabilities on the shards when user does EXPLAIN ANALYZE on a distributed table.	2020-11-20 09:54:04 +01:00
SaitTalhaNisanci	9c44911226	Improve error messages in shard pruning (#4324 )	2020-11-18 17:16:06 +03:00
Nils Dijk	7c891a01a9	create missing objects during upgrade path	2020-11-17 19:01:51 +01:00
Nils Dijk	d065bb495d	Prepare downgrade script and bump development version to 10.0-1	2020-11-17 18:55:35 +01:00
Nils Dijk	213eb93e6d	make columnar compile and functionally working	2020-11-17 18:55:34 +01:00
Önder Kalacı	0c0fc69f2a	Remove unused field (#4275 )	2020-11-17 11:41:57 +01:00
Nils Dijk	7d14800071	add placeholder for enterprise modules	2020-11-11 15:43:04 +01:00
Onur Tirtir	4bf754b245	Fix location of citus--10.0-1--9.5-1.sql downgrade script (#4306 )	2020-11-09 16:43:56 +03:00
Onur Tirtir	5e3dc9d707	Bump citus version to 10.0devel	2020-11-09 13:16:54 +03:00
Hanefi Onaldi	d3019f1b6d	Introduce foreach_ptr_modify macro (#4303 ) If one wishes to iterate through a List and insert list elements in PG13, it is not safe to use for_each_ptr as the List representation in PostgreSQL no longer linked lists, but arrays, and it is possible that the whole array is repalloc'ed if ther is not sufficient space available. See postgres commit 1cff1b95ab6ddae32faa3efe0d95a820dbfdc164 for more information	2020-11-09 12:03:59 +03:00
Onder Kalaci	e0d2ac7620	Do not rely on set_rel_pathlist_hook for finding local relations When a relation is used on an OUTER JOIN with FALSE filters, set_rel_pathlist_hook may not be called for the table. There might be other cases as well, so do not rely on the hook for classification of the tables.	2020-11-06 11:14:30 +01:00
Onur Tirtir	cc8be422ce	Fix relkind checks in planner for relkinds other than RELKIND_RELATION (#4294 ) We were qualifying relations with relkind != RELKIND_RELATION as non-relations due to the strict checks around RangeTblEntry->relkind in planner.	2020-11-05 14:21:02 +03:00
SaitTalhaNisanci	25de5b1290	Fix uninitilized variable (#4293 ) Valgrind found that, we were doing an if check on uninitialized variable and it seems that this is on context.appendparents. `ac22929a26/src/backend/utils/adt/ruleutils.c (L1054)`	2020-11-04 12:08:15 +03:00
Hanefi Önaldı	d6f19e2298	Honor error message conventions	2020-11-03 18:11:18 +03:00
Hanefi Önaldı	85a4b61a0e	Prevent undistribute_table calls for partitions	2020-11-03 18:10:20 +03:00
Hanefi Önaldı	5db380f33a	Prevent undistribute_table calls for foreign tables	2020-11-03 17:33:29 +03:00
Halil Ozan Akgul	77b3be8b6d	Turn RelOptInfos to only used field of them, relids, to be able to copy	2020-10-22 13:42:28 +03:00
Onur Tirtir	ef49b75cd6	Fix memory issues around deparsing index commands (#4270 )	2020-10-22 13:17:13 +03:00
Onder Kalaci	5c4c9304ba	Remove RemoveDuplicateJoinRestrictions() function RemoveDuplicateJoinRestrictions() function was introduced with the aim of decrasing the overall planning times by eliminating the duplicate JOIN restriction entries (#1989). However, it turns out that the function itself is so CPU intensive with a very high algorithmic complexity, it hurts a lot more than it helps. The function is a clear example of premature optimization. The table below shows the difference clearly: "distributed query planning time master" RemoveDuplicateJoinRestrictions() execution time on master "Remove the function RemoveDuplicateJoinRestrictions() this PR" 5 table INNER JOIN 9 msec 2msec 7 msec 10 table INNER JOIN 227 msec 194 msec 29 msec 20 table INNER JOIN 1 sec 235 msec 1 sec 139 msec 90 msecs 50 table INNER JOIN 24 seconds 21 seconds 1.5 seconds 100 table INNER JOIN 2 minutes 16 secods 1 minute 53 seconds 23 seconds 250 table INNER JOIN Bottleneck on JoinClauseList 18 minutes 52 seconds Bottleneck on JoinClauseList 5 table INNER JOIN in subquery 9 msec 0 msec 6 msec 10 table INNER JOIN subquery 33 msec 10 msec 32 msec 20 table INNER JOIN subquery 132 msec 67 msec 123 msec 50 table INNER JOIN subquery 1.2 seconds 900 msec 500 msec 100 table INNER JOIN subquery 6 seconds 5 seconds 2 seconds 250 table INNER JOIN subquery 54 seconds 37 seconds 20 seconds 5 table LEFT JOIN 5 msec 0 msec 5 msec 10 table LEFT JOIN 11 msec 0 msec 13 msec 20 table LEFT JOIN 26 msec 2 msec 30 msec 50 table LEFT JOIN 150 msec 15 msec 193 msec 100 table LEFT JOIN 757 msec 71 msec 722 msec 250 table LEFT JOIN 8 seconds 600 msec 8 seconds 5 JOINs among 2 table JOINs 37 msec 11 msec 25 msec 10 JOINs among 2 table JOINs 536 msec 306 msec 352 msec 20 JOINs among 2 table JOINs 794 msec 181 msec 640 msec 50 JOINs among 2 table JOINs 25 seconds 2 seconds 22 seconds 100 JOINs among 2 table JOINs Bottleneck on JoinClauseList 9 seconds Bottleneck on JoinClauseList 150 JOINs among 2 table JOINs Bottleneck on JoinClauseList 46 seconds Bottleneck on JoinClauseList On top of the performance penalty, the function had a critical bug #4255, and with #4254 we hit one more important bug. It should be fixed by adding the followig check to the ContextCoversJoinRestriction(): ``` static bool JoinRelIdsSame(JoinRestriction leftRestriction, JoinRestriction rightRestriction) { Relids leftInnerRelIds = leftRestriction->innerrel->relids; Relids rightInnerRelIds = rightRestriction->innerrel->relids; if (!bms_equal(leftInnerRelIds, rightInnerRelIds)) { return false; } Relids leftOuterRelIds = leftRestriction->outerrel->relids; Relids rightOuterRelIds = rightRestriction->outerrel->relids; if (!bms_equal(leftOuterRelIds, rightOuterRelIds)) { return false; } return true; } ``` However, adding this eliminates all the benefits tha RemoveDuplicateJoinRestrictions() brings. I've used the commands here to generate the JOINs mentioned in the PR: https://gist.github.com/onderkalaci/fe8654f9df5916c7af4c7c5eb892561e#file-gistfile1-txt Inner and outer JOINs behave roughly the same, to simplify the table only added INNER joins.	2020-10-21 10:29:39 +02:00
SaitTalhaNisanci	0f209377c4	Fix incorrect join related fields (#4242 ) * Fix incorrect join related fields Ruleutils expect to give the original index of join columns hence we should consider the dropped columns while setting the fields in SetJoinRelatedFieldsCompat. * add some more tests for joins * Move tests to join.sql and create a utility function	2020-10-19 18:28:39 +03:00
Onur Tirtir	c49077d594	Disallow outer joins `ON TRUE` with ref & dist tables when ref table is outer relation (#4255 ) Disallow `ON TRUE` outer joins with reference & distributed tables when reference table is outer relation by fixing the logic bug made when calling `LeftListIsSubset` function. Also, be more defensive when removing duplicate join restrictions when join clause is empty for non-inner joins as they might still contain useful information for non-inner joins.	2020-10-19 16:58:11 +03:00
Onur Tirtir	f80f4839ad	Remove unused functions that cppcheck found	2020-10-19 13:50:52 +03:00
Onder Kalaci	bbedfca761	Improve the relation restriction counters It seems like Postgres could call set_rel_pathlist() for the same relation multiple times. This breaks the logic where we assume relationCount eqauls to the number of entries in relationRestrictionList. In summary, relationRestrictionList may contain duplicate entries.	2020-10-19 08:51:16 +02:00
Nils Dijk	caabbf4b84	Table access method support for distributed tables	2020-10-16 12:02:25 -07:00
Onur Tirtir	7cb07c70fa	Move hasSemiJoin to JoinRestrictionContext (#4256 )	2020-10-16 18:37:39 +03:00
Marco Slot	8976f245ab	Support reference table view in reference table modification	2020-10-16 11:31:24 +02:00
Onur Tirtir	de6f2d3f42	Refactor JoinRestrictionListExistsInContext to improve readability (#4249 )	2020-10-16 12:24:56 +03:00
Onder Kalaci	fe3caf3bc8	Local execution considers intermediate result size limit With this commit, we make sure that local execution adds the intermediate result size as the distributed execution adds. Plus, it enforces the citus.max_intermediate_result_size value.	2020-10-15 17:18:55 +02:00
Marco Slot	31858c8a29	Check table existence in EnsureRelationKindSupported	2020-10-15 17:05:06 +02:00
Sait Talha Nisanci	ecde6c6eef	Introduce GetCurrentLocalExecutionStatus wrapper We should not access CurrentLocalExecutionStatus directly because that would mean that we could also set it directly, which we shouldn't because we have checks to see if the new state is possible, otherwise we error.	2020-10-15 15:38:19 +03:00
Simon Kelly	4f94e544b7	create 9.5-1 udfs and update citus--9.4-1--9.5-1.sql	2020-10-15 13:50:36 +02:00
Simon Kelly	2a6c867cb0	Make citus_prepare_pg_upgrade idempotent https://github.com/citusdata/citus/issues/3527	2020-10-15 13:49:50 +02:00
Onder Kalaci	15e724c073	Add regression tests for outer/cross JOINs	2020-10-14 15:17:30 +02:00
Onder Kalaci	de33079065	Improve outer join checks Before this commit, the logic was: - As long as the outer side of the JOIN is not a JOIN (e.g., relation or subquery etc.), we check for the existence of any recurring tuples. There were two implications of this decision. First, even if a subquery which is on the outer side contains distributed table JOIN reference table, Citus would unnecessarily throw an error. Note that, the JOIN inside the subquery would already be going to be tested recursively. But, as long as that check passes, there is no reason for the upper JOIN to fail. An example, which used to fail and now works: SELECT * FROM (SELECT * FROM dist JOIN ref) as foo LEFT JOIN dist; Second, certain JOINs, especially with ON (true) conditions were not represented as Citus expects the JOINs to be in the format DeferredErrorIfUnsupportedRecurringTuplesJoin().	2020-10-14 15:17:30 +02:00
Onur Tirtir	1a28858c47	Disallow field indirection in INSERT/UPDATE queries (#4241 )	2020-10-14 14:11:59 +03:00
Onur Tirtir	8efca3b60a	Fix a crash with inserting domain composite types in coord. evaluation (#4231 ) Use short lived per-tuple context in citus_evaluate_expr like (pg) evaluate_expr does. We should not use planState->ExprContext when evaluating expressions as it might lead to freeing the same executor twice (first one happens in citus_evaluate_expr itself and the other one happens when postgres doing clean-up for the top level executor state), which in turn might cause seg.faults. However, now as we don't have necessary planState info to evaluate prepared statements, we also add planState->es_param_list_info to per-tuple ExprContext.	2020-10-13 14:19:59 +03:00

... 5 6 7 8 9 ...

2583 Commits (65bd540943cfa36ca886ef3025c7e4920a3e8ad9)