citus

Commit Graph

Author	SHA1	Message	Date
Naisila Puka	4fb05efabb	Distributes partition-to-be table before ProcessUtility (#5191 ) * Skip ALTER TABLE constraint checks while planning * Revert previous commit's solution, keep tests * Distribute partition-to-be table before ProcessUtility * Acquire locks in PreprocessAlterTableStmtAttachPartition	2021-09-02 13:07:42 +03:00
Hanefi Onaldi	7e39c7ea83	Replace master with citus in logs and comments (#5210 ) I replaced - master_add_node, - master_add_inactive_node - master_activate_node with - citus_add_node, - citus_add_inactive_node - citus_activate_node respectively.	2021-08-26 11:31:17 +03:00
Onur Tirtir	4b03195c06	Use RelationGetStatExtList instead of GetExplicitStatisticsIdList	2021-08-18 17:50:57 +03:00
Ahmet Gedemenli	9e90894f21	Synchronize hasmetadata flag on mx workers (#5086 ) * Synchronize hasmetadata flag on mx workers * Switch to sequential execution * Add test * Use SetWorkerColumn * Add test for stop_sync * Remove usage of UpdateHasmetadataOnWorkersWithMetadata * Remove MarkNodeMetadataSynced * Fix test for metadatasynced * Remove MarkNodeMetadataSynced * Style * Remove MarkNodeHasMetadata * Remove UpdateDistNodeBoolAttr * Refactor SetWorkerColumn * Use SetWorkerColumnLocalOnly when setting up dependencies * Use SetWorkerColumnLocalOnly in TriggerSyncMetadataToPrimaryNodes * Style * Make update command generator functions static * Set metadatasynced before syncing * Call SetWorkerColumn only if the sync is successful * Try to sync all nodes * Fix indexno * Update metadatasynced locally first * Break if a node fails to sync metadata * Send worker commands optional * Style & Rebase * Add raiseOnError param to SetWorkerColumn * Style * Set metadatasynced for all metadata nodes * Style * Introduce SetWorkerColumnOptional * Polish * Style * Dont send set command to not synced metadata nodes * Style * Polish * Add test for stop_sync * Add test for shouldhaveshards * Add test for isactive flag * Sort by placementid in the function verify_metadata * Cover edge cases for failing nodes * Add comments * Add nodeport to isactive test * Add warning if metadata out of sync * Update warning message	2021-08-12 14:16:18 +03:00
Onder Kalaci	5f02d18ef8	transactional metadata sync for maintanince daemon As we use the current user to sync the metadata to the nodes with #5105 (and many other PRs), there is no reason that prevents us to use the coordinated transaction for metadata syncing. This commit also renames few functions to reflect their actual implementation.	2021-08-09 10:34:55 +02:00
Onder Kalaci	35964c6366	Dropped columns do not diverge distribution column for partitioned tables Before this commit, creating a partition after a DROP column on the parent (position before dist. key) was leading to partition to have the wrong distribution column.	2021-08-06 13:36:12 +02:00
Onder Kalaci	482b8096e9	Introduce citus_internal_update_relation_colocation update_distributed_table_colocation can be called by the relation owner, and internally it updates pg_dist_partition. With this commit, update_distributed_table_colocation uses an internal UDF to access pg_dist_partition. As a result, this operation can now be done by regular users on MX.	2021-08-03 11:44:58 +02:00
SaitTalhaNisanci	4559d02c41	Fix union pushdown issue (#5079 ) * Fix UNION not being pushdown Postgres optimizes column fields that are not needed in the output. We were relying on these fields to understand if it is safe to push down a union query. This fix looks at the parse query, which has the original column fields to detect if it is safe to push down a union query. * Add more tests * Simplify code and make it more robust * Process varlevelsup > 0 in FindReferencedTableColumn * Only look for outers vars in union path * Add more comments * Remove UNION ALL specific logic for pulling up childvars	2021-07-29 13:52:55 +03:00
Jelte Fennema	7d0b6dc9be	Include data_type and cache in sequence definition on workers These two options were not included when creating the sequences on the workers as part of metadata syncing. The missing `data_type` part of the definition made finding the cause of #5126 harder than necessary, because of confusing errors.	2021-07-22 11:49:06 +02:00
Onder Kalaci	2c349e6dfd	Use current user to sync metadata Before this commit, we always synced the metadata with superuser. However, that creates various edge cases such as visibility errors or self distributed deadlocks or complicates user access checks. Instead, with this commit, we use the current user to sync the metadata. Note that, `start_metadata_sync_to_node` still requires super user because accessing certain metadata (like pg_dist_node) always require superuser (e.g., the current user should be a superuser). However, metadata syncing operations regarding the distributed tables can now be done with regular users, as long as the user is the owner of the table. A table owner can still insert non-sense metadata, however it'd only affect its own table. So, we cannot do anything about that.	2021-07-16 13:25:27 +02:00
Sait Talha Nisanci	e7ed16c296	Not include to-be-deleted shards while finding shard placements Ignore orphaned shards in more places Only use active shard placements in RouterInsertTaskList Use IncludingOrphanedPlacements in some more places Fix comment Add tests	2021-06-28 13:05:31 +03:00
Naisila Puka	fe5907ad2d	Adds propagation of ALTER SEQUENCE and other improvements (#5061 ) * Alter seq type when we first use the seq in a dist table * Don't allow type changes when seq is used in dist table * ALTER SEQUENCE propagation * Tests for ALTER SEQUENCE propagation * Relocate AlterSequenceType and ensure dependencies for sequence * Support for citus local tables, and other fixes * Final formatting	2021-06-24 21:23:25 +03:00
Jelte Fennema	d1d386a904	Only allow moves of shards of distributed tables (#5072 ) Moving shards of reference tables was possible in at least one case: ```sql select citus_disable_node('localhost', 9702); create table r(x int); select create_reference_table('r'); set citus.replicate_reference_tables_on_activate = off; select citus_activate_node('localhost', 9702); select citus_move_shard_placement(102008, 'localhost', 9701, 'localhost', 9702); ``` This would then remove the reference table shard on the source, causing all kinds of issues. This fixes that by disallowing all shard moves except for shards of distributed tables. Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2021-06-23 16:25:46 +02:00
Jelte Fennema	ca00b63272	Avoid two race conditions in the rebalance progress monitor (#5050 ) The first and main issue was that we were putting absolute pointers into shared memory for the `steps` field of the `ProgressMonitorData`. This pointer was being overwritten every time a process requested the monitor steps, which is the only reason why this even worked in the first place. To quote a part of a relevant stack overflow answer: > First of all, putting absolute pointers in shared memory segments is > terrible terible idea - those pointers would only be valid in the > process that filled in their values. Shared memory segments are not > guaranteed to attach at the same virtual address in every process. > On the contrary - they attach where the system deems it possible when > `shmaddr == NULL` is specified on call to `shmat()` Source: https://stackoverflow.com/a/10781921/2570866 In this case a race condition occurred when a second process overwrote the pointer in between the first process its write and read of the steps field. This issue is fixed by not storing the pointer in shared memory anymore. Instead we now calculate it's position every time we need it. The second race condition I have not been able to trigger, but I found it while investigating this. This issue was that we published the handle of the shared memory segment, before we initialized the data in the steps. This means that during initialization of the data, a call to `get_rebalance_progress()` could read partial data in an unsynchronized manner.	2021-06-21 14:03:42 +00:00
Onder Kalaci	69ca943e58	Deparse/parse the local cached queries With local query caching, we try to avoid deparse/parse stages as the operation is too costly. However, we can do deparse/parse operations once per cached queries, right before we put the plan into the cache. With that, we avoid edge cases like (4239) or (5038). In a sense, we are making the local plan caching behave similar for non-cached local/remote queries, by forcing to deparse the query once.	2021-06-21 12:24:29 +03:00
Onur Tirtir	6215a3aa93	Merge remote-tracking branch 'origin/master' into columnar-index	2021-06-17 14:31:12 +03:00
Onder Kalaci	bc09288651	Get ready for Improve index backed constraint creation for online rebalancer See: https://github.com/citusdata/citus-enterprise/issues/616	2021-06-17 13:05:56 +03:00
Onur Tirtir	3d11c0f9ef	Merge remote-tracking branch 'origin/master' into columnar-index Conflicts: src/test/regress/expected/columnar_empty.out src/test/regress/expected/multi_extension.out	2021-06-16 20:23:50 +03:00
Hanefi Onaldi	5c6069a74a	Do not rely on fk cache when truncating local data (#5018 )	2021-06-07 11:56:48 +03:00
Marco Slot	e81d25a7be	Refactor RelationIsAKnownShard to remove onlySearchPath argument	2021-06-02 14:30:27 +02:00
Ahmet Gedemenli	089ef35940	Disable dropping and truncating known shards Add test for disabling dropping and truncating known shards	2021-06-02 14:30:27 +02:00
Jelte Fennema	1a83628195	Use "orphaned shards" naming in more places We were not very consistent in how we named these shards.	2021-06-04 11:39:19 +02:00
Jelte Fennema	3f60e4f394	Add ExecuteCriticalCommandInDifferentTransaction function We use this pattern multiple times throughout the codebase now. Seems like a good moment to abstract it away.	2021-06-04 11:30:27 +02:00
Jelte Fennema	280b9ae018	Cleanup orphaned shards at the start of a rebalance In case the background daemon hasn't cleaned up shards yet, we do this manually at the start of a rebalance.	2021-06-04 11:23:07 +02:00
Naisila Puka	0f37ab5f85	Fixes column default coming from a sequence (#4914 ) * Add user-defined sequence support for MX * Remove default part when propagating to workers * Fix ALTER TABLE with sequences for mx tables * Clean up and add tests * Propagate DROP SEQUENCE * Removing function parts * Propagate ALTER SEQUENCE * Change sequence type before propagation & cleanup * Revert "Propagate ALTER SEQUENCE" This reverts commit 2bef64c5a29f4e7224a7f43b43b88e0133c65159. * Ensure sequence is not used in a different column with different type * Insert select tests * Propagate rename sequence stmt * Fix issue with group ID cache invalidation * Add ALTER TABLE ALTER COLUMN TYPE .. precaution * Fix attnum inconsistency and add various tests * Add ALTER SEQUENCE precaution * Remove Citus hook * More tests Co-authored-by: Marco Slot <marco.slot@gmail.com>	2021-06-03 23:02:09 +03:00
Hanefi Onaldi	fa29d6667a	Accept invalidation before fk graph validity check (#5017 ) InvalidateForeignKeyGraph sends an invalidation via shared memory to all backends, including the current one. However, we might not call AcceptInvalidationMessages before reading from the cache below. It would be better to also add a call to AcceptInvalidationMessages in IsForeignConstraintRelationshipGraphValid.	2021-06-02 14:45:35 +03:00
Onur Tirtir	94f30a0428	Refactor index check in ColumnarProcessUtility	2021-06-01 11:12:28 +03:00
Jelte Fennema	3271f1bd13	Fix data race in get_rebalance_progress (#5008 ) To be able to report progress of the rebalancer, the rebalancer updates the state of a shard move in a shared memory segment. To then fetch the progress, `get_rebalance_progress` can be called which reads this shared memory. Without this change it did so without using any synchronization primitives, allowing for data races. This fixes that by using atomic operations to update and read from the parts of the shared memory that can be changed after initialization.	2021-05-31 15:27:32 +02:00
SaitTalhaNisanci	8c3f85692d	Not consider old placements when disabling or removing a node (#4960 ) * Not consider old placements when disabling or removing a node * update cluster test	2021-05-28 22:38:20 +02:00
SaitTalhaNisanci	a4944a2102	Rename CoordinatedTransactionShouldUse2PC (#4995 )	2021-05-21 18:57:42 +03:00
Hanefi Onaldi	878513f325	Remove all occurences of replication_model GUC	2021-05-21 16:14:59 +03:00
Jelte Fennema	10f06ad753	Fetch shard size on the fly for the rebalance monitor Without this change the rebalancer progress monitor gets the shard sizes from the `shardlength` column in `pg_dist_placement`. This column needs to be updated manually by calling `citus_update_table_statistics`. However, `citus_update_table_statistics` could lead to distributed deadlocks while database traffic is on-going (see #4752). To work around this we don't use `shardlength` column anymore. Instead for every rebalance we now fetch all shard sizes on the fly. Two additional things this does are: 1. It adds tests for the rebalance progress function. 2. If a shard move cannot be done because a source or target node is unreachable, then we error in stop the rebalance, instead of showing a warning and continuing. When using the by_disk_size rebalance strategy it's not safe to continue with other moves if a specific move failed. It's possible that the failed move made space for the next move, and because the failed move never happened this space now does not exist. 3. Adds two new columns to the result of `get_rebalancer_progress` which shows the size of the shard on the source and target node. Fixes #4930	2021-05-20 16:38:17 +02:00
Nils Dijk	a6c2d2a4c4	Feature: alter database owner (#4986 ) DESCRIPTION: Add support for ALTER DATABASE OWNER This adds support for changing the database owner. It achieves this by marking the database as a distributed object. By marking the database as a distributed object it will look for its dependencies and order the user creation commands (enterprise only) before the alter of the database owner. This is mostly important when adding new nodes. By having the database marked as a distributed object it can easily understand for which `ALTER DATABASE ... OWNER TO ...` commands to propagate by resolving the object address of the database and verifying it is a distributed object, and hence should propagate changes of owner ship to all workers. Given the ownership of the database might have implications on subsequent commands in transactions we force sequential mode for transactions that have a `ALTER DATABASE ... OWNER TO ...` command in them. This will fail the transaction with meaningful help when the transaction already executed parallel statements. By default the feature is turned off since roles are not automatically propagated, having it turned on would cause hard to understand errors for the user. It can be turned on by the user via setting the `citus.enable_alter_database_owner`.	2021-05-20 13:27:44 +02:00
Onder Kalaci	d07db99ea4	Make sure that target node in shard moves is eligable for shard move	2021-05-20 10:51:01 +02:00
Onder Kalaci	926069a859	Wait until all connections are successfully established Comment from the code: /* * Iterate until all the tasks are finished. Once all the tasks * are finished, ensure that that all the connection initializations * are also finished. Otherwise, those connections are terminated * abruptly before they are established (or failed). Instead, we let * the ConnectionStateMachine() to properly handle them. * * Note that we could have the connections that are not established * as a side effect of slow-start algorithm. At the time the algorithm * decides to establish new connections, the execution might have tasks * to finish. But, the execution might finish before the new connections * are established. / Note that the abruptly terminated connections lead to the following errors: 2020-11-16 21:09:09.800 CET [16633] LOG: could not accept SSL connection: Connection reset by peer 2020-11-16 21:09:09.872 CET [16657] LOG: could not accept SSL connection: Undefined error: 0 2020-11-16 21:09:09.894 CET [16667] LOG: could not accept SSL connection: Connection reset by peer To easily reproduce the issue: - Create a single node Citus - Add the coordinator to the metadata - Create a distributed table with shards on the coordinator - f.sql: select count() from test; - pgbench -f /tmp/f.sql postgres -T 12 -c 40 -P 1 or pgbench -f /tmp/f.sql postgres -T 12 -c 40 -P 1 -C	2021-05-19 15:59:13 +02:00
Onder Kalaci	995adf1a19	Executor takes connection establishment and task execution costs into account With this commit, the executor becomes smarter about refrain to open new connections. The very basic example is that, if the connection establishments take 1000ms and task executions as 5 msecs, the executor becomes smart enough to not establish new connections.	2021-05-19 15:48:07 +02:00
Marco Slot	644b266dee	Only cache local plans when reusing a distributed plan	2021-05-18 16:11:43 +02:00
SaitTalhaNisanci	eaa7d2bada	Not block maintenance daemon (#4972 ) It was possible to block maintenance daemon by taking an SHARE ROW EXCLUSIVE lock on pg_dist_placement. Until the lock is released maintenance daemon would be blocked. We should not block the maintenance daemon under any case hence now we try to get the pg_dist_placement lock without waiting, if we cannot get it then we don't try to drop the old placements.	2021-05-17 03:22:35 -07:00
Nils Dijk	c91f8d8a15	Feature: localhost guc (#4836 ) DESCRIPTION: introduce `citus.local_hostname` GUC for connections to the current node Citus once in a while needs to connect to itself for some systems operations. This used to be hardcoded to `localhost`. The hardcoded hostname causes some issues, for example in environments where `sslmode=verify-full` is required. It is not always desirable or even feasible to get `localhost` as an alt name on the certificate. By introducing a GUC to use when connecting to the current instance the user has more control what network path is used and what hostname is required to be present in the server certificate.	2021-05-12 16:59:44 +02:00
Jelte Fennema	cbbd10b974	Implement an improvement threshold in the rebalancer (#4927 ) Every move in the rebalancer algorithm results in an improvement in the balance. However, even if the improvement in the balance was very small the move was still chosen. This is especially problematic if the shard itself is very big and the move will take a long time. This changes the rebalancer algorithm to take the relative size of the balance improvement into account when choosing moves. By default a move will not be chosen if it improves the balance by less than half of the size of the shard. An extra argument is added to the rebalancer functions so that the user can decide to lower the default threshold if the ignored move is wanted anyway.	2021-05-11 14:24:59 +02:00
Onder Kalaci	a231ff29b0	Get prepared for some improvements for online rebalancer To see all the changes, see https://github.com/citusdata/citus-enterprise/pull/586/files	2021-05-10 19:54:31 +02:00
jeff-davis	7b9aecff21	Columnnar: metapage changes. (#4907 ) * Columnar: introduce columnar storage API. This new API is responsible for the low-level storage details of columnar; translating large reads and writes into individual block reads and writes that respect the page headers and emit WAL. It's also responsible for the columnar metapage, resource reservations (stripe IDs, row numbers, and data), and truncation. This new API is not used yet, but will be used in subsequent forthcoming commits. * Columnar: add columnar_storage_info() for debugging purposes. * Columnar: expose ColumnarMetadataNewStorageId(). * Columnar: always initialize metapage at creation time. This avoids the complexity of dealing with tables where the metapage has not yet been initialized. * Columnar: columnar storage upgrade/downgrade UDFs. Necessary upgrade/downgrade step so that new code doesn't see an old metapage. * Columnar: improve metadata.c comment. * Columnar: make ColumnarMetapage internal to the storage API. Callers should not have or need direct access to the metapage. * Columnar: perform resource reservation using storage API. * Columnar: implement truncate using storage API. * Columnar: implement read/write paths with storage API. * Columnar: add storage tests. * Revert "Columnar: don't include stripe reservation locks in lock graph." This reverts commit `c3dcd6b9f8`. No longer needed because the columnar storage API takes care of concurrency for resource reservation. * Columnar: remove unnecessary lock when reserving. No longer necessary because the columnar storage API takes care of concurrent resource reservation. * Add simple upgrade tests for storage/ branch * fix multi_extension.out Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2021-05-10 20:16:46 +03:00
SaitTalhaNisanci	6b1904d37a	When moving a shard to a new node ensure there is enough space (#4929 ) * When moving a shard to a new node ensure there is enough space * Add WairForMiliseconds time utility * Add more tests and increase readability * Remove the retry loop and use a single udf for disk stats * Address review * address review Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2021-05-06 17:28:02 +03:00
Jelte Fennema	2f29d4e53e	Continue to remove shards after first failure in DropMarkedShards The comment of DropMarkedShards described the behaviour that after a failure we would continue trying to drop other shards. However the code did not do this and would stop after the first failure. Instead of simply fixing the comment I fixed the code, because the described behaviour is more useful. Now a single shard that cannot be removed yet does not block others from being removed.	2021-04-30 15:42:09 +03:00
Marco Slot	4b49cb112f	Fix FROM ONLY queries on partitioned tables	2021-04-27 16:10:07 +02:00
Onder Kalaci	918838e488	Allow constant VALUES clauses in pushdown queries As long as the VALUES clause contains constant values, we should not recursively plan the queries/CTEs. This is a follow-up work of #1805. So, we can easily apply OUTER join checks as if VALUES clause is a reference table/immutable function.	2021-04-21 14:28:08 +02:00
SaitTalhaNisanci	93c2dcf3d2	Fix data-race with concurrent calls of DropMarkedShards (#4909 ) * Fix problews with concurrent calls of DropMarkedShards When trying to enable `citus.defer_drop_after_shard_move` by default it turned out that DropMarkedShards was not safe to call concurrently. This could especially cause big problems when also moving shards at the same time. During tests it was possible to trigger a state where a shard that was moved would not be available on any of the nodes anymore after the move. Currently DropMarkedShards is only called in production by the maintenaince deamon. Since this is only a single process triggering such a race is currently impossible in production settings. In future changes we will want to call DropMarkedShards from other places too though. * Add some isolation tests Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2021-04-21 10:59:48 +03:00
Ahmet Gedemenli	33c620f232	Optimize partitioned disk size calculation (#4905 ) * Optimize partitioned disk size calculation * Polish * Fix test for citus_shard_cost_by_disk_size Try optimizing if not CSTORE	2021-04-19 13:30:56 +03:00
Onder Kalaci	5482d5822f	Keep more statistics about connection establishment times When DEBUG4 enabled, Citus now prints per connection establishment time.	2021-04-16 14:56:31 +02:00
Hanefi Onaldi	9919fbe3f8	Switch to sequential mode on long partition names This commit adds support for long partition names for distributed tables: - ALTER TABLE dist_table ATTACH PARTITION .. - CREATE TABLE .. PARTITION OF dist_table .. Note: create_distributed_table UDF does not support long table and partition names, and is not covered in this commit	2021-04-14 15:27:50 +03:00
Ahmet Gedemenli	d74d358a45	Refactor size queries with new enum SizeQueryType (#4898 ) * Refactor size queries with new enum SizeQueryType * Polish	2021-04-12 17:14:29 +03:00
SaitTalhaNisanci	b453563e88	Warm up connections params hash (#4872 ) ConnParams(AuthInfo and PoolInfo) gets a snapshot, which will block the remote connectinos to localhost. And the release of snapshot will be blocked by the snapshot. This leads to a deadlock. We warm up the conn params hash before starting a new transaction so that the entries will already be there when we start a new transaction. Hence GetConnParams will not get a snapshot.	2021-04-12 13:08:38 +03:00
Halil Ozan Akgul	a5038046f9	Adds shard_count parameter to create_distributed_table	2021-03-29 16:22:49 +03:00
SaitTalhaNisanci	03832f353c	Drop postgres 11 support	2021-03-25 09:20:28 +03:00
Marco Slot	fbc2147e11	Replace MAX_PUT_COPY_DATA_BUFFER_SIZE by citus.remote_copy_flush_threshold GUC	2021-03-16 06:00:38 +01:00
Marco Slot	1646fca445	Add GUC to set maximum connection lifetime	2021-03-16 01:57:57 +01:00
Onder Kalaci	e65e72130d	Rename use -> shouldUse Because setting the flag doesn't necessarily mean that we'll use 2PC. If connections are read-only, we will not use 2PC. In other words, we'll use 2PC only for connections that modified any placements.	2021-03-12 08:29:43 +00:00
Onder Kalaci	6a7ed7b309	Do not trigger 2PC for reads on local execution Before this commit, Citus used 2PC no matter what kind of local query execution happens. For example, if the coordinator has shards (and the workers as well), even a simple SELECT query could start 2PC: ```SQL WITH cte_1 AS (SELECT * FROM test LIMIT 10) SELECT count(*) FROM cte_1; ``` In this query, the local execution of the shards (and also intermediate result reads) triggers the 2PC. To prevent that, Citus now distinguishes local reads and local writes. And, Citus switches to 2PC only if a modification happens. This may still lead to unnecessary 2PCs when there is a local modification and remote SELECTs only. Though, we handle that separately via #4587.	2021-03-12 08:29:43 +00:00
Onder Kalaci	d1cd198655	Prevent infinite recursion for queries that involve UNION ALL and JOIN With this commit, we make sure to prevent infinite recursion for queries in the format: [subquery with a UNION ALL] JOIN [table or subquery] Also, fixes a bug where we pushdown UNION ALL below a JOIN even if the UNION ALL is not safe to pushdown.	2021-03-03 12:27:26 +01:00
Naisila Puka	2f30614fe3	Reimplement citus_update_table_statistics to detect dist. deadlocks (#4752 ) * Reimplement citus_update_table_statistics * Update stats for the given table not colocation group * Add tests for reimplemented citus_update_table_statistics * Use coordinated transaction, merge with citus_shard_sizes functions * Update the old master_update_table_statistics as well	2021-03-03 04:12:30 +03:00
SaitTalhaNisanci	feee25dfbd	Use translated vars in postgres 13 as well (#4746 ) * Use translated vars in postgres 13 as well Postgres 13 removed translated vars with pg 13 so we had a special logic for pg 13. However it had some bug, so now we copy the translated vars before postgres deletes it. This also simplifies the logic. * fix rtoffset with pg >= 13	2021-02-26 19:41:29 +03:00
Naisila Puka	5ebd4eac7f	Preserve colocation with procedures in alter_distributed_table (#4743 )	2021-02-25 19:52:47 +03:00
Hanefi Onaldi	9a792ef841	Remove length limitations for table renames	2021-02-24 03:35:27 +03:00
SaitTalhaNisanci	bcbd24f8de	Only consider pseudo constants for shortcuts (#4712 ) It seems that we need to consider only pseudo constants while doing some shortcuts in planning. For example there could be a false clause but it can contribute to the result in which case it will not be a pseudo constant.	2021-02-15 18:39:37 +03:00
Onder Kalaci	f297c96ec5	Add regression tests for COPY into colocated intermediate results To add the tests without too much data, make the copy switchover configurable.	2021-02-11 15:41:06 +01:00
Ahmet Gedemenli	c8e83d1f26	Fix dropping fkey when distributing table	2021-02-11 15:48:35 +03:00
Hadi Moshayedi	c3dcd6b9f8	Columnar: don't include stripe reservation locks in lock graph.	2021-02-10 10:20:20 -08:00
Onder Kalaci	c804c9aa21	Allow local execution for intermediate results in COPY When COPY is used for copying into co-located files, it was not allowed to use local execution. The primary reason was Citus treating co-located intermediate results as co-located shards, and COPY into the distributed table was done via "format result". And, local execution of such COPY commands was not implemented. With this change, we implement support for local execution with "format result". To do that, we use the buffer for every file on shardState->copyOutState, similar to how local copy on shards are implemented. In fact, the logic is similar to local copy on shards, but instead of writing to the shards, Citus writes the results to a file. The logic relies on LOCAL_COPY_FLUSH_THRESHOLD, and flushes only when the size exceeds the threshold. But, unlike local copy on shards, in this case we write the headers and footers just once.	2021-02-09 15:00:06 +01:00
Ahmet Gedemenli	5dd2a3da03	Convert RelabelTypes into CollateExprs in get_rule_expr function	2021-02-05 12:06:46 +03:00
Onder Kalaci	fc9a23792c	COPY uses adaptive connection management on local node With #4338, the executor is smart enough to failover to local node if there is not enough space in max_connections for remote connections. For COPY, the logic is different. With #4034, we made COPY work with the adaptive connection management slightly differently. The cause of the difference is that COPY doesn't know which placements are going to be accessed hence requires to get connections up-front. Similarly, COPY decides to use local execution up-front. With this commit, we change the logic for COPY on local nodes: Try to reserve a connection to local host. This logic follows the same logic (e.g., citus.local_shared_pool_size) as the executor because COPY also relies on TryToIncrementSharedConnectionCounter(). If reservation to local node fails, switch to local execution Apart from this, if local execution is disabled, we follow the exact same logic for multi-node Citus. It means that if we are out of the connection, we'd give an error.	2021-02-04 09:45:07 +01:00
Sait Talha Nisanci	9ba3f70420	Remove unused method	2021-02-03 20:02:03 +03:00
Onur Tirtir	3a403090fd	Disallow adding local table with identity column to metadata (#4633 ) pg_get_tableschemadef_string doesn't know how to deparse identity columns so we cannot reflect those columns when creating shell relation. For this reason, we don't allow adding local tables -having identity cols- to metadata.	2021-02-03 19:05:17 +03:00
Onur Tirtir	93c3f30024	Rename ExtractColumnsOwningSequences	2021-02-02 18:17:42 +03:00
Hanefi Önaldı	cab17afce9	Introduce UDFs for fixing partitioned table constraint names	2021-01-29 17:32:20 +03:00
SaitTalhaNisanci	738825cc38	Fix partition column index issue (#4591 ) * Fix partition column index issue We send column names to worker_hash/range_partition_table methods, and in these methods we check the column name index from tuple descriptor. Then this index is used to decide the bucket that the current row will be sent for the repartition. This becomes a problem when there are the same column names in the tupleDescriptor. Then we can choose the wrong index. Hence the partitioned data will be put to wrong workers. Then the result could miss some data because workers might contain different range of data. An example: TupleDescriptor contains "trip_id", "car_id", "car_id" for one table. It contains only "car_id" for the other table. And assuming that the tables will be partitioned by car_id, it is not certain what should be used for deciding the bucket number for the first table. Assuming value 2 goes to bucket 2 and value 3 goes to bucket 3, it is not certain which bucket "1 2 3" (trip_id, car_id, car_id) row will go to. As a solution we send the index of partition column in targetList instead of the column name. The old API is kept so that if workers upgrade work, it still works (though it will have the same bug) * Use the same method so that backporting is easier	2021-01-29 14:40:40 +03:00
Onur Tirtir	2f30be823e	Rename create_citus_local_table to citus_add_local_table_to_metadata For simplicity in downgrade test in multi_extension, didn't actually remove create_citus_local_table udf.	2021-01-27 15:52:36 +03:00
Onur Tirtir	458a81f93d	Add suppressNoticeMessages to TableConversionState	2021-01-27 12:53:58 +03:00
Naisila Puka	94bc2703bc	Make undistribute_table() and citus_create_local_table() work with columnar (#4563 ) * Make undistribute_table() and citus_create_local_table() work with columnar * Rename and use LocallyExecuteUtilityTask for UDF check * Remove 'local' references in ExecuteUtilityCommand	2021-01-27 01:17:20 +03:00
Halil Ozan Akgul	bafa692fc1	Adds error messages with names of indexes that will be dropped	2021-01-26 18:18:26 +03:00
Onur Tirtir	b5ea033a0b	Convert postgres tables to citus local when creating reference table having fkeys	2021-01-25 11:02:50 +03:00
Onur Tirtir	253c19062a	Rename IsCitusInitiatedBackend to IsCitusInitiatedRemoteBackend (#4562 )	2021-01-23 01:07:43 +03:00
Onur Tirtir	941c8fbf32	Automatically undistribute citus local tables when no more fkeys with reference tables (#4538 )	2021-01-22 18:15:41 +03:00
Ahmet Gedemenli	2fa060a32d	Fix bug creating citus local table with stats	2021-01-20 17:17:13 +03:00
Onder Kalaci	8df58926c5	Rename CitusProcessUtility -> ProcessUtilityForNode	2021-01-20 15:54:00 +03:00
Hadi Moshayedi	bc01c795a2	Reland #4419	2021-01-19 07:48:47 -08:00
Halil Ozan Akgul	27c2bd1599	Moves creation of ALTER INDEX STATISTICS commands next to index commands	2021-01-18 16:55:53 +03:00
Onur Tirtir	f1ecbc3a53	Fix segfault when adding/dropping fkey from ref to citus local via remote exec (#4528 )	2021-01-17 20:43:33 +03:00
Onder Kalaci	c35e22d75d	Skip validation for foreign key creation commands For certaion purposes, we drop and recreate the foreign keys. As we acquire exclusive locks on the tables in between drop and re-create, we can safely skip validation phase of the foreign keys. The reason is purely being performance as foreign key validation could take a long value.	2021-01-15 18:04:52 +03:00
Onder Kalaci	ae0b92233d	Rename function	2021-01-15 18:04:52 +03:00
Onder Kalaci	30d0a65f40	Adds citus.enable_local_reference_table_foreign_keys When enabled any foreign keys between local tables and reference tables supported by converting the local table to a citus local table. When the coordinator is not in the metadata, the logic is disabled as foreign keys are not allowed in this configuration.	2021-01-15 18:04:52 +03:00
Halil Ozan Akgul	9407965817	Moves struct to the header	2021-01-15 11:50:11 +03:00
Onur Tirtir	36b418982f	Add support for ALTER TABLE commands defining foreign keys	2021-01-14 17:12:00 +03:00
Onur Tirtir	05931b8fe2	Pass ProcessUtilityContext to .preprocess	2021-01-14 17:12:00 +03:00
Onur Tirtir	00da1eed20	Some refactor as a preparation	2021-01-13 16:50:09 +03:00
Halil Ozan Akgul	2be14cce2e	Adds alter_distributed_table and alter_table_set_access_method UDFs	2021-01-13 16:02:39 +03:00
SaitTalhaNisanci	724d56f949	Add citus shard helper view (#4361 ) With citus shard helper view, we can easily see: - where each shard is, which node, which port - what kind of table it belongs to - its size With such a view, we can see shards that have a size bigger than some value, which could be useful. Also debugging can be easier in production as well with this view. Fetch shards in one go per node The previous implementation was slow because it would do a lot of round trips, one per shard to be exact. Hence it is improved so that we fetch all the shard_name, shard-size pairs per node in one go. Construct shards_names, sizes query on coordinator	2021-01-13 13:58:47 +03:00
Ahmet Gedemenli	436c9d9d79	Remove the word 'master' from Citus UDFs (#4472 ) * Replace master_add_node with citus_add_node * Replace master_activate_node with citus_activate_node * Replace master_add_inactive_node with citus_add_inactive_node * Use master udfs in old scripts * Replace master_add_secondary_node with citus_add_secondary_node * Replace master_disable_node with citus_disable_node * Replace master_drain_node with citus_drain_node * Replace master_remove_node with citus_remove_node * Replace master_set_node_property with citus_set_node_property * Replace master_unmark_object_distributed with citus_unmark_object_distributed * Replace master_update_node with citus_update_node * Replace master_update_shard_statistics with citus_update_shard_statistics * Replace master_update_table_statistics with citus_update_table_statistics * Rename master_conninfo_cache_invalidate to citus_conninfo_cache_invalidate Rename master_dist_local_group_cache_invalidate to citus_dist_local_group_cache_invalidate * Replace master_copy_shard_placement with citus_copy_shard_placement * Replace master_move_shard_placement with citus_move_shard_placement * Rename master_dist_node_cache_invalidate to citus_dist_node_cache_invalidate * Rename master_dist_object_cache_invalidate to citus_dist_object_cache_invalidate * Rename master_dist_partition_cache_invalidate to citus_dist_partition_cache_invalidate * Rename master_dist_placement_cache_invalidate to citus_dist_placement_cache_invalidate * Rename master_dist_shard_cache_invalidate to citus_dist_shard_cache_invalidate * Drop master_modify_multiple_shards * Rename master_drop_all_shards to citus_drop_all_shards * Drop master_create_distributed_table * Drop master_create_worker_shards * Revert old function definitions * Add missing revoke statement for citus_disable_node	2021-01-13 12:10:43 +03:00
Onur Tirtir	dd55ab394e	Disallow cascade_via_foreign_keys if any partition rel has non-inherited fkeys (#4487 )	2021-01-11 21:50:09 +03:00
Marco Slot	d900a7336e	Automatically add placeholder record for coordinator	2021-01-08 15:09:53 +01:00
Marco Slot	597533b1ff	Add citus_set_coordinator_host	2021-01-08 13:36:26 +01:00
Onur Tirtir	5289785da4	Add cascade_via_foreign_keys option to create_citus_local_table (#4462 )	2021-01-08 15:13:26 +03:00
Marco Slot	011283122b	Add the shard rebalancer implementation	2021-01-07 16:51:55 +01:00
Onur Tirtir	f3801143fb	Add cascade option to undistribute_table	2021-01-07 15:41:49 +03:00
Onur Tirtir	2e3e680ba9	Add infra to cascade citus table functions	2021-01-07 15:41:48 +03:00
Marco Slot	47c1b19174	Revert "Do metadata sync in a separate background worker." This reverts commit `4df723cf9b`.	2021-01-07 10:30:04 +01:00
Marco Slot	d9f175532b	Revert "Trigger metadata sync at transaction commit" This reverts commit `a2c73bef27`.	2021-01-07 10:30:00 +01:00
Marco Slot	5de3337b2f	Support local execution for INSERT..SELECT with re-partitioning	2021-01-06 16:15:53 +01:00
Onur Tirtir	e91e745dbc	Implement ConstraintWithNameIsOfType (#4451 )	2020-12-29 11:53:06 +03:00
Onur Tirtir	04a4167a8a	Implement GetPgDependTuplesForDependingObjects	2020-12-25 18:03:28 +03:00
Hadi Moshayedi	a2c73bef27	Trigger metadata sync at transaction commit	2020-12-24 08:28:38 -08:00
Hadi Moshayedi	4df723cf9b	Do metadata sync in a separate background worker.	2020-12-24 08:25:55 -08:00
Ahmet Gedemenli	48ca1637a4	Propagate alter stats owner	2020-12-24 17:10:12 +03:00
Ahmet Gedemenli	f7c70f9a63	Propagate alter stats target	2020-12-24 17:10:12 +03:00
Ahmet Gedemenli	5a1607b6c0	Propagate alter stats schema	2020-12-24 17:10:12 +03:00
Ahmet Gedemenli	bdce4a7e67	Propagate rename statistics	2020-12-24 17:10:12 +03:00
Onur Tirtir	5ed9197041	Implement infra to get foreign key connected relations (#4439 ) On top of our foreign key graph, implement the infrastructure to get list of relations that are connected to input relation via a foreign key graph. We need this to support cascading create_citus_local_table & undistribute_table operations. Also add regression tests to see what our foreign key graph is able to capture currently.	2020-12-24 16:42:40 +03:00
Halil Ozan Akgül	9fd3f62cb6	Refactor foreign key functions to use table types (#4424 ) * Reuses extractReferencing/Referenced variables * Refactors GetForeignKeyOids function to check table types * Converts flags to inclusive	2020-12-23 17:05:09 +03:00
Onur Tirtir	d1b3eaf767	Refactor ColumnAppearsInForeignKeyToReferenceTable (#4441 )	2020-12-23 11:44:02 +03:00
Ahmet Gedemenli	874fa1fc09	Propagate Drop Statistics	2020-12-22 18:34:46 +03:00
Marco Slot	f2056e553f	Expose partition column of subqueries in optimizer (#4355 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2020-12-18 20:32:52 +01:00
Ahmet Gedemenli	770d3da1ca	Add dependencies for stat schemas	2020-12-18 17:04:13 +03:00
Ahmet Gedemenli	6c0465566a	Propagate create statistics	2020-12-17 20:38:36 +03:00
Marco Slot	100e5d3196	Address review feedback	2020-12-15 15:23:38 +01:00
Sait Talha Nisanci	7951273f74	Refactor WrapRteRelationIntoSubquery	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	0e53aa5d3b	Add more tests	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	f7c1509fed	Not check if the query is routable for converting It seems that there are only very few cases where that is useful, and for now we prefer not having that check. This means that we might perform some unnecessary checks, but that should be rare and not performance critical.	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	1d82972ff4	Increase the performance with a trick Instead of sending NULL's over a network, we now convert the subqueries in the form of: SELECT t.a, NULL, NULL FROM (SELECT a FROM table)t; And we recursively plan the inner part so that we don't send the NULL's over network. We still need the NULLs in the outer subquery because we currently don't have an easy way of updating all the necessary places in the query. Add some documentation for how the conversion is done	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	3aed6c3ad0	Rename containsOnlyLocalTable as isLocalTableModification Update error message in Modify View	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	5618f3a3fc	Use BaseRestrictInfo for finding equality columns Baseinfo also has pushed down filters etc, so it makes more sense to use BaseRestrictInfo to determine what columns have constant equality filters. Also RteIdentity is used for removing conversion candidates instead of rteIndex.	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	28c5b6a425	Convert some hard coded errors to deferred errors in router planner	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	69992d58f9	Add broken local-dist table modifications tests It seems that most of the updates were broken, we weren't aware of it because there wasn't any data in the tables. They are broken mostly because local tables do not have a shard id and some code paths should be updated with that information, currently when there is an invalid shard id, it is assumed to be pruned. Consider local tables in router planner In case there is a local table, the shard id will not be valid and there are some checks that rely on shard id, we should skip these in case of local tables, which is handled with a dummy placement. Add citus local table dist table join tests add local-dist table mixed joins tests	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	a34504d7bf	Move recursive planning related function to recursive_planning	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	2a44029aaf	Simplify ContainsTableToBeConvertedToSubquery AllDataLocallyAccessible and ContainsLocalTableSubqueryJoin are removed. We can possibly remove ModifiesLocalTableWithRemoteCitusLocalTable as well. Though this removal has a side effect that now when all the data is locally available, we could still wrap a relation into a subquery, I guess that should be resolved in the router planner itself. Add more tests	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	26d9f0b457	Use auto mode in tests and fix debug message	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	eebcd995b3	Add some more tests	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	5693cabc41	Not convert an already routable plannable query We should not recursively plan an already routable plannable query. An example of this is (SELECT * FROM local JOIN (SELECT * FROM dist) d1 USING(a)); So we let the recursive planner do all of its work and at the end we convert the final query to to handle unsupported joins. While doing each conversion, we check if it is router plannable, if so we stop. Only consider range table entries that are in jointree If a range table is not in jointree then there is no point in considering that because we are trying to convert range table entries to subqueries for join use case.	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	2ff65f3630	Enable partitioned distributed tables in local-dist table joins	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	44953579cf	Enable citus-local distributed table joins Check equality in quals We want to recursively plan distributed tables only if they have an equality filter on a unique column. So '>' and '<' operators will not trigger recursive planning of distributed tables in local-distributed table joins. Recursively plan distributed table only if the filter is constant If the filter is not a constant then the join might return multiple rows and there is a chance that the distributed table will return huge data. Hence if the filter is not constant we choose to recursively plan the local table.	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	f3d55448b3	Choose distributed table if it has a unique index in filter When doing local-distributed table joins we convert one of them to subquery. The current policy is that we convert distributed tables to subquery if it has a unique index on a column that has unique index(primary key also has a unique index).	2020-12-15 18:17:10 +03:00
Onder Kalaci	3f4952cc2b	Pushdown projections when relations are recursively planned This is important to limit the data transfer size.	2020-12-15 18:17:10 +03:00
Onder Kalaci	594e001f3b	Add filter pushdown regression tests Also handle WHERE false	2020-12-15 18:17:10 +03:00
Onder Kalaci	7a4d6b2984	Handle modifications as well	2020-12-15 18:17:10 +03:00
Onder Kalaci	8f8390ed6e	Recursively plan local table joins The logical planner cannot handle joins between local and distributed table. Instead, we can recursively plan one side of the join and let the logical planner handle the rest. Our algorithm is a little smart, trying not to recursively plan distributed tables, but favors local tables.	2020-12-15 18:17:10 +03:00
Onder Kalaci	7cc25c9125	Add ability to fetch the restrictions per relation With this commit, we add the ability to add restrictions per relation. We simply rely on the restrictions that Postgres keeps per relation.	2020-12-15 18:17:10 +03:00
Marco Slot	f2538a456f	Support co-located/recurring sublinks in the target list	2020-12-13 15:45:24 +01:00
Ahmet Gedemenli	936775e8e3	Delete transactions when removing node With this commit, we delete entries in pg_dist_transaction for the primary nodes that are removed by `master_remove_node`.	2020-12-07 11:35:20 +03:00
SaitTalhaNisanci	f164575524	Add a utility to process each table index (#4382 ) A utility function is added so that each caller can implement a handler for each index on a given table. This means that the caller doesn't need to worry about how to access each index, the only thing that it needs to do each to implement a function to which each index on the table is passed iteratively.	2020-12-03 16:33:13 +03:00
Onder Kalaci	c546ec5e78	Local node connection management When Citus needs to parallelize queries on the local node (e.g., the node executing the distributed query and the shards are the same), we need to be mindful about the connection management. The reason is that the client backends that are running distributed queries are competing with the client backends that Citus initiates to parallelize the queries in order to get a slot on the max_connections. In that regard, we implemented a "failover" mechanism where if the distributed queries cannot get a connection, the execution failovers the tasks to the local execution. The failover logic is follows: - As the connection manager if it is OK to get a connection - If yes, we are good. - If no, we fail the workerPool and the failure triggers the failover of the tasks to local execution queue The decision of getting a connection is follows: /* * For local nodes, solely relying on citus.max_shared_pool_size or * max_connections might not be sufficient. The former gives us * a preview of the future (e.g., we let the new connections to establish, * but they are not established yet). The latter gives us the close to * precise view of the past (e.g., the active number of client backends). * * Overall, we want to limit both of the metrics. The former limit typically * kics in under regular loads, where the load of the database increases in * a reasonable pace. The latter limit typically kicks in when the database * is issued lots of concurrent sessions at the same time, such as benchmarks. */	2020-12-03 14:16:13 +03:00
Ahmet Gedemenli	514c6a76ac	Propagate alter schema rename	2020-12-02 15:18:26 +03:00
Nils Dijk	6f9c040f76	DESCRIPTION: Propagate columnar table settings for distributed tables When distributing a columnar table, as well as changing options on a distributed columnar table, this patch will forward the settings from the coordinator to the workers. For propagating options changes on an already distributed table this change is pretty straight forward. Before applying the change in options locally we will create a `DDLJob` that contains a call to `alter_columnar_table_set(...)` for every shard placement with all settings of the current table. This goes both for setting an option as well as resetting. This will reset the values to the defaults configured on the coordinator. Having the effect that the coordinator is authoritative on the settings and makes sure the shards have the same settings set as the table on the coordinator. When a columnar table is distributed it is using the `TableDDLCommand` infra structure to create a new kind of `TableDDLCommand`. This new type, called a `TableDDLCommandFunction` contains a context and 2 function pointers to execute. One function returns the command as applied on the table, the second function will return the sql command to apply to a shard with a given shard id. The schema name is ignored as it will use the fully qualified name of the shard in the same schema as the base table.	2020-12-02 13:02:42 +01:00
Onder Kalaci	f7e1aa3f22	Multi-row INSERTs use local execution when placements are local Multi-row execution already uses sequential execution. When shards are local, using local execution is profitable as it avoids an extra connection establishment to the local node.	2020-12-01 21:37:59 +03:00
Onur Tirtir	7f3d1182ed	Handle invalid connection hash entries (#4362 ) If MemoryContextAlloc errors out -e.g. during an OOM-, ConnectionHashEntry->connections stays as NULL. With this commit, we add isValid flag to ConnectionHashEntry that should be set to true right after we allocate & initialize ConnectionHashEntry->connections list properly, and we check it before accesing to ConnectionHashEntry->connections.	2020-11-30 19:44:03 +03:00
Nils Dijk	326e6afa53	refactor table ddl events scoped for shards (#4342 ) Refactor internals on how Citus creates the SQL commands it sends to recreate shards. Before Citus collected solely ddl commands as `char `'s to recreate a table. If they were used to create a shard they were wrapped with `worker_apply_shard_ddl_command` and send to the workers. On the workers the UDF wrapping the ddl command would rewrite the parsetree to replace tables names with their shard name equivalent. This worked well, but poses an issue when adding columnar. Due to limitations in Postgres on creating custom options on table access methods we need to fall back on a UDF to set columnar specific options. Now, to recreate the table, we can not longer rely on having solely DDL statements to recreate a table. A prototype was made to run this UDF wrapped in `worker_apply_shard_ddl_command`. This became pretty messy, hard to understand and subsequently hard to maintain. This PR proposes a refactor of the internal representation of table ddl commands into a `TableDDLCommand` structure. The current implementation only supports a `char ` as its contents. Based on the use of the DDL statement (eg. creating the table -mx- or creating a shard) one of two different functions can be called to get the statement to send to the worker: - `GetTableDDLCommand(TableDDLCommand command)`: This function returns that ddl command to create the table. In this implementation it will just return the `char `. This has the same functionality as getting the old list and not wrapping it. - `GetShardedTableDDLCommand(TableDDLCommand command, uint64 shardId, char schemaName)`: This function returns the ddl command wrapped in `worker_apply_shard_ddl_command` with the `shardId` as an argument. Due to backwards compatibility it also accepts a. `schemaName`. The exact purpose is not directly clear. Ideally new implementations would work with fully qualified statements and ignore the `schemaName`. A future implementation could accept 2.function pointers and a `void *` for context to let the two pointers work on. This gives greater flexibility in controlling what commands get send in which situations. Also, in a future, we could implement the intermediate step of creating the `parsetree` datastructure of statements based on the contents in the catalog with a corresponding deparser. For sharded queries a mutator could be ran over the parsetree to rewrite the tablenames to the names with the shard identifier. This will completely omit the requirement for `worker_apply_shard_ddl_command`.	2020-11-26 13:31:59 +01:00
Onder Kalaci	629ecc3dee	Add the infrastructure to count the number of client backends Considering the adaptive connection management improvements that we plan to roll soon, it makes it very helpful to know the number of active client backends. We are doing this addition to simplify yhe adaptive connection management for single node Citus. In single node Citus, both the client backends and Citus parallel queries would compete to get slots on Postgres' `max_connections` on the same Citus database. With adaptive connection management, we have the counters for Citus parallel queries. That helps us to adaptively decide on the remote executions pool size (e.g., throttle connections if necessary). However, we do not have any counters for the total number of client backends on the database. For single node Citus, we should consider all the client backends, not only the remote connections that Citus does. Of course Postgres internally knows how many client backends are active. However, to get that number Postgres iterates over all the backends. For examaple, see [pg_stat_get_db_numbackends](`8e90ec5580/src/backend/utils/adt/pgstatfuncs.c (L1240)`) where Postgres iterates over all the backends. For our purpuses, we need this information on every connection establishment. That's why we cannot affort to do this kind of iterattion.	2020-11-25 19:19:24 +01:00
Onur Tirtir	46be63d76b	Refactor PreprocessIndexStmt (#4272 )	2020-11-25 12:19:37 +03:00
Onder Kalaci	c433c66f2b	Do not execute subplans multiple times with cursors Before this commit, we let AdaptiveExecutorPreExecutorRun() to be effective multiple times on every FETCH on cursors. That does not affect the correctness of the query results, but adds significant overhead.	2020-11-20 10:43:56 +01:00
Önder Kalacı	0c0fc69f2a	Remove unused field (#4275 )	2020-11-17 11:41:57 +01:00
Hanefi Onaldi	d3019f1b6d	Introduce foreach_ptr_modify macro (#4303 ) If one wishes to iterate through a List and insert list elements in PG13, it is not safe to use for_each_ptr as the List representation in PostgreSQL no longer linked lists, but arrays, and it is possible that the whole array is repalloc'ed if ther is not sufficient space available. See postgres commit 1cff1b95ab6ddae32faa3efe0d95a820dbfdc164 for more information	2020-11-09 12:03:59 +03:00
Onder Kalaci	e0d2ac7620	Do not rely on set_rel_pathlist_hook for finding local relations When a relation is used on an OUTER JOIN with FALSE filters, set_rel_pathlist_hook may not be called for the table. There might be other cases as well, so do not rely on the hook for classification of the tables.	2020-11-06 11:14:30 +01:00
Halil Ozan Akgul	77b3be8b6d	Turn RelOptInfos to only used field of them, relids, to be able to copy	2020-10-22 13:42:28 +03:00
Onder Kalaci	5c4c9304ba	Remove RemoveDuplicateJoinRestrictions() function RemoveDuplicateJoinRestrictions() function was introduced with the aim of decrasing the overall planning times by eliminating the duplicate JOIN restriction entries (#1989). However, it turns out that the function itself is so CPU intensive with a very high algorithmic complexity, it hurts a lot more than it helps. The function is a clear example of premature optimization. The table below shows the difference clearly: "distributed query planning time master" RemoveDuplicateJoinRestrictions() execution time on master "Remove the function RemoveDuplicateJoinRestrictions() this PR" 5 table INNER JOIN 9 msec 2msec 7 msec 10 table INNER JOIN 227 msec 194 msec 29 msec 20 table INNER JOIN 1 sec 235 msec 1 sec 139 msec 90 msecs 50 table INNER JOIN 24 seconds 21 seconds 1.5 seconds 100 table INNER JOIN 2 minutes 16 secods 1 minute 53 seconds 23 seconds 250 table INNER JOIN Bottleneck on JoinClauseList 18 minutes 52 seconds Bottleneck on JoinClauseList 5 table INNER JOIN in subquery 9 msec 0 msec 6 msec 10 table INNER JOIN subquery 33 msec 10 msec 32 msec 20 table INNER JOIN subquery 132 msec 67 msec 123 msec 50 table INNER JOIN subquery 1.2 seconds 900 msec 500 msec 100 table INNER JOIN subquery 6 seconds 5 seconds 2 seconds 250 table INNER JOIN subquery 54 seconds 37 seconds 20 seconds 5 table LEFT JOIN 5 msec 0 msec 5 msec 10 table LEFT JOIN 11 msec 0 msec 13 msec 20 table LEFT JOIN 26 msec 2 msec 30 msec 50 table LEFT JOIN 150 msec 15 msec 193 msec 100 table LEFT JOIN 757 msec 71 msec 722 msec 250 table LEFT JOIN 8 seconds 600 msec 8 seconds 5 JOINs among 2 table JOINs 37 msec 11 msec 25 msec 10 JOINs among 2 table JOINs 536 msec 306 msec 352 msec 20 JOINs among 2 table JOINs 794 msec 181 msec 640 msec 50 JOINs among 2 table JOINs 25 seconds 2 seconds 22 seconds 100 JOINs among 2 table JOINs Bottleneck on JoinClauseList 9 seconds Bottleneck on JoinClauseList 150 JOINs among 2 table JOINs Bottleneck on JoinClauseList 46 seconds Bottleneck on JoinClauseList On top of the performance penalty, the function had a critical bug #4255, and with #4254 we hit one more important bug. It should be fixed by adding the followig check to the ContextCoversJoinRestriction(): ``` static bool JoinRelIdsSame(JoinRestriction leftRestriction, JoinRestriction rightRestriction) { Relids leftInnerRelIds = leftRestriction->innerrel->relids; Relids rightInnerRelIds = rightRestriction->innerrel->relids; if (!bms_equal(leftInnerRelIds, rightInnerRelIds)) { return false; } Relids leftOuterRelIds = leftRestriction->outerrel->relids; Relids rightOuterRelIds = rightRestriction->outerrel->relids; if (!bms_equal(leftOuterRelIds, rightOuterRelIds)) { return false; } return true; } ``` However, adding this eliminates all the benefits tha RemoveDuplicateJoinRestrictions() brings. I've used the commands here to generate the JOINs mentioned in the PR: https://gist.github.com/onderkalaci/fe8654f9df5916c7af4c7c5eb892561e#file-gistfile1-txt Inner and outer JOINs behave roughly the same, to simplify the table only added INNER joins.	2020-10-21 10:29:39 +02:00
SaitTalhaNisanci	0f209377c4	Fix incorrect join related fields (#4242 ) * Fix incorrect join related fields Ruleutils expect to give the original index of join columns hence we should consider the dropped columns while setting the fields in SetJoinRelatedFieldsCompat. * add some more tests for joins * Move tests to join.sql and create a utility function	2020-10-19 18:28:39 +03:00
Onur Tirtir	c49077d594	Disallow outer joins `ON TRUE` with ref & dist tables when ref table is outer relation (#4255 ) Disallow `ON TRUE` outer joins with reference & distributed tables when reference table is outer relation by fixing the logic bug made when calling `LeftListIsSubset` function. Also, be more defensive when removing duplicate join restrictions when join clause is empty for non-inner joins as they might still contain useful information for non-inner joins.	2020-10-19 16:58:11 +03:00
Onur Tirtir	f80f4839ad	Remove unused functions that cppcheck found	2020-10-19 13:50:52 +03:00
Onder Kalaci	bbedfca761	Improve the relation restriction counters It seems like Postgres could call set_rel_pathlist() for the same relation multiple times. This breaks the logic where we assume relationCount eqauls to the number of entries in relationRestrictionList. In summary, relationRestrictionList may contain duplicate entries.	2020-10-19 08:51:16 +02:00
Nils Dijk	caabbf4b84	Table access method support for distributed tables	2020-10-16 12:02:25 -07:00
Onur Tirtir	7cb07c70fa	Move hasSemiJoin to JoinRestrictionContext (#4256 )	2020-10-16 18:37:39 +03:00
Onur Tirtir	de6f2d3f42	Refactor JoinRestrictionListExistsInContext to improve readability (#4249 )	2020-10-16 12:24:56 +03:00
Onder Kalaci	fe3caf3bc8	Local execution considers intermediate result size limit With this commit, we make sure that local execution adds the intermediate result size as the distributed execution adds. Plus, it enforces the citus.max_intermediate_result_size value.	2020-10-15 17:18:55 +02:00
Marco Slot	31858c8a29	Check table existence in EnsureRelationKindSupported	2020-10-15 17:05:06 +02:00
Sait Talha Nisanci	ecde6c6eef	Introduce GetCurrentLocalExecutionStatus wrapper We should not access CurrentLocalExecutionStatus directly because that would mean that we could also set it directly, which we shouldn't because we have checks to see if the new state is possible, otherwise we error.	2020-10-15 15:38:19 +03:00
Halil Ozan Akgul	e2736c25bd	Adds support for WITH TIES option	2020-10-12 19:34:18 +03:00
Marco Slot	73fc054c27	Rename DDL command functions	2020-10-06 11:30:56 +02:00
Marco Slot	dbc348b7e0	Create sequence dependency during metadata syncing	2020-10-06 10:57:39 +02:00
Ahmet Gedemenli	81db4dca5c	Degrade gracefully when no background workers available	2020-10-05 16:55:00 +03:00
Hanefi Önaldı	6d8e83d24f	Replace worker_hash calls with partkey IS NOT NULL filters	2020-10-02 18:16:24 +03:00
Önder Kalacı	df5aa0f0cc	Switch to sequential execution if the index name is long (#4209 ) Citus has the logic to truncate the long shard names to prevent various issues, including self-deadlocks. However, for partitioned tables, when index is created on the parent table, the index names on the partitions are auto-generated by Postgres. We use the same Postgres function to generate the index names on the shards of the partitions. If the length exceeds the limit, we switch to sequential execution mode.	2020-10-02 13:39:34 +03:00
Onder Kalaci	56ca256374	Forcefully terminate connections after citus.node_connection_timeout After the connection timeout, we fail the session/pool. However, the underlying connection can still be trying to connect. That is dangerous because the new placement executions have already been in place. The executor cannot handle the situation where multiple of EXECUTION_ORDER_ANY task executions succeeds. Adding a regression test doesn't seem easily doable. To reproduce the issue - Add 2 worker nodes - create a reference table - set citus.node_connection_timeout to 1ms (requires code change) - Continiously execute `SELECT count(*) FROM ref_table` - Sometime later, you hit an out-of-array access in `ScheduleNextPlacementExecution()` hence crashing. - The reason for that is sometimes the first connection successfully established while the executor is already trying to execute the query on the second node.	2020-09-30 18:24:24 +02:00
Marco Slot	b905c8043d	Fix create index concurrently crash with local execution	2020-09-25 11:49:09 +02:00
Ahmet Gedemenli	abfb79bda6	Sort explain analyze output by task time Add sort method parameter for regression tests Fix check-style Change sorting method parameters to enum Polish Add task fields to OutTask Add test into multi_explain Fix isolation test	2020-09-24 11:38:40 +03:00
SaitTalhaNisanci	e7cd1ed0ee	Not take ShareUpdateExlusiveLock on pg_dist_transaction (#4184 ) * Not take ShareUpdateExlusiveLock on pg_dist_transaction We were taking ShareUpdateExlusiveLock on pg_dist_transaction during recovery to prevent multiple recoveries happening concurrenly. VACUUM( not FULL) also takes ShareUpdateExclusiveLock, and they can conflict. It seems that VACUUM will skip the table if there is a conflicting lock already taken unless it is doing the vacuum to prevent id wraparound, in which case there can be a deadlock. I guess the deadlock happens if: - VACUUM takes a lock on pg_dist_transaction and is done for id wraparound problem - The transaction in the maintenance tries to take a lock but cannot as that conflicts with the lock acquired by VACUUM - The transaction in the maintenance daemon has a very old xid hence VACUUM cannot proceed. If we take a row exclusive lock in transaction recovery then it wouldn't conflict with VACUUM hence it could proceed so the deadlock would be resolved. To prevent concurrent transaction recoveries happening, an advisory lock is taken with ShareUpdateExlusiveLock as before. * Use CITUS_OPERATIONS tag	2020-09-21 15:20:38 +03:00
Onur Tirtir	1b31b22635	Refactor the functions that return OID lists for citus tables	2020-09-18 16:42:46 +03:00
SaitTalhaNisanci	dae2c69fd7	Not allow removing a single node with ref tables (#4127 ) * Not allow removing a single node with ref tables We should not allow removing a node if it is the only node in the cluster and there is a data on it. We have this check for distributed tables but we didn't have it for reference tables. * Update src/test/regress/expected/single_node.out Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com> * Update src/test/regress/sql/single_node.sql Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2020-09-18 15:35:59 +03:00
Onur Tirtir	4118560b75	Prevent citus local table creation from a catalog table (#4158 )	2020-09-15 14:30:48 +03:00
Marco Slot	bd12555b16	Fix distributing tables owned by extensions	2020-09-10 04:46:11 +02:00
Onur Tirtir	3a73fba810	Apply planner changes for citus local tables	2020-09-09 11:51:18 +03:00
Onur Tirtir	a58a4395ab	Extend citus local table utility command support This commit brings following features: Foreign key support from citus local tables to reference tables * Foreign key support from reference tables to citus local tables (only with RESTRICT & NO ACTION behavior) * ALTER TABLE ENABLE/DISABLE trigger command support * CREATE/DROP/ALTER trigger command support and disallows: * ALTER TABLE ATTACH/DETACH PARTITION commands * CREATE TABLE <postgres table> ATTACH PARTITION <citus local table> commands * Foreign keys from postgres tables to citus local tables (the other way was already disallowed) for citus local tables.	2020-09-09 11:50:55 +03:00
Onur Tirtir	17cc810372	Implement "citus local table" creation logic	2020-09-09 11:50:48 +03:00
Onur Tirtir	ba208eae4d	Record non-distributed table accesses in local executor (#4139 )	2020-09-07 18:19:08 +03:00
Nils Dijk	bbf42063a7	export LookupShardTransferMode	2020-09-03 16:06:38 +02:00
Nils Dijk	6e4862c57f	expose transfermode for ensure reference table existance	2020-09-03 16:06:37 +02:00
SaitTalhaNisanci	366461ccdb	Introduce cache entry/table utilities (#4132 ) Introduce table entry utility functions Citus table cache entry utilities are introduced so that we can easily extend existing functionality with minimum changes, specifically changes to these functions. For example IsNonDistributedTableCacheEntry can be extended for citus local tables without the need to scan the whole codebase and update each relevant part. * Introduce utility functions to find the type of tables A table type can be a reference table, a hash/range/append distributed table. Utility methods are created so that we don't have to worry about how a table is considered as a reference table etc. This also makes it easy to extend the table types. * Add IsCitusTableType utilities * Rename IsCacheEntryCitusTableType -> IsCitusTableTypeCacheEntry * Change citus table types in some checks	2020-09-02 22:26:05 +03:00
Jelte Fennema	451ea04508	Rename ForceXxx functions to to XxxOrError This clearer naming was suggested in https://github.com/citusdata/citus/pull/4001	2020-09-01 11:19:17 +02:00
Hanefi Önaldı	024d398cd7	Allow distribution of functions that read from reference tables create_distributed_function(function_name, distribution_arg_name, colocate_with text) This UDF did not allow colocate_with parameters when there were no disttribution_arg_name supplied. This commit changes the behaviour to allow missing distribution_arg_name parameters when the function should be colocated with a reference table.	2020-09-01 07:28:34 +03:00
Hanefi Onaldi	f47b3a7e7d	Remove unused parameters from round robin reordering and friends (#4120 )	2020-08-20 12:45:01 +03:00
SaitTalhaNisanci	679bf0d2b2	Create CanPushdownSubqery wrapper for better readability (#4108 )	2020-08-12 17:28:20 +03:00
SaitTalhaNisanci	73ef40886b	Rename FindNodeCheckXXX functions (#4106 ) FindNodeCheck is not clear about what the function is doing. They are renamed to FindNodeMatchingCheckFunctionXXX. Also for choosing elements in these functions, CheckNodeFunc type is introduced.	2020-08-11 15:01:23 +03:00
Hadi Moshayedi	7b74eca22d	Support EXPLAIN EXECUTE ANALYZE.	2020-08-10 13:44:30 -07:00
Halil Ozan Akgul	375310b7f1	Adds support for table undistribution	2020-08-05 14:36:03 +03:00
Sait Talha Nisanci	fe4ac51d8c	Normalize Output:.. since it changes with pg13 Fix indentation for better readability	2020-08-04 15:38:13 +03:00
Sait Talha Nisanci	63ed126ad4	Set buffer usage with explain It seems that currently we process even postgres tables in explain commands. This is because we register a hook for explain and we don't have any check to see if the query has any citus table. With this commit, we now send the buffer usage as well to the relevant API. There is some duplicate in the code but it is because of the existing structure, we can refactor this separately.	2020-08-04 15:38:13 +03:00
Sait Talha Nisanci	fe1e1c9b68	Replace Set_ptr_value as SetListCellPtr to be more explicit Move header to right place and fix comment style	2020-08-04 15:38:13 +03:00
Sait Talha Nisanci	8e9b52971c	Use new var field names in the codebase The codebase is updated to use varattnosync and varnosyn and we defined the macros for older versions. This way we can just remove the macros when we drop an older version.	2020-08-04 15:38:13 +03:00
Sait Talha Nisanci	b641f63bfd	Use CMDTAG_SELECT_COMPAT CMDTAG_SELECT exists in PG12 hence defining a MACRO such as CMDTAG_SELECT -> "SELECT" is not possible. I chose CMDTAG_SELECT_COMPAT because with the COMPAT suffix it is explicit that it maps to different things in different versions and also has a less chance of mapping something irrevelant. For example if we used SELECT as a macro, then it would map every SELECT to whatever it is mapping to, which might have unexpected/undesired behaviour.	2020-08-04 15:38:13 +03:00
Sait Talha Nisanci	d68bfc5687	Improve error for index operator class parameters The error message when index has opclassopts is improved and the commit from postgres side is also included for future reference. Also some minor style related changes are applied.	2020-08-04 15:38:13 +03:00
Sait Talha Nisanci	1070828465	update cte inline output for pg13 Make some macros in version_compat more robust Remove commented code in ruleutils Remove unnecessary variable assignments	2020-08-04 15:18:27 +03:00
Sait Talha Nisanci	1112b254a7	adapt recently added code for pg13 This commit mostly adds pg_get_triggerdef_command to our ruleutils_13. This doesn't add anything extra for ruleutils 13 so it is basically a copy of the change on ruleutils_12	2020-08-04 15:18:27 +03:00
Sait Talha Nisanci	38aaf1faba	use QueryCompletion struct Postgres introduced QueryCompletion struct. Hence a compat utility is added to finish query completion for older versions and pg >= 13. The commit on Postgres side: 2f9661311b83dc481fc19f6e3bda015392010a40	2020-08-04 15:10:22 +03:00
Sait Talha Nisanci	9f1ec792b3	add queryString to distributed_planner distributed_planner now takes query string as a parameter. related commit on PG side: 6aba63ef3e606db71beb596210dd95fa73c44ce2	2020-08-04 15:10:22 +03:00
Sait Talha Nisanci	1a7ccac6ef	Add RangeTableEntryFromNSItem macro addRangeTableEntryXXX methods return a ParseNamespaceItem with pg >= 13. RangeTableEntryFromNSItem macro is added so that we return the range table entry from the ParseNamespaceItem in pg>=13 and for pg < 13 rte would already be returned with addRangeTableEntryXXX methods. Commit on Postgres side: 5815696bc66b3092f6361f53e0394909647042c8	2020-08-04 15:10:22 +03:00
Sait Talha Nisanci	4ed30a0824	create Set_ptr_value Since PG13 changed the list, a listcell doesn't contain data anymore. Therefore Set_ptr_value macro is created, so that depending on the version it will either use cell->data.ptr_value or cell->ptr_value. Commit on Postgres side: 1cff1b95ab6ddae32faa3efe0d95a820dbfdc164	2020-08-04 15:10:22 +03:00
Sait Talha Nisanci	ab85a8129d	map varoattno and varnoold fields in Var With PG13 varoattno and varnoold fields were renamed as varattnosyn and varnosyn. A macro is defined for these. Commit on Postgres side: 9ce77d75c5ab094637cc4a446296dc3be6e3c221 Command on Postgres side: git log --all --grep="varoattno"	2020-08-04 15:10:22 +03:00
Sait Talha Nisanci	688ab16bba	Introduce ExplainOnePlanCompat Since ExplainOnePlan expects BufferUsage as well with PG >= 13, ExplainOnePlanCompat is added. Commit on Postgres side: ed7a5095716ee498ecc406e1b8d5ab92c7662d10	2020-08-04 15:10:22 +03:00
Sait Talha Nisanci	6314eba5df	introduce standard_planner_compat standard_planner now takes the query string as a parameter as well with pg >= 13. Commit on Postgres Side: 66888f7424f7d6c7cea2c26e181054d1455d4e7a	2020-08-04 15:10:22 +03:00
Sait Talha Nisanci	991f49efc9	introduce getOwnedSequencesCompat macro Commit on Postgres side: 19781729f789f3c6b2540e02b96f8aa500460322	2020-08-04 15:10:22 +03:00
Sait Talha Nisanci	00e7386007	introduce PortalDefineQuerySelectCompat PortalDefineQuery doesn't accept char* for command tag anymore with PG >= 13. We are currently only using it with Select, therefore a Portal define query compat for select is created. Commit on PG side: 2f9661311b83dc481fc19f6e3bda015392010a40	2020-08-04 15:10:22 +03:00
Sait Talha Nisanci	62879ee8c1	introduce planner_compat and pg_plan_query_compat macros As the new planner and pg_plan_query_compat methods expect the query string as well, macros are defined to be compatible in different versions of postgres. Relevant commit on Postgres: 6aba63ef3e606db71beb596210dd95fa73c44ce2 Command on Postgres: git log --all --grep="pg_plan_query"	2020-08-04 15:10:22 +03:00
Sait Talha Nisanci	bf831d2e59	Use table_openXXX methods in the codebase With PG13 heap_* (heap_open, heap_close etc) are replaced with table_* (table_open, table_close etc). It is better to use the new table access methods in the codebase and define the macros for the previous versions as we can easily remove the macro without having to change the codebase when we drop the support for the old version. Commits that introduced this change on Postgres: f25968c49697db673f6cd2a07b3f7626779f1827 e0c4ec07284db817e1f8d9adfb3fffc952252db0 4b21acf522d751ba5b6679df391d5121b6c4a35f Command to see relevant commits on Postgres side: git log --all --grep="heap_open"	2020-08-04 15:10:22 +03:00
Sait Talha Nisanci	0819b79631	introduce list compat macros Pass the list to lnext API lnext API now expects the list as well. The commit on Postgres that introduced the change: 1cff1b95ab6ddae32faa3efe0d95a820dbfdc164 lnext_compat and list_delete_cell_compat macros are introduced so that we can use these macros in the codebase without having to use #if directives in the codebase. Related commit on postgres: 1cff1b95ab6ddae32faa3efe0d95a820dbfdc164 Command to search in postgres: git log --all --grep="list_delete_cell" add ListCellAndListWrapper When iterating a list in separate function calls, we need both the list and the current cell starting from PG13, therefore ListCellAndListWrapper is added to store both as a wrapper. Use ListCellAndListWrapper in foreign key test udfs As we iterate a list in these udfs using a functionContext, we need to use the wrapper to be able to access both the list and the current cell.	2020-08-04 15:10:22 +03:00
Sait Talha Nisanci	30549dc0e2	add copy of ruleutils_12 as ruleutils_13	2020-08-04 13:34:13 +03:00
Onder Kalaci	eeb8c81de2	Implement shared connection count reservation & enable `citus.max_shared_pool_size` for COPY With this patch, we introduce `locally_reserved_shared_connections.c/h` files which are responsible for reserving some space in shared memory counters upfront. We sometimes need to reserve connections, but not necessarily establish them. For example: - COPY command should reserve connections as it cannot know which connections it needs in which order. COPY establishes connections as any input data hits the workers. For example, for router COPY command, it only establishes 1 connection. As discussed here (https://github.com/citusdata/citus/pull/3849#pullrequestreview-431792473), COPY needs to reserve connections up-front, otherwise we can end up with resource starvation/un-detected deadlocks.	2020-08-03 18:51:40 +02:00
SaitTalhaNisanci	ef841115de	Fix int32 overflow and use PG macros for INT32_XX (#4061 ) * Use CalculateUniformHashRangeIndex in HashPartitionId INT32_MIN definition can change among different platforms hence it is possible to get overflow, we would see crashes because of this in debian distros. We have already solved a similar problem with introducing CalculateUniformHashRangeIndex method, hence to solve it we can use the same method, this also removes some duplication and has a single place to decide that. * Use PG_INT32_XX instead of INT32_XX to be safer	2020-07-23 18:30:08 +03:00
Onder Kalaci	cfb633601d	Minor refactorings in COPY command execution 1) Rename CONNECTION_PER_PLACEMENT to REQUIRE_CLEAN_CONNECTION. This is mostly to make things clear as the new name reveals more. 2) We also make sure that mark all the copy connections critical, even if they are accessed earlier in the transction	2020-07-23 15:36:19 +02:00
Onder Kalaci	52c0fccb08	Move executor specific logic to a function Because as we're planning to use the same logic, it'd be nice to use the exact same functions.	2020-07-22 15:09:47 +02:00
Onder Kalaci	ff6555299c	Unify node sort ordering The executor relies on WorkerPool, and many other places rely on WorkerNode. With this commit, we make sure that they are sorted via the same function/logic.	2020-07-22 11:03:25 +02:00
Hanefi Önaldı	e534dbae4a	Accept list of values in a supported ALTER ROLE .. SET statement Some GUCs support a list of values which is indicated by GUC_LIST_INPUT flag. When an ALTER ROLE .. SET statement is executed, the new configuration default for affected users and databases are stored in the setconfig(text[]) column in a pg_db_role_setting record. If a GUC that supports a list of values is used in an ALTER ROLE .. SET statement, we need to split the text into items delimited by commas.	2020-07-21 03:49:57 +03:00
Onder Kalaci	c25de2cf22	Remove flag from As it doesn't make any sense anymore	2020-07-20 12:45:05 +02:00
SaitTalhaNisanci	b3af63c8ce	Remove task tracker executor (#3850 ) * use adaptive executor even if task-tracker is set * Update check-multi-mx tests for adaptive executor Basically repartition joins are enabled where necessary. For parallel tests max adaptive executor pool size is decresed to 2, otherwise we would get too many clients error. * Update limit_intermediate_size test It seems that when we use adaptive executor instead of task tracker, we exceed the intermediate result size less in the test. Therefore updated the tests accordingly. * Update multi_router_planner It seems that there is one problem with multi_router_planner when we use adaptive executor, we should fix the following error: +ERROR: relation "authors_range_840010" does not exist +CONTEXT: while executing command on localhost:57637 * update repartition join tests for check-multi * update isolation tests for repartitioning * Error out if shard_replication_factor > 1 with repartitioning As we are removing the task tracker, we cannot switch to it if shard_replication_factor > 1. In that case, we simply error out. * Remove MULTI_EXECUTOR_TASK_TRACKER * Remove multi_task_tracker_executor Some utility methods are moved to task_execution_utils.c. * Remove task tracker protocol methods * Remove task_tracker.c methods * remove unused methods from multi_server_executor * fix style * remove task tracker specific tests from worker_schedule * comment out task tracker udf calls in tests We were using task tracker udfs to test permissions in multi_multiuser.sql. We should find some other way to test them, then we should remove the commented out task tracker calls. * remove task tracker test from follower schedule * remove task tracker tests from multi mx schedule * Remove task-tracker specific functions from worker functions * remove multi task tracker extra schedule * Remove unused methods from multi physical planner * remove task_executor_type related things in tests * remove LoadTuplesIntoTupleStore * Do initial cleanup for repartition leftovers During startup, task tracker would call TrackerCleanupJobDirectories and TrackerCleanupJobSchemas to clean up leftover directories and job schemas. With adaptive executor, while doing repartitions it is possible to leak these things as well. We don't retry cleanups, so it is possible to have leftover in case of errors. TrackerCleanupJobDirectories is renamed as RepartitionCleanupJobDirectories since it is repartition specific now, however TrackerCleanupJobSchemas cannot be used currently because it is task tracker specific. The thing is that this function is a no-op currently. We should add cleaning up intermediate schemas to DoInitialCleanup method when that problem is solved(We might want to solve it in this PR as well) * Revert "remove task tracker tests from multi mx schedule" This reverts commit `03ecc0a681`. * update multi mx repartition parallel tests * not error with task_tracker_conninfo_cache_invalidate * not run 4 repartition queries in parallel It seems that when we run 4 repartition queries in parallel we get too many clients error on CI even though we don't get it locally. Our guess is that, it is because we open/close many connections without doing some work and postgres has some delay to close the connections. Hence even though connections are removed from the pg_stat_activity, they might still not be closed. If the above assumption is correct, it is unlikely for it to happen in practice because: - There is some network latency in clusters, so this leaves some times for connections to be able to close - Repartition joins return some data and that also leaves some time for connections to be fully closed. As we don't get this error in our local, we currently assume that it is not a bug. Ideally this wouldn't happen when we get rid of the task-tracker repartition methods because they don't do any pruning and might be opening more connections than necessary. If this still gives us "too many clients" error, we can try to increase the max_connections in our test suite(which is 100 by default). Also there are different places where this error is given in postgres, but adding some backtrace it seems that we get this from ProcessStartupPacket. The backtraces can be found in this link: https://circleci.com/gh/citusdata/citus/138702 * Set distributePlan->relationIdList when it is needed It seems that we were setting the distributedPlan->relationIdList after JobExecutorType is called, which would choose task-tracker if replication factor > 1 and there is a repartition query. However, it uses relationIdList to decide if the query has a repartition query, and since it was not set yet, it would always think it is not a repartition query and would choose adaptive executor when it should choose task-tracker. * use adaptive executor even with shard_replication_factor > 1 It seems that we were already using adaptive executor when replication_factor > 1. So this commit removes the check. * remove multi_resowner.c and deprecate some settings * remove TaskExecution related leftovers * change deprecated API error message * not recursively plan single relatition repartition subquery * recursively plan single relation repartition subquery * test depreceated task tracker functions * fix overlapping shard intervals in range-distributed test * fix error message for citus_metadata_container * drop task-tracker deprecated functions * put the implemantation back to worker_cleanup_job_schema_cachesince citus cloud uses it * drop some functions, add downgrade script Some deprecated functions are dropped. Downgrade script is added. Some gucs are deprecated. A new guc for repartition joins bucket size is added. * order by a test to fix flappiness	2020-07-18 13:11:36 +03:00
Hadi Moshayedi	13003d8d05	Use TupleDestination API for partitioning in insert/select.	2020-07-17 09:43:46 -07:00
Nils Dijk	d0b6e62c9a	change wording to allowlist and the likes (#3906 ) In the same line as #3904 Change wording to better reflect use and remove words that enforce/maintain bias.	2020-07-15 16:24:40 +02:00
Sait Talha Nisanci	510535f558	address feedback	2020-07-13 19:45:02 +03:00
Sait Talha Nisanci	db1b78148c	send schema creation/cleanup to coordinator in repartitions We were using ALL_WORKERS TargetWorkerSet while sending temporary schema creation and cleanup. We(well mostly I) thought that ALL_WORKERS would also include coordinator when it is added as a worker. It turns out that it was FILTERING OUT the coordinator even if it is added as a worker to the cluster. So to have some context here, in repartitions, for each jobId we create (at least we were supposed to) a schema in each worker node in the cluster. Then we partition each shard table into some intermediate files, which is called the PARTITION step. So after this partition step each node has some intermediate files having tuples in those nodes. Then we fetch the partition files to necessary worker nodes, which is called the FETCH step. Then from the files we create intermediate tables in the temporarily created schemas, which is called a MERGE step. Then after evaluating the result, we remove the temporary schemas(one for each job ID in each node) and files. If node 1 has file1, and node 2 has file2 after PARTITION step, it is enough to either move file1 from node1 to node2 or vice versa. So we prune one of them. In the MERGE step, if the schema for a given jobID doesn't exist, the node tries to use the `public` schema if it is a superuser, which is actually added for testing in the past. So when we were not sending schema creation comands for each job ID to the coordinator(because we were using ALL_WORKERS flag, and it doesn't include the coordinator), we would basically not have any schemas for repartitions in the coordinator. The PARTITION step would be executed on the coordinator (because the tasks are generated in the planner part) and it wouldn't give us any error because it doesn't have anything to do with the temporary schemas(that we didn't create). But later two things would happen: - If by chance the fetch is pruned on the coordinator side, we the other nodes would fetch the partitioned files from the coordinator and execute the query as expected, because it has all the information. - If the fetch tasks are not pruned in the coordinator, in the MERGE step, the coordinator would either error out saying that the necessary schema doesn't exist, or it would try to create the temporary tables under public schema ( if it is a superuser). But then if we had the same task ID with different jobID it would fail saying that the table already exists, which is an error we were getting. In the first case, the query would work okay, but it would still not do the cleanup, hence we would leave the partitioned files from the PARTITION step there. Hence ensure_no_intermediate_data_leak would fail. To make things more explicit and prevent such bugs in the future, ALL_WORKERS is named as ALL_NON_COORD_WORKERS. And a new flag to return all the active nodes is added as ALL_DATA_NODES. For repartition case, we don't use the only-reference table nodes but this version makes the code simpler and there shouldn't be any significant performance issue with that.	2020-07-13 19:20:15 +03:00
SaitTalhaNisanci	15290bc43b	remove unused worker methods (#4017 )	2020-07-10 13:45:55 +03:00
SaitTalhaNisanci	3f50165365	rename TargetWorkerSet enums (#4015 ) Rename TargetWorkerSet enums to make them more explicit about what they mean. Ideally it would be good to treat everything as a node without the 'worker' concept because it makes things complicated. Another improvement could be to rename TargetWorkerSet as TargetNodeSet but it goes to renaming many occurrences of Worker, which is probably too big for this PR.	2020-07-10 11:21:27 +03:00
Jelte Fennema	759e628dd5	Handle some NULL issues that static analysis found (#4001 ) Static analysis found some issues where we used the result from ExtractResultRelationRTE, without checking that it wasn't NULL. It seems like in all these cases it can never actually be NULL, since we have checked before that it isn't a SELECT query. So, this PR is mostly to make static analysis happy (and protect a bit against future changes of the code).	2020-07-09 15:46:42 +02:00
SaitTalhaNisanci	96adce77d6	rename node/worker utilities (#4003 ) The names were not explicit about what they do, and we have many misusages in the codebase, so they are renamed to be more explicit.	2020-07-09 15:30:35 +03:00
Jelte Fennema	f6e2f1b1cb	Replace words that have bad associations (#3992 ) We had a few words in our codebase that static analysis flagged as having bad associations.	2020-07-08 14:57:48 +02:00
Hadi Moshayedi	23fa421639	Fix task->fetchedExplainAnalyzePlan memory issue.	2020-07-07 07:58:02 -07:00
citus bot	f0693e2f75	Remove unused MaxMasterConnectionCount function	2020-07-07 10:37:57 +02:00
citus bot	bdfeb380d3	Fix some more master->coordinator comments	2020-07-07 10:37:53 +02:00
Marco Slot	b4fec63bc0	Rename master evaluation to coordinator evaluation	2020-07-07 10:37:41 +02:00
Marco Slot	eeffbde8bd	Fix pushdown of constants in aggregate queries	2020-06-30 11:41:16 -07:00
Jelte Fennema	392c5e2c34	Fix wrong cancellation message about distributed deadlocks (#3956 )	2020-06-30 14:57:46 +02:00
Marco Slot	634d6cf9d7	Improve performance of metadata cache (#3924 ) #3866 removed the shard ID hash in metadata_cache.c to simplify cache management, but we observed a significant performance regression that was being masked by the performance improvement provided by #3654 in our benchmarks, but #3654 only applies to specific workloads. This PR brings back the shard ID cache as it existed before #3866 with some extra measures to handle invalidation. When we load a table entry, we overwrite ShardIdCacheEntry->tableEntry pointers for all the shards in that table, though it's possible that the table no longer contains the old shard ID or the table entry is never reloaded, which would leave a dangling pointer once the table entry is freed. To handle that case, we remove all shard ID cache entries that point exactly to that table entry when a table is freed (at the end of the transaction or any call to CitusTableCacheFlushInvalidatedEntries). Co-authored-by: SaitTalhaNisanci <s.talhanisanci@gmail.com> Co-authored-by: Marco Slot <marco.slot@gmail.com> Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2020-06-30 12:10:10 +02:00
Hadi Moshayedi	4ed59d2db3	Move more from insert_select_executor to insert_select_planner	2020-06-26 08:08:26 -07:00
Hadi Moshayedi	d34c21890f	Rename CoordinatorInsertSelect... to NonPushableInsertSelect	2020-06-25 08:55:48 -07:00
Hadi Moshayedi	4e8d79998e	Save INSERT/SELECT method in DistributedPlan. This is so we don't need to calculate it twice in insert_select_executor.c and multi_explain.c, which can cause discrepancy if an update in one of them is not reflected in the other site.	2020-06-25 08:55:48 -07:00
SaitTalhaNisanci	f458d1fd1c	Fix/task execution (#3941 ) * Not set TaskExecution with adaptive executor Adaptive executor is using a utility method from task tracker for repartition joins, however adaptive executor doesn't need taskExecution. It is only used by task tracker. This causes a problem when explain analyze is used because what taskExecution is pointing to might be random. We solve this by not setting taskExecution from adaptive executor. So it will stay NULL as set by CreateTask. * use same memory context as task for taskExecution Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2020-06-24 12:10:00 +03:00
Marco Slot	2a3234ca26	Rename masterQuery to combineQuery	2020-06-17 14:14:37 +02:00
Jelte Fennema	0259815d3a	Fix EXPLAIN ANALYZE received data counter issues (#3917 ) In #3901 the "Data received from worker(s)" sections were added to EXPLAIN ANALYZE. After merging @pykello posted some review comments. This addresses those comments as well as fixing a other issues that I found while addressing them. The things this does: 1. Fix `EXPLAIN ANALYZE EXECUTE p1` to not increase received data on every execution 2. Fix `EXPLAIN ANALYZE EXECUTE p1(1)` to not return 0 bytes as received data allways. 3. Move `EXPLAIN ANALYZE` specific logic to `multi_explain.c` from `adaptive_executor.c` 4. Change naming of new explain sections to `Tuple data received from node(s)`. Firstly because a task can reference the coordinator too, so "worker(s)" was incorrect. Secondly to indicate that this is tuple data and not all network traffic that was performed. 5. Rename `totalReceivedData` in our codebase to `totalReceivedTupleData` to make it clearer that it's a tuple data counter, not all network traffic. 6. Actually add `binary_protocol` test to `multi_schedule` (woops) 7. Fix a randomly failing test in `local_shard_execution.sql`.	2020-06-17 11:33:38 +02:00

... 3 4 5 6 7 ...

1200 Commits (a38428b665032d7c96f74796bd35fa8c1927e8c5)