Commit Graph

285 Commits (98b99634f3844c2b399fb6bc0e2841111ece0ad1)

Author SHA1 Message Date
Marco Slot fd4ff29f2f Add a debug message with distribution column value 2018-06-05 15:09:17 +03:00
Dimitri Fontaine 8b258cbdb0 Lock reads and writes only to the node being updated in master_update_node
Rather than locking out all the writes in the cluster, the function now only
locks out writes that target shards hosted by the node we're updating.
2018-05-09 15:14:20 +02:00
Hadi Moshayedi 86b12bc2d0
Always prefix operators with their namespace. (#2147)
Previously we checked if an operator is in pg_catalog, and if it wasn't we prefixed it with namespace in worker queries. This can have a huge impact on performance of physical planner when using custom data types.

This happened regardless of current search_path config, because Citus overrides the search path in get_query_def_extended(). When we do so, the check for existence of the operator in current search path in generate_operator_name() fails for any operators outside pg_catalog. This means that nothing gets cached, and in the following calls we will again recheck the system tables for existence of the operators, which took an additional 40-50ms for some of the usecases we were seeing.

In this change we skip the pg_catalog check, and always prefix the operator with its namespace.
2018-05-05 13:27:26 -04:00
Onder Kalaci 317dd02a2f Implement single repartitioning on hash distributed tables
* Change worker_hash_partition_table() such that the
     divergence between Citus planner's hashing and
     worker_hash_partition_table() becomes the same.

   * Rename single partitioning to single range partitioning.

   * Add single hash repartitioning. Basically, logical planner
     treats single hash and range partitioning almost equally.
     Physical planner, on the other hand, treats single hash and
     dual hash repartitioning almost equally (except for JoinPruning).

   * Add a new GUC to enable this feature
2018-05-02 18:50:55 +03:00
velioglu 32bcd610c1 Support modify queries with multiple tables
With this commit we begin to support modify queries with multiple
tables if these queries are pushdownable.
2018-05-02 16:22:26 +03:00
Marco Slot 304b3a41ba Cache the partition column Var 2018-04-26 14:58:16 -06:00
Murat Tuncer a6fe5ca183 PG11 compatibility update
- changes in ruleutils_11.c is reflected
- vacuum statement api change is handled. We now allow
  multi-table vacuum commands.
- some other function header changes are reflected
- api conflicts between PG11 and earlier versions
  are handled by adding shims in version_compat.h
- various regression tests are fixed due output and
  functionality in PG1
- no change is made to support new features in PG11
  they need to be handled by new commit
2018-04-26 11:29:43 +03:00
Hadi Moshayedi 966f01fad3
Fix write and copy functions for TaskExecution. (#2120)
We were missing criticalErrorOccurred from CopyNodeTaskExecution() and OutTaskExecution(). This PR fixes it.
2018-04-23 09:07:52 -04:00
Marco Slot ee132c5ead Prune shards once per relation in subquery pushdown 2018-04-10 20:33:07 +02:00
Burak Yucesoy 0c283fa8a3 Add partitioning support to MX tables
Previously, we prevented creation of partitioned tables on Citus MX.
We decided to not focus on this feature until there is a need. Since
now there are requests for this feature, we are implementing support
for partitioned tables on Citus MX.
2018-04-06 12:47:06 +03:00
velioglu 698d585fb5 Remove broadcast join logic
After this change all the logic related to shard data fetch logic
will be removed. Planner won't plan any ShardFetchTask anymore.
Shard fetch related steps in real time executor and task-tracker
executor have been removed.
2018-03-30 11:45:19 +03:00
Brian Cloutier f8f0d4aedc Add Windows replacement for uname 2018-03-21 20:35:56 -07:00
Metin Doslu bcf660475a Add support for modifying CTEs 2018-02-27 15:08:32 +02:00
Marco Slot 0cba4ab588 Refactor worker node hash initialisation 2018-02-12 23:36:43 +01:00
Marco Slot 40d715d494 Cache worker node array for faster iteration 2018-02-12 23:36:43 +01:00
Marco Slot 6f7c3bd73b Skip JSON validation on coordinator during COPY 2018-02-02 15:33:27 +01:00
Brian Cloutier e6ebfc1f53 Remove VLA from UpdateNodeLocation 2018-02-01 10:30:41 -08:00
Brian Cloutier a2ed45e206 Remove variable length arrays
VLAs aren't supported by Visual Studio.

- Remove all existing instances of VLAs.
- Add a flag, -Werror=vla, which makes gcc refuse to compile if we add
  VLAs in the future.
2018-02-01 10:30:41 -08:00
Brian Cloutier 457f570b77 Small refactor, we were using incompatible types 2018-01-31 11:05:59 -08:00
Brian Cloutier b864d014ab
GetNextNodeId() incorrectly called PG_RETURN_DATUM
- Also stabilize the output of a multi_router_planner test
2018-01-29 15:32:36 -08:00
Brian Cloutier 76d1edc3fd
Don't rely on gcc-specific features (#1963)
* Don't use expressions inside compound statements
* Don't depend on __builtin_constant_p
* Remove reliance on S_ISLNK
* Replace use of __func__: older mcvs doesn't support this builtin
2018-01-23 17:03:29 -08:00
Dimitri Fontaine c9760fbb64 Fix CREATE INDEX with storage options on distributed tables.
By sharing the implementation of the function AppendOptionListToString on
three call sites, we would expand an extra OPTIONS keyword in a create index
statement, and omit other bits of the specific syntax here.

This patch introduces an AppendStorageParametersToString() function that is
very similar to AppendOptionListToString() but handles WITH(a="foo",...)
syntax that is used in reloptions (aka Storage Parameters).

Fixes #1747.
2018-01-17 21:56:40 +01:00
Dimitri Fontaine 952da72c55 Implement ALTER TABLE|INDEX ... SET|RESET ().
PostgreSQL implements support for several relation kinds in a single
statement, such as in the AlterTableStmt case, which supports both tables
and indexes and more (see ATExecSetRelOptions in PostgreSQL source code file
src/backend/commands/tablecmds.c for an example of that).

As a consequence, this patch implements support for setting and resetting
storage parameters on both relation kinds.
2018-01-17 21:56:40 +01:00
Brian Cloutier fb7b86fa14 Replace strtoull with pg_strtouint64
The macro we were using to detect strtoull isn't set on Windows, and
just in case there are differences use a portable function from PG
instead of calling strtoull directly.
2017-12-21 14:28:51 +01:00
mehmet furkan şahin fd546cf322 Intermediate result size limitation
This commit introduces a new GUC to limit the intermediate
result size which we handle when we use read_intermediate_result
function for CTEs and complex subqueries.
2017-12-21 14:26:56 +03:00
Marco Slot cbbd418af2 Add citus.copy_format OIDs to metadata cache 2017-12-14 09:32:55 +01:00
Marco Slot 7d1191954d Add DistributedSubPlan node 2017-12-14 09:32:55 +01:00
Marco Slot f8550b8c85 Fix issues with read_intermediate_result signature 2017-12-07 13:47:56 +01:00
Marco Slot eab15aa035 Avoid deadlock in ColocatedTableId 2017-12-06 11:49:34 +01:00
Marco Slot 7279d42849 Treat read_intermediate_result as recurring tuples 2017-12-04 14:50:11 +01:00
Murat Tuncer 2d66bf5f16
Fix hard coded formatting strings for 64 bit numbers (#1831)
Postgres provides OS agnosting formatting macros for
formatting 64 bit numbers. Replaced %ld %lu with
INT64_FORMAT and UINT64_FORMAT respectively.

Also found some incorrect usages of formatting
flags and fixed them.
2017-12-04 14:11:06 +03:00
Marco Slot 20a526d5c4 Fix memory leak in ListToHashSet 2017-11-22 11:26:58 +01:00
Marco Slot 8486f76e15 Auto-recover 2PC transactions 2017-11-22 11:26:58 +01:00
Marco Slot 6ba3f42d23 Rename MultiPlan to DistributedPlan 2017-11-22 09:36:24 +01:00
Andres Freund d063658d6d Protect some initializations from being called during backend startup.
On EXEC_BACKEND builds these functions shouldn't be called at every
backend start.
2017-11-20 15:29:51 -08:00
Brian Cloutier d267e0f9fa EXEC_BACKEND: don't put pointers to shared hashes into shared memory
Store pointers to shared hashes in process-local variables. Previously
pointers to shared hashes were put into shared memory. This causes
problems on EXEC_BACKEND because everybody calls execve and receives a
brand new address space; the shared hash will be in a different place
for every backend. (normally we call fork, which gives you a copy of the
address space, so these pointers remain constant)
2017-11-20 15:29:51 -08:00
Marco Slot ae47df01ea Observe prepared xacts twice in RecoverWorkerTransactions to avoid race condition 2017-11-20 11:44:08 +01:00
Marco Slot 2410c2e450 Rewrite recover_prepared_transactions to be fast, non-blocking 2017-11-20 11:27:40 +01:00
Brian Cloutier 0f3230170f Pull in INT32_MAXINT and INT32_MININT 2017-11-14 14:03:46 -08:00
Hadi Moshayedi 6d79d25101 Fix a relcache reference leak in stats collection.
In DistributedTablesSize() we didn't close the relations that had
replication factor > 2. This caused relcache reference leaks, and
warning messages like following in logs:

    WARNING:  relcache reference leak: relation "researchers" not closed
2017-11-06 23:16:43 -05:00
Marco Slot 6883a09cdd Allow distributed partitioned table creation in Cloud 2017-11-03 10:09:18 +01:00
Hadi Moshayedi 7280774cf4 Use list_length() != 1 in SingleReplicatedTable().
ShardPlacementList's implementation can return NIL. In previous implementation
we got a segmentation fault in this case. The relation can be dropped after
getting distributed table list but before calling SingleReplicatedTable().
2017-10-31 21:51:43 -04:00
Hadi Moshayedi 7691991cb5 Do PG_TRY() inside a subtransaction block.
If we don't propagate the errors we are catching in PG_CATCH(), database's
internal state might not be clean. So we do PG_TRY() inside a subtransaction
so we can rollback to it after catching errors.
2017-10-31 21:51:43 -04:00
Hadi Moshayedi 9bfbbf8a04 Make reports hostname configurable and enable stats collection in tests.
This patch adds --with-reports-host configure option, which sets the
REPORTS_BASE_URL constant. The default is reports.citusdata.com.

It also enables stats collection in tests.
2017-10-31 21:51:43 -04:00
Hadi Moshayedi acaf085a80 Add callback function for request by CollectBasicUsageStatistics().
Curl writes the received response to stdout if we don't specify a response
callback or an output file. This can pollute the PostgreSQL log. In this change
we add a callback function so the response messages aren't added to the log file.
2017-10-31 21:51:43 -04:00
Hadi Moshayedi 747e439601 Limit number of stats collection retries to once a day. 2017-10-31 21:51:43 -04:00
Hadi Moshayedi 78a2cd9052 Check for Citus updates.
Sends a request to /v1/releases/latest?flavor=$CITUS_EDITION once a day,
which returns a response similar to {"version": "7.1.0", "major": 7,
"minor": 1, "patch": 0}. Then compares it with current Citus version,
and if the latest release is newer, logs a LOG message.
2017-10-31 21:51:43 -04:00
Hadi Moshayedi 34f3ec0961 Call FlushDistTableCache() before stats collection. 2017-10-31 21:51:43 -04:00
Hadi Moshayedi c18c6625d9 Lock relations before calling citus_table_size().
This is to make sure they don't get dropped.
2017-10-31 21:51:43 -04:00
Hadi Moshayedi 97d544b75c Follow the patterns used in Deadlock Detection in Stats Collection.
This includes:

(1) Wrap everything inside a StartTransactionCommand()/CommitTransactionCommand().
This is so we can access the database. This also switches to a new memory context
and releases it, so we don't have to do our own memory management.

(2) LockCitusExtension() so the extension cannot be dropped or created concurrently.

(3) Check CitusHasBeenLoaded() && CheckCitusVersion() before doing any work.

(4) Do not PG_TRY() inside a loop.
2017-10-31 21:51:43 -04:00