Commit Graph

1695 Commits (04aeb6938b452da03a03b7c9a47bee39bc5340b4)

Author SHA1 Message Date
Sait Talha Nisanci 5693cabc41 Not convert an already routable plannable query
We should not recursively plan an already routable plannable query. An
example of this is (SELECT * FROM local JOIN (SELECT * FROM dist) d1
USING(a));

So we let the recursive planner do all of its work and at the end we
convert the final query to to handle unsupported joins. While doing each
conversion, we check if it is router plannable, if so we stop.

Only consider range table entries that are in jointree

If a range table is not in jointree then there is no point in
considering that because we are trying to convert range table entries to
subqueries for join use case.
2020-12-15 18:17:10 +03:00
Sait Talha Nisanci 2ff65f3630 Enable partitioned distributed tables in local-dist table joins 2020-12-15 18:17:10 +03:00
Sait Talha Nisanci 44953579cf Enable citus-local distributed table joins
Check equality in quals

We want to recursively plan distributed tables only if they have an
equality filter on a unique column. So '>' and '<' operators will not
trigger recursive planning of distributed tables in local-distributed
table joins.

Recursively plan distributed table only if the filter is constant

If the filter is not a constant then the join might return multiple rows
and there is a chance that the distributed table will return huge data.
Hence if the filter is not constant we choose to recursively plan the
local table.
2020-12-15 18:17:10 +03:00
Sait Talha Nisanci f3d55448b3 Choose distributed table if it has a unique index in filter
When doing local-distributed table joins we convert one of them to
subquery. The current policy is that we convert distributed tables to
subquery if it has a unique index on a column that has unique
index(primary key also has a unique index).
2020-12-15 18:17:10 +03:00
Onder Kalaci f0aef67ed2 Update existing regression tests 2020-12-15 18:17:10 +03:00
Onder Kalaci 3f4952cc2b Pushdown projections when relations are recursively planned
This is important to limit the data transfer size.
2020-12-15 18:17:10 +03:00
Onder Kalaci 945193555b add basic regression tests 2020-12-15 18:17:10 +03:00
Onder Kalaci 594e001f3b Add filter pushdown regression tests
Also handle WHERE false
2020-12-15 18:17:10 +03:00
Onder Kalaci 82a4830c7d Adjust the existing regression tests 2020-12-15 18:17:10 +03:00
Marco Slot f2538a456f Support co-located/recurring sublinks in the target list 2020-12-13 15:45:24 +01:00
Marco Slot 8e8adcd92a Harden citus_tables against node failure 2020-12-13 15:10:40 +01:00
Hadi Moshayedi 4dd22cc4e4 Columnar: Fix ANALYZE for large number of rows. 2020-12-10 09:52:33 -08:00
Hadi Moshayedi b3dac5e9d1 Columnar: set default compression as zstd if available 2020-12-09 14:32:08 -08:00
Hadi Moshayedi 4668fe51a6 Columnar: Make compression level configurable 2020-12-09 08:48:50 -08:00
Hadi Moshayedi f5a4a4bc74 Columnar: Support zstd compression 2020-12-09 08:30:55 -08:00
Hadi Moshayedi 3f81ee26fd Columnar: Support LZ4 compression 2020-12-09 08:29:07 -08:00
jeff-davis 260a02180b
Add tests for unsupported columnar storage features (#4397)
Add negative tests:
 * Deletes
 * Sample scan
 * Special columns
 * Tuple locks
 * Indexes
2020-12-09 00:08:45 -08:00
Jeff Davis c91e5b052b more test fixups 2020-12-07 13:43:27 -08:00
Jeff Davis 7169ba21c4 more test fixes 2020-12-07 13:36:46 -08:00
Jeff Davis e26fdeb706 fixup tests some more 2020-12-07 13:22:16 -08:00
Jeff Davis 5b3c32eb38 fixup tests 2020-12-07 13:18:22 -08:00
Jeff Davis 068af7f38e fixup upgrade tests 2020-12-07 13:11:51 -08:00
Jeff Davis 3758e83850 Rename cstore->columnar in SQL objects and errors. 2020-12-07 13:01:53 -08:00
Jeff Davis ad919ff220 Tests for UPDATE and error message improvement.
UPDATEs on partitioned tables that affect only row partitions should
succeed, the rest should fail.

Also rename CStoreScan to ColumnarScan to make the error message more
relevant.
2020-12-07 11:25:30 -08:00
Ahmet Gedemenli 936775e8e3 Delete transactions when removing node
With this commit, we delete entries in pg_dist_transaction
for the primary nodes that are removed by `master_remove_node`.
2020-12-07 11:35:20 +03:00
Hadi Moshayedi 01da2a1c73 Columnar: track decompressed length in metadata 2020-12-04 09:09:39 -08:00
Onder Kalaci bd9827aed9 Add regression tests with different data types
We typically do not test Citus with these uncommon
data types. Now, we already have the tests for ADF
integration, add it to regression tests as well.
2020-12-04 10:25:00 +03:00
Hadi Moshayedi 4a9aebaa7b Columnar: rename block to chunk 2020-12-03 08:50:19 -08:00
Hadi Moshayedi 24bfd368a9 Columnar: Fix VACUUM for empty tables 2020-12-03 08:46:09 -08:00
Marco Slot c9b658daea Add a public.citus_tables view 2020-12-03 17:31:40 +01:00
Marco Slot 4098d33acb Allow citus size functions on replicated tables 2020-12-03 16:33:24 +01:00
Marco Slot c69ea2512a Fix flappy failure test 2020-12-03 13:54:02 +01:00
Onder Kalaci c546ec5e78 Local node connection management
When Citus needs to parallelize queries on the local node (e.g., the node
executing the distributed query and the shards are the same), we need to
be mindful about the connection management. The reason is that the client
backends that are running distributed queries are competing with the client
backends that Citus initiates to parallelize the queries in order to get
a slot on the max_connections.

In that regard, we implemented a "failover" mechanism where if the distributed
queries cannot get a connection, the execution failovers the tasks to the local
execution.

The failover logic is follows:

- As the connection manager if it is OK to get a connection
	- If yes, we are good.
	- If no, we fail the workerPool and the failure triggers
	  the failover of the tasks to local execution queue

The decision of getting a connection is follows:

/*
 * For local nodes, solely relying on citus.max_shared_pool_size or
 * max_connections might not be sufficient. The former gives us
 * a preview of the future (e.g., we let the new connections to establish,
 * but they are not established yet). The latter gives us the close to
 * precise view of the past (e.g., the active number of client backends).
 *
 * Overall, we want to limit both of the metrics. The former limit typically
 * kics in under regular loads, where the load of the database increases in
 * a reasonable pace. The latter limit typically kicks in when the database
 * is issued lots of concurrent sessions at the same time, such as benchmarks.
 */
2020-12-03 14:16:13 +03:00
Hadi Moshayedi c2f60b6422
Columnar: pg_upgrade support (#4354) 2020-12-02 08:46:59 -08:00
Ahmet Gedemenli 5242dcfe99 Add tests for propagating alter schema rename 2020-12-02 15:18:26 +03:00
Nils Dijk 6f9c040f76
DESCRIPTION: Propagate columnar table settings for distributed tables
When distributing a columnar table, as well as changing options on a distributed columnar table, this patch will forward the settings from the coordinator to the workers.

For propagating options changes on an already distributed table this change is pretty straight forward. Before applying the change in options locally we will create a `DDLJob` that contains a call to `alter_columnar_table_set(...)` for every shard placement with all settings of the current table. This goes both for setting an option as well as resetting. This will reset the values to the defaults configured on the coordinator. Having the effect that the coordinator is authoritative on the settings and makes sure the shards have the same settings set as the table on the coordinator.

When a columnar table is distributed it is using the `TableDDLCommand` infra structure to create a new kind of `TableDDLCommand`. This new type, called a `TableDDLCommandFunction` contains a context and 2 function pointers to execute. One function returns the command as applied on the table, the second function will return the sql command to apply to a shard with a given shard id. The schema name is ignored as it will use the fully qualified name of the shard in the same schema as the base table.
2020-12-02 13:02:42 +01:00
Halil Ozan Akgül ef0914a7f8
Adds ORDER BY to flaky test (#4305)
Co-authored-by: Önder Kalacı <onder@citusdata.com>
2020-12-02 14:24:05 +03:00
Onder Kalaci f7e1aa3f22 Multi-row INSERTs use local execution when placements are local
Multi-row execution already uses sequential execution. When shards
are local, using local execution is profitable as it avoids
an extra connection establishment to the local node.
2020-12-01 21:37:59 +03:00
Marco Slot 04cffdd925 Run master_copy_shard_placement separately 2020-11-30 20:34:03 +01:00
Marco Slot 48caca4084 Improve regression test settings 2020-11-30 20:34:03 +01:00
Ahmet Gedemenli 8e5f0487eb Add order by for flaky test 2020-12-01 10:54:52 +03:00
Ahmet Gedemenli 67761897ab Add test for citus table size func in transaction with modification
Add test for citus_relation_size
2020-12-01 10:38:15 +03:00
Hadi Moshayedi feecb7b423
Columnar: few fixes (#4371)
* Columnar: fix a memory issue

* Columnar: no need for deferred triggers

* Columnar: relax memory growth constraints
2020-11-30 18:09:43 -08:00
Hadi Moshayedi a94e8c9cda
Associate column store metadata with storage id (#4347) 2020-11-30 18:01:43 -08:00
Marco Slot ecbc1ab008 Run subquery_prepared_statements by itself 2020-11-30 08:53:06 +01:00
Sait Talha Nisanci 8b0aed521f Isolate join test
Join test gets too many clients error too frequently hence we should
not run anything concurrently with that. Hopefully this will fix the
flakiness of test.
2020-12-01 00:00:17 +03:00
SaitTalhaNisanci c31a8df380
Call 6 times not 7 in subquery_prepared_statements (#4357) 2020-11-30 21:20:51 +03:00
SaitTalhaNisanci 8c3dd6338e
Run pg12 and pg13 separately (#4352)
It seems that sometimes we get `too many clients errors` with this set
of parallel tests, hence two of them are separated.
2020-11-30 19:32:49 +03:00
Hadi Moshayedi 7f43804dae
Normalize VACUUM VERBOSE output (#4353)
This is to avoid flaky changes like the following in test outputs:

-CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
+CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.02 s.
2020-11-27 12:07:25 -08:00
Nils Dijk 383e334023
refactor options to their own table linked to the regclass (#4346)
Columnar options were by accident linked to the relfilenode instead of the regclass/relation oid. This PR moves everything related to columnar options to their own catalog table.
2020-11-27 11:22:08 -08:00
Onder Kalaci 629ecc3dee Add the infrastructure to count the number of client backends
Considering the adaptive connection management
improvements that we plan to roll soon, it makes it
very helpful to know the number of active client
backends.

We are doing this addition to simplify yhe adaptive connection
management for single node Citus. In single node Citus, both the
client backends and Citus parallel queries would compete to get
slots on Postgres' `max_connections` on the same Citus database.

With adaptive connection management, we have the counters for
Citus parallel queries. That helps us to adaptively decide
on the remote executions pool size (e.g., throttle connections
if necessary).

However, we do not have any counters for the total number of
client backends on the database. For single node Citus, we
should consider all the client backends, not only the remote
connections that Citus does.

Of course Postgres internally knows how many client
backends are active. However, to get that number Postgres
iterates over all the backends. For examaple, see [pg_stat_get_db_numbackends](8e90ec5580/src/backend/utils/adt/pgstatfuncs.c (L1240))
where Postgres iterates over all the backends.

For our purpuses, we need this information on every connection
establishment. That's why we cannot affort to do this kind of
iterattion.
2020-11-25 19:19:24 +01:00
Ahmet Gedemenli a64dc8a72b Fixes a bug preventing INSERT SELECT .. ON CONFLICT with a constraint name on local shards
Separate search relation shard function

Add tests
2020-11-25 15:10:46 +03:00
Onder Kalaci 7accbff3f6 Do not cache all the distributed table metadata during CitusTableTypeIdList()
CitusTableTypeIdList() function iterates on all the entries of pg_dist_partition
and loads all the metadata in to the cache. This can be quite memory intensive
especially when there are lots of distributed tables.

When partitioned tables are used, it is common to have many distributed tables
given that each partition also becomes a distributed table.

CitusTableTypeIdList() is used on every CREATE TABLE .. PARTITION OF.. command
as well. It means that, anytime a partition is created, Citus loads all the
metadata to the cache. Note that Citus typically only loads the accessed table's
metadata to the cache.
2020-11-24 17:44:06 +01:00
Önder Kalacı c760cd3470
Move local execution after remote execution (#4301)
* Move local execution after the remote execution

Before this commit, when both local and remote tasks
exist, the executor was starting the execution with
local execution. There is no strict requirements on
this.

Especially considering the adaptive connection management
improvements that we plan to roll soon, moving the local
execution after to the remote execution makes more sense.

The adaptive connection management for single node Citus
would look roughly as follows:

   - Try to connect back to the coordinator for running
     parallel queries.
        - If succeeds, go on and execute tasks in parallel
        - If fails, fallback to the local execution

So, we'll use local execution as a fallback mechanism. And,
moving it after to the remote execution allows us to implement
such further scenarios.
2020-11-24 13:43:38 +01:00
Hadi Moshayedi 40b52ab757 Fix memory leaks in column store 2020-11-23 11:26:12 -08:00
Jeff Davis ba6ec610e2 address review comment 2020-11-20 10:03:12 -08:00
Jeff Davis 8cee2b092b remove columnar FDW code 2020-11-20 10:03:12 -08:00
Onder Kalaci c433c66f2b Do not execute subplans multiple times with cursors
Before this commit, we let AdaptiveExecutorPreExecutorRun()
to be effective multiple times on every FETCH on cursors.
That does not affect the correctness of the query results,
but adds significant overhead.
2020-11-20 10:43:56 +01:00
Hadi Moshayedi b182a95389 Fix ALTER COLUMN ... SET TYPE for columnar 2020-11-19 15:36:45 -08:00
Jeff Davis cef1d0e915 fixup test output 2020-11-19 12:45:52 -08:00
Jeff Davis 91015deb9d rename UDFs also 2020-11-19 12:27:40 -08:00
Jeff Davis a2b698a766 rename cstore_tableam -> columnar 2020-11-19 12:15:51 -08:00
SaitTalhaNisanci 9c44911226
Improve error messages in shard pruning (#4324) 2020-11-18 17:16:06 +03:00
Hadi Moshayedi 2747fd80ff Add prepared materialized view tests for columnar 2020-11-17 20:13:20 -08:00
Hadi Moshayedi 6711340ea6 Add prepared xact & stmt tests for columnar 2020-11-17 20:00:57 -08:00
Hadi Moshayedi 97cba2d5b6 Implements write state management for tuple inserts.
TableAM API doesn't allow us to pass around a state variable along all of the tuple inserts belonging to the same command. We require this in columnar store, since we batch them, and when we have enough rows we flush them as stripes.

To do that, we keep a (relfilenode) -> stack of (subxact id, TableWriteState) global mapping.

**Inserts**

Whenever we want to insert a tuple, we look up for the relation's relfilenode in this mapping. If top of the stack matches current subtransaction, we us the existing TableWriteState. Otherwise, we allocate a new TableWriteState and push it on top of stack.

**(Sub)Transaction Commit/Aborts**

When the subtransaction or transaction is committed, we flush and pop all entries matching current SubTransactionId.

When the subtransaction or transaction is committed, we pop all entries matching current SubTransactionId and discard them without flushing.

**Reads**

Since we might have unwritten rows which needs to be read by a table scan, we flush write states on SELECTs. Since flushing the write state of upper transactions in a subtransaction will cause metadata being written in wrong subtransaction, we ERROR out if any of the upper subtransactions have unflushed rows.

**Table Drops**

We record in which subtransaction the table was dropped. When committing a subtransaction in which table was dropped, we propagate the drop to upper transaction. When aborting a subtransaction in which table was dropped, we mark table as not deleted.
2020-11-17 12:07:16 -08:00
Nils Dijk 725f4a37d0
change configure to not have options 2020-11-17 19:01:54 +01:00
Nils Dijk 22df8027b0
add extra output for multi_extension targeting pg11 2020-11-17 19:01:54 +01:00
Nils Dijk 7c891a01a9 create missing objects during upgrade path 2020-11-17 19:01:51 +01:00
Nils Dijk 2987535172
add pg upgrade tests verifying table am is created 2020-11-17 18:55:36 +01:00
Nils Dijk d065bb495d
Prepare downgrade script and bump development version to 10.0-1 2020-11-17 18:55:35 +01:00
Nils Dijk b6d4a1bbe2
fix style 2020-11-17 18:55:35 +01:00
Nils Dijk 3bb6554976
make tests run 2020-11-17 18:55:35 +01:00
Nils Dijk f89bd3eeb5
move columnar test files 2020-11-17 18:55:34 +01:00
SaitTalhaNisanci 34de1f645c
Update failure test dependencies (#4284)
* Update failure test dependencies

There was a security alert for cryptography. The vulnerability was fixed
in 3.2.0. The vulnebarility:

"RSA decryption was vulnerable to Bleichenbacher timing vulnerabilities,
which would impact people using RSA decryption in online scenarios."

The fix:
58494b41d6

It wasn't enough to only update crpytography because mitm was
incompatible with the new version, so mitm is also upgraded.

The steps to do in local:
python -m pip install -U cryptography
python -m pip install -U mitmproxy
2020-11-17 19:16:08 +03:00
Onur Tirtir 5e3dc9d707 Bump citus version to 10.0devel 2020-11-09 13:16:54 +03:00
Onur Tirtir 5d5966f700
Fix a flaky test in mixed_relkind_tests (#4300) 2020-11-06 14:53:30 +03:00
Onder Kalaci e0d2ac7620 Do not rely on set_rel_pathlist_hook for finding local relations
When a relation is used on an OUTER JOIN with FALSE filters,
set_rel_pathlist_hook may not be called for the table.

There might be other cases as well, so do not rely on the hook
for classification of the tables.
2020-11-06 11:14:30 +01:00
Onur Tirtir 0556952607
Normalize partitioned table aliases in explain output (#4295)
Aliases that postgres choose for partitioned tables in explain output
might change in different pg versions, so normalize them and remove
the alternative test output
2020-11-06 10:44:01 +03:00
Onur Tirtir d912d4bc38
Print full file path in valgrind testing (#4299) 2020-11-06 10:26:53 +03:00
Onur Tirtir cc8be422ce
Fix relkind checks in planner for relkinds other than RELKIND_RELATION (#4294)
We were qualifying relations with relkind != RELKIND_RELATION as
non-relations due to the strict checks around RangeTblEntry->relkind
in planner.
2020-11-05 14:21:02 +03:00
Hanefi Önaldı d6f19e2298
Honor error message conventions 2020-11-03 18:11:18 +03:00
Hanefi Önaldı 85a4b61a0e
Prevent undistribute_table calls for partitions 2020-11-03 18:10:20 +03:00
Hanefi Önaldı 5db380f33a
Prevent undistribute_table calls for foreign tables 2020-11-03 17:33:29 +03:00
Halil Ozan Akgul 77b3be8b6d Turn RelOptInfos to only used field of them, relids, to be able to copy 2020-10-22 13:42:28 +03:00
Onur Tirtir 790beea59f
Add intermediate result tests with unsupported outer joins (#4262) 2020-10-20 12:11:18 +03:00
SaitTalhaNisanci 0f209377c4
Fix incorrect join related fields (#4242)
* Fix incorrect join related fields

Ruleutils expect to give the original index of join columns hence we
should consider the dropped columns while setting the fields in
SetJoinRelatedFieldsCompat.

* add some more tests for joins

* Move tests to join.sql and create a utility function
2020-10-19 18:28:39 +03:00
Onur Tirtir c49077d594
Disallow outer joins `ON TRUE` with ref & dist tables when ref table is outer relation (#4255)
Disallow `ON TRUE` outer joins with reference & distributed tables
when reference table is outer relation by fixing the logic bug made
when calling `LeftListIsSubset` function.

Also, be more defensive when removing duplicate join restrictions
when join clause is empty for non-inner joins as they might still
contain useful information for non-inner joins.
2020-10-19 16:58:11 +03:00
Onder Kalaci bbedfca761 Improve the relation restriction counters
It seems like Postgres could call set_rel_pathlist() for
the same relation multiple times. This breaks the logic
where we assume relationCount eqauls to the number of
entries in relationRestrictionList.

In summary, relationRestrictionList may contain duplicate
entries.
2020-10-19 08:51:16 +02:00
Hadi Moshayedi 663549db33 Set explicit transfer_mode in tableam tests 2020-10-16 12:40:37 -07:00
Nils Dijk caabbf4b84 Table access method support for distributed tables 2020-10-16 12:02:25 -07:00
Marco Slot 8976f245ab Support reference table view in reference table modification 2020-10-16 11:31:24 +02:00
Onder Kalaci 596f7bf4a9 Add more regression test for single node Citus
Tests on commands with SCHEMA.
2020-10-15 17:32:32 +02:00
Onder Kalaci fe3caf3bc8 Local execution considers intermediate result size limit
With this commit, we make sure that local execution adds the
intermediate result size as the distributed execution adds. Plus,
it enforces the citus.max_intermediate_result_size value.
2020-10-15 17:18:55 +02:00
Marco Slot 31858c8a29 Check table existence in EnsureRelationKindSupported 2020-10-15 17:05:06 +02:00
Onder Kalaci 15e724c073 Add regression tests for outer/cross JOINs 2020-10-14 15:17:30 +02:00
Onder Kalaci de33079065 Improve outer join checks
Before this commit, the logic was:
    - As long as the outer side of the JOIN is not a JOIN (e.g., relation
      or subquery etc.), we check for the existence of any recurring
      tuples. There were two implications of this decision.

      First, even if a subquery which is on the outer side contains
      distributed table JOIN reference table, Citus would unnecessarily throw
      an error. Note that, the JOIN inside the subquery would already
      be going to be tested recursively. But, as long as that check
      passes, there is no reason for the upper JOIN to fail. An example, which
      used to fail and now works:

	SELECT * FROM (SELECT * FROM dist JOIN ref) as foo LEFT JOIN dist;

      Second, certain JOINs, especially with ON (true) conditions were not
      represented as Citus expects the JOINs to be in the format
      DeferredErrorIfUnsupportedRecurringTuplesJoin().
2020-10-14 15:17:30 +02:00
Onur Tirtir 1a28858c47
Disallow field indirection in INSERT/UPDATE queries (#4241) 2020-10-14 14:11:59 +03:00
Onur Tirtir 8efca3b60a
Fix a crash with inserting domain composite types in coord. evaluation (#4231)
Use short lived per-tuple context in citus_evaluate_expr like
(pg) evaluate_expr does.

We should not use planState->ExprContext when evaluating expressions
as it might lead to freeing the same executor twice (first one happens
in citus_evaluate_expr itself and the other one happens when postgres
doing clean-up for the top level executor state), which in turn might
cause seg.faults.

However, now as we don't have necessary planState info to evaluate
prepared statements, we also add planState->es_param_list_info to
per-tuple ExprContext.
2020-10-13 14:19:59 +03:00
Halil Ozan Akgul e2736c25bd Adds support for WITH TIES option 2020-10-12 19:34:18 +03:00