Commit Graph

174 Commits (7e8fd49b94f4719c238194fb14797dc4d9ae5e18)

Author SHA1 Message Date
Hanefi Onaldi 7e8fd49b94 Create Schemas as superuser on all shard/table creation UDFs
- All the schema creations on the workers will now be  via superuser connections
- If a shard is being repaired or a shard is replicated, we will create the
  schema only in the relevant worker; and in all the other cases where a schema
  creation is needed, we will block operations until we ensure the schema exists
  in all the workers
2019-06-26 17:12:28 +02:00
Jason Petersen d4e1172247 Implement propagation of SET LOCAL commands
Adds support for propagation of SET LOCAL commands to all workers
involved in a query. For now, SET SESSION (i.e. plain SET) is not
supported whatsoever, though this code is intended as somewhat of a
base for implementing such support in the future.

As SET LOCAL modifications are scoped to the body of a BEGIN/END xact
block, queries wishing to use SET LOCAL propagation must be within such
a block. In addition, subsequent modifications after e.g. any SAVEPOINT
or ROLLBACK statements will correspondingly push or pop variable mod-
ifications onto an internal stack such that the behavior of changed
values across the cluster will be identical to such behavior on e.g.
single-node PostgreSQL (or equivalently, what values are visible to
the end user by running SHOW on such variables on the coordinator).

If nodes enter the set of participants at some point after SET LOCAL
modifications (or SAVEPOINT, ROLLBACK, etc.) have occurred, the SET
variable state is eagerly propagated to them upon their entrance (this
is identical to, and indeed just augments, the existing logic for the
propagation of the SAVEPOINT "stack").

A new GUC (citus.propagate_set_commands) has been added to control this
behavior. Though the code suggests the valid settings are 'none', 'local',
'session', and 'all', only 'none' (the default) and 'local' are presently
implemented: attempting to use other values will result in an error.
2019-06-20 16:15:43 -07:00
Hadi Moshayedi 4bbae02778 Make COPY compatible with unified executor. 2019-06-20 19:53:40 +02:00
Philip Dubé 4bfcf5b665 Enable Werror for all warnings
Changes to ruleutils match changes made upstream to silence gcc fallthrough warnings
2019-06-18 14:43:54 -07:00
Hadi Moshayedi dee5bc31b4 Refactor ShardIdForTuple() to a separate function. 2019-06-02 09:48:15 -07:00
Philip Dubé b8871d9ff4 Propagate more ALTER FOREIGN TABLE to workers 2019-05-24 12:54:05 -07:00
Philip Dubé 16886b3c63 Fix misc typos 2019-05-23 17:23:27 -07:00
Hadi Moshayedi 8ae47e1244 Fix comments for RemoteFileDestReceiverStartup and CitusCopyDestReceiverStartup 2019-05-21 09:03:22 -07:00
Hadi Moshayedi b5c0ca45f1 Remove stopOnFailure flag from EndRemoteCopy() 2019-05-11 06:18:34 -07:00
Hadi Moshayedi 32ecb6884c Test ROLLBACK TO SAVEPOINT with multi-shard CTE failures 2019-05-01 09:33:43 -07:00
Hadi Moshayedi aafd22dffa Fix savepoint rollback for INSERT INTO ... SELECT. 2019-05-01 09:33:43 -07:00
Jason Petersen 71d5d1c865 Enable variable shadowing warnings; fix all
Rather than wait for another place like the previous commit to bite us,
I think we should turn on this warning.
2019-04-30 13:24:25 -06:00
Jason Petersen 1125fc9da0 Fix self-strncmp in ConstrIsFKToReferenceTable
Make the function do what I assume was intended.
2019-04-30 13:24:25 -06:00
Marco Slot 0ea4e52df5 Add nodeId to shardPlacements and use it for shard placement comparisons
Before this commit, shardPlacements were identified with shardId, nodeName
and nodeport. Instead of using nodeName and nodePort, we now use nodeId
since it apparently has performance benefits in several places in the
code.
2019-03-20 12:14:46 +03:00
Marco Slot f2abf2b8e5 Functions are treated as transaction blocks 2019-03-15 16:34:08 -06:00
Hadi Moshayedi a9e6d06a98 Skip execution of ALTER TABLE constraint checks on the coordinator 2019-03-14 15:40:56 -07:00
Hadi Moshayedi cdd3b15ac8 Fix distributed deadlock for ALTER TABLE ... ATTACH PARTITION.
Following scenario resulted in distributed deadlock before this commit:

CREATE TABLE partitioning_test(id int, time date) PARTITION BY RANGE (time);
CREATE TABLE partitioning_test_2009 (LIKE partitioning_test);
CREATE TABLE partitioning_test_reference(id int PRIMARY KEY, subid int);

SELECT create_distributed_table('partitioning_test_2009', 'id'),
       create_distributed_table('partitioning_test', 'id'),
       create_reference_table('partitioning_test_reference');

ALTER TABLE partitioning_test ADD CONSTRAINT partitioning_reference_fkey FOREIGN KEY (id) REFERENCES partitioning_test_reference(id) ON DELETE CASCADE;
ALTER TABLE partitioning_test_2009 ADD CONSTRAINT partitioning_reference_fkey_2009 FOREIGN KEY (id) REFERENCES partitioning_test_reference(id) ON DELETE CASCADE;

ALTER TABLE partitioning_test ATTACH PARTITION partitioning_test_2009 FOR VALUES FROM ('2009-01-01') TO ('2010-01-01');
2019-03-14 15:28:37 -07:00
Hadi Moshayedi f19feb742c
Remove never assigned colocatedRelation from CreateDistributedTable (#2479) 2019-03-12 14:50:18 -07:00
Murat Tuncer cd5213abee Set sequential mode execution GUC for alter partitioned table
PG recently started propagating foreign key constraints
to partition tables. This came with a select query
to validate the the constaint.

We are already setting sequential mode execution for this
command. In order for validation select query to respect
this setting we need to explicitly set the GUC.

This commit also handles detach partition part.
2019-01-25 15:28:07 +03:00
Jason Petersen 339e6e661e
Remove 9.6 (#2554)
Removes support and code for PostgreSQL 9.6

cr: @velioglu
2019-01-16 13:11:24 -07:00
Marco Slot 1b1c6374f7
Execute CREATE INDEX CONCURRENTLY concurrently 2018-12-21 14:02:59 -07:00
Marco Slot 5b9376a7f8 Check ownership before taking locks in distributed table creation 2018-12-18 15:32:07 +01:00
Marco Slot 9cf91c438b Only allow transmit from pgsql_job_cache directory 2018-12-05 10:18:27 +01:00
Marco Slot 8893cc141d Support INSERT...SELECT with ON CONFLICT or RETURNING via coordinator
Before this commit, Citus supported INSERT...SELECT queries with
ON CONFLICT or RETURNING clauses only for pushdownable ones, since
queries supported via coordinator were utilizing COPY infrastructure
of PG to send selected tuples to the target worker nodes.

After this PR, INSERT...SELECT queries with ON CONFLICT or RETURNING
clauses will be performed in two phases via coordinator. In the first
phase selected tuples will be saved to the intermediate table which
is colocated with target table of the INSERT...SELECT query. Note that,
a utility function to save results to the colocated intermediate result
also implemented as a part of this commit. In the second phase, INSERT..
SELECT query is directly run on the worker node using the intermediate
table as the source table.
2018-11-30 15:29:12 +03:00
Hanefi Onaldi 7db6991dc0 propagate validate queries to workers 2018-11-26 14:04:51 +03:00
Marco Slot 6aa5592e52 Add user ID suffix to intermediate files in re-partition jobs 2018-11-23 08:36:11 +01:00
Marco Slot caf402d506 COPY to a task file no longer switches to superuser 2018-11-22 18:15:33 +01:00
Onder Kalaci 7f0a57a153 Make sure to prevent unauthorized users to drop tables in Citus MX 2018-11-15 18:07:03 +03:00
Marco Slot f383e4f307
Description: Refactor code that handles DDL commands from one file into a module
The file handling the utility functions (DDL) for citus organically grew over time and became unreasonably large. This refactor takes that file and refactored the functionality into separate files per command. Initially modeled after the directory and file layout that can be found in postgres.

Although the size of the change is quite big there are barely any code changes. Only one two functions have been added for readability purposes:

- PostProcessIndexStmt which is extracted from PostProcessUtility
- PostProcessAlterTableStmt which is extracted from multi_ProcessUtility

A README.md has been added to `src/backend/distributed/commands` describing the contents of the module and every file in the module.
We need more documentation around the overloading of the COPY command, for now the boilerplate has been added for people with better knowledge to fill out.
2018-11-14 13:36:27 +01:00
Onder Kalaci c1b5a04f6e Allow partitioned tables with replication factor > 1
With this commit, we all partitioned distributed tables with
replication factor > 1. However, we also have many restrictions.

In summary, we disallow all kinds of modifications (including DDLs)
on the partition tables. Instead, the user is allowed to run the
modifications over the parent table.

The necessity for such a restriction have two aspects:
   - We need to acquire shard resource locks appropriately
   - We need to handle marking partitions INVALID in case
     of any failures. Note that, in theory, the parent table
     should also become INVALID, which is too aggressive.
2018-09-21 14:40:41 +03:00
Onder Kalaci 5cf8fbe7b6 Add infrastructure to relation if exists 2018-09-07 14:49:36 +03:00
Onder Kalaci 1b3257816e Make sure that table is dropped before shards are dropped
This commit fixes a bug where a concurrent DROP TABLE deadlocks
with SELECT (or DML) when the SELECT is executed from the workers.

The problem was that Citus used to remove the metadata before
droping the table on the workers. That creates a time window
where the SELECT starts running on some of the nodes and DROP
table on some of the other nodes.
2018-09-04 08:57:20 +03:00
velioglu bd30e3e908 Add support for writing to reference tables from MX nodes 2018-08-27 18:15:04 +03:00
mehmet furkan şahin ef9f38b68d ApplyLogRedaction noop func is added 2018-08-17 14:48:54 -07:00
Onder Kalaci 7fb529aab9 Some stylistic improvements in the foreign keys to reference table
changes.
2018-07-05 23:23:34 +03:00
Nils Dijk c1c8c38dc9 create placeholder for policy ddl 2018-07-05 11:07:01 +02:00
Onder Kalaci d83be3a33f Enforce foreign key restrictions inside transaction blocks
When a hash distributed table have a foreign key to a reference
table, there are few restrictions we have to apply in order to
prevent distributed deadlocks or reading wrong results.

The necessity to apply the restrictions arise from cascading
nature of foreign keys. When a foreign key on a reference table
cascades to a distributed table, a single operation over a single
connection can acquire locks on multiple shards of the distributed
table. Thus, any parallel operation on that distributed table, in the
same transaction should not open parallel connections to the shards.
Otherwise, we'd either end-up with a self-distributed deadlock or
read wrong results.

As briefly described above, the restrictions that we apply is done
by tracking the distributed/reference relation accesses inside
transaction blocks, and act accordingly when necessary.

The two main rules are as follows:
   - Whenever a parallel distributed relation access conflicts
     with a consecutive reference relation access, Citus errors
     out
   - Whenever a reference relation access is followed by a
     conflicting parallel relation access, the execution mode
     is switched to sequential mode.

There are also some other notes to mention:
   - If the user does SET LOCAL citus.multi_shard_modify_mode
     TO 'sequential';, all the queries should simply work with
     using one connection per worker and sequentially executing
     the commands. That's obviously a slower approach than Citus'
     usual parallel execution. However, we've at least have a way
     to run all commands successfully.

   - If an unrelated parallel query executed on any distributed
     table, we cannot switch to sequential mode. Because, the essense
     of sequential mode is using one connection per worker. However,
     in the presence of a parallel connection, the connection manager
     picks those connections to execute the commands. That contradicts
     with our purpose, thus we error out.

   - COPY to a distributed table cannot be executed in sequential mode.
     Thus, if we switch to sequential mode and COPY is executed, the
     operation fails and there is currently no way of implementing that.
     Note that, when the local table is not empty and create_distributed_table
     is used, citus uses COPY internally. Thus, in those cases,
     create_distributed_table() will also fail.

   - There is a GUC called citus.enforce_foreign_key_restrictions
     to disable all the checks. We added that GUC since the restrictions
     we apply is sometimes a bit more restrictive than its necessary.
     The user might want to relax those. Similarly, if you don't have
     CASCADEing reference tables, you might consider disabling all the
     checks.
2018-07-03 17:05:55 +03:00
velioglu 6be6911ed9 Create foreign key relation graph and functions to query on it 2018-07-03 17:05:55 +03:00
mehmet furkan şahin 2c5d59f3a8 create_distributed_table in transaction is fixed 2018-07-03 17:05:01 +03:00
Onder Kalaci 2f01894589 Track relation accesses using the connection management infrastructure 2018-06-25 18:40:30 +03:00
mehmet furkan şahin 2b2ce036eb create_distributed_table honors sequential mode 2018-06-19 17:33:45 +03:00
mehmet furkan şahin d1a3b20115 foreign_constraint_utils is created 2018-06-07 18:19:24 +03:00
Marco Slot 2559b84049 Drop shards as current user instead of super user 2018-05-01 09:57:20 +02:00
Marco Slot 3d3c19a717
Improve messages for essential connection failures 2018-04-26 12:58:47 -06:00
Murat Tuncer a6fe5ca183 PG11 compatibility update
- changes in ruleutils_11.c is reflected
- vacuum statement api change is handled. We now allow
  multi-table vacuum commands.
- some other function header changes are reflected
- api conflicts between PG11 and earlier versions
  are handled by adding shims in version_compat.h
- various regression tests are fixed due output and
  functionality in PG1
- no change is made to support new features in PG11
  they need to be handled by new commit
2018-04-26 11:29:43 +03:00
Brian Cloutier 42ddfa176d Fix crash on Windows where there is no detail 2018-04-13 12:54:22 -07:00
Burak Yucesoy 0c283fa8a3 Add partitioning support to MX tables
Previously, we prevented creation of partitioned tables on Citus MX.
We decided to not focus on this feature until there is a need. Since
now there are requests for this feature, we are implementing support
for partitioned tables on Citus MX.
2018-04-06 12:47:06 +03:00
Metin Doslu 3b7b64a8b6 Remove skip_jsonb_validation_in_copy GUC 2018-03-13 10:33:27 +02:00
Marco Slot 6f7c3bd73b Skip JSON validation on coordinator during COPY 2018-02-02 15:33:27 +01:00
Marco Slot 36ee21c323 Make CanUseBinaryCopyFormatForType public 2017-12-14 09:32:55 +01:00