citus

Commit Graph

Author	SHA1	Message	Date
Hanefi Onaldi	6254f30305	Add arbitrary config tests for function DDL statements (#5885 )	2022-04-12 16:03:10 +03:00
SaitTalhaNisanci	3f65751d43	Add an infrastructure to run same tests with arbitrary configs (#5316 ) To run tests in parallel use: ```bash make check-arbitrary-configs parallel=4 ``` To run tests sequentially use: ```bash make check-arbitrary-configs parallel=1 ``` To run only some configs: ```bash make check-arbitrary-base CONFIGS=CitusSingleNodeClusterConfig,CitusSmallSharedPoolSizeConfig ``` To run only some test files with some config: ```bash make check-arbitrary-base CONFIGS=CitusSingleNodeClusterConfig EXTRA_TESTS=dropped_columns_1 ``` To get a deterministic run, you can give the random's seed: ```bash make check-arbitrary-configs parallel=4 seed=12312 ``` The `seed` will be in the output of the run. In our regular regression tests, we can see all the details about either planning or execution but this means we need to run the same query under different configs/cluster setups again and again, which is not really maintanable. When we don't care about the internals of how planning/execution is done but the correctness, especially with different configs this infrastructure can be used. With `check-arbitrary-configs` target, the following happens: - a bunch of configs are loaded, which are defined in `config.py`. These configs have different settings such as different shard count, different citus settings, postgres settings, worker amount, or different metadata. - For each config, a separate data directory is created for tests in `tmp_citus_test` with the config's name. - For each config, `create_schedule` is run on the coordinator to setup the necessary tables. - For each config, `sql_schedule` is run. `sql_schedule` is run on the coordinator if it is a non-mx cluster. And if it is mx, it is either run on the coordinator or a random worker. - Tests results are checked if they match with the expected. When tests results don't match, you can see the regression diffs in a config's datadir, such as `tmp_citus_tests/dataCitusSingleNodeClusterConfig`. We also have a PostgresConfig which runs all the test suite with Postgres. By default configs use regular user, but we have a config to run as a superuser as well. So the infrastructure tests: - Postgres vs Citus - Mx vs Non-Mx - Superuser vs regular user - Arbitrary Citus configs When you want to add a new test, you can add the create statements to `create_schedule` and add the sql queries to `sql_schedule`. If you are adding Citus UDFs that should be a NO-OP for Postgres, make sure to override the UDFs in `postgres.sql`. You can add your new config to `config.py`. Make sure to extend either `CitusDefaultClusterConfig` or `CitusMXBaseClusterConfig`. On the CI, upon a failure, all logfiles will be uploaded as artifacts, so you can check the artifacts tab. All the regressions will be shown as part of the job on CI. In your local, you can check the regression diffs in config's datadirs as in `tmp_citus_tests/dataCitusSingleNodeClusterConfig`.	2021-10-12 14:24:19 +03:00

Author

SHA1

Message

Date

Hanefi Onaldi

6254f30305

Add arbitrary config tests for function DDL statements (#5885 )

2022-04-12 16:03:10 +03:00

SaitTalhaNisanci

3f65751d43

Add an infrastructure to run same tests with arbitrary configs (#5316 )

To run tests in parallel use:

```bash
make check-arbitrary-configs parallel=4
```

To run tests sequentially use:

```bash
make check-arbitrary-configs parallel=1
```

To run only some configs:

```bash
make check-arbitrary-base CONFIGS=CitusSingleNodeClusterConfig,CitusSmallSharedPoolSizeConfig
```

To run only some test files with some config:

```bash
make check-arbitrary-base CONFIGS=CitusSingleNodeClusterConfig EXTRA_TESTS=dropped_columns_1
```

To get a deterministic run, you can give the random's seed:

```bash
make check-arbitrary-configs parallel=4 seed=12312
```

The `seed` will be in the output of the run.

In our regular regression tests, we can see all the details about either planning or execution but this means
we need to run the same query under different configs/cluster setups again and again, which is not really maintanable.

When we don't care about the internals of how planning/execution is done but the correctness, especially with different configs
this infrastructure can be used.

With `check-arbitrary-configs` target, the following happens:

- a bunch of configs are loaded, which are defined in `config.py`. These configs have different settings such as different shard count, different citus settings, postgres settings, worker amount, or different metadata.
- For each config, a separate data directory is created for tests in `tmp_citus_test` with the config's name.
- For each config, `create_schedule` is run on the coordinator to setup the necessary tables.
- For each config, `sql_schedule` is run. `sql_schedule` is run on the coordinator if it is a non-mx cluster. And if it is mx, it is either run on the coordinator or a random worker.
- Tests results are checked if they match with the expected.

When tests results don't match, you can see the regression diffs in a config's datadir, such as `tmp_citus_tests/dataCitusSingleNodeClusterConfig`.

We also have a PostgresConfig which runs all the test suite with Postgres.
By default configs use regular user, but we have a config to run as a superuser as well.

So the infrastructure tests:

- Postgres vs Citus
- Mx vs Non-Mx
- Superuser vs regular user
- Arbitrary Citus configs

When you want to add a new test, you can add the create statements to `create_schedule` and add the sql queries to `sql_schedule`.
If you are adding Citus UDFs that should be a NO-OP for Postgres, make sure to override the UDFs in `postgres.sql`.

You can add your new config to `config.py`. Make sure to extend either `CitusDefaultClusterConfig` or `CitusMXBaseClusterConfig`.

On the CI, upon a failure, all logfiles will be uploaded as artifacts, so you can check the artifacts tab.
All the regressions will be shown as part of the job on CI.

In your local, you can check the regression diffs in config's datadirs as in `tmp_citus_tests/dataCitusSingleNodeClusterConfig`.

2021-10-12 14:24:19 +03:00

2 Commits (1d7dda991f54a2da75febf9040a4356221e9a4ba)