History

Nils Dijk 6f9c040f76 DESCRIPTION: Propagate columnar table settings for distributed tables When distributing a columnar table, as well as changing options on a distributed columnar table, this patch will forward the settings from the coordinator to the workers. For propagating options changes on an already distributed table this change is pretty straight forward. Before applying the change in options locally we will create a `DDLJob` that contains a call to `alter_columnar_table_set(...)` for every shard placement with all settings of the current table. This goes both for setting an option as well as resetting. This will reset the values to the defaults configured on the coordinator. Having the effect that the coordinator is authoritative on the settings and makes sure the shards have the same settings set as the table on the coordinator. When a columnar table is distributed it is using the `TableDDLCommand` infra structure to create a new kind of `TableDDLCommand`. This new type, called a `TableDDLCommandFunction` contains a context and 2 function pointers to execute. One function returns the command as applied on the table, the second function will return the sql command to apply to a shard with a given shard id. The schema name is ignored as it will use the fully qualified name of the shard in the same schema as the base table.		2020-12-02 13:02:42 +01:00
..
bin	Associate column store metadata with storage id (#4347 )	2020-11-30 18:01:43 -08:00
data	move columnar test files	2020-11-17 18:55:34 +01:00
expected	DESCRIPTION: Propagate columnar table settings for distributed tables	2020-12-02 13:02:42 +01:00
input	Associate column store metadata with storage id (#4347 )	2020-11-30 18:01:43 -08:00
mitmscripts	mitmscripts/fluent.py: use atomic increment	2020-01-13 20:35:08 +00:00
output	Associate column store metadata with storage id (#4347 )	2020-11-30 18:01:43 -08:00
spec	rename cstore_tableam -> columnar	2020-11-19 12:15:51 -08:00
sql	DESCRIPTION: Propagate columnar table settings for distributed tables	2020-12-02 13:02:42 +01:00
upgrade	Bump citus version to 10.0devel	2020-11-09 13:16:54 +03:00
.gitignore	Implement direct COPY table TO stdout	2020-02-17 15:15:10 +01:00
Makefile	address review comment	2020-11-20 10:03:12 -08:00
Pipfile	Update failure test dependencies (#4284 )	2020-11-17 19:16:08 +03:00
Pipfile.lock	Update failure test dependencies (#4284 )	2020-11-17 19:16:08 +03:00
README.md	Add a basic testing README including normalization explanation	2020-01-06 09:32:03 +01:00
after_citus_upgrade_coord_schedule	Introduce objects to dist. infrastructure when updating Citus (#3477 )	2020-02-07 18:07:59 +03:00
after_pg_upgrade_schedule	add pg upgrade tests verifying table am is created	2020-11-17 18:55:36 +01:00
base_isolation_schedule	Add base isolation schedule (#3784 )	2020-04-24 12:38:37 +03:00
base_schedule	Adds multi_schedule_hyperscale schedule	2020-04-10 15:54:47 +03:00
before_citus_upgrade_coord_schedule	Introduce objects to dist. infrastructure when updating Citus (#3477 )	2020-02-07 18:07:59 +03:00
before_pg_upgrade_schedule	Adds multi_schedule_hyperscale schedule	2020-04-10 15:54:47 +03:00
columnar_am_isolation_schedule	make tests run	2020-11-17 18:55:35 +01:00
columnar_am_schedule	DESCRIPTION: Propagate columnar table settings for distributed tables	2020-12-02 13:02:42 +01:00
failure_base_schedule	Adds multi_schedule_hyperscale schedule	2020-04-10 15:54:47 +03:00
failure_schedule	Throttle connections to the worker nodes	2020-04-14 10:27:48 +02:00
isolation_schedule	Implement "citus local table" creation logic	2020-09-09 11:50:48 +03:00
log_test_times	Add test-timing script	2019-02-26 23:01:40 -07:00
minimal_schedule	Adds multi_schedule_hyperscale schedule	2020-04-10 15:54:47 +03:00
multi_follower_schedule	Remove task tracker executor (#3850 )	2020-07-18 13:11:36 +03:00
multi_mx_schedule	Apply planner changes for citus local tables	2020-09-09 11:51:18 +03:00
multi_schedule	Run master_copy_shard_placement separately	2020-11-30 20:34:03 +01:00
multi_schedule_hyperscale	Fix incorrect join related fields (#4242 )	2020-10-19 18:28:39 +03:00
multi_schedule_hyperscale_superuser	Fix incorrect join related fields (#4242 )	2020-10-19 18:28:39 +03:00
mx_base_schedule	Adds multi_schedule_hyperscale schedule	2020-04-10 15:54:47 +03:00
mx_minimal_schedule	Add a CI check to see if all tests are part of a schedule (#3959 )	2020-07-03 11:34:55 +02:00
pg_regress_multi.pl	Improve regression test settings	2020-11-30 20:34:03 +01:00
worker_schedule	Remove task tracker executor (#3850 )	2020-07-18 13:11:36 +03:00

README.md

How our testing works

We use the test tooling of postgres to run our tests. This tooling is very simple but effective. The basics it runs a series of .sql scripts, gets their output and stores that in results/$sqlfilename.out. It then compares the actual output to the expected output with a simple diff command:

diff results/$sqlfilename.out expected/$sqlfilename.out

Schedules

Which sql scripts to run is defined in a schedule file, e.g. multi_schedule, multi_mx_schedule.

Makefile

In our Makefile we have rules to run the different types of test schedules. You can run them from the root of the repository like so:

# e.g. the multi_schedule
make install -j9 && make -C src/test/regress/ check-multi

Take a look at the makefile for a list of all the testing targets.

Running a specific test

Often you want to run a specific test and don't want to run everything. You can use one of the following commands to do so:

# If your tests needs almost no setup you can use check-minimal
make install -j9 && make -C src/test/regress/ check-minimal EXTRA_TESTS='multi_utility_warnings'
# Often tests need some testing data, if you get missing table errors using
# check-minimal you should try check-base
make install -j9 && make -C src/test/regress/ check-base EXTRA_TESTS='with_prepare'
# Sometimes this is still not enough and some other test needs to be run before
# the test you want to run. You can do so by adding it to EXTRA_TESTS too.
make install -j9 && make -C src/test/regress/ check-base EXTRA_TESTS='add_coordinator coordinator_shouldhaveshards'

Normalization

The output of tests is sadly not completely predictable. Still we want to compare the output of different runs and error when the important things are different. We do this by not using the regular system diff to compare files. Instead we use src/test/regress/bin/diff which does the following things:

Change the $sqlfilename.out file by running it through sed using the src/test/regress/bin/normalize.sed file. This does stuff like replacing numbers that keep changing across runs with an XXX string, e.g. portnumbers or transaction numbers.
Backup the original output to $sqlfilename.out.unmodified in case it's needed for debugging
Compare the changed results and expected files with the system diff command.

Updating the expected test output

Sometimes you add a test to an existing file, or test output changes in a way that's not bad (possibly even good if support for queries is added). In those cases you want to update the expected test output. The way to do this is very simple, you run the test and copy the new .out file in the results directory to the expected directory, e.g.:

make install -j9 && make -C src/test/regress/ check-minimal EXTRA_TESTS='multi_utility_warnings'
cp src/test/regress/{results,expected}/multi_utility_warnings.out

Adding a new test file

Adding a new test file is quite simple:

Write the SQL file in the sql directory
Add it to a schedule file, to make sure it's run in CI
Run the test
Check that the output is as expected
Copy the .out file from results to expected

Isolation testing

See src/test/regress/spec/README.md

Upgrade testing

See src/test/regress/upgrade/README.md

Failure testing

See src/test/regress/mitmscripts/README.md

Perl test setup script

To automatically setup a citus cluster in tests we use our src/test/regress/pg_regress_multi.pl script. This sets up a citus cluster and then starts the standard postgres test tooling. You almost never have to change this file.