Distributed systems are hard to test, which means that creating a solid set of tests for a distributed database is a substantial software project in itself. Software developer Kyle Kingsbury has invented a full-featured tool for testing distributed systems, called Jepsen. (It’s named after Carly Rae Jepsen.)
Jepsen is a flexible tool that can be set up to test a variety of distributed systems, including Apache Cassandra. Jepsen and Cassandra have both made a lot of progress since the original set of tests. Joel Knighton, a developer working at DataStax, has updated and enhanced the Jepsen tests for Cassandra and made the code available.
Jepsen runs a bunch of tasks at the same time.
- A control node starts and manages all the others.
- Database nodes run the service being tested.
- “Processes” carry out client tasks: there is a generator that creates new operations for the worker processes to perform.
- Finally, a checker checks for correct state at the end of the test.
A generator may create different types of operations:
- Client operations that in Cassandra context would eventually result in some CQL transactions.
- Nemesis operations.
- Other operations such as bootstrapping or decommissioning nodes.
The logical block that is responsible for bad things is also called nemesis. Here are some of the partitioning possibilities it supports:
- Split off a subset of the nodes in the cluster
- Split a single node
- Bridge: cut the network into two parts, but preserves a node in the middle which remains connected to both sides.
- Scramble the time on a node.
Nemesis can also change the time on a node, and impose network delays, using the
tc utility. Using
tc you can also reorder network packets. Jepsen may be extended to utilize this feature too in the future.
The Jepsen tests for Cassandra are organized into groups. Each test group consists of a few tests that do something specific to this test group, such as batch transactions or map transactions. During the test, nemesis imposes problems on the cluster, such as “node isolated”.
Adding functionality to Jepsen
The ScyllaDB version of Jepsen uses faketime to enable ScyllaDB processes in different containers on the same host to have different ideas of the correct time. The original version of Jepsen used the
date command to change time, but this cannot make the times in two containers different. When you change the time it changes the time for the host, including all nodes.
Kyle proposed using faketime instead, and the ScyllaDB project will submit the change upstream, for testing other software.
We have also added an option to run a “loader” in the background – a sidekick process that will do something to the cluster while the original test runs. For instance, delaying the traffic on specific links can help test problems when there is pressure on the internal server queues. But the original test case load is far from challenging ScyllaDB. One way to help make sure the system is correct under extreme load is to run another client to constantly keep the internal queues heavily occupied during the original test. This is where the new sidekick option comes in handy. It lets us add a cassandra-stress process to a test, to simulate higher loads.
The Jepsen test suite for Cassandra and ScyllaDB is thorough. A single test test takes 5-10 minutes, and the whole suite takes several hours. We are now in the process of adding ‘batch’ group based tests to cover batch operations while
tc is disrupting the network, with cassandra-stress running in the background during the test.
So far, 11 bugs have been filed on ScyllaDB based on Jepsen tests: ScyllaDB bugs with Jepsen keyword. Bootstrap and decommission have most of the issues. ScyllaDB development is moving rapidly, and the remaining bugs may be closed by the time you read this.
We will need to improve the Jepsen checkers to verify the state of the cluster at end of the test. The current Jepsen checker just verifies that the data can be successfully read, which can happen even if a node has failed. So we want to verify the state of all nodes.
We also plan to add schema change tests, involving a schema change and network disruption happening at the same time. The Cassandra tests we started with are only changing data, not schema. We need more tests that cover network disruption during a repair.
We’re also planning to simulate more network problems, for example making a network connection down for connections in one direction but delayed for connections in the other direction.
And of course we’re planning to run more combinations of events, including testing a simultaneous repair, schema change, and network partition. Jepsen gives us a flexible test framework to build on, and we’re going to use it to put ScyllaDB through as many tough tests as we can. For more info on ScyllaDB testing with Jepsen, see ScyllaDB wiki: Running Jepsen with ScyllaDB and the scylladb/jepsen repository.
Next in the ScyllaDB testing series: Distributed tests, longevity tests, and a new framework for filesystem fault injection.