Feb4

Scylla testing part 1: Apache Cassandra compatibility testing

Subscribe to Our Blog

This series of blog posts will cover how we test Scylla—first of all for stability and correctness, and also to ensure that Scylla keeps Apache Cassandra compatibility constantly as the project proceeds. Software testing is critical, and especially so for a project that needs to achieve compatibility with some previously existing software. This blog series will cover the tests that we put Scylla through.

Scylla must pass three categories of tests: Scylla native tests, Apache Cassandra tests, and third-party tests. First, of course are the project’s own built-in tests.

But, of course, that’s not all. Scylla must be compatible with Apache Cassandra. So, besides Scylla’s own tests, the project must pass the Apache Cassandra test suite, including both unit tests and distributed tests. Tests for the CQL query language are part of the Apache Cassandra unit tests.

Example of a unit test

public class TimeuuidTest extends CQLTester
{
/**
* Migrated from cql_tests.py:TestCQL.timeuuid_test()
*/
@Test
public void testTimeuuid() throws Throwable
{
createTable(CREATE TABLE %s (k int, t timeuuid, PRIMARY KEY(k, t)));
assertInvalidSyntaxMessage(null, INSERT INTO %s (k, t) VALUES (0, 2012-11-07 18:18:22-0800));
for (int i = 0; i < 4; i++)
execute(INSERT INTO %s (k, t) VALUES (0, now()));
Object[][] rows = getRows(execute(SELECT * FROM %s));
assertEquals(4, rows.length);
assertRowCount(execute(SELECT * FROM %s WHERE k = 0 AND t >= ?, rows[0][1]), 4);
assertEmpty(execute(SELECT * FROM %s WHERE k = 0 AND t < ?, rows[0][1]));
assertRowCount(execute(SELECT * FROM %s WHERE k = 0 AND t > ? AND t <= ?, rows[0][1], rows[2][1]), 2);
assertRowCount(execute(SELECT * FROM %s WHERE k = 0 AND t = ?, rows[0][1]), 1);
assertInvalid(SELECT dateOf(k) FROM %s WHERE k = 0 AND t = ?, rows[0][1]);
for (int i = 0; i < 4; i++)
{
long timestamp = UUIDs.unixTimestamp((UUID) rows[i][1]);
assertRows(execute(SELECT dateOf(t), unixTimestampOf(t) FROM %s WHERE k = 0 AND t = ?, rows[i][1]),
row(new Date(timestamp), timestamp));
}
assertEmpty(execute(SELECT t FROM %s WHERE k = 0 AND t > maxTimeuuid(1234567) AND t < minTimeuuid(‘2012-11-07 18:18:22-0800’)));
}
/**
* Test for 5386,
* migrated from cql_tests.py:TestCQL.function_and_reverse_type_test()
*/
@Test
public void testDescClusteringOnTimeuuid() throws Throwable
{
createTable(CREATE TABLE %s (k int, c timeuuid, v int, PRIMARY KEY (k, c)) WITH CLUSTERING ORDER BY (c DESC));
execute(INSERT INTO %s (k, c, v) VALUES (0, now(), 0));
}
}
view raw
TimeuuidTest.java
hosted with ❤ by GitHub

The above code is an example of an Cassandra unit test (source). It performs the following steps.

  1. Create a table with two columns: int and timeuuid.
  2. Insert n (n=4) values in it with the now() function, which is guaranteed to return unique results.
  3. Verify if there are indeed 4 results in the table. If now() returns a unique UUID, there will be.
  4. Perform queries that verify that the timeuuid values can be compared appropriately.
  5. Verify that the functions dateOf and unixTimestampOf are returning correct values by comparison with what java.utils.Date and java.util.UUID.unixTimestamp return for the same strings.

Apache Cassandra developers have written hundreds of such tests, which means that the Scylla project is starting with a solid foundation for QA. However, the Apache Cassandra tests require quite a bit of setup. The cassandra distributed tests, or dtest suite, is more of a functional cluster test than actual unit testing, it just so happens to be written using python unittest classes. It’s a common trend in Python testing that most types of test classes are derived from the unittest base class.

In order to get the Apache unit tests running on Scylla, you need to start a Scylla cluster with ccm, the Cassandra Cluster Manager.

ccm create scylla-3 --scylla --vnodes -n 3 --install-dir=[your-install-dir]
ccm start

The Scylla project has a fork of ccm that enables starting Scylla. We’re in the process of contributing this extra functionality to the ccm project, and will post to the scylladb-dev mailing list and the upstream project list. We’ll be ready to share this repo with the rest of the world soon, when we have worked out some remaining issues with Git history. We’re still working out if we want to generate a single clean patch to submit, or keep some messy history.

The Cassandra unit tests are a mix of two categories: some depend on the correct behavior of the software, and some that are tied to the internals of the original implementation. Anything that directly depends on the original Java implementation is not useful for Scylla testing purposes. The tests that don’t depend on internals, including CQL unit tests, are in Scylla’s cassandra-unit-tests repository. So far, it’s been good for us to show how compliant and solid our support for CQL is.

When you have cloned the test repo, execute the tests by doing:

cd cassandra-unit-tests
mvn test

What’s the test failure?

The mvn test command above will show one failed test. What’s the bug?

The current development version of Scylla does not always return a unique timestamp. This is a known Scylla bug and we’re working on it right now.

Scylla has a Jenkins setup with lots of jobs that run on a daily or per merge basis. We execute all the tests covered above using Jenkins, and as we develop more features, everything is continually tested. Constantly running and expanding the test suite is a top priority. If you’re interested in joining us, and high software quality is your thing, check out the QA automation developer listing on our jobs page.

Next: Jepsen tests

In the next article in this series, we’ll cover the Jepsen distributed testing tool, and how we’re extending Jepsen’s Cassandra testing support to cover Scylla. Follow @ScyllaDB on Twitter or subscribe to this site’s RSS feed for more testing info, release announcements, and other project news.

Apache®, Apache Cassandra®,  are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.

Scylla TeamAbout Scylla Team

Scylla is the world’s fastest column-store database: the functionality of Apache Cassandra with the speed of a light key/value store.


Tags: deep-dive, testing