Introduction The most common operations with ScyllaDB are inserting, updating, and retrieving rows within a single partition: each operation specifies a single partition key, and the operation applies to that partition. While less commonly used, reads of all partitions, also known as full table scans are also useful, often in the context of data analytics. This post describes how to efficiently perform full table scans with ScyllaDB 1.6 and above.
ScyllaDB strives to offer its users predictable low latencies. However, in real life, things do not always go according to plan, and sometimes predictable low latencies become unpredictable big latencies. When that happens, it’s time to go into detective mode and go figure out what’s going on.
Scylla 1.3 has introduced better support for large partitions. It is an important feature which simplifies data modeling so that it can be more focused on the client’s needs and less on the server limitations and ways to work around them. Moreover, issues related to large partitions are not just failed requests and server crashes caused by the node running out of memory, before reaching that point cluster may experience various performance problems, something much harder to diagnose.
Block devices sometimes do bad things (or just fill up), so sometimes bad things happen to good software. CharybdeFS makes it easy to do integration testing that covers hard-to test filesystem errors. And good error handling is a sign of well-thought-out software. For example, your program will make a much better impression on users if you have it show a nice “insufficient space” message than if it just crashes for no apparent reason. The CharybdeFS filesystem lets you inject arbitrary file errors for testing. This article covers some common examples for getting started.
This is the second and last part of this article. If you haven’t read the first part, you can do it here. In this part, we will look at the design of the Seastar I/O Scheduler that Scylla uses to manage its disk I/O and discuss how it can be used to not only provide predictable latencies as we saw in our previous installment, but to guarantee fairness and proper balancing among different actors.
In a datastore like Scylla, there are many actors competing for disk I/O. Examples of such actors are data writers (in Scylla’s parlance they can be either memtable or commitlog writers), and a disk reader fetching the data to serve a cache miss. To illustrate the role that competition plays, if we are just issuing disk I/O without resorting to any fairness or balancing consideration, a reader, for instance, could find itself behind a storm of writes. By the time it has the opportunity to run, all that wait would have translated into increased latency for the read.
This is the fifth part in our blog series on Scylla testing. Part 1 covers Apache Cassandra compatibilty testing, Part 2 covers Jepsen tests, Part 3 covers the CharybdeFS fault-injecting filesystem, and Part 4 covers distributed testing.
This is the fourth part in our blog series on Scylla testing. Part 1 covers Apache Cassandra compatibility testing, Part 2 covers Jepsen tests, and Part 3 covers the CharybdeFS fault-injecting filesystem.
Full disclosure: I’m a casual user of Ansible, Python and Bash. I’m not an expert on any of them. The notes below are my personal impression after playing around with Ansible for a few months, mostly to run Scylla and Cassandra clusters on EC2.
This is the third part in our blog series on Scylla testing. Part 1 covers Apache Cassandra compatibility testing, and Part 2 covers Jepsen tests.