Scylla Blog

Stay up to date with recent news and updates on our Users Blog, and get under the hood on our Developers Blog.

Posts by Glauber Costa


Selecting Compression Chunk Sizes for Scylla

By default, Scylla SSTables will be compressed when they are written to disk. As mandated by the file format, data is compressed in chunks of a certain size – 4kB if not explicitly set. The size of the chunk is one of the parameters for the compression property to be set at table creation. Chunk-based compression presents trade-offs that users may not be aware of. In this post, I will try to explore what those trade-offs are and how to set them correctly for maximum benefit. As trade-offs imply different results for different loads, we will focus on single-partition read […]

Read full article


Faster and better: What to expect running Scylla on AWS i3 instances

AWS i3 monitoring

Amazon recently unveiled a new class of machines—the AWS i3 family. Targeted at I/O intensive applications and featuring up to 15TB of fast storage, these machines offer unprecedented power with a great balance between I/O and CPU. At a lower price than the previous i2 family, we expect the i3 family to become the default class for NoSQL workloads. This article will cover i3 instances and provide information about the status of Scylla support for the hardware. Although we don’t yet officially provide i3 AMIs, customers are already running them in production with positive results. Scylla’s native architecture takes advantage […]

Read full article


Performance report: Scylla vs Apache Cassandra on low-end hardware

What to expect from Scylla’s performance on low-end hardware Scylla is a reimplementation of Apache Cassandra that has been demonstrated by us and third parties to perform up to 10x better than Apache Cassandra. These performance advantages stem from Scylla’s modern hardware-friendly and ultra-scalable architecture. As a result, Scylla’s performance grows as the hardware size grows. Scaling both up and out offers many advantages: from simplified cluster management to access to generally better hardware and economies of scale. We will address those choices in detail in an upcoming blog post. However, many users have compelling reasons to stay on low-end hardware, […]

Read full article


Scylla Workload Conditioning part one: write request rate determination

Scylla Workload Conditioning

What is Workload Conditioning? What is the best request rate I should throw at my cluster? What disk bandwidth should I make available for compactions? How many reader or writer threads should I have? What are the best size for my memtables?

Read full article


Big latencies? It’s open season for kernel bug-hunting!

Scylla Workload Conditioning

ScyllaDB strives to offer its users predictable low latencies. However, in real life, things do not always go according to plan, and sometimes predictable low latencies become unpredictable big latencies. When that happens, it’s time to go into detective mode and go figure out what’s going on.

Read full article


Monitoring Deep Dive: The best tools for the job and the metrics exported by Scylla

monitoring deep dive

Last month we gave a talk at Scylla Summit that described the caveats and best practices for monitoring a live Scylla cluster. Once the cluster is ready to serve your requests, you will need to monitor it to understand its performance characteristics, its overall health, and should anything go wrong, understand what was it was that upset the cluster’s behavior.

Read full article


Designing a Userspace Disk I/O Scheduler for Modern Datastores: the Scylla example (Part 2)

This is the second and last part of this article. If you haven’t read the first part, you can do it here. In this part, we will look at the design of the Seastar I/O Scheduler that Scylla uses to manage its disk I/O and discuss how it can be used to not only provide predictable latencies as we saw in our previous installment, but to guarantee fairness and proper balancing among different actors.

Read full article


Designing a Userspace Disk I/O Scheduler for Modern Datastores: the Scylla example (Part 1)

In a datastore like Scylla, there are many actors competing for disk I/O. Examples of such actors are data writers (in Scylla’s parlance they can be either memtable or commitlog writers), and a disk reader fetching the data to serve a cache miss. To illustrate the role that competition plays, if we are just issuing disk I/O without resorting to any fairness or balancing consideration, a reader, for instance, could find itself behind a storm of writes. By the time it has the opportunity to run, all that wait would have translated into increased latency for the read.

Read full article


Choosing EC2 instances for NoSQL

Amazon EC2 is a virtual computer store with all sizes and types of server on display. We researched the top choices to find the best balanced, best-performing server for NoSQL.

Read full article

Subscribe to Our Blog