Why Switch from Apache Cassandra to Scylla?

Six reasons why it’s time to make the move to Scylla (and the features that make us the better choice)

scylla-vs-cassandra
scylla-vs-datastax

Price Performance

Everything in Scylla is written with performance in mind. Scylla squeezes every cycle from your CPU — from analyzing C++ compiled assembler code, to using the best kernel async interfaces for system calls. Scylla even caches your paged query pointer. It has its own memory allocator and its own schedulers for CPU and IO. And Scylla is designed to run at 100% CPU utilization, with every operation classified to a priority class. There’s no need to overprovision. 

FireEye found Scylla to be the best option as a back-end to their massive graph database.

Consistent Performance

Thanks to Scylla’s built-in schedulers, foreground operations (reads and writes) are prioritized over maintenance tasks such as repairs and compactions. There are no garbage collection stalls, which hinder Cassandra performance. Scylla adopts a perfect shared-nothing design: Not a single lock is taken, so latency cannot be affected. 

Comcast was able to reduce P99 response times by 95% after migrating to Scylla. 

scylla-vs-datastax

Less Complexity

Speed doesn’t have to come at the price of complexity. Scylla simplifies everything for its users.It automatically configures the RAID device for you with the right striping, and automatically assigns the NICs network queues to shards. Scylla installs daemons in an isolated Linux control group to cap their memory/CPU usage. Scylla setup runs a disk benchmark to measure the optimal point to maximize throughput while keeping latency low. 

GE Predix was able to greatly reduce the administrative burden to meet their SLAs after switching to Scylla.

Better Maintainability

Stability and ease of maintenance are often more important than performance/cost. Scylla has a notable maintainability advantage as a distributed database. Since Scylla scales up to any number of cores and can stream data to a 60TB(!) meganode (at the same speed it streams to smaller nodes), you can decrease your cluster by 10x. So, for example, rolling restarts become 10x faster. Scylla add-node and decommission operations are *restartable*, you can pause them, resume them from the previous point. Compaction is a solved problem in Scylla.

Fanatics was able to replace 43 nodes of Cassandra with just 3 nodes of Scylla.

scylla-vs-datastax

Better Functionality

Do more with Scylla than you can with Cassandra. Scylla supports global and local indexes — even at the same time. Finally, real, scalable indexes can be used with your model. Scylla supports workload prioritization, enabling you to provide a different priority to different user workloads in a simple role-based fashion. You can provide a superior SLA to your production queries and run your dev queries with the lowest priority. Scylla supports change-data-capture as a CQL table, thus you can easily track your DB changes in a consistent way with the same query language you already know.

Grab found it very easy to use Scylla for their real-time threat detection system.

Easy Migration

Teams can easily migrate applications that use Apache Cassandra and enjoy fundamentally better technology. With Scylla there’s no lock-in. Everything’s the same: The CQL protocol and queries, nodetool, SSTables and compaction strategies; even JMX is supported. What’s more, Scylla supports a DynamoDB compatible API, so you can consolidate more use cases. Work with the same open source projects such as JanusGraph, Spark, Kafka (using our optimized Scylla connector), Presto, KairosDB, Kong and many others. Scylla choses the best open source projects and has selected Prometheus and Grafana for metrics, Wireshark for packet analysis, systemd for Linux daemons and a Kubernetes operator for provisioning.

The team at SAS was shocked that they didn’t need to change any code for their application.

Scylla Features vs Cassandra

Scylla’s innovative shard-per-core design divides a server’s resources into shared-nothing units of CPU-core, RAM, direct storage and network queue. Scylla runs on the highest amounts of cores on multiple CPU architectures, from x86 to arm, IBM Power and even mainframe. Scylla has an end-to-end sharded architecture, so each server core sends RPC to the right matching CPU core target on the remote replica machine. Additionally Scylla’s shard-aware drivers guarantee that the client is topology aware and will reach the CPU core shard that owns the data in order to eliminate hot shards and remove extra hops.

At the heart of Scylla lies its core engine, Seastar. Seastar is a standalone Apache library developed by ScyllaDB. Several storage companies and the Ceph open source storage engine make use of Seastar technology. Seastar has a specialized scheduler, builds with a fully async programming paradigm of futures and promises and can run a million lambda functions per core per second. Seastar is responsible for the schedulers, the networking (it has a tcp stack in userspace but usually the Linux kernel is good enough), DMA to the disks, sharded memory allocators and so forth. Seastar is written in C++20 and uses every innovative trick and paradigm.

Beyond Seastar, Scylla uses C++20 and the best compiler techniques to maximize the cpu benefits. Scylla automatically configures your network card interrupts to balance IRQ processing across your cores. Scylla explicitly chooses to read-ahead data from the drive when it expects a follow on disk access instead of blindly relying on the disk like the case with Cassandra. Scylla controls all aspects of CPU execution and runs procedures to use the CPU idle time so memory layout will be optimized.

Like Cassandra, the client driver is topology aware and will prefer a node that owns the keyrange under query. Scylla takes the design one step further and allows the client to reach the specific cpu core within the replica that owns the data. It improves the load balancing among the servers and improves the latency. Java, Go and Python have sharded Scylla drivers.

Scylla’s lightweight transactions are compatible with Cassandra’s but have one less round trip, and are therefore more efficient with better latency. Scylla’s LWT has a special commitlog mode that automatically balances between the transaction durability flush requirement and fast, non-transactional operations.

Per query cache bypass allows for range scan queries to skip the cache and not be stored in the cache. Bypass cache hints allow you to squeeze more performance from your cluster and to keep your working-set in-cache, so real-time queries receive the best latency.

Scylla implements a different repair checksum algorithm that resembles rsync and runs the checksum at row granularity instead of partition granularity. The new algorithm conciliates repair faster, sends less data over the wire and is less sensitive to large partitions.

Scylla is designed with highly optimized memory management down to the application binaries. Since each shard owns a chunk of rRAM and a CPU core, Scylla binds the CPU to the RAM within the same socket and makes sure that all accesses are done within the same socket. A non-NUMA friendly deployment causes memory access to be twice as expensive.

Scylla can linearly scale-up performance, even on the largest machines you throw at it, such as the AWS i3en24.xlarge with 60TB of storage. It takes the same amount of time to compact or stream as a small i3en.xl. Cassandra has issues with nodes larger than 2TB. The JVM cannot scale.

Heat-weighted load balancing effectively performs rolling node upgrades and reboots by allowing cold nodes to slowly ramp up into requests as its cache is being populated.

Scylla’s incremental compaction strategy (ICS) enhances the existing STCS strategy by dividing the SStables to increments and thus eliminating the typical requirement of 50% free space in your drive. ICS reduces your total storage by 37%.

Stop optimizing flags, no more Garbage Collection (GC) tuning and surprises. JVMs are good for management applications but not for high speed infrastructure. No need to compute the heap size, no need to divide the RAM between the JVM, the off-heap and the page cache. Cassandra suffers from the worst of all worlds — having to manage memory (pools, off heap), ongoing tuning and suffering slow downs due to the JVM.

Stop worrying about and tracking compaction. Scylla’s I/O scheduler prioritizes compaction below the read/write operation class. When there’s a spike in queries, Scylla automatically queues compaction activity. When there is CPU/disk idle time, Scylla will run compaction at full speed. All cores run compaction in parallel. No tuning is required. Maximize your disk speed and improve your query latency.

Using control theory, Scylla makes the database less fragile by dynamically tuning the way resources are used instead of requiring an operator to adjust an overwhelming number of configurations on the fly. Forget about tuning your database! Scylla runs a benchmark to measure your disk and will make all of the Linux configurations on your behalf — from RAID setup to clock drift and fstrim disk scheduling.

Scylla allows for OLTP and OLAP workloads to share a cluster. Built-in scheduler prioritizes transactions and tasks based on shares of system resources assigned per-user, balancing requests to maintain desired service level agreements (SLAs) for each service. This allows you to run a single cluster scaled to support both types of operations, simplifying your architecture and saving you on hardware provisioning.

Cassandra uses several separate caches (key cache, row cache, on/off heap, and Linux OS cache) that require an operator to analyze and correctly size, a manual process that will never be able to keep up with users’ dynamic workloads. Scylla eliminates competing caches with a unified cache system that automatically tunes itself. There is no need to for external caches, either.

Scylla’s metrics are based on Prometheus for collection and Grafana dashboards for presentation. Scylla contributed to Wireshark to add support for CQL and also for its internal RPC for better traceability. Scylla uses systemd and automatically configures Linux on your behalf.

Scylla allows for tables to have global secondary indexes, not just locally on a node. In Cassandra, only local indexes are supported which aren’t scalable. With Scylla you can query your cluster more and have a richer data model.

Scylla employs the best open source experts and has a legacy of consistent open source contributions. We are committed to open source. Seastar, Scylla’s core engine is used by the Ceph storage engine and many others. GoCQLX driver was developed by the Scylla Manager team. Scylla has enhanced the Linux XFS in order to make it more asynchronous. We contributed kernel code for system call efficiency and have made numerous other contributions.

Starting with Scylla 4.0, node operations such as streaming and decommission are based on repair algorithms under the hood. It allows you to pause or restart them while going back to the same position before the restart. It saves a lot of time just when you need it the most.

Scylla allows you to track and stream table changes in a consistent and easy manner. Change data is stored in Scylla as a table that developers can query like any other table. The data is consistent across the replica set and can provide the previous version of the data changed. It surpasses Cassandra’s CDC in terms of ease of use and functionality.

Scylla allows you to run on fewer, larger nodes. With Cassandra, when the time comes to scale your deployment, it will take at least 10x longer to expand your cluster since Cassandra can add only one node at a time. That’s too long to react to changes, so you’re forced to over provision.

Scylla allows you more choices with more compatible APIs. You are free to choose among multiple DB APIs and at any point change your physical deployment or even your database vendor and protocol. Our DynamoDB API is now GA.

Let’s do this

Getting started takes only a few minutes. Scylla has an installer for every major platform. If you get stuck, we’re here to help.