Scylla vs. Cassandra at Samsung: YCSB Benchmark

Performance Report: YCSB Benchmark Results by Samsung

Scylla vs Apache Cassandra Benchmark Summary

Samsung MSL (Memory Solutions Lab) benchmarked cloud serving systems with YCSB to evaluate NoSQL database performance.. We are thrilled to share the Samsung benchmark results, which reiterate previous benchmark findings that ScyllaDB performance is 10X better than Apache Cassandra. If you have high-end hardware, you can expect the same results. On smaller machines, the difference is in the range of 1.5X to 3X. We recommend using larger machines to reduce both your node count and your Total Cost of Ownership.

YCSB Test Methodology (tools, setup and configuration)

The ScyllaDB cluster consisted of three servers and nine machines as YCSB clients. Each server was equipped with four NVMe SSDs with an XFS filesystem organized into a level 0 software RAID. The database was populated with a 2TB dataset, replicated across three servers and compression disabled. An explicit effort was made to set up a tuned Apache Cassandra 3.9 and Java 1.8 with a G1 garbage collection configuration. Four different YCSB workloads were used:

WorkloadOperationsApplication Examples
A – Update HeavyRead: 50%, Update: 50%Session store recording recent actions in a user session
B – Read HeavyRead: 95%, Update: 5%Photo tagging: can add a tag in an update, but most operations are to read tags
C – Read OnlyRead: 100%User profile cache, where profiles are constructed elsewhere (e.g. Hadoop)
D – Read LatestRead: 95%, Insert: 5%User status update

YCSB Benchmark Results

Comparing the database performance of ScyllaDB vs Apache Cassandra, using the same 2TB dataset and run over two hours, YCSB demonstrates ScyllaDB outperforms Apache Cassandra by a staggering 10X to 37X factor. The Samsung team ran Apache Cassandra with a small, 50GB, dataset fitting in the server RAM and compared to ScyllaDB running with a 2TB dataset with 100% hit rate. The results show ScyllaDB performs faster by a factor of 4.4X to 8.6X than Apache Cassandra, while ScyllaDB stores 40X the data. Moreover, the Samsung team repeated the test, this time with ScyllaDB running the 2TB dataset with only 60% hit rate (i.e. NVMe SSDs are serving 40% of the requests) and still, ScyllaDB performs faster by a factor of 2.3X to 3X Apache Cassandra while storing 40X more data. 

Measuring ScyllaDB Latency

The Samsung team selected a load of 50% of the maximum throughput, the top anticipated working range. They measured latency for each workload, starting from ~60-80% hit rate and up to ~100% hit rate. The results vary between 0.6ms to 2.6ms depending on the hit rate and the workload. Yes, a cluster of 3 machines with a replication factor of 3 can do multiple 100k IOPS with a millisecond latency using a 10-column, 1KB schema.

WorkloadHit Rate % and Avg. LatencyHit Rate % and Avg. Latency
AHit rate: 58% Latency: 2.64 milisecHit rate: 97% Latency: 1.56 milisec
BHit rate: 64% Latency: 1.72 milisecHit rate: 97% Latency: 0.69 milisec
CHit rate: 64% Latency: 1.71 milisecHit rate: 99% Latency: 0.46 milisec
DHit rate: 82% Latency: 1.25 milisecHit rate: 98% Latency: 0.61 milisec

Download the Samsung benchmark full report, ‘ScyllaDB and Samsung NVMe SSDs Accelerate NoSQL Database Performance‘ to get all the details.

Let’s do this

Getting started takes only a few minutes. Scylla has an installer for every major platform. If you get stuck, we’re here to help.