Samsung Benchmark

ScyllaDB vs Cassandra – Performance Benchmark by Samsung

Background

Samsung MSL (Memory Solutions Lab) recently released benchmark results from a YCSB evaluation they conducted. We are thrilled to share Samsung’s results, which reiterate previous benchmark findings that Scylla performs 10X better than Cassandra. If you have high-end hardware, you can expect the same results. On smaller machines, the difference is in the range of 1.5X to 3X. We recommend using larger machines to reduce both your node count and your Total Cost of Ownership.

Test Methodology (tools, setup and configuration)

The ScyllaDB cluster consisted of three servers and nine machines as YCSB clients. Each server was equipped with four NVMe SSDs with an XFS filesystem organized into a level 0 software RAID. The database was populated with a 2TB dataset, replicated across three servers and compression disabled. An explicit effort was made to set up a tuned Cassandra 3.9 and Java 1.8 with a G1 garbage collection configuration. Four different YCSB workloads were used:

WorkloadOperationsApplication Examples
A – Update HeavyRead: 50%, Update: 50%Session store recording recent actions in a user session
B – Read HeavyRead: 95%, Update: 5%Photo tagging: can add a tag in an update, but most operations are to read tags
C – Read OnlyRead: 100%User profile cache, where profiles are constructed elsewhere (e.g. Hadoop)
D – Read LatestRead: 95%, Insert: 5%User status update

The Results

Comparing the performance of ScyllaDB versus Cassandra, using the same 2TB dataset and run over two hours, demonstrates ScyllaDB outperforms Cassandra by a staggering 10X to 37X factor. The Samsung team ran Cassandra with a small, 50GB, dataset fitting in the server RAM and compared to ScyllaDB running with a 2TB dataset with 100% hit rate. The results show ScyllaDB performs faster by a factor of 4.4X to 8.6X than Cassandra, while ScyllaDB stores 40X the data. Moreover, the Samsung team repeated the test, this time with ScyllaDB running the 2TB dataset with only 60% hit rate (i.e. NVMe SSDs are serving 40% of the requests) and still, ScyllaDB performs faster by a factor of 2.3X to 3X Cassandra while storing 40X more data.

Measuring ScyllaDB Latency

The Samsung team selected a load of 50% of the maximum throughput, the top anticipated working range. They measured latency for each workload, starting from ~60-80% hit rate and up to ~100% hit rate. The results vary between 0.6ms to 2.6ms depending on the hit rate and the workload. Yes, a cluster of 3 machines with a replication factor of 3 can do multiple 100k IOPS with a millisecond latency using a 10-column, 1KB schema.

WorkloadHit Rate % and Avg. LatencyHit Rate % and Avg. Latency
AHit rate: 58% Latency: 2.64 milisecHit rate: 97% Latency: 1.56 milisec
BHit rate: 64% Latency: 1.72 milisecHit rate: 97% Latency: 0.69 milisec
CHit rate: 64% Latency: 1.71 milisecHit rate: 99% Latency: 0.46 milisec
DHit rate: 82% Latency: 1.25 milisecHit rate: 98% Latency: 0.61 milisec

Download Samsung’s full report, ‘ScyllaDB and Samsung NVMe SSDs Accelerate NoSQL Database Performance‘ to get all the details.

Let’s do this

Getting started takes only a few minutes. Scylla has an installer for every major platform and is well documented. If you get stuck, we’re here to help.