See all blog posts

ScyllaDB vs Apache Cassandra – Performance Benchmark by Samsung

Background

Samsung MSL (Memory Solutions Lab) recently released benchmark results from a YCSB evaluation they conducted. We are thrilled to share Samsung’s results, which reiterate previous benchmark findings that ScyllaDB performs 10X better than Apache Cassandra. If you have high-end hardware, you can expect the same results. On smaller machines, the difference is in the range of 1.5X to 3X. We recommend using larger machines to reduce both your node count and your Total Cost of Ownership.

Test Methodology (tools, setup, and configuration)

The ScyllaDB cluster consisted of three servers and nine machines as YCSB clients. Each server was equipped with four NVMe SSDs with an XFS filesystem organized into a level 0 software RAID. The database was populated with a 2TB dataset, replicated across three servers and compression disabled. An explicit effort was made to set up a tuned Apache Cassandra 3.9 and Java 1.8 with a G1 garbage collection configuration. Four different YCSB workloads were used:

Workload Operations Application Examples
A – Update Heavy Read: 50%, Update:50% Session store recording recent actions in a user session
B – Read Heavy Read: 95%, Update:5% Photo tagging: can add a tag in an update, but most operations are to read tags
C – Read Only Read: 100% User profile cache, where profiles are constructed elsewhere (e.g. Hadoop)
D – Read Latest Read 95%, Insert: 5% User status update


The Results

Comparing the performance of ScyllaDB versus Apache Cassandra, using the same 2TB dataset and run over two hours, demonstrates ScyllaDB outperforms Apache Cassandra by a staggering 10X to 37X factor.

The Samsung team ran Apache Cassandra with a small, 50GB, dataset fitting in the server RAM and compared to ScyllaDB running with a 2TB dataset with 100% hit rate. The results show ScyllaDB performs faster by a factor of 4.4X to 8.6X than Apache Cassandra, while ScyllaDB stores 40X the data.

Moreover, the Samsung team repeated the test, this time with ScyllaDB running the 2TB dataset with only 60% hit rate (i.e. NVMe SSDs are serving 40% of the requests) and still, ScyllaDB performs faster by a factor of 2.3X to 3X Apache Cassandra while storing 40X more data.

Measuring ScyllaDB Latency

The Samsung team selected a load of 50% of the maximum throughput, the top anticipated working range.
They measured latency for each workload, starting from ~60-80% hit rate and up to ~100% hit rate. The results vary between 0.6ms to 2.6ms depending on the hit rate and the workload. Yes, a cluster of 3 machines with a replication factor of 3 can do multiple 100k IOPS with a millisecond latency using a 10-column, 1KB schema.

Workload Hit rate % and Avg. Latency Hit rate % and Avg. Latency
A Hit rate: 58%
Latency: 2.64 milisec
Hit rate: 97%
Latency: 1.56 milisec
B Hit rate: 64%
Latency: 1.72 milisec
Hit rate: 97%
Latency: 0.69 milisec
C Hit rate: 64%
Latency: 1.71 milisec
Hit rate: 99%
Latency: 0.46 milisec
D Hit rate: 82%
Latency: 1.25 milisec
Hit rate: 98%
Latency: 0.61 milisec

Download Samsung’s full report, ‘ScyllaDB and Samsung NVMe SSDs Accelerate NoSQL Database Performance‘ to get all the details.

Apache®, Apache Cassandra®,  are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.

About Tomer Sandler

Tomer Sandler joined ScyllaDB as a solution architect after a 12 year career in SW Quality Engineering, mostly in storage and telecom lawful interception domains. Prior to ScyllaDB, Tomer held various QA management roles at Dell EMC, leading a group of QA engineers and information developers for ScaleIO storage.