Join us at Scylla Summit 2017 in San Francisco Oct 24-25 - Register now!

Samsung Benchmark

ScyllaDB vs Apache Cassandra – Performance Benchmark by Samsung

Background

Samsung MSL (Memory Solutions Lab) recently released benchmark results from a YCSB evaluation they conducted. We are thrilled to share Samsung’s results, which reiterate previous benchmark findings that Scylla performs 10X better than Apache Cassandra. If you have high-end hardware, you can expect the same results. On smaller machines, the difference is in the range of 1.5X to 3X. We recommend using larger machines to reduce both your node count and your Total Cost of Ownership.

Test Methodology (tools, setup and configuration)

The ScyllaDB cluster consisted of three servers and nine machines as YCSB clients. Each server was equipped with four NVMe SSDs with an XFS filesystem organized into a level 0 software RAID. The database was populated with a 2TB dataset, replicated across three servers and compression disabled. An explicit effort was made to set up a tuned Apache Cassandra 3.9 and Java 1.8 with a G1 garbage collection configuration. Four different YCSB workloads were used:

WorkloadOperationsApplication Examples
A – Update HeavyRead: 50%, Update: 50%Session store recording recent actions in a user session
B – Read HeavyRead: 95%, Update: 5%Photo tagging: can add a tag in an update, but most operations are to read tags
C – Read OnlyRead: 100%User profile cache, where profiles are constructed elsewhere (e.g. Hadoop)
D – Read LatestRead: 95%, Insert: 5%User status update

The Results

Comparing the performance of ScyllaDB versus Apache Cassandra, using the same 2TB dataset and run over two hours, demonstrates ScyllaDB outperforms Apache Cassandra by a staggering 10X to 37X factor. The Samsung team ran Apache Cassandra with a small, 50GB, dataset fitting in the server RAM and compared to ScyllaDB running with a 2TB dataset with 100% hit rate. The results show ScyllaDB performs faster by a factor of 4.4X to 8.6X than Apache Cassandra, while ScyllaDB stores 40X the data. Moreover, the Samsung team repeated the test, this time with ScyllaDB running the 2TB dataset with only 60% hit rate (i.e. NVMe SSDs are serving 40% of the requests) and still, ScyllaDB performs faster by a factor of 2.3X to 3X Apache Cassandra while storing 40X more data.

Measuring ScyllaDB Latency

The Samsung team selected a load of 50% of the maximum throughput, the top anticipated working range. They measured latency for each workload, starting from ~60-80% hit rate and up to ~100% hit rate. The results vary between 0.6ms to 2.6ms depending on the hit rate and the workload. Yes, a cluster of 3 machines with a replication factor of 3 can do multiple 100k IOPS with a millisecond latency using a 10-column, 1KB schema.

WorkloadHit Rate % and Avg. LatencyHit Rate % and Avg. Latency
AHit rate: 58% Latency: 2.64 milisecHit rate: 97% Latency: 1.56 milisec
BHit rate: 64% Latency: 1.72 milisecHit rate: 97% Latency: 0.69 milisec
CHit rate: 64% Latency: 1.71 milisecHit rate: 99% Latency: 0.46 milisec
DHit rate: 82% Latency: 1.25 milisecHit rate: 98% Latency: 0.61 milisec

Download Samsung’s full report, ‘ScyllaDB and Samsung NVMe SSDs Accelerate NoSQL Database Performance‘ to get all the details.

Let’s do this

Getting started takes only a few minutes. Scylla has an installer for every major platform and is well documented. If you get stuck, we’re here to help.

Apache®, Apache Cassandra®, are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.