CockroachDB vs ScyllaDB - Database Benchmark

An ‘apples-to-oranges’ benchmark reveals the relative performance of two distributed databases, ScyllaDB and CockroachDB. No surprise: The database built for low-latency outperforms the database built for strong consistency.

ScyllaDB vs CockroachDB Benchmark Overview

In this ScyllaDB vs Cockroach benchmark, we compare what we consider the best-of-breed in NoSQL versus the best-in-class in NewSQL CockroachDB database. Admittedly biased, we selected ourselves for NoSQL. For NewSQL, we chose CockroachDB. The latter represents distributed SQL, the segment which represents not only the SQL API but also the distributed relational database model.

Obviously, the ScyllaDB vs CockroachDB comparison is of the apples-and-oranges variety. We expect ScyllaDB will be faster — providing lower latencies and greater throughputs — while CockroachDB should have stronger consistency and be friendlier to use with its SQL interface. Our goal was to put forward a rational analysis and put a price tag on the differences in workloads that could be addressed by the two databases. So we won’t cover SQL JOINs, which are supported by CockroachDB alone, and we won’t cover time-series workloads, where ScyllaDB has a clear design advantage.

It’s worth noting that the distinction between NoSQL and NewSQL will evolve over time. And speaking of evolution, please visit our Project Circe page to see what we’re doing at ScyllaDB to bring stronger consistency (and more) to our NoSQL database.

For this ScyllaDB vs CockroachDB performance comparison, we ran the well-known and widely adopted Yahoo! Serving Benchmark (YCSB). The YCSB is an open-source specification and benchmarking framework that is commonly used to measure scalability and performance (defined as latency) in databases.

Summary of ScyllaDB vs CockroachDB Benchmark Results

The following benchmark tests demonstrate that ScyllaDB achieved superior, consistent p95 and p99 latencies compared to CockroachDB performance, with up to 10x better throughput. Additionally, ScyllaDB loaded a dataset of 1 billion keys, which CockroachDB failed to load. ScyllaDB outperformed CockroachDB even while querying the 10x larger dataset.

NOTE: This benchmark report distills the findings described in our original blog post, CockroachDB vs. ScyllaDB Benchmark.

Test Methodology and Configuration

We conducted tests for both databases using similar clusters, each consisting of 3 nodes running on storage optimized i3.4xlarge AWS EC2 instances within a single geographic region (eu-north-1), evenly spread across three availability zones with the standard replication factor (RF=3) and default configuration.

The initial goal was to populate both databases with a dataset of 1 billion keys. We were unable to successfully load 1 billion keys into the CockroachDB cluster. After 3-5 hours of loading, the CockroachDB cluster became unresponsive and generated critical errors in the logs. In response, we reduced the size of the dataset for the CockroachDB to 100 million keys. We loaded the 1 billion keys into ScyllaDB as planned. We also loaded 100 million keys to measure load time and performance. However, for ScyllaDB, we decided to use 1 billion key results in this report to highlight performance differences at a 10x larger scale.

To measure database performance under various workloads, we executed a series of benchmarks defined in the Yahoo! Cloud Serving Benchmark (YCSB) test suite. The YCSB tests encompass a variety of workloads (A through F), each of which represents a different database access pattern. The six benchmarks represent the following workloads:

Workload A — Update Heavy, 50/50 read/write ratio
Workload B — Read Mostly, 95/5 read/write ratio
Workload C — Read Only, 100/0 read/write ratio
Workload D — Read Latest, 95/0/5 read/update/insert ratio
Workload E — Short Range, 95/5 scan/insert ratio
Workload F — Read-Modify-Write, 50/50 read/read-modify-write ratio

ScyllaDB & CockroachDB Configuration Setup

	ScyllaDB Cluster	CockroachDB Cluster
AWS EC2 Instance type	i3.4xlarge	i3.4xlarge
Storage (ephemeral disks)	2 x 1.9TB NVMe SSD	2 x 1.9TB NVMe SSD
Cluster size	3-node cluster in a single region across 3 availability zones	3-node cluster in a single region across 3 availability zones
Total CPU and RAM	16 vCPU \| 122 GiB RAM	16 vCPU \| 122 GiB RAM
Database versions	ScyllaDB 4.2.0	CockroachDB 20.1.6
Benchmarking tools versions	brianfrankcooper/YCSB 0.18.0 SNAPSHOT with a ScyllaDB-native binding Token Aware load-balancing policy	brianfrankcooper/YCSB 0.17.0 benchmark with PostgreNoSQL binding CockroachDB v20.1.6 YCSB port to the Go programming language

Table 1 – The setup for running YCSB against ScyllaDB and CockroachDB

Initial Data Load Results

As stated above, the initial data load for both databases was meant to be 1 billion keys. While the loading ran smoothly for ScyllaDB, CockroachDB latency and throughput degraded from 12K operations per second (OPS) to 2.5K OPS. By our estimates, loading 1billion keys would require at least 111 hours to complete. In our test, the CockroachDB cluster became unresponsive after 5 hours, and also generated critical errors in the log exposing some CockroachDB limitations.

As a workaround, a smaller dataset of 100 million keys was used with CockroachDB. The smaller dataset loaded in 7 hours without any errors. Figure 1, below, demonstrates the slowing rate of insert statements on the CockroachDB cluster as service latency increased.

Figure 1: Initial Data Load performance for CockroachDB

YCSB Test Results

With datasets loaded, we were ready to run YCSB workloads against the ScyllaDB and CockroachDB clusters.

Tests were performed against ScyllaDB from a dedicated AWS EC2 c5.9xlarge instance acting as a client, with 32 vCPU, 72 GB RAM, 10GiB . The following command was used for all YCSB workloads, A, B, C, D, E and F:

$ bin/ycsb run scylla -P workloads/workloada -target 120000 -threads 840 -p recordcount=100000000 -p fieldcount=10 -p fieldlength=128 -p operationcount=300000000 -p scylla.coreconnections=280 -p scylla.maxconnections=280 -p scylla.username=cassandra -p scylla.password=cassandra -p scylla.tokenaware=true -p hosts=10.0.2.51,10.0.3.133,10.0.3.67 -p scylla.readconsistencylevel=ONE -p scylla.writeconsistencylevel=ONE

Tests were performed against CockroachDB from a dedicated AWS EC2 c5.9xlarge instance acting as a client, with 32 vCPU, 72 GB RAM, 10GiB . The following command was used for all YCSB workloads:

$ cockroach workload run ycsb 'postgresql://[email protected]:5000?sslmode=disable' --workload A --insert-count 100000000 --concurrency 840 --max-rate 16000 --max-ops 30000000 --tolerate-errors

For the majority of YCSB workloads, ScyllaDB demonstrated better performance with lower, predictable latencies, even while working on a dataset 10x larger than CockroachDB.

Figure 2, below, provides a side-by-side overview of ScyllaDB and CockroachDB performance results for YCSB workloads A – F. Overall, ScyllaDB’s throughput ranges from 100K to 180K, with p99 latencies between 5 ms and 30 ms. CockroachDB’s throughput ranges between 16K and 40K, while p99 latencies range between 52 ms and 530 ms.

Figure 2: Throughput and latency of ScyllaDB and CockroachDB per YCSB workload.

In the following sections, we examine CockroachDB vs ScyllaDB test results for each YCSB workload.

Workload A

YCSB workload A performs 50% single-row lookups and 50% single-column updates. According to YCSB documentation, workload A simulates a session store that records recent actions. As such, it is a highly contended workload.

For the 1 billion key dataset, ScyllaDB managed to serve 150K-200K OPS on most of the workloads at 75-80% CPU utilization with reasonable latency. As seen in Figure 3, on average, ScyllaDB achieved 150K OPS with p99 latencies under 12 ms at 75% load.

Meanwhile, for the 100 million key dataset, the CockroachDB managed to serve 16K OPS. As seen in Figure 3, on average, CockroachDB achieved 16K OPS with p99 latencies around 52 ms at 73% load.

Figure 3: ScyllaDB and CockroachDB results for YCSB workload A

ScyllaDB execution results for workload A for 100 million and 1 billion key datasets:

Dataset Size	Overall OPS	CPU Utilization	P95 ms	P99 ms
100M	120,000	60%	3.6	8.5
1B	150,000	80%	4	12

CockroachDB Execution results for Workload A for the 100 million key dataset:

Dataset Size	Overall OPS	CPU Utilization	P95 ms	P99 ms
100M	16,000	73%	27.3	52.4

Workload B

Workload B performs 95% single-row lookups and 5% single-column updates. While less contended than workload A, workload B still bottlenecks on contention as concurrency grows. Using column families slows down the single-row lookups but speeds up the updates. A real-world use case involving workload B is photo tagging. A tag is an update, but the majority of operations are reads.

As demonstrated in Figure 4, ScyllaDB achieved 150K OPS against the dataset of 1 billion keys. ScyllaDB’s p95 latencies were 1.9 ms, while p99 latencies were 7 ms.

Running against the dataset of 100 million keys, CockroachDB managed 35K OPS, with p95 latencies of 50.3 ms, and p99 latencies of 125.8 ms.

Figure 4: Detailed results for YCSB workload B

ScyllaDB results for workload B:

Dataset Size	Overall OPS	CPU Utilization	P95 ms	P99 ms
100M	120,000	60%	1	2.2
1B	150,000	50%	1.9	7

CockroachDB results for workload B:

Dataset Size	Overall OPS	CPU Utilization	P95 ms	P99 ms
100M	35,000	80%	50.3	125.8

Workload C

Workload C consists entirely of single-row lookups. A real-world application of workload C is a user profile cache, where profiles are constructed elsewhere – for example, in Hadoop. Using column families slows down single-row lookups, so, by default, they are not used.

As demonstrated in Figure 5, ScyllaDB achieved 150K OPS against the dataset of 1 billion keys. ScyllaDB’s p95 latencies were 2.4 ms, while p99 latencies were 7 ms.

Running against the dataset of 100 million keys, CockroachDB managed 38K OPS, with p95 latencies of 31.5 ms, and p99 latencies of 56.6 ms.

Figure 5: Detailed results for YCSB workload C

ScyllaDB results for workload C:

Dataset Size	Overall OPS	CPU Utilization	P95 ms	P99 ms
100M	140,000	55%	1.6	3.7
1B	150,000	50%	2.4	7

CockroachDB results for workload C:

Dataset Size	Overall OPS	CPU Utilization	P95 ms	P99 ms
100M	38,000	80%	31.5	56.6

Workload D

Workload D performs 95% single-row lookups and 5% single-row insertion. As such, it has no contention. A real-world example of workload D is user status updates, where users generally want to read the latest updates. Using column families slows down single-row lookups and single-row insertion, so, by default, they are unused.

As demonstrated in Figure 6, ScyllaDB achieved 180K OPS against the dataset of 1 billion keys. ScyllaDB’s p95 latencies were 3 ms, while p99 latencies were 5.5 ms. Of all workloads, this produced ScyllaDB’s best performance.

Running against the dataset of 100 million keys, CockroachDB achieved its best throughput results, managing 40K OPS, with p95 latencies of 100 ms, and p99 latencies of 176.2 ms. Further increasing the load did not increase throughput. Increased load only contributed to higher latency and its variance.

Figure 6: Detailed results for YCSB workload D

ScyllaDB results for workload D:

Dataset Size	Overall OPS	CPU Utilization	P95 ms	P99 ms
100M	130,000	50%	1.2	3
1B	180,000	75%	3	5.5

CockroachDB results for workload D:

Dataset Size	Overall OPS	CPU Utilization	P95 ms	P99 ms
100M	40,000	80%	100	176.2

Workload E

Workload E performs 95% multi-row scans and 5% single-row insertion and therefore has moderate contention. Using column families slows down multi-row scans and single-row insertion, so those are not used by default.

Workload E defines operations that are short-range scans, rather than queries against individual records. According to the YCSB benchmark, the workload involves “threaded conversations, where each scan is for the posts in a given thread assumed to be clustered by thread ID.” As expected, this workload produced the worst performance – only 10k operations per second.

YCSB Workload E is implemented in a way that does not play to the strengths of either ScyllaDB or CockroachDB features. In ScyllaDB, range partition scans are token-based, and tokens are randomly distributed throughout the cluster. As such, many random reads across multiple nodes are required to satisfy a single scan request. Unsurprisingly, this is not an efficient way to do range-scans. Yet while range scans are not the main advantage of ScyllaDB, ScyllaDB still did well enough and outperformed CockroachDB by 5x.

As demonstrated in Figure 7, ScyllaDB achieved 10K OPS against the dataset of 1 billion keys. ScyllaDB’s p95 latencies were 14 ms, while p99 latencies were 21 ms.

Running against the dataset of 100 million keys, CockroachDB managed 2K OPS, with p95 latencies of 352.9 ms, and p99 latencies of 536.9 ms.

Figure 7: Detailed results for YCSB workload E

ScyllaDB results for workload E:

Dataset Size	Overall OPS	CPU Utilization	P95 ms	P99 ms
100M	10,000	50%	12	18
1B	10,000	50%	14	21

CockroachDB results for workload E:

Dataset Size	Overall OPS	CPU Utilization	P95 ms	P99 ms
100M	2,000	50%	352.9	536.9

Workload F

And finally, workload F performs 50% single-row lookups and 50% single-column updates expressed as multi-statement read-modify-write transactions. As such, it is highly contended. A real-world example of workload F is a user database, in which user records are read and modified by the user, and which also records user activity.

As demonstrated in Figure 8, ScyllaDB achieved 100K OPS against the dataset of 1 billion keys. ScyllaDB’s p95 latencies were 10 ms, while p99 latencies were 26 ms.

Running against the dataset of 100 million keys, CockroachDB managed 6K OPS, with p95 latencies of 41.9 ms, and p99 latencies of 385.9 ms.

Figure 8: Detailed results for YCSB workload F

ScyllaDB results for workload F:

Dataset Size	Overall OPS	CPU Utilization	P95 ms	P99 ms
100M	100,000	50	12	18
1B	100,000	80	10	26

CockroachDB results for workload F:

Dataset Size	Overall OPS	CPU Utilization	P95 ms	P99 ms
100M	6,000	70	41.9	385.9

The Bottom Line

To summarize the results of the YCSB CockroachDB vs Sylla benchmark comparison:

Loading 10x the data into ScyllaDB took less than half the time it took for CockroachDB to load the much lesser dataset.
ScyllaDB handled 10x the amount of data.
ScyllaDB achieved 9.3x the throughput of CockroachDB latency at 1/4th the latency.

Modern, cloud-native applications often require high availability and predictable low latency. Such workloads are ideal for ScyllaDB. Requirements for strong consistency are less common. Workloads characterized by modest dataset size, and which require strong consistency guarantees and transactions along with a relational database model, involving JOINs, are ideal for a NewSQL database such as CockroachDB.

Apache® and Apache Cassandra® are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. Amazon DynamoDB® and Dynamo Accelerator® are trademarks of Amazon.com, Inc. No endorsements by The Apache Software Foundation or Amazon.com, Inc. are implied by the use of these marks.

Why ScyllaDB?

Is ScyllaDB right for me?

ScyllaDB University

Check out the ScyllaDB Blog