CockroachDB vs ScyllaDB - Database Benchmark

An ‘apples-to-oranges’ benchmark reveals the relative performance of two distributed databases, ScyllaDB and CockroachDB. No surprise: The database built for low-latency outperforms the database built for strong consistency.

ScyllaDB vs CockroachDB Benchmark Overview

In this ScyllaDB vs Cockroach benchmark, we compare what we consider the best-of-breed in NoSQL versus the best-in-class in NewSQL CockroachDB database. Admittedly biased, we selected ourselves for NoSQL. For NewSQL, we chose CockroachDB. The latter represents distributed SQL, the segment which represents not only the SQL API but also the distributed relational database model.

Obviously, the ScyllaDB vs CockroachDB comparison is of the apples-and-oranges variety. We expect ScyllaDB will be faster — providing lower latencies and greater throughputs — while CockroachDB  should have stronger consistency and be friendlier to use with its SQL interface. Our goal was to put forward a rational analysis and put a price tag on the differences in workloads that could be addressed by the two databases. So we won’t cover SQL JOINs, which are supported by CockroachDB alone, and we won’t cover time-series workloads, where ScyllaDB has a clear design advantage.

It’s worth noting that the distinction between NoSQL and NewSQL will evolve over time. And speaking of evolution, please visit our Project Circe page to see what we’re doing at ScyllaDB to bring stronger consistency (and more) to our NoSQL database.

For this ScyllaDB vs CockroachDB performance comparison, we ran the well-known and widely adopted Yahoo! Serving Benchmark (YCSB). The YCSB is an open-source specification and benchmarking framework that is commonly used to measure scalability and performance (defined as latency) in databases.

Summary of ScyllaDB vs CockroachDB Benchmark Results

The following benchmark tests demonstrate that ScyllaDB achieved superior, consistent p95 and p99 latencies compared to CockroachDB performance, with up to 10x better throughput. Additionally, ScyllaDB loaded a dataset of 1 billion keys, which CockroachDB failed to load. ScyllaDB outperformed CockroachDB  even while querying the 10x larger dataset.

NOTE: This benchmark report distills the findings described in our original blog post, CockroachDB vs. ScyllaDB Benchmark.

Test Methodology and Configuration

We conducted tests for both databases using similar clusters, each consisting of 3 nodes running on storage optimized i3.4xlarge AWS EC2 instances within a single geographic region (eu-north-1), evenly spread across three availability zones with the standard replication factor (RF=3) and default configuration.

The initial goal was to populate both databases with a dataset of 1 billion keys. We were unable to successfully load 1 billion keys into the CockroachDB cluster. After 3-5 hours of loading, the CockroachDB cluster became unresponsive and generated critical errors in the logs. In response, we reduced the size of the dataset for the CockroachDB to 100 million keys. We loaded the 1 billion keys into ScyllaDB as planned. We also loaded 100 million keys to measure load time and performance. However, for ScyllaDB, we decided to use 1 billion key results in this report to highlight performance differences at a 10x larger scale.

To measure database performance under various workloads, we executed a series of benchmarks defined in the Yahoo! Cloud Serving Benchmark (YCSB) test suite. The YCSB tests encompass a variety of workloads (A through F), each of which represents a different database access pattern. The six benchmarks represent the following workloads:

  • Workload A — Update Heavy, 50/50 read/write ratio
  • Workload B — Read Mostly, 95/5 read/write ratio
  • Workload C — Read Only, 100/0 read/write ratio
  • Workload D — Read Latest, 95/0/5 read/update/insert ratio
  • Workload E — Short Range, 95/5 scan/insert ratio
  • Workload F — Read-Modify-Write, 50/50 read/read-modify-write ratio

ScyllaDB & CockroachDB Configuration Setup

 ScyllaDB ClusterCockroachDB Cluster
AWS EC2 Instance typei3.4xlargei3.4xlarge
Storage (ephemeral disks)2 x 1.9TB NVMe SSD2 x 1.9TB NVMe SSD
Cluster size3-node cluster in a single region across 3 availability zones3-node cluster in a single region across 3 availability zones
Total CPU and RAM16 vCPU | 122 GiB RAM16 vCPU | 122 GiB RAM
Database versions ScyllaDB 4.2.0CockroachDB 20.1.6
Benchmarking tools versions

brianfrankcooper/YCSB 0.18.0 SNAPSHOT with a ScyllaDB-native binding

Token Aware load-balancing policy

brianfrankcooper/YCSB 0.17.0 benchmark with PostgreNoSQL binding

CockroachDB v20.1.6 YCSB port to the Go programming language

Table 1 – The setup for running YCSB against ScyllaDB and CockroachDB

Initial Data Load Results

As stated above, the initial data load for both databases was meant to be 1 billion keys. While the loading ran smoothly for ScyllaDB, CockroachDB latency and throughput degraded from 12K operations per second (OPS) to 2.5K OPS. By our estimates, loading 1billion keys would require at least 111 hours to complete. In our test, the CockroachDB cluster became unresponsive after 5 hours, and also generated critical errors in the log exposing some CockroachDB limitations. 

As a workaround, a smaller dataset of 100 million keys was used with CockroachDB. The smaller dataset loaded in 7 hours without any errors. Figure 1, below, demonstrates the slowing rate of insert statements on the CockroachDB cluster as service latency increased.

crdb-load-2-overview-cluster

Figure 1: Initial Data Load performance for CockroachDB

YCSB Test Results

With datasets loaded, we were ready to run YCSB workloads against the ScyllaDB and CockroachDB clusters

Tests were performed against ScyllaDB from a dedicated AWS EC2 c5.9xlarge instance acting as a client, with 32 vCPU, 72 GB RAM, 10GiB . The following command was used for all YCSB workloads, A, B, C, D, E and F:

$ bin/ycsb run scylla -P workloads/workloada -target 120000 -threads 840 -p recordcount=100000000 -p fieldcount=10 -p fieldlength=128 -p operationcount=300000000 -p scylla.coreconnections=280 -p scylla.maxconnections=280 -p scylla.username=cassandra -p scylla.password=cassandra -p scylla.tokenaware=true -p hosts=10.0.2.51,10.0.3.133,10.0.3.67 -p scylla.readconsistencylevel=ONE -p scylla.writeconsistencylevel=ONE

Tests were performed against CockroachDB from a dedicated AWS EC2 c5.9xlarge instance acting as a client, with 32 vCPU, 72 GB RAM, 10GiB . The following command was used for all YCSB workloads:

$ cockroach workload run ycsb 'postgresql://[email protected]:5000?sslmode=disable' --workload A --insert-count 100000000 --concurrency 840 --max-rate 16000 --max-ops 30000000 --tolerate-errors

For the majority of YCSB workloads, ScyllaDB demonstrated better performance with lower, predictable latencies, even while working on a dataset 10x larger than CockroachDB.  

Figure 2, below, provides a side-by-side overview of ScyllaDB and CockroachDB performance results for YCSB workloads A – F. Overall, ScyllaDB’s throughput ranges from 100K to 180K, with p99 latencies between 5 ms and 30 ms. CockroachDB’s throughput ranges between 16K and 40K, while p99 latencies range between 52 ms and 530 ms.

Figure 2: Throughput and latency of ScyllaDB and CockroachDB per YCSB workload.

In the following sections, we examine CockroachDB vs ScyllaDB test results for each YCSB workload.

Workload A

YCSB workload A performs 50% single-row lookups and 50% single-column updates. According to YCSB documentation, workload A simulates a session store that records recent actions. As such, it is a highly contended workload.

For the 1 billion key dataset, ScyllaDB managed to serve 150K-200K OPS on most of the workloads at 75-80% CPU utilization with reasonable latency. As seen in Figure 3, on average, ScyllaDB achieved 150K OPS with p99 latencies under 12 ms at 75% load.

Meanwhile, for the 100 million key dataset, the CockroachDB managed to serve 16K OPS. As seen in Figure 3, on average, CockroachDB achieved 16K OPS with p99 latencies around 52 ms at 73% load.

Figure 3: ScyllaDB and CockroachDB results for YCSB workload A

ScyllaDB execution results for workload A for 100 million and 1 billion key datasets:

Dataset SizeOverall OPSCPU UtilizationP95 msP99 ms
100M120,00060%3.68.5
1B150,00080%412

CockroachDB Execution results for Workload A for the 100 million key dataset:

Dataset SizeOverall OPSCPU UtilizationP95 msP99 ms
100M16,00073%27.352.4

Workload B

Workload B performs 95% single-row lookups and 5% single-column updates. While less contended than workload A, workload B still bottlenecks on contention as concurrency grows. Using column families slows down the single-row lookups but speeds up the updates. A real-world use case involving workload B is photo tagging. A tag is an update, but the majority of operations are reads.

As demonstrated in Figure 4, ScyllaDB achieved 150K OPS against the dataset of 1 billion keys. ScyllaDB’s p95 latencies were 1.9 ms, while p99 latencies were 7 ms.

Running against the dataset of 100 million keys, CockroachDB managed 35K OPS, with p95 latencies of 50.3 ms, and p99 latencies of 125.8 ms.

Figure 4: Detailed results for YCSB workload B

ScyllaDB results for workload B:

Dataset SizeOverall OPSCPU UtilizationP95 msP99 ms
100M120,00060%12.2
1B150,00050%1.97

CockroachDB results for workload B:

Dataset SizeOverall OPSCPU UtilizationP95 msP99 ms
100M35,00080%50.3125.8

Workload C

Workload C consists entirely of single-row lookups. A real-world application of workload C is a user profile cache, where profiles are constructed elsewhere – for example, in Hadoop. Using column families slows down single-row lookups, so, by default, they are not used.

As demonstrated in Figure 5, ScyllaDB achieved 150K OPS against the dataset of 1 billion keys. ScyllaDB’s p95 latencies were 2.4 ms, while p99 latencies were 7 ms.

Running against the dataset of 100 million keys, CockroachDB managed 38K OPS, with p95 latencies of 31.5 ms, and p99 latencies of 56.6 ms.

Figure 5: Detailed results for YCSB workload C

ScyllaDB results for workload C:

Dataset SizeOverall OPSCPU UtilizationP95 msP99 ms
100M140,00055%1.63.7
1B150,00050%2.47

CockroachDB results for workload C:

Dataset SizeOverall OPSCPU UtilizationP95 msP99 ms
100M38,00080%31.556.6

Workload D

Workload D performs 95% single-row lookups and 5% single-row insertion. As such, it has no contention. A real-world example of workload D is user status updates, where users generally want to read the latest updates. Using column families slows down single-row lookups and single-row insertion, so, by default, they are unused. 

As demonstrated in Figure 6, ScyllaDB achieved 180K OPS against the dataset of 1 billion keys. ScyllaDB’s p95 latencies were 3 ms, while p99 latencies were 5.5 ms. Of all workloads, this produced ScyllaDB’s best performance. 

Running against the dataset of 100 million keys, CockroachDB achieved its best throughput results, managing 40K OPS, with p95 latencies of 100 ms, and p99 latencies of 176.2 ms. Further increasing the load did not increase throughput. Increased load only contributed to higher latency and its variance.

Figure 6: Detailed results for YCSB workload D

ScyllaDB results for workload D:

Dataset SizeOverall OPSCPU UtilizationP95 msP99 ms
100M130,00050%1.23
1B180,00075%35.5

CockroachDB results for workload D:

Dataset SizeOverall OPSCPU UtilizationP95 msP99 ms
100M40,00080%100176.2

Workload E

Workload E performs 95% multi-row scans and 5% single-row insertion and therefore has moderate contention. Using column families slows down multi-row scans and single-row insertion, so those are not used by default.

Workload E defines operations that are short-range scans, rather than queries against individual records. According to the YCSB benchmark, the workload involves “threaded conversations, where each scan is for the posts in a given thread assumed to be clustered by thread ID.” As expected, this workload produced the worst performance – only 10k operations per second.

YCSB Workload E is implemented in a way that does not play to the strengths of either ScyllaDB or CockroachDB features. In ScyllaDB, range partition scans are token-based, and tokens are randomly distributed throughout the cluster. As such, many random reads across multiple nodes are required to satisfy a single scan request. Unsurprisingly, this is not an efficient way to do range-scans. Yet while range scans are not the main advantage of ScyllaDB, ScyllaDB still did well enough and outperformed CockroachDB by 5x.

As demonstrated in Figure 7, ScyllaDB achieved 10K OPS against the dataset of 1 billion keys. ScyllaDB’s p95 latencies were 14 ms, while p99 latencies were 21 ms.

Running against the dataset of 100 million keys, CockroachDB managed 2K OPS, with p95 latencies of 352.9 ms, and p99 latencies of 536.9 ms.

Figure 7: Detailed results for YCSB workload E

ScyllaDB results for workload E:

Dataset SizeOverall OPSCPU UtilizationP95 msP99 ms
100M10,00050%1218
1B10,00050%1421

CockroachDB results for workload E:

Dataset SizeOverall OPSCPU UtilizationP95 msP99 ms
100M2,00050%352.9536.9

Workload F

And finally, workload F performs 50% single-row lookups and 50% single-column updates expressed as multi-statement read-modify-write transactions. As such, it is highly contended. A real-world example of workload F is a user database, in which user records are read and modified by the user, and which also records user activity.

As demonstrated in Figure 8, ScyllaDB achieved 100K OPS against the dataset of 1 billion keys. ScyllaDB’s p95 latencies were 10 ms, while p99 latencies were 26 ms.

Running against the dataset of 100 million keys, CockroachDB managed 6K OPS, with p95 latencies of 41.9 ms, and p99 latencies of 385.9 ms.

Figure 8: Detailed results for YCSB workload F

ScyllaDB results for workload F:

Dataset SizeOverall OPSCPU UtilizationP95 msP99 ms
100M100,000501218
1B100,000801026

CockroachDB results for workload F:

Dataset SizeOverall OPSCPU UtilizationP95 msP99 ms
100M6,0007041.9385.9

The Bottom Line

To summarize the results of the YCSB CockroachDB vs Sylla benchmark comparison:

  • Loading 10x the data into ScyllaDB took less than half the time it took for CockroachDB to load the much lesser dataset.
  • ScyllaDB handled 10x the amount of data.
  • ScyllaDB achieved 9.3x the throughput of CockroachDB latency at 1/4th the latency.

Modern, cloud-native applications often require high availability and predictable low latency. Such workloads are ideal for ScyllaDB. Requirements for strong consistency are less common. Workloads characterized by modest dataset size, and which require strong consistency guarantees and transactions along with a relational database model, involving JOINs, are ideal for a NewSQL database such as CockroachDB.