See all blog posts

Storing and retrieving large data sets in ScyllaDB 1.6 vs. Apache Cassandra 3.0.9

Cassandra CPU profile for ingesting 1.3TB of data

How much data can you store in a single ScyllaDB node?

A reduced node count translates to ease of operations and lower capital expenses. Using ScyllaDB, developers and database operators can store and retrieve at least twice the amount of data in nodes compared to Apache Cassandra-based systems.

In our recent benchmarks, ScyllaDB handled twice the amount of ingested data, while Apache Cassandra-based systems failed to sustain the workload or stay afloat with the amount of data ingested.
Furthermore, a look into the workload shows that while ScyllaDB ingested 1.3TB of data in less than 12 hours including compaction, Apache Cassandra 3.0.9 took over 28 hours to ingest the data, plus an additional 37 hours to complete the compaction process. ScyllaDB was five times faster than Apache Cassandra.

ScyllaDB CPU profile for ingesting 1.3TB of data
ScyllaDB CPU profile for ingesting 1.3TB of data

 

Cassandra CPU profile for ingesting 1.3TB of data
Apache Cassandra CPU profile for ingesting 1.3TB of data

The benchmarks above used Amazon Web Services (AWS) Elastic Compute Cloud (EC2) instances. The instance type used is i2.8xlarge: 32 vCPU, 244GB DRAM and 8x800GB SSDs. Using larger instances of ScyllaDB helps operators condense their clusters’ footprint and lowers the management burden without compromising performance or high availability. (Note: Performance and pricing are even better with i3 – watch this space for a new blog post soon!)

You can read much more about the benchmark here.

Apache®, Apache Cassandra®,  are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.

About Eyal Gutkind

Eyal Gutkind is a solution architect for ScyllaDB. Prior to ScyllaDB Eyal held product management roles at Mirantis and DataStax. Prior to DataStax Eyal spent 12 years with Mellanox Technologies in various engineering management and product marketing roles.Eyal holds a BSc. degree in Electrical and Computer Engineering from Ben Gurion University, Israel and MBA from Fuqua School of Business at Duke University, North Carolina.