See all blog posts

Benchmarking MongoDB vs ScyllaDB: IoT Sensor Workload Deep Dive

benchANT’s comparison of ScyllaDB vs MongoDB in terms of throughput, latency, scalability, and cost for an IoT sensor workload

BenchANT recently benchmarked the performance and scalability of the market-leading general-purpose NoSQL database MongoDB and its performance-oriented challenger ScyllaDB. You can read a summary of the results in the blog Benchmarking MongoDB vs ScyllaDB: Performance, Scalability & Cost, see the key takeaways for various workloads in this technical summary,  and access all results (including the raw data) from the benchANT site.

This blog offers a deep dive into the tests performed for the IoT sensor workload. The IoT sensor workload is based on the YCSB and its default data model, but with an operation distribution of 90% insert operations and 10% read operations that simulate a real-world IoT application. The workload is executed with the latest request distribution patterns. This workload is executed against the small database scaling size with a data set of 250GB and against the medium scaling size with a data set of 500GB.

Before we get into the benchmark details, here is a summary of key insights for this workload.

  • ScyllaDB outperforms MongoDB with higher throughput and lower latency results for the sensor workload except for the read latency in the small scaling size
  • ScyllaDB provides constantly higher throughput that increases with growing data sizes up to 19 times
  • ScyllaDB provides lower (down to 20 times) update latency results compared to MongoDB
  • MongoDB provides lower read latency for the small scaling size, but ScyllaDB provides lower read latencies for the medium scaling size

Throughput Results for MongoDB vs ScyllaDB

The throughput results for the sensor workload show that the small ScyllaDB cluster is able to serve 60 kOps/s with a cluster utilization of ~89% while the small MongoDB cluster serves only 8 kOps/s under a comparable cluster utilization of 85-90%. For the medium cluster sizes, ScyllaDB achieves an average throughput of 236 kOps/s with ~88% cluster utilization and MongoDB 21 kOps/s with a cluster utilization of 75%-85%.

Scalability Results for MongoDB vs ScyllaDB

Analogous to the previous workloads, the throughput results allow us to compare the theoretical scale up factor for throughput with the actually achieved scalability. For ScyllaDB the maximal theoretical throughput scaling factor is 400% when scaling from small to medium. For MongoDB, the theoretical maximal throughput scaling factor is 600% when scaling size from small to medium.

The ScyllaDB scalability results show that ScyllaDB is able to nearly achieve linear scalability by achieving a throughput scalability of 393% of the theoretically possible 400%.

The scalability results for MongoDB show that it achieves a throughput scalability factor of 262% out of the theoretically possible 600%.

 

Throughput per Cost Ratio

In order to compare the costs/month in relation to the provided throughput, we take the MongoDB Atlas throughput/$ as baseline (i.e. 100%) and compare it with the provided ScyllaDB Cloud throughput/$.

The results show that ScyllaDB provides 6 times more operations/$ compared to MongoDB Atlas for the small scaling size and 11 times more operations/$ for the medium scaling size. Similar to the caching workload, MongoDB is able to scale the throughput with growing instance/cluster sizes, but the preserved operations/$ are decreasing.

 

Latency Results for MongoDB vs ScyllaDB

The P99 latency results for the sensor workload show that ScyllaDB and MongoDB provide constantly low P99 read latencies for the small and medium scaling size. MongoDB provides the lowest read latency for the small scaling size, while ScyllaDB provides the lowest read latency for the medium scaling size.

For the insert latencies, the results show a similar trend as for the previous workloads. ScyllaDB provides stable and low insert latencies, while MongoDB experiences up to 21 times higher update latencies.

Technical Nugget – Performance Impact of the Data Model

The default YCSB data model is composed of a primary key and a data item with 10 fields of strings that results in documents with 10 attributes for MongoDB and a table with 10 columns for ScyllaDB. We analyze how performance changes if a pure key-value data model is applied for both databases: a table with only one column for ScyllaDB and a document with only one field for MongoDB keeping the same record size of 1 KB.

Compared to the data model impact for the social workload, the throughput improvements for the sensor workload are clearly lower. ScyllaDB improves the throughput by 8% while for MongoDB there is no throughput improvement. In general, this indicates that using a pure k-v improves the performance of read-heavy workloads rather than write-heavy workloads.

Continue Comparing ScyllaDB vs MongoDB

Here are some additional resources for learning about the differences between MongoDB and ScyllaDB:

ScyllaDB Summit 2023 Speaker – Daniel Seybold, benchANT, Chief Technical Officer

About Daniel Seybold

Daniel started his career as PhD student in the area of cloud computing with a focus on distributed databases in the cloud. Further interests cover cloud orchestration, model-driven engineering, and performance evaluations of distributed systems. After completing his PhD, Daniel has co-founded the Benchmarking-as-a-Service platform benchANT where he is responsible for the product development.