See all blog posts

ScyllaDB Elastic Scaling in Action [Demo]

Watch along to see how fast ScyllaDB X Cloud scales from 10K to 1M ops/sec and back down again – with single-digit millisecond latency

ScyllaDB X Cloud is ScyllaDB’s fully-managed database-as-a-service. It’s a truly elastic database designed to support variable/unpredictable workloads with consistent low latency as well as low costs.

We’ve previously blogged about how users can scale out and scale in almost instantly to match actual usage. For example, you can scale all the way from 100K OPS to 2M OPS in just minutes, with consistent single-digit millisecond P99 latency. This means you don’t need to overprovision for the worst-case scenario or suffer the lag traditionally associated with ramping up capacity in response to a sudden surge.

In this post, I want to show you how it looks in action: increasing capacity 10X, as well as scaling it back, in minutes.

Part 1: Scaling 10X Fast, with Single-Digit Millisecond P99 Latency

This first video provides a quick look at how fast ScyllaDB XCloud scales out to increase capacity. It shows you how ScyllaDB’s new tablets architecture lets you scale a cluster to support 10x or more workload capacity in minutes (vs. the usual hours or days). Simulating a massive sales event, we scale a cluster from a moderate 100K ops/sec up to 1M.

As we start, the cluster is currently managing a moderate load of 100K ops/sec across three small nodes. Knowing that a surge of 1M ops/sec is imminent, we use the built-in calculator to precisely size our needs. By simply entering the desired read and write throughput and selecting the schema complexity, the system automatically determines the necessary vCPU requirement. In this case, we add three larger nodes to our existing setup.

Once the new scaling policy is saved, you can watch the scaling happen as the nodes join and tablets are automatically streamed and rebalanced in parallel. In this demo, the entire scale-out process, including data rebalancing, completed in roughly 23 minutes—all while the cluster remained under load. You’ll see that the new nodes immediately start sharing the responsibility of serving requests even before the rebalancing is fully finished.

Finally, we simulate the 10x load jump to 1 million operations per second. You can see that even with mixed instance sizes, ScyllaDB perfectly balances the workload, with the larger nodes serving more requests as expected. Most importantly, despite this massive increase in traffic, the cluster maintains impressive performance. It achieves single-digit millisecond P99 latencies throughout the entire event.

Part 2: Achieving Rapid Parallel Scale-Down After Peak Workload

This next video demonstrates the process of scaling the ScyllaDB cluster back down to its original size following a simulated high-traffic sales event. You can see how the system handles a drop from 1M ops/sec down to its baseline load.

After running at 1M ops/sec for about 20 minutes, our simulated sale event has concluded. That means our load is dropping back to its original 100K ops/sec. Once the load stabilizes and the monitoring overview panel confirms that we are back to 20K writes and 80K reads, we’re ready to scale the cluster back to its original size of 24 vCPUs. To do this, we simply update the scaling policy back to 24 vCPUs. That leaves us with the same three 2x large nodes we had before the simulated sale event started.

As the scaling progress begins, we can watch the nodes leave the cluster in real-time. By viewing the monitoring dashboard’s detailed panel, we can see an animation of the tablets streaming from the larger 8x nodes back to the original nodes. Once that’s completed, the cluster is back to its original configuration of three nodes. The scale-down process took about 22 or 23 minutes, which is nearly identical to the time it took to scale up earlier (in the other video).

While scaling out has always been fast with tablets, scaling back down used to be a sequential process. Now, starting with version ScyllaDB 2026.1.3, we can scale the cluster in parallel both out and back. That makes it possible to handle a massive workload spike and return to baseline capacity all within about an hour.

ScyllaDB Cloud – Free Trial

About Faisal Saeed

Faisal Saeed is a Senior Solutions Architect at ScyllaDB with more than 30 years of hands-on experience in database architecture and software development. He is dedicated to solving complex enterprise data problems and optimizing large-scale data systems.