Monster Scale Summit Planet background
yellow-star
blue-star
Planet-Jodorowski
yellow-star
blue-star
yellow-star
yellow-star
yellow-star
blue-star
blue-star
yellow-star
blue-star

Monster Scale Summit

Monster Scale Summit logo

Extreme scale engineering

Discover the latest trends and best practices impacting data-intensive applications. Register for access to all 50+ sessions available on demand.

Planet Herbert
planet-path

Extreme Elasticity with Tablets, Raft and Kubernetes

Maciej Zimnoch10 minutes
Share this
Share this

In this Monster Scale Summit Presentation

Recent ScyllaDB versions have improved elasticity using Tablets and Raft-based Consistent Topology Changes, allowing for fast bootstrapping and parallel scaling. A demo presents doubling cluster size and autoscaling after crossing 90% disk utilization.

Planet-McKenna
Planet-McKenna
Monster Scale Summit 2025

Maciej Zimnoch, Senior Software Engineer, ScyllaDB

Maciej Zimnoch is a core developer of ScyllaDB Operator – a Kubernetes Operator for ScyllaDB.

Video Transcript

Maciej Zimnoch shows how tablets, Raft, and the Scylla Operator give ScyllaDB elasticity on Kubernetes. Earlier, scaling requires one node at a time, gossip coordination, and full data streaming, taking hours or days. Raft now reaches consensus in seconds, and nodes start serving before the background stream finishes. A demo doubles cluster size in about a minute and rebalances within three. A second demo auto‑scales when disk hits 90 %, holding utilization high while cutting waste.

Topics discussed

  • What single‑node bootstrap with Gossip and full streaming looks like in pre‑tablet ScyllaDB clusters
  • How tablets, Raft, and consistent topology changes let nodes join in seconds and scale in parallel
  • How the Scylla Operator on Kubernetes doubles cluster size in one minute and finishes load balancing in three
  • When a Horizontal Pod Autoscaler adds one node per rack as soon as disk exceeds 90 % utilization
  • Why tablets allow safe 90 % SSD usage, lowering hardware costs while keeping clusters responsive

Takeaways

  • Tablets and Raft replace gossip and per‑node streaming, so new nodes reach quorum in seconds and start coordinating traffic even while data backfills, slashing bootstrap overhead.
  • A Kubernetes‑managed cluster scales horizontally in parallel; the demo adds three nodes, doubles capacity in ~60 s, and rebalances traffic by the three‑minute mark with only minor latency upticks.
  • Disk‑aware autoscaling uses tablets’ 90 % utilization headroom plus HPA to add capacity without manual action, cutting scaling lag from hours to minutes and avoiding under‑used SSDs.
  • Combining rapid node readiness with background file‑based streaming keeps the cluster online during expansions, making maintenance windows shorter and protecting P95/P99 latency.

Top takeaway:  Tablets and Raft let ScyllaDB clusters on Kubernetes double capacity in under three minutes while keeping latency in check.

Moebius-Planet
planet-glow-purple
Planet-Jabir