We are happy to announce the first release of our new product, Scylla Manager, a management system that automates maintenance tasks on a Scylla cluster. In this release, we now provide a managed repair feature that will automatically run repairs on a cluster.
A Smarter Repair
Repairing a full cluster is not an easy task and multiple strategies exist. We provide a dedicated solution that leverages Scylla’s thread-per-core architecture for the best results. Clusters are repaired node by node, ensuring that each database shard performs exactly one repair task at a time. This gives the best repair parallelism on a node, shortens the overall repair time, and does not introduce unnecessary load. On a shard level, a repair is performed in chunks and can be stopped and resumed.
We want to make operating Scylla clusters simpler. Our solution discovers the cluster topology, available keyspaces, and automatically schedules a repair task of the whole cluster. The user has full control over the process and can provide a custom schedule or on select tables or keyspaces that are to be repaired. The User tracks the progress of a repair in real-time using REST API or a CLI tool that we provide.
Scylla Management is a centralized and Highly Available product. The management data is stored in the Scylla database. Upon restart, all running repairs are resumed and continued from the point where it was left off. If a repair fails for some reason, it’s also automatically retried.
The product comes as a separate server and the diagram below shows how it fits into the deployment environment.
The Scylla Management server can manage multiple clusters and connects to the nodes using SSH. The SSH daemon on each Scylla node is used to open a connection to the local Scylla API port and proxies the connection to the Scylla Management server, there is no JMX involved.
Users interact with the Scylla Management server using our REST API or the command line tool, sctool. Sctool is a convenient tool that lets you register clusters, manage repair schedules, and produces a human-readable output.
The demo below presents a session where sctool is used to register a cluster and run a repair.
To learn more about managed repair, please check out the Scylla Manager Documentation.