See all blog posts

Scylla Manager 2.2: Repair Revisited

We released Scylla Manager 1.0 to help our users manage repairs in February 2018. Since then a lot of things have changed. We added the row-level repair to Scylla Open Source Scylla, and to the Enterprise version this year. We just released Scylla Manager 2.2 with a new repair method optimized for row-level repair.

New repair features in Scylla Manager 2.2

Parallel repairs

Scylla Manager chooses the optimal number of parallel repairs for a keyspace. This is beneficial for big clusters. For example, given a 12 node cluster and a keyspace with replication factor 3, it can be repaired up to 4 times faster than with Scylla Manager 2.1. This is done by repairing distinct replica sets in a token ring in parallel. Scylla Manager ensures that each node takes part in at most one Scylla repair job at all times.

The following diagram shows a model of how token ranges are replicated in a token ring.

Token ranges 11-20 are replicated by nodes N1, N2, N3. When they are repaired we can still repair token ranges 41-50 replicated by nodes N4, N5, N6 and token ranges 71-80 replicated by nodes N7, N8, N9. We could also repair ranges 101-110 on nodes N10, N11, and N0. Otherwise Node N0 is idle; we can only repair token ranges 1-10 when nodes N1, N2 (and N3) are done repairing. The process continues until the whole ring is repaired.

The parallel limit can be described by the following formula:

max_parallel = floor(# of nodes / keyspace RF)

In a multi DC setting the keyspace RF in the above formula is a sum of replication factors in different DCs.

Adjustment of repair intensity to a node capabilities

Repair intensity feature was added in a previous version of Scylla Manager. It lets you specify how many token ranges (per shard) to repair in a single Scylla repair job. For Scylla clusters that do not support row-level repair, intensity can also be a decimal between (0,1). In that case it specifies percent of shards that can be repaired in parallel on a repair master node.

Scylla Manager 2.2 adds support for intensity=0. In that case the number of token ranges is calculated based on node memory and adjusted to the Scylla maximal number of ranges that can be repaired in parallel (see max_repair_ranges_in_parallel in Scylla logs). If you want to repair faster, try setting --intensity 0.

Changing the repair speed in flight

In Scylla Manager 2.2 we added two new sctool repair subcommands: sctool repair update and sctool repair control. The former replaces the task update command and extends it with the capabilities to update any repair parameter in an existing repair task. The latter allows you to change the intensity and parallelism as you repair without ever restarting the task. In contrast to the update command the control command does not persist the changes for future runs. The current values for intensity and parallel can be checked in sctool task progress.

Example:

Run the following command to adjust the number of token ranges per Scylla repair job to the maximum supported (in parallel) by a repair master node.

sctool repair control -c prod-cluster --intensity 0

Support for schema changes during repair

In Scylla Manager 2.2 repair goes table by table. This has many benefits. On a node Scylla repair jobs need to read fewer files, and we keep coming back to the same files before we move on to the next table. When there is a new table added, it is not a concern of Scylla Manager since the list of tables to repair is created in the repair init phase. When a table is deleted during repair, Scylla Manager gracefully handles that and moves on to the next table.

Small table optimization

Tables that contain less than 1G of data would be repaired in a few Scylla repair jobs, one per replica set. This saves time on Scylla repair job creation and status checking. The small table threshold can be changed as a repair command parameter.

Graceful stop

Whenever a user decides to stop a repair there is a grace period to wait for the ongoing repair jobs before killing them. The duration of the grace period by default is 30s, the default can be changed in the configuration file.

New sctool repair progress

Repair progress and duration is displayed per table. In detailed view users may now see the actual repair progress of a keyspace or a table on each node.

Benchmark
In the benchmark we run 9 Scylla 2020.1 nodes on AWS i3.2xlarge machines. Each node has 8 cores, 61GB of memory, and holds approximately 1.8TB of data. During the tests the nodes are 50% loaded with a read workflow, and 10% of data needs to be repaired, all the nodes have missing data. The following dashboard shows the system in the stabilisation phase before repair starts.

The following chart presents effects of changing the parallel values between full parallelism (default), two parallel threads and one parallel thread.

The following chart presents effects of changing intensity while running at full parallelism. One can observe that when the system is loaded adding more repairs does not speed things up too much but it’s still in the range of 10%.

If we take out parallel from the equation changing intensity from 1 (default) to 2 gives 20% improvement. On an idle cluster intensity is more relevant.

Summary

Scylla Manager 2.2’s parallel repair mechanism is a great supplement to Scylla row-level repair. Give it a try today! You can download it from the Scylla Download Center. Also, if you are interested in learning more about Scylla Manager and repairs, check out this free lesson in Scylla University!

Michal Matczuk

About Michal Matczuk

Michał is a software engineer working on Scylla management. He's a Go enthusiast and contributor to many open source projects. He has a background in network programming. Prior joining ScyllaDB, he worked with StratoScale and NTT.