See all blog posts

ScyllaDB Manager Supports Google Cloud Storage

ScyllaDB Manager automates the backup process and allows you to configure how and when backups occur. Until now, ScyllaDB Manager supported only AWS S3 compatible APIs as a backup storage. Starting with ScyllaDB Manager 2.2 we’ve added Google Cloud Storage (GCS) to the list.

Having support for backup in Google Cloud Storage allows ScyllaDB clusters deployed on Google Cloud Engine to be protected from disasters, and also minimizes the time needed to clone or fast migrate your clusters.

This article shows how to perform backup of your data to a GCS bucket.

Backup

Let’s assume we already have a ScyllaDB cluster up and running. We will use 3x n1-standard-4 nodes using local NVMe drives.

First let’s ensure that all ScyllaDB Manager Agents are upgraded to at least version 2.2. We can use sctool status to check Agent version:

Let’s schedule a daily backup of the entire cluster using a “manager-prod-backup” bucket. You have to use the “gcs” prefix in the location parameter in order to tell ScyllaDB Manager that you want to upload to Google Cloud Storage.

$ sctool backup --cluster prod-cluster --location gcs:manager-prod-backup --retention 7 --interval 1d

After a while, we can check if backup is running:

And after around 20 minutes the backup is finished:

Let’s check bucket content. You can see there are three directories under the “backup” directory:

  1. meta – contains metadata about backup like cluster node IDs, names of keyspaces and tables etc.
  2. schema – contains schema definitions for each backed up keyspace.
  3. sst – contains SSTable files.

If you’re interested in more details about backup management, make sure to check our recent blog about backups.

Efficiency

Let’s check if and how ScyllaDB Manager performs during backup procedure and how it affects latency.

Our cluster was under ~30% mixed load, as you can see on graphs the effect of running backup is negligible (backup started at 16:43 and ended 17:03):

ScyllaDB Manager Agent tries to be efficient in terms of CPU and memory usage to not affect workload latency. You can see that each Agent used around 300 MiB of memory during backup and releases it once backup is done.

Summary

Ease of configuration and efficiency of the ScyllaDB Manager Agent are very important in order to achieve best results, that’s why we constantly work on improving our support for storage providers and expanding the list of supported ones.
If you have a suggestion on what we should support next we would like to hear your feedback. Enterprise customers can open a ticket with our support team. Open Source users can always reach us on Slack, or contact us privately.

Maciej Zimnoch

About Maciej Zimnoch

Maciej is a Go and C++ enthusiast. He is a software engineer working on ScyllaDB management tools. Previously he worked in network companies where he delivered multiple features to SDN solutions and LTE networks.