See all blog posts

Kubernetes and ScyllaDB: 10 Questions and Answers


We recently hosted the webinar, Steering the Sea Monster: Integrating ScyllaDB with Kubernetes. We received a lot of great questions during the live webinar, so here they are along with our answers. Miss the live webinar? You can access the on-demand version here.

What’s the difference between Helm and the YAML manifest files?

Kubernetes manifest files (written in YAML) describe the specification of Kubernetes API objects like StatefulSets and Services. Helm is a packing system that can combine and automate the deployment of many manifest files and help an administrator to manage them effectively. Helm calls groups of manifests (and their configuration) “Helm Charts”.

If I lose a ScyllaDB pod of a StatefulSet, and rather than trying to bring that Node back, I create a new Node, does the StatefulSet replace the old ScyllaDB pod from the old node by rescheduling it on a new node? Or does it generate a new pod replica in the sequence?

The new ScyllaDB pod will replace the downed pod. For example, let’s image that we initially had scylla-0, scylla-1, and scylla-2 pods. If pod scylla-1 goes down, the StatefulSet will create an additional pod and it will still be called scylla-1.

Are multi-datacenters supported when ScyllaDB is deployed with Kubernetes?

Yes, multi-datacenters are supported with ScyllaDB on Kubernetes. To deploy a ScyllaDB cluster in each region, you are required to create a StatefulSet per region and each region can be scaled separately. The StatefulSet will include the definition of the end-point-snitch, seed nodes information, and prefix information per datacenter.

How does Kubernetes know that a pod is ready?

A pod is ready when Kubernetes has completed a status probe, in our case nodetool status. Once nodetool reports that the pod is up and running, Kubernetes will mark it as ready.

The StatefulSet is responsible for creating DNS records for nodes?

Yes, the StatefulSet is responsible for setting an internal DNS server to register the different instances that the StatefulSet consists of. For example, if the StatefulSet includes four instances and the pod moves to a different host, the DNS information will be updated accordingly.

Is there a performance penalty when running ScyllaDB on Kubernetes + containers?

At the time of writing this blog post, there is a performance degradation using ScyllaDB in Docker containers. For Kubernetes, we are using the ScyllaDB default Docker container as the base image. From recent testing conducted, we have seen 25% to 40% degradation of performance. ScyllaDB is actively working to improve its performance in Docker containers.

Why use Docker for ScyllaDB if performance is not as good as using the AMI?

Some users prefer to have a single deployment platform that can be used across multiple infrastructure systems, e.g., cloud, on-prem. Kubernetes provides the users with a single view of the platform. AMIs are specific for AWS deployments. While It is possible to create preconfigured images for many on-prem and cloud solutions, Kubernetes and Docker offer a single image to be used over any infrastructure.

How would you upgrade ScyllaDB in this deployment?

Users are required to take a snapshot of the current data and configuration of the image. Once the persistent storage directory is backed up, we can remove the current container and attach the upgraded version to the pre-existing persistent storage used in the previous container.

Will the Helm Chart be available?

Yes, the Helm Charts are currently available from ScyllaDB code examples, and we are working to submit the charts into the main Helm project repositories.

How is the deployment/decommission affected by the amount of data?

In the case of decommissioning or adding pods, ScyllaDB will redistribute the data. Streaming of data between nodes implies less CPU, I/O , and network bandwidth to the normal operations. The more data, the more throughput is required from the I/O and network system to maintain SLA.

Our next webinar, ‘Analytics Showtime: Powering Spark with ScyllaDB’ is on June 27th where we will cover best practices, use cases, and performance tuning. We hope you can join us! REGISTER NOW

Next Steps

  • Learn more about ScyllaDB from our product page.
  • See what our users are saying about ScyllaDB.
  • Download ScyllaDB. Check out our download page to run ScyllaDB on AWS, install it locally in a Virtual Machine, or run it in Docker.

About Eyal Gutkind

Eyal Gutkind is a solution architect for ScyllaDB. Prior to ScyllaDB Eyal held product management roles at Mirantis and DataStax. Prior to DataStax Eyal spent 12 years with Mellanox Technologies in various engineering management and product marketing roles.Eyal holds a BSc. degree in Electrical and Computer Engineering from Ben Gurion University, Israel and MBA from Fuqua School of Business at Duke University, North Carolina.