We recently hosted the webinar, Steering the Sea Monster: Integrating Scylla with Kubernetes. We received a lot of great questions during the live webinar, so here they are along with our answers. Miss the live webinar? You can access the on-demand version here.
What’s the difference between Helm and the YAML manifest files?
Kubernetes manifest files (written in YAML) describe the specification of Kubernetes API objects like StatefulSets and Services. Helm is a packing system that can combine and automate the deployment of many manifest files and help an administrator to manage them effectively. Helm calls groups of manifests (and their configuration) “Helm Charts”.
If I lose a Scylla pod of a StatefulSet, and rather than trying to bring that Node back, I create a new Node, does the StatefulSet replace the old Scylla pod from the old node by rescheduling it on a new node? Or does it generate a new pod replica in the sequence?
The new Scylla pod will replace the downed pod. For example, let’s image that we initially had scylla-0, scylla-1, and scylla-2 pods. If pod scylla-1 goes down, the StatefulSet will create an additional pod and it will still be called scylla-1.
Are multi-datacenters supported when Scylla is deployed with Kubernetes?
Yes, multi-datacenters are supported with Scylla on Kubernetes. To deploy a Scylla cluster in each region, you are required to create a StatefulSet per region and each region can be scaled separately. The StatefulSet will include the definition of the end-point-snitch, seed nodes information, and prefix information per datacenter.
How does Kubernetes know that a pod is ready?
A pod is ready when Kubernetes has completed a status probe, in our case nodetool status. Once nodetool reports that the pod is up and running, Kubernetes will mark it as ready.
The StatefulSet is responsible for creating DNS records for nodes?
Yes, the StatefulSet is responsible for setting an internal DNS server to register the different instances that the StatefulSet consists of. For example, if the StatefulSet includes four instances and the pod moves to a different host, the DNS information will be updated accordingly.
Is there a performance penalty when running Scylla on Kubernetes + containers?
At the time of writing this blog post, there is a performance degradation using Scylla in Docker containers. For Kubernetes, we are using the Scylla default Docker container as the base image. From recent testing conducted, we have seen 25% to 40% degradation of performance. Scylla is actively working to improve its performance in Docker containers.
Why use Docker for Scylla if performance is not as good as using the AMI?
Some users prefer to have a single deployment platform that can be used across multiple infrastructure systems, e.g., cloud, on-prem. Kubernetes provides the users with a single view of the platform. AMIs are specific for AWS deployments. While It is possible to create preconfigured images for many on-prem and cloud solutions, Kubernetes and Docker offer a single image to be used over any infrastructure.
How would you upgrade Scylla in this deployment?
Users are required to take a snapshot of the current data and configuration of the image. Once the persistent storage directory is backed up, we can remove the current container and attach the upgraded version to the pre-existing persistent storage used in the previous container.
Will the Helm Chart be available?
Yes, the Helm Charts are currently available from Scylla code examples, and we are working to submit the charts into the main Helm project repositories.
How is the deployment/decommission affected by the amount of data?
In the case of decommissioning or adding pods, Scylla will redistribute the data. Streaming of data between nodes implies less CPU, I/O , and network bandwidth to the normal operations. The more data, the more throughput is required from the I/O and network system to maintain SLA.