Hello, and welcome for this session about ScyllaDB Operator for Kubernetes. To introduce ourselves, my name is Tomas Nozicka. I am a Principal Software Engineer at ScyllaDB where I lead the development of ScyllaDB Operator. Before joining ScyllaDB, I used to work for Red Hat writing operators for the OpenShift control plane and also contributing to Kubernetes upstream. Together with me today, there is Maciej Zimnoch.
He’s a Senior Software Engineer with ScyllaDB as well. He is one of the maintainers for the ScyllaDB Operator project, and before that, he used to work on ScyllaDB Manager, and before joining ScyllaDB, he used to write software for instant messaging servers, SDN and LTE networks. So for those who don’t know about ScyllaDB Operator yet, it is a Kubernetes Operator for managing and automating tasks related to managing ScyllaDB clusters. In that complicated sentence, what it actually means is that ScyllaDB Operator extends the Kubernetes API which allows you to declare your intent, how your ScyllaDB clusters should look like, and it will take any action necessary to create such cluster and any other sources in the Kubernetes cluster for you. Here are two important links. One is for a GitHub where the source code is located and where you can report issues should you hit any. There’s also a link for the Operator docs. If you want to go through that, that’s great. For those who don’t know about the Operator, here’s an example of the ScyllaDBCluster object that defines the ScyllaDB Cluster. You specify the version, the manager, agent version, and you define a datacenter. In this case with one rack, three members for eight CPUs, 16 gigabytes of memory, and that’s what you send to the API, and the ScyllaDB Operator will take any action necessary to actually create a cluster for you. We have introduced ScyllaDB Operator at the last summit, the 1.0 version, and since then, we have made six releases, hopefully seven by the time that you actually watch this video because the 1.7 is scheduled to be GA just before the ScyllaDB Summit. Generally we are aiming to ship a release every 6 weeks. We support the latest two releases, so that will give time everybody to update, and you should stay up to date to be on a supported release. We support N-1 compatibility which means you have to update in step, say from 1.3 to 1.4 to 1.5, not to skip any important migration steps. Here I made a screenshot of our GitHub. This graph represents the number of commits, and you can see that we have spent a lot of effort over the last year to make ScyllaDB Operator even better than it was before. And with that, I would like to give the floor to Maciej who is going to talk about some of the improvements that went into the Operator over the last year.
So this meme describes what we have been working on recently. We already have Kubernetes presence thanks to ScyllaDB Operator, but we missed ScyllaDB speed that is available on non-Kubernetes deployments. Because Kubernetes brings yet another virtualization layer, some of the optimizations we have in ScyllaDB Image are not that trivial to apply. So in 1.6 ScyllaDB Operator, we added an automated optimizations. It’s currently an experimental feature, so API may change, but we encourage you to try it on your development cluster right now. It improves performance very much and gives repetitive results. I’ll show you the numbers we get on the later slide, but in 1.6, we added a Custom Resource Definition called Node Config which allows you to tune your Kubernetes nodes by providing the placement options. They control where our tuning pod is deployed. Usually, we recommend dedicating entire nodes to ScyllaDB. So in this example, we set the node selector to nodes which have a label called node type with ScyllaDB value. Of course, you can choose any other label, but it basically means that these nodes will be tuned by the Operator, and it’s as simple as that. The rest is automated. To verify these optimizations are indeed working, we run a benchmark to compare VM and Kubernetes deployments in terms of performance. We did it on the three-node cluster using i3.4xlarge machines both on VMs and Kubernetes, and we run three workflows, read, write and mixed. All of them were disk-intensive. We can see that numbers are getting closer to our VM baseline. We lose around 1 to 2 milliseconds in terms of P99 latency in each of the workflow. The throughput is still high, but there is a difference. So we need to chase what else we can tune to have these numbers even closer. So make sure to follow our resources because we are continuously working on improving the performance. Those who are already experienced with ScyllaDB and Kubernetes may [Indistinguishable] in IO Tune. It’s a 2-minute disk benchmark which is run on ScyllaDB startup to evaluate the disk capabilities. In 1.2, users may save those 2 minutes on every startup by providing their precomputed values. And in 1.7, these benchmarks are automatically cached in persistent location, and it’s reused on every ScyllaDB restart. From 1.4, ScyllaDB Operator supports seedless mode. This means that ScyllaDB nodes are no longer — seed nodes are no longer special, and every node is symmetric. Deployment model hasn’t changed, but some maintenance procedures no longer require additional manual steps around seed nodes. This also means that seed nodes can be automatically replaced, but you have to keep in mind that your ScyllaDB version needs to support this mode. We also added a way to use private image registries with authorization enabled. We extended the ScyllaDBCluster CRD with image pull secrets where a user might specify their private registries of ScyllaDB images and use a secret to pass the credentials. In 1.3, we added autogenerated auth tokens to secure communication between ScyllaDB Operator and ScyllaDB. Previously, users had to set it up manually, so this is no longer needed because ScyllaDB automatically provisions the secret. We are also working on improving the stability of your deployments. In 1.2, Operator automatically provisions a pod disruption budget to protect from evicting to much pod during disruptive operations like Kubernetes upgrades. This improves high availability of your ScyllaDB cluster, and we also changed significantly the Operator deployment model. Previous stateful set were replaced with multi-replica deployments with leader election. In 1.4, we extracted the Webhook into a separate pod to improve high availability of the Operator. And in 1.5, the Operator is also protected from too many disruptions by its own pod disruption budget. Let’s get back to Tomas who will describe even more changes that happened during last year.
Another big feature that we have introduced in 1.4 is the right of the Operator to use informers and the other Kubernetes machinery. For those who don’t know what informers are, those are essentially the caches that all the upstream Kubernetes controllers use. For us, that has made a significant impact, and it has reduced the number of API calls made by the sidecar to 6 percent and 18 percent for the calls from the controller. The machinery is also less bug-prone because it’s typed, and as I said before, it’s battle tested by the Kubernetes controllers and the community. Together with the informers, we have also introduced a full reconciliation in 1.4 which means that any change made to a ScyllaDBCluster object will manifest to a underlying stateful set without any special needs. Previously, our code was structured so there was a specific support if you made a change in the ScyllaDBCluster resource. There was a dedicated code to change something in the stateful set, and we didn’t support changing as much as we could, and this has been now fixed. So we have also enabled changing the resources placement and repository specs that has been previously forbidden, so now you can change the CPU count. You can change the memory for the ScyllaDBClusters, the placements detail on which nodes it should land or the repository spec. Also the controller now prunes the old resources, so if you had a cluster of five nodes that you scaled down to three, previously there were, say, services that were left hanging in the cluster, and now the Operator will correctly delete them. We’ve also worked on the user experience to help you know better in what state the ScyllaDBCluster is. There are now two new fields, the updatedMembers fields, which tells you how many ScyllaDBCluster members have been updated to the new version, and the stale field that tells you if the status is up to date or if there is a change that is happening in the underlying status that hasn’t been reported back yet. Also, it now supports observedGeneration API concept which all the upstream Kubernetes controllers do which lets you assess if that status is for the same generation of your ScyllaDBCluster resource. Another big thing that hopefully will be important for a lot of our users is that you can now force-redeploy the whole ScyllaDBCluster in a rolling update fashion. Previously, you had to roll out each stateful set individually and manually. Now the Operator will manage it for you if you set the forceRedeploymentReason. That is very useful, say, if you change a ScyllaDB config and needed to be reloaded, and you need to restart ScyllaDB. This is how you do it. Another feature that we have introduced since is that our validating webhooks now chain the errors which means you don’t have to iterate one by one. If you, say, create a ScyllaDBCluster, and you mistype a field or set an invalid value, you don’t have to go through the cycle, and it will just tell you all the errors that it sees at once. We have also invested a lot into testing so we ship a much stable product. In 1.2, we have introduced an integrated end-to-end suite that, in addition to our existing QA coverage, and we are gradually growing that because we care about stability. Every new feature that ships with the ScyllaDB Operator is mandated to have an end-to-end test with it. We have also switched the end-to-end tests to run in parallel which brings us closer to a real-world scenario, and it also helps us to run them in a more reasonable time. So what’s next for the ScyllaDB Operator? Maciej has talked about a lot of performance improvements we have made in the past releases, but we are still looking into how we can close that gap even further and one day be very close to or at the same level as the non-Kubernetes deployments are. We are also looking into introducing a persistent storage support because currently we support only the local SSDs, but that’s for performance reasons, right? But if you want to set up a toy cluster or just want to start with it or a dev cluster, you might not care about the performance, but you want something that’s very easy to set up, and persistent storage is something you get with every cloud that you use. We are also looking to introduce managed TLS to set up the certificates for the internal node communication and also to have an API way for you to define the external TLS. We are also looking into supporting MultiDC with the ScyllaDB Operator. First, we would do a manual MultiDC where we would connect to existing ScyllaDBClusters through a seed and eventually, we would like to evolve that into a managed MultiDC where you choose one object, and the Operator will manage the Kubernetes Clusters for you and create ScyllaDBClusters in those. We are also looking to manage the ScyllaDB credentials, so you will have a ScyllaDB secret that you would define the credentials in, and a ScyllaDB Operator would configure ScyllaDB with those. Also, we are trying to expand on our deployment methods. We are looking into Operator life cycle manager to be published the bundles into the operator hub or the OpenShift marketplace. There’s also Azure Cloud on our road map, and we’ve already started some work on autoscaling, and there is a lot more than you can expect from us. And with that, I’d like to thank you for coming to this presentation, and enjoy ScyllaDB Summit. If you want to reach out to us, there is a scylladb-users Slack and ScyllaDB Operator channel there that we hang on or write to the mailing list. Thanks.