ScyllaDB Cloud Goes Serverless

Yaniv Kaul18 minutesFebruary 15, 2023

Learn how ScyllaDB Cloud is moving to serverless, transforming its single tenant deployment model into a multi-tenant architecture based on Kubernetes. Discover the engineering innovation required, and the user value of the new architecture, including use of encryption (both at flight and at rest), performance isolation, and the capability to scale elastically.

Share this

Video Slides

Video Transcript

Hello! I’ll be talking on ScyllaDB Cloud serverless, a new important feature coming to ScyllaDB Cloud. We’ve already released it as a beta just a month or so ago for the trial tier and we are very excited about it. Today I wish to share more details about it with you but let me start by introducing myself.

I’ve only been at ScyllaDB for a few months now and I can tell you how energized I am with the work we are doing I’ve been part of two major releases already five one on the open source side and 2022.2 on the Enterprise side as well as this initiative serverless which is an exciting journey by itself I’ve been working with and contributing to open source for more than 20 years now and it’s always been a great experience I’ve moved to working with storage about 10 years ago I still remember someone advising me against it storage is slow moving huh I personally find the challenges with performance and reliability motivating what’s the B doing here well it’s one of my hobbies taking closer pictures of bees and butterflies and flowers this one was taking in the northern part of Israel on a lovely day just a month ago

so what are you going to hear today from me I’ll give you a short introduction a tour of how it looks like and how you work with serverless then I’ll provide a behind the scenes overview of its architecture in between I’ll mention sound implementation challenges we’ve had along the way and our design choices let’s get started

I’d like to begin my introduction with a quick reminder of what Silla cloud is then what is serverless and we’ll move over quickly to a visual tour of the user experience

so let’s start with a short recap of what Silla cloud is this is a ScyllaDB offering for customers who do not wish to run steel Enterprise by themselves and own the day 1 and day 2 operations and prefer that we manage the Clusters according to best practices monitor their health provide data operations such as upgrade or preemptive expansion and so on it is currently available in multiple flavors specifically it is running on both Amazon and Google clouds in multiple regions around the globe today in sync cloud every tenant gets a dedicated set of complete virtual machines for their clusters we picked them according to our sizing guide taking into account performance and capacity requirements when needed we make adjustments for actively replace nodes and if needed we change the architecture we deploy more nodes we do different sizing anything to it that needs to happen or reactively when a node becomes faulty we also scale up and scale down regularly based on customer demands so what is serverless all about serverless is a major step we have taken into our next Generation architecture running ScyllaDB clusters on top of kubernetes infrastructure infrastructure component management such as storage networks security monitoring logging and more is handled by kubernetes and Via standard kubernetes Primitives operators crds and could control if needed there is also a seamless integration with the cloud services as an example VPC for networking instant provisioning for scaling security features and so on as for the core sealer components themselves cylinder nodes have been delivered via containers for quite some times along with other distribution format of course like this RPM and Dev packages or Mi images the silicubernetes operator has matured quickly gaining more and more day one and day two operations and integration it is one of the Magic ingredients of this effort let’s take a tour of how the user facing part of it looks like upon logging to ScyllaDB Cloud you can create a cluster and right now serverless isn’t better let’s take a look at how it works

during cluster creation the flow behind the scenes is that our Fila Cloud management software has the operator to create a full circle cluster it provides it with the input from the user and reached with the default settings for example replication factor and the cluster is formed initialized and eventually launched are a default for example there’s three replicas each and its own availability Zone in a single region and there we go we launched our cluster our cluster is ready it is fully deployed and accept can accept user input we can now go ahead and proceed to connect to it let’s proceed to see how the UI looks like for a running cluster just a note before that all the cluster creation flow can also be done via a rest API or a terraform provider which of course uses our rest API so this process can be easily integrated into a customer’s automated workflow

here we can see the overview of the cluster we can get some of the data on the running cluster and very basic statistics of operation from here we can go to the connect tab which we will in second we can launch the external monitoring view which I’ll show later or we can take some actions on the cluster let’s move to the connect Tab and learn how we connect using cql to the Run cluster the first step in connecting with the cluster is downloading the connection bundle to the client the bundle is a simple Yahoo file containing all the connection details needed for a serverless aware driver to connect to it let’s take a concrete example the SQL shell has an example of how we connect with containerized circulation so it will be easier to use I’ve copy pasted the command line from the example in the UI let’s examine it in details

we call Docker as the container execution engine to launch a container and we test the container a file from the host the connection bundle we’ve downloaded earlier that file is passed as a configuration file to situation this provides security all the information needs to securely connect to the cluster

and voila we are connected as I’ve mentioned earlier all we had to do is pass the connection bundle file to the virus drivers in order to connect to the class this provides a very smooth experience to the user

here we can see an example written in Python I promise you I won’t go over it line you can see from this plane example that all we do is construct a cluster object with that aforementioned bundle via the path and then it simply calls the connect method to connect to it that’s all there is to it

last user interface element I’d like to show you is the monitoring stack it’s the usual stack of grafana with Prometheus reading our exported metrics notably missing is the node OS level metrics that we used to have in VM or bare metal based clusters as they are simply Irrelevant for servers however we do monitor the kubernetes nodes and we do collect those metrics in our style as our devops team watch the kubernetes node Health continuously

now that we have seen how it looks like that let’s do some deep dive and to how we’ve built it I’ll explain the high level architecture of it in a few slides

I’ll be honest kubernetes was not designed from the ground up to be running stateful application and probably at least initially not for high performance latency sensitive applications so I’ll admit it has made steady Improvement in both areas ScyllaDB is a stateful application at database requires more processes and steps than a status application as an example rolling upgrade of the application must be properly coordinated and executed one node at a time making sure they’re all responsive after outbreak thoughts have a persistent identity for example ScyllaDB also has strict requirements for the operating system itself in order to achieve very high performance CPU scheduling higher requirements are to specific examples we also have deployment requirements we support multiple data centers in different regions and we use nvme disk for high performance

as mentioned earlier ScyllaDB Cloud management system already supports multiply cloud provider including Amazon with ScyllaDB AWS account or bring your own account as well as Google Cloud platform so extending it by treating service is just another Cloud flavor with a natural fit for the architecture for us we do run multiple kubernetes clusters we separate infrastructure Services used for management monitoring and alerting to a separate kubernetes cluster than the one that is running the workloads themselves

we deploy the ScyllaDB clusters themselves on kubernetes worker nodes the ScyllaDB operator takes care of it in two steps the first is initialize the new kubernetes node if needed and one is not already available we run a privileged operator that we prepares some parts of the node for our own usage specifically in case the kubernetes node have multiple disks we set it up in a RAID 0 configuration for market and set some hours configuration items the second step is to run a disk IO benchmarking software so we’ll know how much iops we can squeeze from the local disk I’ll show in details later how we use this information

now the silica kubernetes operator using a crd deploys a complete sealer cluster number of nodes size which is the number of cores the memory and of course the disk the distribution the replication Factor specifically replication Factor equals three for the time being it configures and initialize the ScyllaDB nodes and waits for the cluster formation it installs the sealer manager agent and configurus it sets up a monitoring stack lastly it deploys the driver proxy also known as snip proxy which I’ll discuss later and we’re done the cluster is ready for use

so what changed with virtual machine you have a fixed set of relationship between number of course amount of memory and size of storage each node of a cluster can have scaling can be performed either by adding or removing nodes or by replacing nodes with different node types somewhat heavy operations using kubernetes we abstract underlying hardware and provide much more flexibility and elasticity for interviews we match CPU and memory resources but both can scale and so can the storage economically this means customers no longer need to commit to Hardware resources ahead of time they can scale much more easily and quickly with resources that match their performance and capacity requirements in a granular and efficient manner

let’s discuss in more depth a critical feature that’s part of the architecture tenant isolation with multiple tenants running side by side on the same physical node we have to ensure all resources are shared but isolated memory and and CPU is kind of built in to kubernetes we set a request and limit on the resource definition and we schedule them using CPU sets on different V chords so they don’t impact each other these resources are shared and it’s the more interesting part let’s look into how it’s achieved

first we have to ensure no customer uses the whole storage or just more than it was allocated for that use the xfs file system quota feature we Define a quota for each month to scale we use the CSI volume expansion API and it adjusts the quarter accordingly and dynamically of course CSI by the way is the container storage interface it’s a standard API in kubernetes to handle storage

quota itself is not enough just as important is to ensure there’s no Noisy Neighbor impacts that different tenants don’t use all the iops the underlying storage can provide I will not go right now into the details about ScyllaDBs IO scheduler works but if you recall I mentioned earlier we run a benchmarking software on the underlying storage um from its result we know how many iops we have overall and we can provision it across tenant relative to the number of V cores each tenant was assigned so we split the resources over all the all iops that we have in turn node into the different tenants each tenant gets a slice of it and its higher scheduler knows to adjust to it

here’s a very preliminary results of our performance test on a small scale system just to explain in a high level what the test is running multiple workloads for multiple client against multiple tenants running side by side we measure throughput operations per second and tail latency the famous P99 for all workloads as you can see we have two different workloads of ltp and Outlet an interactive and a non-interactive workload the overall plan here was to measure we can run multiple tenants and indeed witnessed three key achievement the first is tenant isolation tenants do not suffer from Noisy Neighbor impact the second is there is fairness between evenly set up tenants each get their own share of the entire horsepower that the kubernetes node can deliver the third is that aggregated we are close to what this node had been a single VM can deliver we’re still not there we still have some known gaps and we’re working on those items

lastly let’s look at how we connect to the cluster two important facts that we need to keep in mind that Silla nodes have no public IP and in fact can change their IP on a pod restart therefore we need a different way than the traditional way to connect we are using a feature of TLS encryption protocol called Sni which allows the proxy to redirect the connection the previously mentioned bundle file has the cluster TLS relevant details as well as the Sni proxy address needed to connect to the cluster as mentioned earlier SI is a TLS protocol feature it allows you to connect to a specific node behind the TLs endpoint initially it was used to connect to specific virtual hosts that may be deployed together with other virtual hosts on a single physical cluster ScyllaDB is using to redirect traffic to the correct Zilla node it wants to reach we are still shot aware we still connect to everything in the cluster but we do it via the SMI proxy

if you wish to know more about our architecture and implementation and specifically on our kubernetes operator I invite you to go to ScyllaDB University and get more information including some hands-on experience using the operator

with that I’d like to thank you for listening and encourage you to get in touch with me if you have any questions on serverless or operator or Silla cloud thank you [Applause]

Read More