Scylla Alternator: The Open Source DynamoDB-compatible API
About Scylla’s Alternator Project
Alternator is an open source project that gives Scylla compatibility with Amazon DynamoDB™.
Our goal is that any application written for Amazon DynamoDB could be run, unmodified, against Scylla with Alternator enabled. Originally, Scylla began as a re-implementation of Apache Cassandra, and it has since proven to be a solid database engine with key performance and TCO benefits over Cassandra. However, we always considered Cassandra to be just a starting point. Now a 5-year old project, Scylla is able to scale to hundreds of machines, petabytes of data and many regions and availability zones.
Scylla can easily run millions of operations per second at ≤1 msec latency for 99% of requests on a single node. We can map different workloads to roles and prioritize them, thus allowing users to combine analytics and real-time transactional workloads on the same cluster.
Now that we’ve met our first major goal of creating a better Cassandra (along with a fully managed cloud version of Scylla), we’ve decided it’s time to add new APIs. A DynamoDB-compatible API makes a lot of sense as the Dynamo paper played a major role in the design of DynamoDB, Cassandra and, of course, Scylla. (Aren’t we all one big competitive family?) The three databases have a high availability design and offer similar functionality even prior to this recent enhancement.
There are three key benefits Scylla’s Alternator brings to DynamoDB users:
- Cost: DynamoDB is very convenient when consumed as a service. When you run it at small scale, it just works and you don’t have to worry about database administration. At scale, however, the sub-cents per operation charges add-up to a shockingly large amount of money. Scylla’s efficiency allows you to to use significantly fewer resources for the same task or workload. According to our benchmark, one can expect to save 80% – 93% to support the same workload (5x-14x less expensive)!
- Performance: Scylla was implemented in modern C++ with a lot of expertise dedicated to details such as instructions per cycle, lock-free, log structured cache, heat-based load balancing, workload prioritization, userspace schedulers and much more. It all adds up to huge improvements in latency and throughput. This lets Scylla scale vertically to modern servers with hundreds of cores per machine. Even when data is not evenly balanced (“hot node” and “hot partition” problems) you can still enjoy solid performance. It’s also worth noting that Scylla does not require the expensive Cache/DAX in front of it.
- Openness! Scylla is open source. You can run it everywhere for free. Scylla can run in any possible deployment: on-prem, hybrid cloud, multi cloud, containerized, virtualized, bare metal, etc. You won’t be locked-in to one vendor’s expensive cloud platform. Read more below.
My co-founder Avi Kivity and I are going to give an in-depth look into Scylla Alternator via a webinar on Wednesday, September 25, 2019 at 10 AM Pacific, 1 PM Eastern. You won’t want to miss it!
Open Source First
Our goal at Scylla is to become the default NoSQL database by giving developers more and better choices for their deployments. With years of open source development experience, we are strong believers in this path and have thus decided to release our Alternator code to the world first, before an official product release. This is a proven method to get a better product out the door with constant, bidirectional feedback from knowledgeable developers.
A global project should be open, a property that all parties enjoy and benefit from. In a world where cloud providers routinely commercialize Open Source Software (OSS), leaving little space for the OSS vendor who then, in turn, begins to blur the lines of open source licenses, Scylla remains devoted to OSS. Our chosen OSS licensing model, AGPL, encourages contribution back plus provides reasonable support for the retention and longevity of the OSS business model.
It’s not just about no lock-in, by being OPEN, one can trace the system. You can analyze the query path. Scylla has wonderful observability with hundreds of different Prometheus metrics. You no longer need to assume the database is a black box. Plus, if you need a feature or see a bug and you want to contribute, you can extend or fix the code! Now, if your team doesn’t have that sort of technical depth, you can always run on Scylla Cloud, our fully-managed cloud platform. But if you do have a deep bench on your team (and many of our customers do) we give you all the tooling you need to get maximum performance out of your database.
Open source traditionally disrupts commercial vendors. Not only is disruption what we do best, truth be told we also enjoy it!
The Current Status of Alternator
Alternator is still in development. It is not yet generally available and we haven’t created a product release with it. However, it is part of the Scylla source code right now.
Our alternator.md and design doc provide detailed information of what’s supported and not yet supported today. In short, most standard applications will just work. The JSON HTTP API is mostly implemented, indexing works, multi zones are implemented, and many more features will work. There are consistency differences that arise from the fact that DynamoDB itself has a leader/follower model versus the active-active model that Scylla implemented. This may be an issue in certain cases. For anyone looking to use Alternator, we advise you to first take a close look at the documentation and your code.
Within the next few months we plan to harden the code, bringing it to production quality by inserting it into our robust quality assurance cycles. We will also completely implement all of the nitty gritty API differences. In the future, we will offer a Scylla Enterprise release containing the Alternator software and also release a version to run on Scylla Cloud.
Future updates to our managed DBaaS will run on Azure and GCP but you won’t have to wait. As soon as Alternator releases you will be able to run it on your own Amazon, Azure and Google Cloud instances. We also plan on having a General Availability (GA) version for our Kubernetes operator so you can fully deploy and manage a DynamoDB-compatible database wherever you wish.
We will also address the load balancing differences between Scylla and DynamoDB clients. Unlike Scylla, where the client is topology aware and can access any node, in DynamoDB the clients receive a load-balanced IP DNS translation that is Timed-to-Live (TTLed) every 4 seconds.
When Scylla’s workload prioritization feature is enabled, developers can assign more resources to crucial workloads, cap less important workloads, and so, for example, mix analytics and real time operational loads on the same cluster.
At our Scylla Summit (November 5-6, 2019) we plan to release Scylla’s first Light Weight Transaction (LWT) feature. Initially based on Cassandra’s Paxos, LWT will allow us to add consistency to the Alternator. In parallel we are working on a Raft version that has a leader/follower model and better performance.
We’re also going to announce our own streaming – Change Data Capture (CDC) feature and we may add compatibility with DynamoDB streams later on.
In another upcoming release, Scylla will introduce User Defined Functions (UDFs) and, soon after, map-reduce computation based on it. UDFs are far more efficient than serverless alternative since the functions are executed on the server.
To make the switch as easy as possible we’re working on online migration tools that will be relatively simple: just start streaming the changes from DynamoDB plus run a full scan. The Scylla Spark Migrator project will be enhanced to support the DynamoDB-compatible API.
Beyond DynamoDB Compatibility
Interested in more protocols? Drop us a github pull request or a github issue request. As Scylla looks to use our expertise beyond traditional NoSQL environments, it makes more and more sense to add new functionality and protocols. For example, we’ve already been asked (nicely) for a Redis API: [feature] add redis protocol for drop-in replacement redis and guess what? There is already a work in-progress to merge the Pedis project (“Parallel Redis” built on Seastar) into Scylla.
You can find out more and stay current with the status on the new Project Alternator home page. And for those who want more details, you can register for our upcoming webinar, all about Project Alternator.
We have longer-term plans but we’ll leave that for another day and get back to coding. If you’ve enjoyed reading this, add a github star or take us for a spin using docker-compose with step by step instructions.