Overheard at Scylla Summit 2019
“It was five years ago when Dor Laor and Avi Kivity had a dream…” So began the opening remarks at Scylla Summit 2019. The dreams of ScyllaDB’s founders have since manifested and made their impact all across the Big Data industry.
The world’s leading Scylla practitioners gathered in San Francisco earlier this month to hear about the latest Scylla product developments and production deployments. Once again your intrepid reporter tried to keep up with the goings-on, live tweeting, along with others under the #ScyllaSummit hashtag about the event’s many sessions.
It’s impossible to pack two days and dozens of speakers into a few thousand words, so I’ll give just the highlights. We’ll get the SlideShare links and videos posted for you over the coming days.
Pre-Summit Programs: Training Day and the Seastar Summit
The day before the official opening of Scylla Summit was set aside for our all-day training program. The training program included novice and advanced administration tracks that were well attended. There were courses on data modeling, debugging, migrating, using Scylla Manager, the Scylla Operator for Kubernetes, and more!
Between sessions students flock around ScyllaDB’s Customer Success Manager Tomer Sandler to learn how to migrate to Scylla.
Training Day was held concurrently as a separate-but-related developer conference: the first-ever Seastar Summit, which featured speakers from Red Hat, Lightbits Labs, Vectorized.io as well as ScyllaDB. It was a unique opportunity for those building solutions based on the Seastar engine to engage directly with peers .
ScyllaDB CEO Dor Laor
The Tuesday keynotes kicked off with ScyllaDB CEO Dor Laor providing a review of how we got to this point in history in terms of scalability. He recounted the C10k problem — the problem with optimizing network sockets to handle 10,000 concurrent sessions. The C10k barrier has since been shattered by orders of magnitude — it is now the so-called “C10M problem” — with real-world examples of 10 million, 12 million or even 40 million concurrent connections.
In an analogous vein, there is also the so-called $1 million engineering problem, which has at its core the following axiom: “it’s easy to fall into a cycle where the first response to any problem is to spend more money.”
Dor dubbed this issue the “D10M” problem, where customers are spending $10 million or more on their cloud provider bill. He asserted large organizations using Big Data could literally save millions of dollars a year if they just had performant systems that scaled to their needs.
To that point, Dor brought to the stage ScyllaDB’s VP of Field Engineering, Glauber Costa, and Packet’s Solution Architect James Malachowski to reveal a new achievement in scalability. They had created a test scenario simulating 1 million IoT sensors sampling temperature data every minute over the course of a year. That resulted in a data set of 536 billion temperature readings — 1.44 billion data points per day.
Against that data, they created a query to check for average, min and max temperatures every minute across a given timespan.
To give an idea of how large such a dataset is, if you were to analyze it at 1 million rows per second it would take 146 hours — almost a week. Yet organizations don’t want to wait for hours, never mind days, to take action against their data. They want immediacy of insights, and the capability to take action in seconds or minutes.
Packet’s Solution Architect James Malachowski (left) and ScyllaDB VP of Field Engineering, Glauber Costa (right) describe the architecture needed to scale to 1,000,000,000 reads per second
This was why ScyllaDB partnered with Packet to run the bare-metal instances needed to analyze that data as fast as possible. Packet is a global bare metal cloud built for enterprises. Packet ran the Scylla database on a cluster of 83 instances. This cluster was capable of scanning three months of data in less than two minutes at a speed of 1.1 billion rows per second!
Scanning the entire dataset from disk (i.e., without leveraging any caching) took only 9 minutes.
As Glauber put it, “Bare metal plus Scylla are like peanut butter and jelly.”
Packet’s bare-metal cluster that achieved a billion reads-per-second comprised 83 servers with a total of 2800 physical cores, 34 TB of RAM and 314 TB of NVMe.
And while your own loads may be nowhere near as large, the main point was that if Scylla can scale to that kind of load, it can performantly handle pretty much anything you throw at it.
Yet even if “all companies are software,” he also noted many company’s software isn’t at all easy. So a major goal of ScyllaDB is to make it EASY for software companies to adopt, to use and to build upon Scylla. He then outlined the building blocks for Scylla that help address this issue.
- Scylla Cloud allows users to deploy a highly-performant scalable NoSQL database without having to hire a team of experts to administer the back end.
- Project Alternator, ScyllaDB’s DynamoDB-compatible API, is now available in beta on Scylla Cloud. Dor noted how with provisioned pricing 120,000 operations per second would cost $85 per hour on DynamoDB whereas running Scylla Alternator a user could do the same workload for $7.50 an hour — an order of magnitude cheaper.
- With workload prioritization, you can now automatically balance workloads across a single cluster, providing greater cost savings by minimizing wasteful overprovisioning.
Beyond that, there are needs for workhorse features coming soon from Scylla, such as
- Database backup with Scylla Manager
- Lightweight Transactions (LWT)
- Change Data Capture (CDC) for database updates and
- User Defined Functions (UDFs) which will support data transformations
While Dor observed the trend to think of “all companies are software,” he also recognized that companies are still at their heart driven by people, highlighting the case of a Scylla user. He finished by making a bold assertion of his own. If all companies are software, then “Scylla is a damned good choice for your software.”
ScyllaDB CTO Avi Kivity
It was then time for Avi Kivity, ScyllaDB’s CTO to take the stage. Avi emphasized how Scylla was doubling down on density. While today’s cloud servers are capable of scaling to 60 terabytes of storage, he pointed out how features like Scylla’s innovative incremental compaction strategy will allow users to get the most out of those large storage systems. Also, to safeguard your data, Scylla now supports encryption at rest.
What Avi gets most excited about are the plans for new features. For instance User Defined Functions (UDFs) and User Defined Aggregates (UDAs). Also now that Scylla has provided Cassandra and DynamoDB APIs, Avi noted that there’s also work afoot on a Redis API (#5132) that allows for disk-backed persistence.
Avi clarified there are also going to be two implementations for Lightweight Transactions (LWT). First, a Paxos-based implementation for stricter guarantees, and then, in due time, a Raft implementation for higher throughput.
Avi also spoke about the unique nature of Scylla’s Change Data Capture (CDC) implementation. Instead of being a separate interface, it will be a standard CQL-readable table for increased integration with other systems.
He finished with a review of Scylla release roadmaps for 2020.
ScyllaDB CTO Avi Kivity showed the Release Schedule for Scylla Enterprise, Scylla Open Source and Scylla Manager for 2020
Philip Zimich, Comcast X1
Next up to speak was Comcast’s Philip Zimich, who presented the scope and scale of use cases behind Comcast’s video and DVR services. When Comcast’s X1 platform team began to consider Scylla they had grown their business to 15 million households and 31 million devices. Their data had grown to 19 terabytes per datacenter spanning 962 nodes of Casssandra. They make 2.4 billion RESTful calls per day, their business logic persisting both recordings and recording instructions. Everything from the DVRs and their recording history to back office data, recording intents, reminders, lookup maps and histories.
Their testing led the Xfinity team to spin up a 200 node cluster, first on Cassandra and then on Scylla, to simulate multiple times the normal peak production load of a single datacenter. Their results were startling. Cassandra is known as a fast-write oriented database. In Comcast’s testing it was able to achieve 22,000 writes per second. Yet Scylla was able to get over 26,500 writes per second — an improvement of 20%. On reads the difference was even more dramatic. Cassandra was able to manage 144,000 reads while Scylla was able to get 553,000 reads — an improvement of over 280%.
Comcast’s testing showed that Scylla could improve their read transactions by 2.8x and lower their long-tail (p999) latencies by over 22x
Difference in latencies were similarly dramatic. Median reads and writes for Scylla were both sub-millisecond. Scylla’s p999s were in the single-digit millisecond range. Under all situations latencies for Scylla were far better than for Cassandra — anywhere between 2x to 22x faster.
|Latencies (in milliseconds)||Cassandra||Scylla||Improvement|
With performance testing complete Comcast moved forward with their migration to Scylla. Scylla’s architectural advantages allowed Comcast to scale their servers vertically, minimizing the number of individual nodes they need to administer.
Comcast’s dramatic node count reduction, from 962 nodes of Cassandra to 78 of Scylla
When fully deployed, Comcast will shrink their deployment radically from 962 servers on Cassandra to only 78 nodes on Scylla. This new server topology gives them all the capacity they need to support their user base but without increasing their costs, capping their spending and sustaining their planned growth through 2022.
Martin Strycek, Kiwi.com
Last year when the global travel booking giant Kiwi.com took the stage at Scylla Summit they were still in the middle of their migration and described that stage of their implementation as “taking flight with Scylla.” At this year’s Summit Martin continued the analogy to “reaching cruising altitude” and updated the crowds regarding their progress.
Martin described how they were able to use two specific features of Scylla to make all the difference for their production efficiency.
The first feature that improved Kiwi.com’s performance dramatically was enabling BYPASS CACHE for full table scan queries. With 90 million rows to search against, bypassing the cache allowed them to drop the time to do a full table scan from 520 seconds down to 330 seconds – a 35% improvement.
The second feature was SSTable 3.0 (“mc’) format. Enabling this allowed Kiwi.com to shrink their storage needs from 32 TiB of total data down to 22 TiB on disk — a 31% reduction.
The amount of disk used per server shrank once Kiwi.com enabled SSTable 3.0 (“mc”) format.
Enabling these features was a smooth, error-free operation for the Kiwi.com team. Martin finished his presentation by thanking ScyllaDB, especially Glauber, for making the upgrade experience entirely uneventful: “Thank you for a boring database.”
We take that as high praise!
After announcing the winners of the Scylla User Awards for 2019 the keynotes continued with Glauber Costa returning to the stage to share tips and tricks for how to be successful with Scylla.
First off he made distinctions for long-time Cassandra admins of what to remember from their prior experience (data model and consistency issues) and what they’ll need to forget about — such as trying to tune the system the exact same way as before. Because many of the operational aspects of Cassandra may work completely differently or may not even exist in Scylla.
In terms of production hardware, Glauber suggested NVMe if minimizing latency is your main goal. SSD is best if you need high throughput. But forget about using HDDs or any network interface below 1 Gbps if you care about performance. And, generally, never use more, smaller nodes if the result is the same amount of resources. However, in practice it is acceptable to use them to smooth out expansion.
Another key point Glauber touched upon was rack awareness. It is best to “run as many racks as you have replicas.” So if you have a replication factor of three, use three racks for your deployment. This provides perfect balance and perfect resiliency.
These are just two of many topics that Glauber touched upon, and we’ll work to get you the full video of his presentation soon. For now, here are his slides:
Alexys Jacob, Numberly
Known to the developer community as @ultrabug, Alexys is the CTO of Numberly. His presentation was his production experience comparison of MongoDB and Scylla. Numberly uses both of these NoSQL databases in their environment, and Alexys wished to contrast their purpose, utility, and architecture.
For Alexys, it’s not an either-or situation with NoSQL databases. Each is designed for specific data models and use cases. Alexys’ presentation highlighted some of the commonalities between both systems, then drilled down into the differences.
His operations takeaways between the two systems were unsparing but fair. Whereas he hammered MongoDB’s claims about their sharding-based clustering — which, to his view had both poor TCO and operations, he said that replica-sets should be enough, and gave them a moderate TCO rating (“vertical scaling is not a bad thing!”) and good operations rating (“Almost no tuning needed”).
He gave Scylla a good rating for TCO due to its clean and simple topology, maximized hardware utilization and capability to scale horizontally (as well as vertically). For operations, what Alexys wanted was even more automation.
Alexys observed there are complex and mandatory background maintenance operations for Scylla. For example, while compactions are seamless, repairs are still only “semi-seamless.”
From Alexys’ perspective, MongoDB favors flexibility to performance, while Scylla favors consistent performance to versatility.
In their session SmartDeployAI spoke about democratizing Machine Learning and AI by making deployment of resources easy through Kubernetes
This year Scylla Summit boasted over thirty breakout sessions, which spanned from the afternoon of the first day to the afternoon of the second. In due time we’ll share the video and slides for each, but for now, a few highlights.
- JanusGraph was quite prevalent this year, with back-to-back sessions hosted by our friends at Expero and Enharmonic, and use cases presented by Zeotap for adtech and FireEye for cybersecurity.
- Kubernetes was also top of mind for many, with presentations from Arrikto’s Yannis Zarkardis about the Scylla Operator, as well as SmartDeployAI talking about using Kubernetes to make efficient workflow pipelines for Machine Learning.
- Streaming is now nearly ubiquitous. We were pleased to have Confluent’s Jeff Bean on hand for our die-hard Kafka fans, as well as CapitalOne’s Glen Gomez Zuazo who gave an overview of streaming technologies that included Kafka but also touched on other options like RabbitMQ and NATS.
- Scylla continues to make new inroads across various industries from IoT (Augury, Mistaway and Nauto), to retail and delivery services (Fanatics, iFood) to security (Lookout, Reversing Labs, and FireEye), utilities/energy (SkyElectric), to consumer video (Comcast and Tubi).
- Scylla Cloud user Mistaway showcased how using Scylla’s database-as-a-service allowed them to keep their focus on their application and their customers, not their infrastructure.
SkyElectric spoke about the incredible opportunities of providing renewable energy – solar and wind — to power the world using their solutions built upon Scylla Open Source
There were also many sessions from ScyllaDB’s own engineering team, highlighting our major new features and capabilities, best practices, security, sizing, Scylla Manager, Scylla Monitoring Stack and more. Look for the video and slides from those sessions in the days ahead.
Day Two General Sessions
The second day of the conference reconvened after lunch as a single general session.
Goran Cvijanovic, ReversingLabs — You have twenty billion objects you’ve analyzed. Any of them could be malware (loaded with a virus, a trojan horse, or other malicious payload), or so-called “goodware” (known to be benign). What database will be able to keep up a level of analytics to prevent the next major data breach? That’s the problem facing ReversingLabs. They analyze all those objects for what they call a “file reputation” which determines if it is safe, suspicious, or known to be dangerous. The need to be fast and right every time is why they put Scylla at the heart of their TitaniumCloud offering.
ReversingLabs’ Goran Cvijanovic described the scale and scope of their file reputation system
The ReversingLabs backend data store needed to support protobuf format natively, exhibit extremely low latencies (<2 milliseconds), and allow for highly variable record sizes (ranging anywhere from 1k of data up to 500 MB). On top of it all, it needed to be highly available and support replication.
ReversingLabs results showed average latencies for writes below 6 milliseconds, and average reads below 1 millisecond. Even their p99 latencies were 12 ms for writes and 7 ms for reads
Goran emphasized that one of the key take aways is to test your chunk size with data compression. In ReversingLabs’ case, it meant they were able to use 49% less storage. (If you want to learn how to take advantage of Scylla’s data compression feature yourself, make sure to read our blog series on compression: Part One and Part Two).
Richard Ney, Lookout — Richard Ney is the principal architect behind Lookout’s ingestion pipeline and query services for the security of tens of millions of mobile devices. Their Common Device Service is receiving data at different intervals for all sorts of attributes related to a mobile device; its software and hardware, its filesystem, settings, permissions and configuration, a binary manifest of what is installed, and various analyses of risk.
Lookout’s Richard Ney shared how a simple mobile device, multiplied tens of millions of times, becomes a Big Data security challenge! His talk was on how Lookout solved that challenge with Scylla.
Their existing design had Spark streaming ingestion jobs flowing through this Common Device Service to DynamoDB and Elasticsearch. But the long-term issue was that, as the system grew, costs increased significantly. Particularly costs associated with DynamoDB. As well, DynamoDB has limits on the primary key and sort key; it was not designed for time series data.
What Lookout is now seeking to do is to replace their current architecture with Scylla plus Kafka, with an eye to leveraging Scylla’s recently announced Change Data Capture (CDC) to flow data through Kafka Connect into Elasticsearch and other downstream services. They also seek to employ incremental compaction with Scylla Enterprise.
Richard described Lookout’s test environment, which emulated 38 million devices generating over 100,000 messages per second. He then noted that the default Kafka partitioner (Murmur2 hash) was not very efficient. The Lookout team implemented their own Murmur3 hash (the same as is used within Scylla) with a consistent jump hash using the Google guava library.
The bottom line for Lookout was that “the cost benefits over the current architecture flow increased significantly as our volume increased.” Richard showed the cost analysis for DynamoDB versus Scylla. In an on-demand scenario, supporting 38 million devices would cost Lookout over $304,000 per month compared to $14,500 for Scylla; a 95% savings over DynamoDB! Even moving to provisioned pricing, the cost of DynamoDB (more than $55,600 per month) would still eclipse Scylla. Scylla would still be 74% cheaper.
Keeping in mind the growth of the mobile market as well as Dor’s allusion to the “million dollar engineering challenge,” Richard extrapolated costs out even further, to a 100,000,000 device scenario. In that case costs for provisioned DynamoDB would rise to $146,000 a month — roughly $1.88 million annually. Whereas for Scylla, at around $38,300 a month, the annualized cost would be less than $460,000. That would mean a savings of over $1.4 million annually.
Shlomi Livne — ScyllaDB’s VP of R&D was up next with a session on writing applications for Scylla. His conceptual process model had a number of steps, each of which could lead back to prior steps. In general:
- Think about the queries you are going to run
- Only then create a data model
- Use cassandra-stress or another tool to validate performance while you…
- Scale test, and finally
During this iterative process, Shlomi encouraged the audience to look for opportunities for CQL optimization and to also pay heed to disk access patterns. Look at the amount of I/O operations, and the overall amount of bytes read. There are two elements read off disk: the data stored in the SSTables, and the index. Everything else is read from memory.
Shlomi also proposed paying careful attention to your memory-to-disk ratio when you choose your instance types. For example the AWS EC2 i3 family has a memory-to-disk ratio of 1:30. Whereas the newer i3en family has a memory-to-disk ratio of 1:78. Thus, on the latter instance type you will get more queries served off disk.
Shlomi then led the audience on a comparison of doing a single partition scan versus range scans for performance, and gave examples of how current features such as BYPASS CACHE and PER PARTITION LIMIT, and upcoming features like GROUP BY and LIKE will help users get even more efficiency out of their full table and range scans. The bottom line is that optimized full scans can reduce the overall amount of disk access compared to aggregated single-partition scans.
Calle Wilund — Change Data Capture is a compelling new feature presented by our engineer Calle Wilund. It is a way to record changes to the tables of a database, which can then be read asynchronously by consumers. Basically a way to track diffs to records in your Scylla database. What is unique about Scylla’s implementation is that CDC is implemented as a standard CQL-readable table. It does not require a separate interface or application to read them.
A table comparing how Change Data Capture (CDC) is implemented in Scylla versus other NoSQL databases.
There are a number of use cases for CDC, including database mirroring and replication, or to enable specific applications like fraud detection or a Kafka pipeline. The way it is implemented is as a per-table log in the form of a CQL-readable table. The CDC log is colocated with the original data and has a default Time To Live (TTL) of 24 hours to ensure it doesn’t bloat your disk use uncontrollably.
We’ll provide more information about this feature and its implementation in a future blog, as well as our documentation. For now, if you want to keep an eye on our progress, make sure to check out issue #4985 in Github and peek in on all of the related subtasks.
Konstantin Osipov — Lightweight Transactions (LWT) that enable “Compare-and-Set” are another long-awaited feature in Scylla (see #1359). Our Software Team Lead Konstantin “Kostja” Osipov gave the audience an update on our implementation status. The good news from Kostja is that you can now try it yourself using Scylla’s nightly builds. There is even a quickstart guide on how to use them in Docker. However, LWT is an experimental feature (use
--experimental) unsuitable for production; it is not yet available in a tested maintenance release.
Kostja also noted that while ScyllaDB is doing its best to make our implementation match Cassandra, there are some differences. For example, Scylla supports per-core partitioning, so you would need to use a shard-aware driver for optimal performance.
Kostja made clear that LWT has a performance cost. Throughput will be lower and latencies higher. Four round trips are very costly. Especially when working across multiple regions. Right now, ScyllaDB is focusing on improving performance with LWT using a Paxos consensus algorithm. There are also plans in the future to implement Raft.
As with CDC, we’ll have a full blog and documentation of our LWT implementation in the future.
Nadav Har’El — Project Alternator, the free and open source DynamoDB-compatible API for Scylla was presented to our audience by our Distinguished Engineer Nadav Har’el. What was new to those who have been paying close attention to this project was that we are now offering Alternator on Scylla Cloud in beta.
An intensive test of DynamoDB vs. Alternator at 120,000 ops shows that running Scylla on EC2 VMs could be an order of magnitude cheaper than DynamoDB.
Ask Us Anything!
The summit ended with its traditional Ask Me Anything session with Dor, Avi and Glauber onstage. We’d like to thank everyone who attended, and to all our speakers who made the event such a great success.
Even though Scylla Summit is over, we still invite you to ask us anything! Want more details on a feature we announced? Curious to know how to get the same advantages of Scylla’s performance and scalability in your own organization? Feel free to contact us, or drop into our Slack and let us know your thoughts.
Tags: CDC, Change Data Capture, Comcast, JanusGraph, Kiwi.com, kubernetes, Lightweight Transactions, Lookout, LWT, numberly, Packet, Paxos, Project Alternator, ReversingLabs, Scylla Summit, Scylla Summit 2019, SkyElectric, SmartDeployAI