Kiwi.com is an online flight booking platform that builds customized travel itineraries by assembling flight combinations from multiple airlines. Using this approach Kiwi.com saves travellers money on airline tickets by generating itineraries that mix-and match global airlines with local carriers, finding the best price for the trip as a whole.
At ScyllaDB Summit 2018, we were joined by two speakers from Kiwi.com covering both the technical and business aspects of their migration from Cassandra to ScyllaDB. The topics they covered include Cassandra to ScyllaDB migration, benchmarking on two popular cloud providers’ of bare metal instances, and analysis of performance results that focus on full table scans.
In his presentation, Jan Plhak, Head of C++ Development, discussed how the nature of Kiwi.com’s data creates scaling challenges. Kiwi.com stores data on 100,000 flights a day, and 35 million flights a year. That’s not much data. In fact, as Jan pointed out, your phone can store that. What makes it challenging is that Kiwi.com stores flight combinations. This results in 7 billion flight entries, and a replicated dataset of 20 terabytes. A phone can’t store that.
With this background, Jan related Kiwi.com’s journey to ScyllaDB. The team initially used PostgreSQL, but in order to scale, PostgreSQL required custom sharding, with 60 database instances and 60 Redis caches. Jan referred ironically described this topology as ‘pure joy’.
The team reasoned that a NoSQL database was more appropriate to their use case, so they turned to Apache Cassandra. Ultimately, Cassandra proved unable to scale up, even as the team added more and more nodes. Even worse, the team was required to write custom code to read Cassandra SStables, creating problems with maintenance and upgrades.
“If you’re considering moving from Cassandra to ScyllaDB, I don’t know what’s holding you back!”
Martin Strycek, Engineering Manager, Kiwi.com
Massive Full Table Scans
After discussing the journey to ScyllaDB, Jan went into some detail about the requirements for full table scans, and why Cassandra was not up to the task. Cassandra’s limitations forced the team to implement a custom scanning service to read newly created SStables and stream updates to the cache, ScyllaDB made it easy and safe to do performant full-table scans.
Kiwi.com’s precomputation engine requires all of the data, updated every hour. That load, combined with secondary production and testing put a strain on the production databases. With Cassandra, the team saw CPU overload and massive latency spikes. Jan ascribed this to Cassandra’s underlying Java implementation, as well as the inability to write a query that would read only the most recently updated data.
The Kiwi.com team attempted a Cassanadra workaround. Since Cassandra stores immutable data in SSTables during compaction, they could create a service to parse new SSTables, and then stream that data to the cache. The data from the cache could in turn be used to feed the preprocessing engine while sidestepping Cassandra. Jan described this workaround as ‘opening a Pandora’s box’.
Luckily, ScyllaDB made it possible to close this Pandora’s box for good. ScyllaDB enables continuous full table scans that filter for last-update-timestamp. ScyllaDB can also handle token ranges without overloading. This solved many of Kiwi.com’s problems, in particular building workarounds on Cassandra’s internal, undocumented, unsupported format.
The Migration from Cassandra
Martin Strycek, engineering manager at Kiwi.com spoke to the migration process from Cassandra to ScyllaDB, and provided some context involving TCO. Martin said that Kiwi.com first migrated to Cassandra from a big PostgreSQL cluster to get better performance and scalability, but their demands never stopped growing.
Martin covered the way his team approached testing of ScyllaDB, the migration plan, how it impacts the business and Kiwi.com’s high-level application and infrastructure architecture. In Martin’s view, ScyllaDB has had a significant impact on disaster recovery and availability of the overall system.
According to Martin, Kiwi.com quickly settled on ScyllaDB as a drop-in replacement for Cassandra, but they wanted to prove it out under real-life conditions before making the leap. With a healthy scepticism for vendor benchmarks, Kiwi.com set out to independently evaluate Cassandra versus ScyllaDB. To do so, the team defined equivalent configurations, traffic volumes, and workloads based on the Cassandra benchmark.
The goal was to test ScyllaDB raw speed and performance, along with ScyllaDB’s support for Kiwi.com’s specific workloads. They also wanted some insight into running on bare metal or on a cloud platform, testing GCP versus OVH, popular cloud provider in Europe. The final goal of the POC was to evaluate ScyllaDB’s cost relative to the Cassandra cluster they were running.
Overall, Martin used three approaches to testing:
- synthetic benchmarks
- shadowing production traffic
- internal benchmarking tool for reads
Kiwi.com worked closely with the ScyllaDB team to establish success criteria for the POC. Once the test bed of five nodes each was set up, Kiwi.com ran a set of synthetic benchmarks, shadowed production traffic, and used internal monitoring tools for reads.
Their tests demonstrated a stark difference between the two databases. With a replication factor of 4, Cassandra required 100 nodes to achieve 40K reads per second. With only 21 nodes, ScyllaDB was able to achieve 900K reads per second.
Best of all, Kiwi.com discovered that the running cost of ScyllaDB would be about 25% the cost of Cassandra. Martin provided a detailed breakdown of the hardware costs of running Cassandra versus ScyllaDB, on bare metal and Google Cloud Platform:
A comparison of Kiwi.com’s hardware costs between Cassandra and ScyllaDB on cloud platforms
Having made the decision to go with ScyllaDB, the team undertook the migration to GCP and OVH instances running in multiple cities and geographical regions. In fact, Martin’s team installed the final server in the ScyllaDB cluster just before the presentation, displaying shadow traffic from the live system.
Martin pointed out that Kiwi.com is also excited about ScyllaDB’s roadmap. The ability to prioritize production traffic over analytics will be a huge advantage, since the many algorithms that Kiwi.com runs against the ScyllaDB clusters will have no discernable impact on the customer experience.
Martin wrapped up his ScyllaDB Summit talk by encouraging the audience to “never stop innovating”, stating, “This is the bottom line. If you are considering going from ScyllaDB to Cassanadra, I don’t know why you didn’t do that last week!”
You can watch Jan’s full presentation (with slides), Kiwi.com Takes Flight with ScyllaDB, and Martin’s Kiwi.com’s Migration to ScyllaDB: The Why, the How, the Fails and the Status, from ScyllaDB Summit 2018 in our Tech Talks section. And if you enjoy these in-depth technical insights from Kiwi.com as other NoSQL industry leaders, this is also a good reminder that registration is now open for ScyllaDB Summit 2019.
Register Now for ScyllaDB Summit 2019!
If you enjoyed reading about Kiwi.com’s use case, and want to learn more about how to get the most out of ScyllaDB and your big data infrastructure, sign up for ScyllaDB Summit 2019, coming up this November 5-6, 2019 in San Francisco, California.