Kiwi.com is an online flight booking platform that builds customized travel itineraries by assembling flight combinations from multiple airlines. Kiwi.com saves travellers money on airline tickets by generating itineraries that mix-and match global airlines with local carriers, finding the best price for the trip as a whole.
Kiwi.com’s latest service, Nomad, enables users to add multiple cities, with no requirements on ordering or specific travel dates. Nomad leverages the ‘traveling salesman’ algorithm to generate an itinerary that falls within the traveler’s general requirements, and at the best price.
Kiwi.com must handle heavy traffic, with 90,000 daily queries, and 25,000 seats booked per day. But the underlying challenge is the ever-increasing size of the graph composed of travel segments and routes. While there are about 100,00 flights per day, Kiwi.com is interested in storing flight combinations — for example, Lufthansa’s route from Prague to JFK, connecting in Munich. The combinatorics of the data set result in 7 billion combinations, a number that is continuously growing. At the same time, the data itself is constantly updating, as airlines modify prices for various flight combinations. The data refreshes at the rate of 60% of data per day, 80% over 3 days, and 100% every 10 days.
“If you’re considering moving from Cassandra to ScyllaDB, I don’t know what’s holding you back!”
Martin Strycek, Engineering Manager, Kiwi.com
Kiwi.com initially built their service on Postgres, attempting to scale through sharding and by adding instances. This created a management nightmare, since the team had to resort to manually distributing data over clusters running individual Postgres. When considering NoSQL vs SQL, SQL was clearly not a good choice for Kiwi.com’s use case.
To improve performance and scalability, Kiwi.com first migrated to Cassandra. But they found their demands never stopped growing, and Cassandra was unable to keep up. “It’s not a uniform unit,” said Martin Strycek, Engineering Manager, Kiwi.com. “It’s 70 different systems running all over the place.” Ultimately, Cassandra proved unable to scale up, even as the team added more and more nodes. Even worse, the team was required to write custom code to read Cassandra SStables, creating problems with maintenance, upgrades, and so on.
As an international company, Kiwi.com wanted to go global, with at least three data centers in three different cities, running on bare metal. They also wanted to eliminate the headaches of running custom code against the data layer. Another component of the migration was a move from AWS to GCP. To accomplish this, Kiwi.com needed to find an alternative to Cassandra.
Kiwi.com quickly settled on ScyllaDB as a drop-in replacement for Cassandra, but they wanted to prove it out under real-life conditions before making the leap. With a healthy skepticism for vendor benchmarks, Kiwi.com set out to independently evaluate Cassandra versus ScyllaDB. To do so, the team defined equivalent configurations, traffic volumes, and workloads based on the Cassandra benchmark.
The goal was to test ScyllaDB raw speed and performance, along with ScyllaDB’s support for Kiwi.com’s specific workloads. They also wanted some insight into running on bare metal or on a cloud platform, testing GCP versus OVH. The final goal of the POC was to evaluate ScyllaDB’s cost relative to the Cassandra cluster they were running.
Kiwi.com worked closely with the ScyllaDB team to establish success criteria for the POC. Once the test bed of five nodes each was set up, Kiwi.com ran a set of synthetic benchmarks, shadowed production traffic, and used internal monitoring tools for reads.
Their tests demonstrated a stark difference between the two databases. With a replication factor of 4, Cassandra required 100 nodes to achieve 40K reads per second. With only 21 nodes, ScyllaDB was able to achieve 900K reads per second.
Best of all, Kiwi.com discovered that the running cost of ScyllaDB would be about 25% the cost of Cassandra. “This is the bottom line,” said Martin Strycek, Engineering Manager, Kiwi.com. “If you are on the money side, just sign on the dotted line and go with ScyllaDB.”
Having made the decision to go with ScyllaDB, the team undertook the migration to GCP and OVH instances running in multiple cities and geographical regions.
“I like ScyllaDB from the technical point of view,” said Jan Plhák, Head of C++ Development at Kiwi.com. “I love that it’s written in C++ and runs on the Seastar framework, which is an amazing open-source project. But from the business point of view, it also helps us grow our company.”
Kiwi.com is also excited about ScyllaDB’s roadmap. The ability to prioritize production traffic over analytics will be a huge advantage, since the many algorithms that Kiwi.com runs against the ScyllaDB clusters will have no discernible impact on the customer experience.
By the Numbers
- Dataset: 7 billion flight entries
- Storage: 11TB in multiple replicas
- Writes per second: 700,000
- Reads per second: 500,000
Click the video below to hear from Martin Strycek, Engineering Manager at Kiwi.com on how migrating from Cassandra to ScyllaDB improved Kiwi.com’s performance and scalability while lowering costs for processing their billions of data combinations with this case study on NoSQL.