With more than 70 million items for sale on its platform, Allegro is far and away the largest e-commerce site in Poland, and one of the biggest in Central and Eastern Europe. It’s a destination for more than 14 million monthly users, who account for more than 750,000 purchases on the platform per day.
Allegro puts a premium on site performance. As Lukasz Pachciarek, Senior Database Administrator at Allegro, explains, “Our users expect very fast response times, so we need our databases to respond very quickly.”
The company started using Apache Cassandra in 2013. They presently have 20 Cassandra clusters and more than 250 Cassandra nodes running in 2 datacenters. However, they have encountered several issues in their use of Cassandra. For example, garbage collector pauses have impacted cluster performance, as have slower-than-expected maintenance tasks—such as restarts, repairs, joins and cleanups.
“We found we were spending a lot of time doing Java tuning, but our efforts would result in performance gains of only a few percent,” explained Pachciarek. “Worse yet, we were seeing lots of latency spikes.”
A Performance Difference
Having heard about the performance advantages of ScyllaDB, the Allegro team decided to test ScyllaDB for themselves. They used production traffic for a direct comparison of ScyllaDB versus Cassandra. In the test environment, the team implemented two clusters for each database, running in two datacenters, with three nodes per cluster and each node loaded with 300GB of data. They divided the tests into two parts—production traffic and maintenance tasks.
“Whatever we did, ScyllaDB did it faster. And it wasn’t just 5 or 10 percent better. It was improvements of hundreds of percentage points.”
– Szymon Szymanski, System Administrator, Allegro
“Whatever we did, ScyllaDB did it faster,” said Allegro system administrator Szymon Szymanski. “And it wasn’t just 5 or 10 percent better. It was improvements of hundreds of percentage points. ScyllaDB had four times the throughput of Cassandra and a 7X lower response time. Every time we measured latency, for Cassandra there were a lot of spikes while ScyllaDB was very stable.”
Maintenance task performance tests yielded similar results. “We started with repairs, where we found ScyllaDB performs twice as fast as Cassandra,” said Pachciarek. “The difference was even bigger for decommissions—half an hour for ScyllaDB versus three hours for Cassandra. We similarly saw a huge difference for cleanup times.”
The tests also showed significant opportunities to reduce costs. “After carefully reviewing the results of our tests it became clear that we could replace three Cassandra nodes with a single ScyllaDB node,” said Szymanski. “That saves a lot of money on hardware while, more importantly, making our response times so much better.”
Based on these tests, Allegro is now migrating from Cassandra to ScyllaDB.