Case Study: mParticle Improves System Performance by Migrating to ScyllaDB

By Andrew Katz CTO & Founder, mParticle

About mParticle

mParticle empowers its customers to accelerate growth in a multi-screen world. A customer data platform for managing mobile applications, the company collects web and app data in one place and streams it to the platforms and systems that matter to a business.

mParticle is used by hundreds of leading brands, including NBC, Turner, Airbnb, Venmo, King, and Spotify. The mParticle platform makes it makes it easier and more transparent for these companies to manage and control the data they send to third parties, helps them segment that data and enrich it with identity information, and gives them better insight into customer utilization.

The Challenge

mParticle processes more than 50 billion messages, 100 billion events and 150 terabytes of raw data every month. In the face of these high volumes, the company must also meet very strict SLAs for each customer.

At the heart of the mParticle platform is a client application that processes the messages it receives, reads all the historical event information from the data store, processes the data through the rules defined by the customer and then writes the appended data back to the data store. All of this has to stream through their systems in near real time.

mParticle initially used Cassandra for its user profile store and rules engine. Eventually, however, they started seeing performance degradation. When any batch processing or server-to-server data loads took place, the Cassandra cluster nodes would accumulate pending compactions, which would significantly increase read and write latency. The client application would scale down, resulting in a backlog in processing the data.

“The data that the previous day Cassandra was processing in 20 hours, ScyllaDB was now processing in real time.”

Nayden Kolev Systems Architect, mParticle

Cassandra turned out to not be a good fit for mParticle. Their pain points included:

Backlogs of data processing that far exceeded their SLAs
An over-complicated setup that was hard to modify or scale
Too much human labor involved in tuning Cassandra
Lack of affordable support

The Solution & Results

Searching for a better way, mParticle started by doing a POC of ScyllaDB that included a direct comparison with Cassandra. The difference was immediately clear.

“In every test it was glaringly obvious that ScyllaDB was outperforming Cassandra,” said Nayden Kolev Systems Architect at mParticle. ”It wasn’t a small percentage difference. It was absolutely clear to us that ScyllaDB was superior to Cassandra in performance for our workload.”

In addition to ScyllaDB’s performance advantages, mParticle was also impressed by:

Ease of configuration
Self-tuning during installation
Highly responsive and knowledgeable support from ScyllaDB engineers

Having decided on ScyllaDB, mParticle began migrating one client at a time to their ScyllaDB system. They found that migrating to ScyllaDB was very straightforward. There were no changes required to the code or the data model.

“The day after we finished the migration, we started watching the metrics,” recalled Kolev. “We waited for a backlog. And we waited. And there was no backlog. The data that the previous day Cassandra was processing in 20 hours, ScyllaDB was now processing in real time. That for us was the biggest validation we could ever want that we’d made the right decision and that ScyllaDB is really much better performing for us than Cassandra.”

“The fact that ScyllaDB isolates compactions and other tasks into background and foreground tasks and automatically tunes its performance to perform at the optimal level is what makes the cluster perform as well as it does,” explained Kolev. “From the moment the application is installed on the hardware until it starts getting production load, it tunes itself without having any human intervention.”

The mParticle team is also very pleased with ScyllaDB support.

“Cliches like giving 110% or going the extra mile don’t do them justice,” says Kolev. “They helped us along the way when we were doing our testing and we’ve gotten nothing but outstanding support from them ever since.”

Real-Time AI

Is ScyllaDB right for me?

ScyllaDB University

ScyllaDB Blog

Case Study: mParticle Improves System Performance by Migrating to ScyllaDB

By Andrew Katz CTO & Founder, mParticle

About mParticle

The Challenge

“The data that the previous day Cassandra was processing in 20 hours, ScyllaDB was now processing in real time.”

Nayden Kolev Systems Architect, mParticle

The Solution & Results

Start scaling with the world's best high performance NoSQL database.