mParticle empowers its customers to accelerate growth in a multi-screen world. A customer data platform for managing mobile applications, the company collects web and app data in one place and streams it to the platforms and systems that matter to a business.
mParticle is used by hundreds of leading brands, including NBC, Turner, Airbnb, Venmo, King, and Spotify. The mParticle platform makes it makes it easier and more transparent for these companies to manage and control the data they send to third parties, helps them segment that data and enrich it with identity information, and gives them better insight into customer utilization.
mParticle processes more than 50 billion messages, 100 billion events and 150 terabytes of raw data every month. In the face of these high volumes, the company must also meet very strict SLAs for each customer.
At the heart of the mParticle platform is a client application that processes the messages it receives, reads all the historical event information from the data store, processes the data through the rules defined by the customer and then writes the appended data back to the data store. All of this has to stream through their systems in near real time.
mParticle initially used Cassandra for its user profile store and rules engine. Eventually, however, they started seeing performance degradation. When any batch processing or server-to-server data loads took place, the Cassandra cluster nodes would accumulate pending compactions, which would significantly increase read and write latency. The client application would scale down, resulting in a backlog in processing the data.
“The data that the previous day Cassandra was processing in 20 hours, Scylla was now processing in real time.”
Nayden Kolev Systems Architect, mParticle
Cassandra turned out to not be a good fit for mParticle. Their pain points included:
Searching for a better way, mParticle started by doing a POC of Scylla that included a direct comparison with Cassandra. The difference was immediately clear.
“In every test it was glaringly obvious that Scylla was outperforming Cassandra,” said Nayden Kolev Systems Architect at mParticle. ”It wasn’t a small percentage difference. It was absolutely clear to us that Scylla was superior to Cassandra in performance for our workload.”
In addition to Scylla’s performance advantages, mParticle was also impressed by:
Having decided on Scylla, mParticle began migrating one client at a time to their Scylla system. They found that migrating to Scylla was very straightforward. There were no changes required to the code or the data model.
“The day after we finished the migration, we started watching the metrics,” recalled Kolev. “We waited for a backlog. And we waited. And there was no backlog. The data that the previous day Cassandra was processing in 20 hours, Scylla was now processing in real time. That for us was the biggest validation we could ever want that we’d made the right decision and that Scylla is really much better performing for us than Cassandra.”
“The fact that Scylla isolates compactions and other tasks into background and foreground tasks and automatically tunes its performance to perform at the optimal level is what makes the cluster perform as well as it does,” explained Kolev. “From the moment the application is installed on the hardware until it starts getting production load, it tunes itself without having any human intervention.”
The mParticle team is also very pleased with Scylla support.
“Cliches like giving 110% or going the extra mile don’t do them justice,” says Kolev. “They helped us along the way when we were doing our testing and we’ve gotten nothing but outstanding support from them ever since.”
Getting started takes only a few minutes. Scylla has an installer for every major platform and is well documented. If you get stuck, we’re here to help.