Case Study: Natura Achieves Beautiful Results with ScyllaDB

By Felipe Moz, Big Data Engineer, Natura

About Natura

Natura is a Brazilian manufacturer and marketer of beauty products, household, and personal care, skin care, solar filters, cosmetics, perfume and hair care products, sold through representatives and in more than 3,200 stores in 70 countries around the world. Founded in 1969, Natura is now the largest Brazilian cosmetics company.

Natura’s R&D scientists study each ingredient to understand how to extract the maximum benefits for skin and hair. They work directly with more than 30 local communities in the Amazon region—including more than 300 families—to help them develop sustainable business models that benefit the forest.

The Challenge

Natura runs a streaming architecture built on Kafka and Spark that supports about 400,000 messages per day, with some clusters running 40 long batches per day. Natura uses a mix of databases to match various use cases, including document and key-value scenarios. For key-value uses cases, Natura initially ran Cassandra. However, they encountered a number of significant performance issues, many of which were caused by Java and its associated garbage collection.

“With ScyllaDB, we’re seeing better performance, saving a lot of money, getting great support, and there’s no more JVM.”

Felipe Moz, Big Data Engineer, Natura

Natura also found Cassandra to be a hardware hog. Running a 5-node Cassandra ring, Natura needed 11 primary disks to support the needed throughput. Just the hardware—not including software licenses—needed to support such a topology soon became prohibitively expensive. Adding in the performance and administrative overhead, is was clear to the Natura team that they needed an alternative.

The Solution

While researching other options, Natura discovered ScyllaDB. Running a series of side-by-side comparisons, Natura saw much better, more predictable, low-latency performance with ScyllaDB than Cassandra.

Natura saw an overall 10% reduction in batch processes—in some cases, from 6 hours to below 10 minutes. Write latencies went from milliseconds to microseconds. In the 95th percentile, latency was 220 milliseconds for Cassandra, and 500 microseconds for ScyllaDB. Average processing time shrank from 76 to 6 milliseconds.

Natura also found that ScyllaDB’s underlying architecture made it possible to achieve the same throughput with ScyllaDB using just 2 primary disks, as opposed to Cassandra’s 11. Deploying on AWS, this configuration produced a reduction in hardware costs greater than 50%.

Felipe Moz, Big Data Engineer at Natura summed up the experience by saying, “With ScyllaDB, we’re seeing better performance, saving a lot of money, getting great support, and there’s no more JVM.”

“ScyllaDB support gives us direct access to developers who are very familiar with our use cases and data models,” Moz added. “But aside from ScyllaDB’s support, which is amazing, ScyllaDB also enables Natura to scale the way we want to, by cores rather than by nodes.”