Natura is a Brazilian manufacturer and marketer of beauty products, household, and personal care, skin care, solar filters, cosmetics, perfume and hair care products, sold through representatives and in more than 3,200 stores in 70 countries around the world. Founded in 1969, Natura is now the largest Brazilian cosmetics company.
Natura’s R&D scientists study each ingredient to understand how to extract the maximum benefits for skin and hair. They work directly with more than 30 local communities in the Amazon region—including more than 300 families—to help them develop sustainable business models that benefit the forest.
Natura runs a streaming architecture built on Kafka and Spark that supports about 400,000 messages per day, with some clusters running 40 long batches per day. Natura uses a mix of databases to match various use cases, including document and key-value scenarios. For key-value uses cases, Natura initially ran Cassandra. However, they encountered a number of significant performance issues, many of which were caused by Java and its associated garbage collection.
“With Scylla, we’re seeing better performance, saving a lot of money, getting great support, and there’s no more JVM.”
Felipe Moz, Big Data Engineer, Natura
Natura also found Cassandra to be a hardware hog. Running a 5-node Cassandra ring, Natura needed 11 primary disks to support the needed throughput. Just the hardware—not including software licenses—needed to support such a topology soon became prohibitively expensive. Adding in the performance and administrative overhead, is was clear to the Natura team that they needed an alternative.
While researching other options, Natura discovered Scylla. Running a series of side-by-side comparisons, Natura saw much better, more predictable, low-latency performance with Scylla than Cassandra.
Natura saw an overall 10% reduction in batch processes—in some cases, from 6 hours to below 10 minutes. Write latencies went from milliseconds to microseconds. In the 95th percentile, latency was 220 milliseconds for Cassandra, and 500 microseconds for Scylla. Average processing time shrank from 76 to 6 milliseconds.
Natura also found that Scylla’s underlying architecture made it possible to achieve the same throughput with Scylla using just 2 primary disks, as opposed to Cassandra’s 11. Deploying on AWS, this configuration produced a reduction in hardware costs greater than 50%.
Felipe Moz, Big Data Engineer at Natura summed up the experience by saying, “With Scylla, we’re seeing better performance, saving a lot of money, getting great support, and there’s no more JVM.”
“Scylla support gives us direct access to developers who are very familiar with our use cases and data models,” Moz added. “But aside from Scylla’s support, which is amazing, Scylla also enables Natura to scale the way we want to, by cores rather than by nodes.”