SnapFish Clicks with Scylla, Achieving Global Scale and High Performance

About SnapFish

Snapfish is a web-based photo-sharing and printing service dedicated to helping people transform their photos into beautiful personalized objects, such as photo books, mugs, prints and gifts that can be shared and treasured. Snapfish provides members–both amateurs and professional photographers alike–with unlimited photo storage and free uploads.

Based in San Francisco, California, Snapfish is owned by District Photo.

The Challenge

As a premiere global photo-sharing service, Snapfish manages a staggering quantity of data. Over the last 18 years, Snapfish’s 1 million plus members have uploaded more than 40 billion images, a number that’s continually growing.

According to Brent Williams, principal engineer at Snapfish, “to work at this scale, we need the ideal balance of performance, stability, and cost. We found that MongoDB wasn’t up to the challenge.”

“What initially interested us is the number of concurrent operations that Scylla supports. In the end, Scylla turned out to be the fastest NoSQL out there.”

Brent Williams, Principal Engineer, Snapfish

On a typical day, Snapfish handles over 1 million uploads. During the holiday season, that number can easily quadruple. These real-time numbers are compounded by backups, which have to move 40 billion files a year via 80,000 simultaneous reads and writes per minute, without impacting production workloads.

“Scale is incredibly important to us because of the amount of traffic we generate on our own internal systems with this file migration can really impact our production systems if we’re not careful,” added Williams.

The Solution

The NoSQL database platform in use at Snapfish wasn’t up to the task, so the team evaluated other options.

Initially, Snapfish was excited by the number of concurrent operations that Scylla supports. To prove it out, Snapfish simulated heavy migrations running simultaneously with user-facing workloads. The performance results blew past expectations.

Running on a small cluster that hadn’t even been tuned for production, Scylla hit Snapfish’s production throughput requirement of 3,000 read operations per second. The team saw even better results after moving to a 3-node Scylla cluster running on AWS. In the end, performance improved by 5X using Scylla as compared to MongoDB.

As Williams observes, “It’s clear that Scylla solves constraints and optimizes performance within the operating system and TCP layer, for example, areas where other database architectures punt.”

According to Williams, the Snapfish team had a great experience working with ScyllaDB support. “Having experienced boilerplate advice from database vendors, it was refreshing to see the Scylla team really dig into the question of how to scale properly. Other vendors just tell you to add more RAM. Scylla applies machine learning to your environment and auto-tunes itself for optimal performance.”

Cutting costs while achieving performance targets is always a win. Snapfish was delighted to achieve both goals with Scylla. “With our current vendor, the cost of hardware and licensing combined would have been prohibitive. Our estimates show tremendous cost savings from adopting Scylla.”

According to Williams, “What initially interested us is the number of concurrent operations that Scylla supports. In the end, Scylla turned out to be the fastest NoSQL out there.”