fbpx

CASE STUDY

Russia’s Largest Email Service Stores User Actions at High-Load with Scylla on Hard Disk Drives

About Mail.Ru

Started in 1998 as an email service, Mail.Ru has grown into a major figure in the Russian-speaking Internet. Mail.Ru has grown to serve 19 million unique daily active users, and 47 million unique monthly active users. Mail.Ru’s sites reach approximately 86% of Russian Internet users on a monthly basis, and the company is one the top 5 largest Internet companies. Mail.Ru also controls and operates the 3 most popular Russian social networking sites, VKontakte, Odnoklassniki and Moi Mir, respectively.

The Challenge

As an email provider, Mail.Ru gives its millions of users the ability to view and manage an ‘action history – a time series of actions that are stored per email. Every action has a user field, such as the IP address from which the action was committed, the ID of the client who created the action, and finally the action ID, which represents whether a user replied to an email, uploaded an image, and so on. The API supports 65,000 peak writes per second and a peak of 50 reads per second.

Having started out with a homegrown solution, Mail.Ru started looking for alternative storage options based on a set of limitations:

  • Poor scalability, requiring too many nodes to handle increases in traffic
  • Built from scratch internally, poor maintainability
  • Lack of essential features, such as secondary indexes, tunable replication, and query language

“We saved at least $150,000 of capital expenses per petabyte.”

Kirill Alexseev, Software Engineering Technical Lead, Mail.Ru

The Solution

The team decided to go with Scylla as storage for user actions. Today, Mail.Ru runs Scylla on bare metal in two datacenters, with 4 nodes in one and 5 nodes in the second. Mail.Ru’s Scylla cluster handles 240,000 writes per second with 95% latency ~1.5ms and 99.9% latency ~22ms. It supports a peak of 100 reads per second with 95% latency ~400ms and 99.9% ~ 650ms.

Scylla has enabled Mail.Ru to achieve the following results:

  • A high-load service for storing users actions with Scylla and HDDs
  • 240K writes per second with 95% latency of 1.5ms
  • Reads are served via secondary keys with predictable performance

But why HDDs instead of SDDs? Kirill Alexseev, Software Engineering Technical Lead at Mail.Ru, explains: “By using HDDs, we saved at least $150,000 of capital expenses per petabyte compared to an SSD setup.”

In the coming year, Mail.Ru plans to add a new datacenter, optimize Scylla and clients to achieve even better latencies, and integrate Scylla into even more projects.