Started in 1998 as an email service, Mail.Ru has grown into a major figure in the Russian-speaking Internet. Mail.Ru has grown to serve 19 million unique daily active users, and 47 million unique monthly active users. Mail.Ru’s sites reach approximately 86% of Russian Internet users on a monthly basis, and the company is one the top 5 largest Internet companies. Mail.Ru also controls and operates the 3 most popular Russian social networking sites, VKontakte, Odnoklassniki and Moi Mir, respectively.
As an email provider, Mail.Ru gives its millions of users the ability to view and manage an ‘action history – a time series of actions that are stored per email. Every action has a user field, such as the IP address from which the action was committed, the ID of the client who created the action, and finally the action ID, which represents whether a user replied to an email, uploaded an image, and so on. The API supports 65,000 peak writes per second and a peak of 50 reads per second.
Having started out with a homegrown solution, Mail.Ru started looking for alternative storage options based on a set of limitations:
- Poor scalability, requiring too many nodes to handle increases in traffic
- Built from scratch internally, poor maintainability
- Lack of essential features, such as secondary indexes, tunable replication, and query language
“We saved at least $150,000 of capital expenses per petabyte.”
– Kirill Alexseev, Software Engineering Technical Lead, Mail.Ru
The team decided to go with ScyllaDB as storage for user actions. Today, Mail.Ru runs ScyllaDB on bare metal in two datacenters, with 4 nodes in one and 5 nodes in the second. Mail.Ru’s ScyllaDB cluster handles 240,000 writes per second with 95% latency ~1.5ms and 99.9% latency ~22ms. It supports a peak of 100 reads per second with 95% latency ~400ms and 99.9% ~ 650ms.
ScyllaDB has enabled Mail.Ru to achieve the following results:
- A high-load service for storing users actions with ScyllaDB and HDDs
- 240K writes per second with 95% latency of 1.5ms
- Reads are served via secondary keys with predictable performance
But why HDDs instead of SDDs? Kirill Alexseev, Software Engineering Technical Lead at Mail.Ru, explains: “By using HDDs, we saved at least $150,000 of capital expenses per petabyte compared to an SSD setup.”
In the coming year, Mail.Ru plans to add a new datacenter, optimize ScyllaDB and clients to achieve even better latencies, and integrate ScyllaDB into even more projects.