Scylla Summit Preview: Rebuilding the Ceph Distributed Storage Solution with Seastar
We’d like to get to know you a little better, Kefu. Outside of technology, what do you enjoy doing? What are your interests and hobbies?
I try to teach myself French in my spare time. And I read sci-fi and play FPS video games sometimes when I am tired of memorizing conjugations. 😃
You have broad experience developing kernel modules, client-side applications, middleware, testing frameworks and HTTP servers. What path led you to getting hands-on with Seastar?
Higher throughput, more IOPS and predictable lower latency have become the “holy grail” of storage solutions. The words sounds familiar to you, right? Because database and storage systems share almost the same set of problems nowadays. It’s natural for us to look for solutions in database technologies. And the Seastar framework behind Scylla is appealing. That’s why we embarked our journey rebuilding Ceph with Seastar
For those not familiar with Ceph, what are its primary features? How would you compare it to other storage options?
Ceph is an open-source distributed storage platform which offers block, object, and filesystem access to the data stored in the cluster. Unlike numbers, it’s often difficult to compare real-world entities. Even an apple could be very different from another apple in different perspectives. Instead, I’d like to highlight some things that differentiate Ceph from other software-defined storage solutions:
- It has an active user and developer community.
- It’s a unified storage solution. One doesn’t need to keep multiple different clusters for different use cases.
- It’s designed to be decentralized and to avoid single point of failure.
What will you cover in your talk?
I will introduce Ceph the distributed storage we are working on, explain the problems we are facing, and then talk about how we rebuilt this system with Seastar.
Though you are sure to get into this deeper in your discussion, what advantages are you seeing in the Seastar framework?
As you might know, I was working on another C++ framework named mordor couple years ago before C++11 brought the future/promise to us. And all of mordor, C++11 and Seastar offer coroutine, such that by calling a blocking call, the library automatically moves another runnable fiber to this thread so this thread is not blocked. But Seastar went further by enforcing the share-nothing model with zero tolerance of locking. I think this alone differentiates Seastar from other coroutine frameworks. It forces developers to re-think their designs.
What were some unique challenges to integrate Seastar to Ceph?
Unlike some other projects based on Seastar, Ceph was not designed from scratch with Seastar. So we need to overcome some more interesting difficulties. Also, Ceph is a very dynamic project, so it’s like trying to catch a guy ten miles ahead running away from you.
Where are you in the process of integration?
We are rebuilding the infrastructures in Ceph using Seastar. It’s almost done.
What are you looking to do next?
We want to get to the I/O path as soon as possible to understand how Seastar impacts to the performance.
Is there anything Scylla Summit attendees need to know in order to get the most out of your talk? What technology or tools should they be familiar with?
It would be ideal if the attendees have basic understanding of typical threading model used by servers. But I will cover this part briefly also.
Thank you for the time Kefu! Looking forward to seeing you on stage in November!
If you haven’t arranged your own travel plans to attend Scylla Summit 2018, it’s not too late! Don’t delay! Register today!