See all blog posts

ScyllaDB Summit Preview: Rebuilding the Ceph Distributed Storage Solution with Seastar

ScyllaDB Summit Preview: Rebuilding the Ceph Distributed Storage Solution with Seastar
In the run-up to ScyllaDB Summit 2018, we’ll be featuring our speakers and providing sneak peeks at their presentations. This interview in our ongoing series is with Kefu Chai, Software Engineer at Red Hat. Before Red Hat Kefu has worked at Morgan Stanley, VMWare, and EMC. His presentation at ScyllaDB Summit will be on Rebuilding the Ceph Distributed Storage Solution with Seastar.

We’d like to get to know you a little better, Kefu. Outside of technology, what do you enjoy doing? What are your interests and hobbies?

I try to teach myself French in my spare time. And I read sci-fi and play FPS video games sometimes when I am tired of memorizing conjugations. 😃

You have broad experience developing kernel modules, client-side applications, middleware, testing frameworks and HTTP servers. What path led you to getting hands-on with Seastar?

Higher throughput, more IOPS and predictable lower latency have become the “holy grail” of storage solutions. The words sounds familiar to you, right? Because database and storage systems share almost the same set of problems nowadays. It’s natural for us to look for solutions in database technologies. And the Seastar framework behind ScyllaDB is appealing. That’s why we embarked our journey rebuilding Ceph with Seastar

For those not familiar with Ceph, what are its primary features? How would you compare it to other storage options?

Ceph is an open-source distributed storage platform which offers block, object, and filesystem access to the data stored in the cluster. Unlike numbers, it’s often difficult to compare real-world entities. Even an apple could be very different from another apple in different perspectives. Instead, I’d like to highlight some things that differentiate Ceph from other software-defined storage solutions:

  • It has an active user and developer community.
  • It’s a unified storage solution. One doesn’t need to keep multiple different clusters for different use cases.
  • It’s designed to be decentralized and to avoid single point of failure.

What will you cover in your talk?

I will introduce Ceph the distributed storage we are working on, explain the problems we are facing, and then talk about how we rebuilt this system with Seastar.

Though you are sure to get into this deeper in your discussion, what advantages are you seeing in the Seastar framework?

As you might know, I was working on another C++ framework named mordor couple years ago before C++11 brought the future/promise to us. And all of mordor, C++11 and Seastar offer coroutine, such that by calling a blocking call, the library automatically moves another runnable fiber to this thread so this thread is not blocked. But Seastar went further by enforcing the share-nothing model with zero tolerance of locking. I think this alone differentiates Seastar from other coroutine frameworks. It forces developers to re-think their designs.

What were some unique challenges to integrate Seastar to Ceph?

Unlike some other projects based on Seastar, Ceph was not designed from scratch with Seastar. So we need to overcome some more interesting difficulties. Also, Ceph is a very dynamic project, so it’s like trying to catch a guy ten miles ahead running away from you.

Where are you in the process of integration?

We are rebuilding the infrastructures in Ceph using Seastar. It’s almost done.

What are you looking to do next? 

We want to get to the I/O path as soon as possible to understand how Seastar impacts to the performance.

Is there anything ScyllaDB Summit attendees need to know in order to get the most out of your talk? What technology or tools should they be familiar with?

It would be ideal if the attendees have basic understanding of typical threading model used by servers. But I will cover this part briefly also.

Thank you for the time Kefu! Looking forward to seeing you on stage in November!

If you haven’t arranged your own travel plans to attend ScyllaDB Summit 2018, it’s not too late! Don’t delay! Register today!

About Peter Corless

Peter Corless is the Director of Technical Advocacy at ScyllaDB. He listens to users’ stories and discovers wisdom in each to share with other practitioners in the industry at large, whether through blogs or in technical presentations. He occasionally hosts live industry events from webinars to online conferences.