Consensus Algorithms

Consensus Algorithms Definition

Consensus is a fundamental problem in distributed systems, in which multiple interacting components must agree on system state. Consensus algorithms are designed to enable a collection of distributed machines to work together as a coherent group, even in the presence of failures and outages. As such, consensus algorithms are fundamental building blocks of large-scale, fault-tolerant systems.

In practice, consensus provides a way for multiple servers to reach agreement on system state. Once they reach consensus, the result is final and indisputable. To define agreement, algorithms set a threshold, the quorum of members that must reach agreement to constitute consensus. This threshold number of members, whether they are thought of as machines, servers, or nodes, is required to achieve consensus before system state advances. For example, a consensus algorithm might cluster of 6 nodes can continue to operate even if 2 servers are faulty.

Image showing Consensus Algorithm example - such as raft consensus algorithm, showing how multiple servers reach agreement on system state..

Consensus Algorithms FAQs

What Are Consensus Algorithms?

Consensus typically arises in the context of replicated state machines, a general approach to building fault-tolerant systems. 

The best known consensus algorithms are Paxos and Raft. Raft defines not only how the group makes a decision, but also the protocol for adding new members and removing members from the group, making it a natural mechanism for managing topology changes in distributed systems.

Unlike the related consensus algorithm, Paxos, Raft is a leader-based log replication protocol.  According to its inventors, Raft is more comprehensible than Paxos, in that it “reduces the degree of nondeterminism and the ways servers can be inconsistent with each other.”

With the invention of Bitcoin in 2009, a new generation of ‘decentralized’ consensus algorithms has emerged. Such algorithms enable distributed systems to achieve consensus even in adversarial, untrusted  environments. As such, these algorithms provide ‘byzantine fault-tolerance’ (BFT). The Bitcoin protocol, for example, leverages BFT to solve the ‘double-spend’ problem, ensuring that Bitcoins are cryptographically secured against digital counterfeiting. The consensus mechanism used by Bitcoin is known as Proof-of-Work (PoW). Other notable BFT consensus algorithms include Proof-of-Stake, Proof-of-Stake (PoS),  and Proof-of-Authority (PoA).

What Consensus Algorithms are supported by ScyllaDB?

ScyllaDB launched an initiative to improve ScyllaDB by adding greater capabilities for consistency, performance, scalability, stability, manageability and ease of use. As of December 2020, the core Raft protocol is implemented in ScyllaDB.

ScyllaDB supports Lightweight Transactions (LWT) using Paxos, but these transactions require three roundtrips. Raft is enabling ScyllaDB to execute consistent transactions without a performance penalty. Unlike Paxos, which is only used for LWT, most aspects of ScyllaDB will move to Raft, significantly improving manageability and consistency.

Beyond crucial operational advantages, application developers will be able to leverage Raft to enable strong transaction consistency at the price of a regular operation.

ScyllaDB’s Raft implementation covers the following system components:

  • Transactional Schema Changes — Our first user-visible value eliminates schema conflicts and allows full automation of DDL changes under any condition.
  • Transactional Topology Changes — Our next user-visible change will permit adding or removing any number of nodes simultaneously. Currently, ScyllaDB and Cassandra can only scale one node at a time. This means it can take long hours to double or triple the whole cluster’s capacity. Obviously, this is not the elasticity you’d expect if you have bursty intraday traffic.
  • Tablets — Once range ownership becomes transactional, it will allow many levels of freedom. We plan to improve more aspects of range movements, towards tablets and dynamic range splitting for load balancing.
  • Dynamic Tablets — Static Tablets enable ScyllaDB to scale multiple nodes at the same time. Reusing the tablet concept enables ranges to be split dynamically in order to load balance shards and to support unbalanced data models.

ScyllaDB’s short term and long-term Raft support roadmap was covered in detail at  ScyllaDB Summit 2022.

Trending NoSQL Resources

ScyllaDB University Mascot

ScyllaDB University

Get started on your path to becoming a ScyllaDB expert.