Casper FFG: A consensus protocol aiming to achieve Proof of Stake

share
Casper FFG: A consensus protocol aiming to achieve Proof of Stake

In 2017, Vitalik Buterin and Virgil Griffith jointly introduced Casper the Friendly Finality Gadget (Casper FFG). Casper FFG is a consensus protocol inspired by PBFT and improved upon, designed to be simple yet not easy in its security proof. In this article, the author will analyze the principles of Casper FFG, giving readers insight into the problems addressed by the Proof of Stake consensus and its design philosophy. Furthermore, Casper FFG is the consensus mechanism of Ethereum 2.0, understanding its operation can help researchers and developers further comprehend the design of Ethereum 2.0. Special thanks to Ethereum researcher Chih-Cheng Liang for providing essential materials, engaging in extensive discussions, and feedback with the author, without his assistance, this article would not have come to fruition.

Table of Contents

How did Casper FFG start?

Research on Proof-of-Stake (PoS) for Ethereum can be traced back to an article in 2014. Since then, Ethereum researchers have been working towards the goal of implementing a PoS-based consensus protocol. The design of PoS consensus is a complex problem that involves computer science, economics, cryptography, and other aspects. Ethereum has one of the most interdisciplinary teams in the blockchain ecosystem, making their research on PoS quite thorough.

Recently, I translated an important document on the development context of Casper FFG:

The Love Affair of Casper FFG and Casper CBC
A tweet storm explaining the history and state of Ethereum’s Casper research
medium.com

Casper FFG was inspired by PBFT and can be seen as an improved version of PBFT - it inherits the important designs of PBFT while adding new mechanisms and simplifying certain rules. If readers are unfamiliar with PBFT, they can refer to my analysis of PBFT posted earlier:

A Classic Not to Be Overlooked for Understanding Blockchain: PBFT

What are the important features of PBFT? Can PBFT issue cryptocurrencies?
medium.com

In short, PBFT is a consensus protocol with a two-round voting mechanism and has the following characteristics:

  • Permissioned: Only nodes that are "allowed" can participate in the consensus.
  • Leader-based: Only the leading node is responsible for proposing, while other nodes only vote, requiring a view-change mechanism to control dishonest leading nodes.
  • Communication-based: Uses deterministic majority decisions to reach consensus, rather than non-deterministic proof-of-work puzzles.
  • Safety-over-Liveness: Regardless of network latency, the protocol can guarantee the security of the consensus (no forks), giving the protocol the characteristic of instant finality.

The instant finality feature possessed by PBFT is perhaps the main reason it caught Vitalik's attention. After studying PBFT thoroughly, Vitalik also summarized his findings and proposed important ideas that later evolved into Casper FFG.

The Precursor of Casper FFG: 4 Slashing Rules

Although PBFT has instant finality, it lacks the ability to resist collusion, so it requires a punishment mechanism to deter malicious behavior. Any node that violates the rules must suffer economic losses - the design concept of PoS is to regulate node behavior through economic principles. Any node that deposits a stake can join the network to participate in consensus without permission, making consensus based on the PoS model permissionless.

Here, it is important to clarify the concept of "permission". We say it is "permissionless" because any verifying node can join and leave. However, if a list of validating nodes is maintained when joining, it might be considered "permissioned" from this perspective. From the perspective of PBFT, validating nodes for voting must also be selected from a permitted list.

The next question is: which behaviors should be punished? After careful consideration of PBFT, Vitalik discovered that PBFT only needs 4 rules (assertions) to ensure the proper functioning of consensus:

Minimal Slashing Conditions
Last week Yoichi released a blog post detailing the process of formally proving safety and liveness properties of my…
medium.com

Vitalik summarized these 4 rules in this article and referred to them as PBFT's "Minimal Slashing Conditions". Any behavior that violates these 4 rules will result in the loss of stake. The 4 rules are as follows:

1. Commitment (commit_req): Only commit after receiving prepared messages from 2/3 nodes.
2. Preparation (prepare_req): Each prepared message can only point to a height (Epoch) with prepared messages from 2/3 nodes, and these prepared messages must all point to the same height.
3. Prepare-commit consistency (prepare_commit_consistency): Any new prepared message can only point to the last committed or other higher heights.
4. No double prepare (no_double_prepare): Cannot send two prepares at the same height.

These 4 rules can be further simplified into 2:

A validating node v must not issue two different votes:<ν, s1, t1, hs1, ht1> and 
<ν, s2, t2, hs2, ht2>,and make one of the following conditions true:
1. ht1 = ht2The validating node must not issue two different votes at the same height.
2. hs1 < hs2 < ht2 < ht1The validating node must not vote at a height surrounding/being surrounded by another vote height.

These 2 rules are the minimal slashing conditions of Casper FFG.

How does Casper FFG work?

Casper FFG: Checkpoint Tree

Casper FFG is an overlay chain that abstracts the block proposing mechanism and is solely responsible for achieving consensus. The block proposing mechanism is implemented at the underlying chain level, and blocks coming from the underlying chain are called checkpoints. Checkpoints form a checkpoint tree, for example: extracting the hash values of blocks at heights 0, 50, 100, 150 to form a new tree as shown in the image above. The bottommost checkpoint is called the root checkpoint.

Every node must vote on checkpoints, and the content of the vote consists of a link composed of two different height checkpoints, with the lower height being the source and the higher height being the target. Nodes broadcast their votes to the network and collect votes from other nodes. If the total stake of nodes voting on a link L exceeds 2/3 of all stakes, L is considered a supermajority link, denoted as s → t. In the image above, b1/b2/b3 all form supermajority links, represented as b1 → b2, b2 → b3, etc.

Starting from the root checkpoint, if a supermajority link is formed between two checkpoints, the target of that link enters the "justified" state; and at the time the link is established, the source that is already in the "justified" state enters the "finalized" state; the root checkpoint is by default "justified" and "finalized." After two rounds of voting at each checkpoint, they first "justify" and then "finalize," almost equivalent to PBFT's "prepare" and "commit." For example, in the right branch of the image, r/b1/b2 are all in the "finalized" state, with only b3 in the "justified" state.

So, which checkpoints should validating nodes link to? Each node must follow the Fork Choice Rule to select the next checkpoint to link to. Casper FFG's rule is to choose the highest "justified" checkpoint.

Casper FFG: Fork Caused by Largely Change of Validator Set

As Casper FFG allows any staking node to become a validating node, the validator set dynamically changes over time. Nodes must wait for a period from exiting the network to withdrawing their stake, known as the withdrawal delay. Each checkpoint C has its corresponding dynasty number, defined as the number of finalized checkpoints from the root checkpoint to C. For example, in the image above, the dynasty number of b3 is 3. Each checkpoint generation corresponds to two types of committees: a front-end validation node set (including nodes added in this generation) and a rear-end validation node set (including nodes leaving in this generation). Theoretically, the front-end/rear-end sets of each generation should highly overlap, but collusion among nodes can cause significant changes in the front-end/rear-end sets. If this occurs, the protocol may fail to slash the stake of malicious nodes (since they have left), posing a threat to security. For example, in the image above, validating node A can exit, which means for fork C', A has exited, but for fork C, A has never exited. Therefore, A can continue to vote on the old chain C, but the new chain C' cannot slash A's stake (because A has already exited).

To ensure proper accountability for errors at each checkpoint, a stitching mechanism is needed to "stitch" together the front-end/rear-end sets of checkpoints, ensuring that every error can be properly attributed (whether the error lies in the front-end or rear-end set).

Overall, Casper FFG improves on almost all aspects of PBFT:

  • Economic Constraints: PBFT is permissioned and relies on organizations with existing trust bases to operate the protocol together; Casper FFG is permissionless and introduces minimal slashing conditions to regulate node behavior using the risk of economic losses. Nodes can run the protocol together without any trust base, achieving true decentralization.
  • Abstract Block Proposing Mechanism: PBFT relies on honest leading nodes to propose blocks and requires a view-change mechanism to control Byzantine nodes; Casper FFG does not need to concern itself with the underlying block proposing mechanism, only focusing on achieving consensus. The benefit of abstracting the block proposing mechanism is that the block frequency of the underlying chain does not need to match the consensus frequency of the overlay chain, increasing efficiency and reducing network load. For example, only 1 checkpoint is made for every 100 underlying blocks.
  • Streamlined Voting: PBFT has several voting messages such as , , ; Casper FFG only has , and the content of the vote is not a single block/request, but two linked checkpoints, making Casper FFG much more concise. These linked checkpoints, which form a chain structure, go through two rounds of voting at different heights. Since each round of voting finalizes the source and justifies the target, consensus can progress like a pipeline. A similar design concept also appears in Hot-Stuff; interestingly, the author of the paper, Dahlia Malkhi, has written a comparison between Hot-Stuff and