Consensus Compare: Casper vs. Tendermint

This technical deep dive was originally published on the Cosmos Blog. Enjoy.

The Long Road to Proof-of-Stake

For definitions, see (Primer).

The Byzantine General’s Problem was first proposed by SIFT in the 1970s to tolerate real-time faults in commercial aircraft and then given the name by Lamport, Shostak, and Pease in 1982. It described the problem of achieving distributed agreement over a compromised communications network — of making “a reliable system from unreliable parts”, as Ethan Buchman of Cosmos puts it. From 1982 to 1999, no one had invented a system that solved the Byzantine General’s Problem. It was irrelevant to computing for a long time because, at the time, the internet had evolved out of cloud-based central computing; all that was needed was fault-tolerance.

So, crash-fault tolerant algorithms like Paxos, invented in the late 1980s, and Raft, invented in 2013, were widely deployed. Practical Byzantine Fault Tolerance (PBFT) was created in 1999, but was not really adopted by anyone outside of academia. The breakthrough that Satoshi Nakamoto made in 2008 was in bringing internet-scale distributed Byzantine Fault Tolerant (BFT)consensus into a blockchain scheme. Once that demonstrationhappened, a lot of the people from the systems research community started to formulate ideas around applying what was largely an academic curiosity to a real-world application.

In 2011, a thread in the BitcoinTalk forum formed discussions around the idea of something called Proof-of-Stake. First-generation PoS protocols such as Peercoin were naively implemented but were steps in the right direction. The first person who really started talking about applying BFT research in a PoS public blockchain context was Jae Kwon, who, in 2014, invented Tendermint.

At the time, PoS research made a large assumption that the set of peers in the system is static and that it would be stable over long periods of time. It was completely unrealistic in the blockchain environment. Jae Kwon’s major breakthrough that enabled Tendermint was in adapting BFT research to the domain of replicated state machines (blockchains), using blocks, hash-linking, dynamic validator sets, and rotating leader election.

In the post-Tendermint environment, a huge number of different consensus algorithms (Honeybadger, Ouroboros, Tezos, Casper) popped up that all incorporate elements of BFT research along with other modular observations on the blockchain.

The keystone problem that all this research that’s been done in Proof-of-Stake forms around the question: Can we achieve the same level of security as a Proof-of-Work (PoW) system like Bitcoin while not depleting physical scarce resources to do it? Proof-of-Stake (PoS), i.e., when voting power is denominated in a native currency rather than hashpower, has been the leading class of research attempting to answer this question. Widely debated problems in blockchain over scalability, high overhead costs of running PoW miners, and environmental externalities all motivate the pouring of considerable resources into PoS security research.

This article explores the unique approaches to Proof-of-Stake powering three of the major PoS protocols in the cryptocurrency space: Casper the Friendly Ghost, research led by Vlad Zamfir, Casper the Friendly Finality Gadget, research led by Vitalik Buterin, and Tendermint, led by Jae Kwon.

Pitfalls of Proof-of-Stake

Nothing-at-Stake

At first, this problem was explored under a variety of phrases to describe the general pitfalls of Proof-of-Stake. Jae Kwon first referred to this problem in May 2014 under the unfortunate name of “the fallacy of false choices”. Vitalik, in July 2014, popularized the now well-defined problem described by Bitcoin developers as “nothing-at-stake”. The problem presents this scenario:validators can effectively break safety by voting for multiple conflicting blocks at a given block height without incurring cost for doing so.

Naive PoS implementations are vulnerable to these attacks. Catastrophically,since there’s no incentive to ever converge on a unique chain and every incentive to sign duplicitously on multiple conflicting chains at once, the economically optimal strategy becomes voting on as many forks as you can find in order to reap more block rewards. The following diagram demonstrates this.

Expected value for voting on competing chains is greater than expected value for voting on a single chain in naive PoS design.

In Proof-of-Work, the “penalty” for mining on multiple chains is that a miner must split up their physical hashing power (scarce resource) to do this. In modern, non-degenerate PoS design, this cost must be baked into the protocol to mimic physical PoW mining constraints.

This attack vector is mitigated with the broadly conceived notion of a ‘slasher’, or in-protocol penalty, introduced by Vitalik Buterin in January 2014. Jae Kwon extrapolated this approach further the same year, a development which materialized into the first iteration of the Tendermint consensus protocol.Slashing, and the conditions that warrant such punishment, is instrumental to all non-degenerate BFT protocols, so much so that it is employed in all three of the consensus protocols presented in this article.

Long Range Attacks

The long range attack draws from the right that users have to withdraw their security deposits. A fundamental problem arises with this because it means an attacker can build up a fork from an arbitrarily long range without fear of being slashed. Once security deposits are unbonded, the incentive not to vote on a long-range fork from some block height ago is removed. In other words,when more than ⅔ of the validators have unbonded, they could maliciously create a second chain which included the past validator set, which could result in arbitrary alternative transactions.

This could be fatal for Proof-of-Stake protocols because the security model is necessarily ‘subjective’. A security model is said to be ‘subjective’ when participation in a network requires large amounts of social information, where new nodes coming onto the network can arrive at different conclusions about the current network state because their decision is based off of subjective information, i.e., social reputation. In contrast, a Proof-of-Work security model is necessarily ‘objective’ because a current network state will always be the state with the most amount of proof of work, and new nodes will arrive at the same conclusion since their decision is based off of objective information.

Long range attacks in PoS are rectified under the weak subjectivity model,which requires the following of new nodes which come onto the network:

Must be currently bonded. Only trust validating nodes that currently have security deposits.
Unbonded deposits must go through a ‘thawing’ period. After unbonding, tokens need time to ‘thaw’ for weeks to months to account for synchrony assumptions (i.e. delayed messages).
Forbid reverting back before N blocks, where N is the length of the security deposit. This rule invalidates any long range fork.
Optionally store the validator set on a PoW chain.

Illustration: Long range attack. Source: Casper the Friendly Finality Gadget

Casper(s) and Tendermint adopt a simple locking mechanism (colloquially called ‘freezing’ in Tendermint) which “locks” stake for a certain period of time(weeks to months of ‘thawing’) in order to prevent any malicious coalition of validators from violating safety. In Casper the Friendly Finality Gadget (CFFG), a fork choice rule to never revert a finalized block prevents the long range attack.By using timestamps, long range forks in CFFG attempting to revise blocks older than a finalized block are ignored by the protocol.

Cartel Formation

A third, final hurdle facing any economic paradigm that’s worth any value faces the very real problem of oligopolies; decentralized protocols with native cryptocurrencies are no exception.

“Cryptocurrency is incredibly concentrated. So is mining power. Oligopolistic competition is the norm in many ‘real-life’ markets. Coordination between a small number of relatively wealthy validators is much easier than coordination between a large number of relatively poor validators. Cartel formation is completely expected, in our context.”

— Vlad Zamfir, The History of Casper Chapter 4

Tendermint relies on extra-protocol governance measures to combat oligopolistic validators. While there are no in-protocol measures for censorship-resistance, the rationale behind relying on out-of-band social information to tackle cartel formation is that the users would eventually andinevitably notice cartels forming, socially gossip about it, and then either abandon or vote to reorganize the blockchain that’s under attack.

Thus far, Vlad’s construction of the Casper protocol is the only model whichexplicitly combats cartel formation with in-consensus censorship-resistant incentives.

Broad Overview

There are varying ways to implement Proof-of-Stake algorithms, but the two major tenets in Proof-of-Stake design are chain-based PoS and Byzantine Fault Tolerant-based PoS. Tendermint is a BFT-based PoS design, Casper the Friendly Ghost is a chain-based PoS design, and Casper the Friendly Finality Gadget is a hybrid of the two.

The CAP Theorem in computer science returns the impossibility of providing more than 2 out of 3 guarantees in distributed data systems: availability, consistency, and partition tolerance. Chain-based PoS algorithms tend to choose availability over consistency, where having availability means that all transactions will be processed, but at the expense of a consistent state replicating across the network. BFT-based PoS algorithms, on the other hand, strictly choose consistency over availability.

Byzantine Fault Tolerant-based Proof-of-Stake

Byzantine Fault Tolerant (BFT) consensus algorithms stem from a rich body of research spanning more than 30 years. Tendermint (2014) is the first adaptation of Proof-of-Stake consensus derived from the Practical Byzantine Fault Tolerant (PBFT) algorithm introduced by Castro and Liskov in 1999.BFT-based PoS protocols pseudo-randomly assign a validator the right to propose new blocks during a multi-round voting process. However,committing and finalizing blocks depends on a supermajority — a >⅔ quorum — of all validators signing off on the proposed block. This may take several rounds, or polkas, before blocks become finalized. BFT systems can only tolerate up to a ⅓ of failures, where failures can include arbitrary or malicious behaviour.

Tendermint BFT

Tendermint consists of two chief technical components: a blockchain consensus engine and a generic application interface. The consensus engine, called Tendermint Core, ensures that the same transactions are recorded on every machine in the same order. The application interface, called theApplication Blockchain Interface (ABCI), enables the transactions to be processed in any programming language.

At the core, Tendermint works as a round-based voting mechanism which makes the consensus protocol. A round is broken up into a three-step process through which validators propose blocks, signal commitment intent and then sign to commit new blocks. This mechanism yields a secure state replication machine for atomic broadcast with an added layer of accountability — safety faults are perfectly attributable in Tendermint.

Tendermint consensus algorithm begins with a set of validators. Validators maintain a full copy of the blockchain and are identified by their public keys.They take turns proposing blocks at each new block height. There is at mostone proposer per voting round. Each proposal is signed by a validator’s corresponding private key so that the validator responsible for it can be identified if some failure were to occur. The rest of the validators then vote on each proposal, signing their votes with their private keys. This constitutes a single round. But it may take several rounds before a new block is committed due to network asynchrony.

A voting round depicting a circuit of partially synchronous proposals, followed by asynchronous voting. After the proposal step, validators only make progress after hearing from ⅔ or more of the other validators. The dotted arrow extends the consensus into atomic broadcast by moving to the next height.

Validators may fail to commit a block for a number of arbitrary reasons; i.e., the current proposer may be offline, or a network may be experiencing latency. Tendermint allows a validator to be skipped. Validators wait a small amount of time to receive a complete proposal block from the proposer before voting to move to the next round. This reliance on a timeout is what makes Tendermint a weakly synchronous protocol, rather than an asynchronous one.However, the rest of the protocol is asynchronous, and validators only make progress after hearing from more than ⅔ of the validator set. As such,Tendermint requires 100% uptime from a supermajority of its validatorsbecause if ⅓ or more are offline or partitioned, the network may halt.

Assuming less than ⅓ of the validators are Byzantine, Tendermint guarantees that safety will never be violated — that is, validators will never commit conflicting blocks at the same height. Therefore, a Tendermint-based blockchain never forks.

The design decision behind Tendermint so far really prioritizes safety and finality over liveness. There’s a fairly high likelihood that in the real world, the system will actually halt, and the participants will have to organize outside of the protocol in some sort of software update to restart the system.

Tendermint formed the basis for Casper research when very few people in the cryptocurrency community understood what the insight was and why it was valuable. The insight was: If the chain itself is highly fault-tolerant, then you can rely on the chain to make good decisions about who can produce blocks.If the chain itself is less reliable, then you end up with this chicken and the egg problem, which is what doomed every other consensus algorithm that came before.

This design decision is perceived as inferior to availability-prioritizing protocols like Ethereum and Bitcoin. The trade-off in Bitcoin is: If its network goes split brain, Bitcoin loses all of its safety guarantees in various attack scenarios. It’s the overall theme of finality, which is the idea that if your confidence interval is less than 100%, that the chain you’re following is the correct chain, and you’re using that chain to select who is going to produce the next block, then once you’ve diverted to an unsafe chain, there’s no clear path back to the correct chain.

Properties at a Glance

Provable liveness in partially synchronous network.
Safety threshold: ⅓ of validators.
Public/private chain compatible.
Instant finality: 1–3 seconds depending on number of validators.
Consistency-prioritizing.
Consensus safety.

Chain-based Proof-of-Stake

Chain-based Proof-of-Stake simulates Proof-of-Work consensus, in that the protocol assigns the right to commit a single new block to a pseudorandomly-selected validator, where the new block is hash-linked to a parent block of the previously longest chain. Chain-based PoS relies largely on synchronous networks, generally prioritizing availability over consistency. Casper(s) is an adaptation of the core ideas of Tendermint to a setting that prefers liveness over safety.

Casper the Friendly Finality Gadget

While Casper the Friendly Ghost is an explicit PoS construct, Casper the Friendly Finality Gadget, or Casper FFG for short, is a PoS overlay on top of the existing Ethereum PoW proposal mechanism — a hybrid PoW/PoS implementation led by Vitalik Buterin.

Neither Bitcoin nor Ethereum’s PoW consensus protocols make ‘final’ decisions, and blocks can potentially be reorganized to some past block height. A block is said to be ‘final’ when there is no chance of it being revised.Since Proof-of-Work provides no such revision guarantees, it is not actually considered to be consensus safe. Instead, blocks get probabilistically more finalized as we move deeper into the chain. In order to add the desired properties of finality and 51% attack resistance to the Ethereum blockchain, implementing FFG logic would ideally provide this effect.

Casper the Friendly Finality Gadget will be rolled out in multiple steps to conservatively transition Ethereum’s PoW security model to a wholly PoS one over time. The first iteration of Casper will be implemented as the hybrid PoW/PoS protocol discussed here. The final iteration of Casper will likely take the lessons from both FFG and CBC toward a complete PoS protocol.

FFG is a hybrid between chain-based PoS and Byzantine Fault Tolerant-based PoS, as it borrows from both schools of thought. Its modular overlay design makes upgrading the existing PoW chain easier, as it is a more conservative approach to upgrading the system to an entirely different consensus model.

Casper’s application logic lives inside a smart contract. To become a validator in Casper, one must hold ether (ETH) and deposit some ETH to be leveraged as stake into the Casper smart contract. The block proposal mechanism in the first iteration of Casper remains the same: It still uses Nakamoto PoW consensus where miners create the blocks. In order to finalize blocks, however, Casper’s PoS overlay takes control and has its validators vote on blocks trailing the PoW miners.

A key component of Casper’s PoS consensus are checkpoints. Casper assesses finality at 50 block increments called checkpoints and each 50 block segment is called an epoch. This process happens via validators sending a Vote message during each epoch.

It takes two epochs, i.e., two rounds of voting, in order to finalize a checkpoint one epoch ago. For example, when > ⅔ of the validators, aka. “supermajority”, vote on a checkpoint, c, that checkpoint is said to be “justified”. Now in the next epoch, when a supermajority vote on that checkpoint, c’, two things happen: c’ becomes justified and c is finalized. The epoch of c is referred to as the last finalized epoch (LFE).

In review, a block1 is said to finalize, f, after two conditions are met:

A supermajority (>⅔) of validators vote on block1 in epoch1, therebyjustifying block1.
A supermajority (>⅔) of validators vote on block2 in epoch2, the direct child block to block1, thereby finalizing block1 during epoch2.

In an ideal execution, a block is finalized in the following steps:

⅔ Vote block1 → Justify block1 → ⅔ Vote block2 → Finalize block1

Where block2 is direct child of block1

Here’s an illustration of Casper’s finality gadget logic. Each block represents a checkpoint. Lines represent 50 blocks (epoch) between each checkpoint. A block becomes the last finalized epoch (LFE) when its child block is justified.

Validators are paid once a checkpoint is finalized.However, if there are two finalized checkpoints on two distinct forks at the same block height, thus violating safety, then slashing conditions come into play and at least ⅓ of the total security deposits staked will be slashed. Evidence for fault attribution when safety violations occur can be broadcasted as a transaction to the PoW miners. PoW miners then mine the block with the evidence transaction, and the validator(s) who submitted the evidence is rewarded a finder’s fee. When this happens, the guilty validator(s) who signed the conflicting blocks is slashed on both chains.

Now what happens if a miner were to execute a brute force PoW attack?Casper’s now finalized blockchain prevents PoW attackers, even with 51% or more of the hashpower, from rewriting history beyond the latest checkpoint.Thus, the Casper protocol provides safety. Unlike Casper the Friendly Ghost, because Casper the Friendly Finality Gadget is just an overlay on top of a separate proposal mechanism, Casper cannot guarantee liveness, as liveness depends on the proposal mechanism.

Validators are incentivized to converge on a canonical chain because they are penalized more over time as validators continue to vote on two divergent chains. With the formalization of ‘slasher 2.0’, validators are not just punished for double-voting but are penalized for voting on the incorrect fork. This does present a ‘discouragement’ predicament where validators end up opting out of voting from fear of getting slashed if there were a fork and they were unsure which one to choose.

Properties at a Glance

Finality: ~20 minutes to finality. Finalized blocks once every 50 blocks (an epoch) makes the blockchain future-proof against brute force PoW mining attacks.
Consensus safety.
Provable liveness.
Availability-prioritizing.

Casper the Friendly Ghost

Casper the Friendly Ghost is Vlad Zamfir’s correct-by-construction consensus protocol, or Casper CBC for short, tailored toward combatting an oligopolistic real world environment. CTFG is a PoS adaptation of the Greedy Heaviest-Observed Subtree, or GHOST protocol in PoW, for its fork choice rule. The guiding design principle behind Casper the Friendly Ghost is based oncryptoeconomics using formal methods aimed to achieve estimate safety. Casper CBC is a pure Proof-of-Stake concept, unlike Casper FFG’s hybrid protocol detailed in the earlier section.

“Casper was born as simply ‘the friendly ghost’, an adaptation of GHOST to Proof-of-Stake, complete with incentives that would make a cartel ‘friendly’ to non-cartel validators.”

— Vlad Zamfir, The History of Casper — Chapter 5

Similar to Proof-of-Work, Casper CBC trades off consistency for availability. In particular, blocks are not finalized; rather, they grow “safer” the deeper down the chain they get. Casper the Friendly Ghost is similar to FFG in that the head of the chain is always progressing faster than the blocks are getting finalized.

The key differentiator in Casper’s PoS proposal mechanism from Tendermint’s is that instead of pseudo-randomly elected leaders, validators in the former can propose blocks based on the blocks that they’ve seen.

The unique functionality Casper offers is parameterizable safety thresholds.Similar to taking 6 confirmations in Bitcoin to determine economic finality, “estimate safety” in CBC provides that one validator can have a different safety threshold than another validator. Casper’s design goal is to allow validators to pick their fault tolerance thresholds while the network maintains the low overhead of Proof-of-Work.

A core advantage that Casper has over Tendermint is in the number of validators that the network can accommodate at any given time. Since block creation in Tendermint requires finality at time of creation, block times should be short. And in order to achieve short block times, there needs to be a limit to the number of validators Tendermint PoS can accommodate. Since neither CBC nor FFG require safety at time of block creation, the Ethereum network would be able to accommodate much more than, say, 100 validators.

Properties at a Glance

Available. Casper’s nodes can have their blocks fork until they come to consensus.
Asynchronously safe.
Live. Casper’s decisions can be live in partial synchrony, but are not live in asynchrony.
Cartel-resistant. Casper’s entire premise is built upon resisting an oligopolistic attacker so that no colluding set of validators can overtake the protocol.
Safety. Depends on each validator’s estimate safety threshold.

Future Work

Public blockchains running in production are a relatively nascent technology.The only security model in this paradigm that has so far shown to be incorruptible is Proof-of-Work. The design space in Proof-of-Stake is far more vast and the engineering tradeoffs far less understood as Proof-of-Stake is a research frontier and there just isn’t enough data. Needless to say, there is a lot of future work to be done to arrive at an optimal PoS consensus algorithm that adopts best practices once we know what those are.

One improvement in Tendermint could be a new proposal mechanism altogether, or one that compresses Tendermint’s multi-round voting process into a single round.

The second area of future work could be in leveraging more advanced cryptography to make the signatures on the block headers smaller. Because we’re building an ‘Internet of Blockchains’ through Cosmos, moving light client proofs from one blockchain to another is core of what we do. It would be hugely advantageous from that standpoint to use more advanced cryptography to decrease the size of the block headers by a factor of thirty or more. Currently, with 100 validators, Tendermint block headers are little less than 4 KB; they’re full of validator signatures. We can make the 100 signatures go from 3.2 KB to 64 bytes by using more advanced cryptography.

There are also ways of optimizing the peer-to-peer layer so that we can significantly decrease the amount of peer-to-peer traffic that’s needed to finalize a block. In future work, this would not only compress the amount of data that goes into the block header, but also decrease the amount of data sent to each peer. This would allow Tendermint to accommodate a much larger set of validators above the initial 100 validator threshold that the Cosmos network can have.

Acknowledgements: Zaki Manian for the history lesson, Sunny Aggarwal & Vlad Zamfir for Casper the Friendly Ghost insights, Virgil Griffith, Karl Floersch and Jon Choi for Finality Gadget review.

Consensus Compare: Casper vs. Tendermint was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.

Publication date

07/11/2018 - 12:31

Author

Chjango Unchained

Article source

Disclaimer

The views and opinions expressed in this article are solely those of the authors and do not reflect the views of Bitcoin Insider. Every investment and trading move involves risk - this is especially true for cryptocurrencies given their volatility. We strongly advise our readers to conduct their own research when making a decision.