ETHEREUM

Blockchain is a social system! Vitalik discusses whether Ethereum should be an all-powerful L1 or build a multi-functional L2?

2023/10/3

Ethereum founder Vitalik Buterin recently published a new article titled "Should More Functionality Be Enshrined in the Ethereum Base Layer Protocol?" In the article, he explores whether many of the current technologies being used should be incorporated into the Ethereum base layer protocol and provides a standard for technical judgment.

This translation and summary are provided for reference. For any uncertainties, please refer to the original article.

Table of Contents

Should more features be incorporated into the Ethereum base protocol?

From the inception of the Ethereum project, there has been a strong ideological commitment to keeping the core Ethereum as simple as possible and building protocols on top to do as much as possible.

In the blockchain space, the debate between "do it on L1" and "focus on L2" is often seen as primarily about scaling, but in reality, similar issues exist in meeting various needs of Ethereum users: digital asset exchanges, privacy, user names, advanced cryptography, account security, censorship resistance, and front-running protections, among others. However, there has been caution recently about being willing to incorporate more of these functionalities into the core Ethereum protocol.

This article will explore some philosophical reasoning behind the original minimal inclusion philosophy and some of the latest thinking on these ideas. The goal is to start building a framework to better identify potential goals where embedding certain functionalities in the protocol might be worth considering.

Incorporating ERC-4337

There are several main reasons for integrating ERC-4337 into Ethereum:

Gas Efficiency:
All operations executed within the EVM incur some virtual machine costs, and the currently inefficient functionalities add at least about 20,000 Gas, making integration of this suite into the protocol the simplest way to address this issue.
Code Vulnerability Risk:
If the entry contract for ERC-4337 has a severe bug, all ERC-4337 compatible wallets would be susceptible to asset theft. By replacing the contract with built-in functionality, the responsibility would shift to Ethereum, thus addressing the user's fund risk issue through a hard fork.
Support for EVM opcodes:
For example, tx.origin, native account abstraction allowing tx.origin to point to a real account and send transactions, making it work like an EOA.
Censorship Resistance:
The additional ERC-4337 protocol encapsulates user operations in a single transaction, making user operations opaque to the Ethereum protocol. Therefore, the inclusion lists provided by the Ethereum protocol would not offer censorship resistance to ERC-4337 user operations. By incorporating ERC-4337 and considering user operations as "appropriate" transaction types, this issue would be resolved.

Among these reasons, Vitalik particularly focuses on the Gas cost efficiency issue. In its current form, ERC-4337 is much more expensive than "basic" Ethereum transactions: basic transactions cost 21,000 Gas, while ERC-4337 costs around 42,000 Gas.

Incorporating ZK-EVM

Since L2 ZK-EVM essentially uses the same EVM as Ethereum, can "verifying EVM execution in ZK" be made a protocol feature in some way? And for special cases like bugs and upgrades, can Ethereum's social consensus alone handle them?

Regarding this, Vitalik points out some subtle differences:

Ethereum wants to allow different clients to use different proof systems. That is, for any EVM operation using the ZK-SNARK system proof, Ethereum wants assurance that the underlying data is available so proofs can be generated for other ZK-SNARK systems.
While the technology is not yet mature, it may still need auditability. Similarly, if any operation is proven, Ethereum wants the underlying data available so that users and developers can check it in case of any issues.
Need for faster proof times, so that if one type of proof is generated, other types of proofs can be generated quickly enough for other clients to verify them. One solution is to create a precompile that responds asynchronously after a period longer than a slot, but this adds complexity.
Ethereum wants to support not just EVM replicas but also "almost-EVM" systems. If L2 can still use ZK-EVM from the native protocol to handle parts similar to EVM and only rely on its own code to handle different parts, that would be ideal. This could be achieved by designing a ZK-EVM precompile that allows callers to specify a bitfield, opcode list, or address list to handle provided tables externally rather than by EVM itself. It can also make gas costs somewhat customizable.

Thus, Vitalik finds the argument for incorporating ZK-EVM quite compelling: Rollup has been constructing its custom versions, and if Ethereum views multiple executions and off-chain consensus as more critical than L1's EVM operations and L2 doing the exact same work but needing complex setups involving security committees, this seems off.

Incorporating Private Mempools

To avoid sandwich attacks due to the transparent nature of blockchain transactions, the design of private mempools keeps user transactions encrypted until they are irreversibly accepted into a block. Achieving this form of encryption involves different technologies and trade-offs but would require assistance from centralized operators.

Vitalik believes that the solutions from these centralized organizations have various weaknesses, and centralized operators cannot be accepted as part of the protocol. Traditional time-locked encryption methods running in public mempools are too costly to handle thousands of transactions.

Incorporating Liquidity Staking

To meet the staking needs of users on Ethereum, protocols like Lido and Rocketpool offer simple interfaces for users to stake with a click of a button. However, Vitalik points out a natural centralization mechanism in liquidity staking: People tend to use the biggest staking protocol because it is the most famous and has the best liquidity. But Vitalik highlights the issues faced by Lido and Rocketpool respectively.

With Rocketpool, users only need 8ETH to run a node, making it susceptible to a 51% attack. As for the node auditing responsibility of Lido DAO, a majority stake in a single staking coin would lead to a single, potentially attackable governance risk, despite Lido implementing corresponding protection mechanisms, a single layer of defense might not be sufficient.

In the short term, Vitalik believes encouraging the community to use various types of staking protocols to reduce centralization risks is a step in the right direction. However, this is not a stable practice. So, does it make sense to incorporate some form of functionality into the protocol to make liquidity staking less centralized?

Regarding this, Vitalik states the key issue is, what functionality to incorporate? He believes RocketPool has provided a viable way to operate: each node operator contributes part of the ETH, while liquidity stakers provide the rest. Just some fine-tuning of parameters, such as adjusting Slashing Penalty reduction to 2ETH, would make Rocket Pool's existing rETH risk-free.

Incorporating More Precompiles

Precompiles were an early compromise adopted in Ethereum development: due to the virtual machine's expense for some highly complex and specialized code, key operations valuable to applications could be implemented in native code to boost speed.

However, Vitalik believes many of the previously added precompiles (such as RIPEMD and BLAKE) ended up being far less utilized than expected. Instead of continuing to focus on adding more precompiles for specific operations, the focus should be on more moderate approaches, like EVM-MAX and SIMD proposals. Even removing rarely used precompiles and achieving the same functionality with EVM code could be a better approach.

What is Vitalik trying to tell us?

This piece emphasizes minimalist-inspired by Unix philosophy and the balance between enshrining features and incorporating them into blockchain protocols. Unlike personal computer operating systems, blockchains are social systems, so different factors need to be considered. Key points in the article include:

Incorporating Features Enshrining Features: Incorporating features can prevent centralization risks. However, going overboard can increase governance burdens, make the protocol complex, and may be inconsistent with long-term user needs.
Middle Ground: A possible solution is "minimal viable inclusion." Instead of integrating the entire feature, the protocol might only adopt key components that make implementing that feature simpler. For instance, modifying penalty rules for liquidity staking, incorporating EVM-MAX for expanding operations, or just including EVM verification rather than the entire rollups concept.
De-Enshrining: Removing certain features might be beneficial, especially if they are rarely used. For example, some precompiles.

Ultimately, deciding what features should be integrated into the protocol and what features should be left to other ecosystem layers is a complex issue. This balance may change over time as user needs and available technologies evolve.