DA does not store historical data? Data availability does not equate to permanent availability.
Data Availability, DA does not mean permanent storage? Does blockchain not have the obligation to permanently store data? This article will clarify the various misconceptions about the data availability layer in plain language. The role of DA is to enhance security rather than store historical data.
Table of Contents
Is Storing Historical Data Equal to Security?
When discussing whether data availability should be built on external projects like Celestia or kept on Ethereum, it is not about where historical data should be stored, but rather a matter of network security.
Viewpoint | Competition in the DA Layer Will Present Diverse Development
Some readers may wonder, "Isn't the storage of Layer 2 historical data the same as security?" In fact, historical data is not the most important consideration for Layer 2 security, and what Vitalik insists on is not the storage of historical data.
If you have the above question, it means a misunderstanding of the purpose and definition of the data availability layer.
What are the differences in the storage locations of Layer 2 transaction records? Introduction to the development of off-chain data availability DA
What is Data Availability?
What Data Availability is Not
Data availability does not guarantee that all historical data will be permanently available for users or nodes to access. Data availability projects like Celestia or EigenDA provide temporary storage space, which fundamentally differs from Arweave's decentralized storage facility that permanently preserves data, even though they both essentially provide hard drive storage.
Celestia TIA Mainnet Launch | Opportunities and Challenges in Modular Blockchain Future
Since mainstream Rollups currently treat Ethereum as DA and compress complete transaction information before putting it on-chain, it can lead outsiders to mistakenly think that data availability means permanent storage. However, with recent upgrades like Dencun from the Cancun upgrade, EIP-4844, they will begin to delete old Rollups transaction data because the original purpose of putting it on-chain was not for permanent storage.
DA: Data Availability Only Ensures Complete Data Publication
Data availability only guarantees the availability of data for retrieval before a block is finalized, providing Ethereum with a basis for judgment in the event of disputed block arbitration. Therefore, some people believe that this name should be changed to Data Publication, DP.
For example, if a node on Arbitrum discovers errors in blocks transmitted by other nodes and issues a fraud proof, there needs to be accurate data for Ethereum to compute and arbitrate. Without DA ensuring data availability, the fraud proof mechanism cannot proceed.
Once a transaction is finalized, such as when a block in the Ethereum network has been confirmed by over 2/3 of the nodes, approximately having 60 or more new blocks become the longest chain, it will be finalized, and a finalized block will never be altered.
DS: Data Storage and Historical Indexing
Some readers may find it strange that if the purpose of data availability is to ensure complete data publication on the network, which may be deleted or inaccessible after some time, what happens if, for a specific reason, one needs to access the complete transaction history of Rollups? This is where data storage comes into play.
However, blockchain historical data storage is not a very critical issue.
As long as any party among all nodes voluntarily retains complete transaction data due to vested interests or other reasons, such as:
- Blockchain explorers: because they are critical resources in the industry
- Rollups projects: as they are a link in serving users
- Enthusiastic users: hoping for industry improvement
On the other hand, most node designs will retain block header data containing transaction block hash values, meaning that when provided with complete historical data from a party, its authenticity can easily be verified.
In summary, the assumption of data storage and historical indexing is close to 1/N. As long as the network scale is large enough with N nodes, as long as one can find a node willing to provide complete data, it can almost guarantee that anyone can obtain correct historical data from somewhere.
Why is Data Availability Important?
Next, let's explain why DA is crucial for Rollups or validium, so much so that L2BEAT considers DA as one of the five major risk models.
Fraud Proof Mechanism
As mentioned earlier, the fraud proof mechanism relies on complete transaction information to function.
Even in extreme cases where all nodes collude and stop sending information to an honest node, without ensuring data availability, the honest node cannot distinguish whether the network connection is unstable or under attack, leaving them unable to retaliate. The 1/N trust assumption of fraud proof will not hold without data availability.
Validium Escape Hatch Requires Latest State
Most Layer2 networks have anti-censorship withdrawal mechanisms, such as the Escape Hatch. When a user's withdrawal request is consistently ignored by the sequencer for an extended period or maliciously rejected, and the forced withdrawal function is also ignored without a response from nodes for several days, the user initiates an emergency button.
When the Escape Hatch function is activated, the network will pause for a period, during which all transactions on the network are halted, but users can withdraw based on the state root to achieve an anti-censorship withdrawal mechanism.
However, for Validium to obtain the latest state tree, at least one node must be willing to provide it. To enhance user asset security, having a reliable DA can ensure that users can obtain the state tree and withdraw.
Is validium a type of Layer2? Validium is being re-examined by the Ethereum community
Data Availability as a Crucial Pillar of Network Security
Therefore, from the two scenarios mentioned above, one can understand that data availability plays a crucial role as a security component in the Layer2 ecosystem, even if it is not responsible for permanently storing data.
The function of the data availability layer is not to provide complete transaction history information but to ensure smooth network operation and user asset security by providing the state before transactions are finalized, becoming a critical key in the Layer2 security model.
Without DA, in extreme cases, no matter how well-designed the transaction proof mechanisms like fraud proof or zero-knowledge proof are, they are essentially useless. This underscores the importance of data availability, explaining why many Ethereum developers do not agree with external DA.
Vitalik hopes to see 10 L2 projects reach Stage 1 by 2024. What does this mean?
While projects boast about their zkEVM, fraud proof, and ecosystem development, it is also essential to always remember whether these fundamental infrastructures can ensure security.