Should I be concerned about Infura's outage? Listen to the official explanation and opinions from industry players.
Yesterday afternoon on the 12th, Ethereum API service provider Infura experienced multiple API service interruptions, leading to several exchanges temporarily halting ETH and ERC20 token deposit and withdrawal services, abnormal balance displays on Ethereum wallet MetaMask, data delays, and abnormal Gas Price estimations.
Isn't blockchain supposed to be decentralized? Why can a single company like Infura cause such a widespread impact? What exactly is Infura's function, and how did the issue occur? Let's explore the reasons behind this incident through insights from Lee Hsuan, co-founder of blockchain wallet provider Blocto, and analysis from Infura.
Table of Contents
What is the function of Infura?
Blocto co-founder Li Xuan stated that for decentralized applications (Dapps) on Ethereum to interact with the blockchain, actions such as checking account balances, smart contract data, etc., require interaction through RPC nodes, which are essentially full nodes without mining. While RPC nodes can be set up by users with data needs such as Dapps and exchanges, there are basic operational costs and maintenance work involved.
He mentioned that Infura specifically provides "reliable" RPC node services. Although there are other companies offering this service, Infura is the largest service provider in this field.
What caused the service interruption?
Infura stated that the root cause of the service interruption was due to differences in the software versions used in their internal systems, specifically the client software Geth v.1.9.9 and Geth v.1.9.13, which resulted in block synchronization delays among multiple subsystems.
Infura mentioned that in the past, they would immediately upgrade when Geth or Parity released updates. However, they stopped doing so because sometimes these updates would cause instability or have adverse effects on users. As no software is entirely bug-free, Infura is more cautious when updating nodes. The software update originally scheduled for early this month was also delayed for stability reasons.
However, this service interruption occurred because Infura did not realize that the differences between Geth v.1.9.9 and Geth v.1.9.13 would lead to a "consensus bug."
How to avoid this?
Infura mentioned that the development team did not mention fixing the "consensus bug" in the update version, possibly to avoid being attacked during the bug-fixing process and to maintain a low profile. The link below shows developer Nikita Zhavoronkov stating that Ethereum quietly underwent an undisclosed hard fork to fix the bug.
Technically you are correct that it was an "unannounced hard fork" (from a bad chain to the good one). That said, silently fixing a bug dormant for 2+ years has a much lower chance of causing a disruption than raising awareness to it. We strive to minimize potential damage.
— Péter Szilágyi (karalabe.eth) (@peter_szilagyi) November 11, 2020
Therefore, from Infura's perspective, they will strive to optimize the version update process to balance potential consensus bug fixes with system stability. They will also review this incident and find ways to shorten the recovery time.
Regarding data demand, Li Xuan mentioned that this situation should be avoidable. Although the Infura service interruption temporarily affected transfers, Dapp usage, etc., since they also have their own RPC nodes, switching to their nodes would restore normal functionality.
Should I be concerned about Infura's downtime?
Li Xuan stated, "This incident is indeed a warning."
He explained that this incident revealed that many key components are still quite centralized. If Infura had malicious intentions, such as obstructing connections and manipulating coin prices to trigger liquidations while many people are unable to connect to the Ethereum network, it could cause significant harm.
Related
- Cross-chain bridge protocol LI.FI hacked for $12 million, Parity: Same vulnerability exploited two years ago
- All North Korean hackers? Elliptic analyzes WazirX was attacked by North Korean hackers, and the recent $300 million from DMM as well?
- ScamSniffer Phishing Report: Over 300 million losses in half a year, one person loses tens of millions of pounds becoming the second largest victim in history