Why are transactions on Solana frequently failing and experiencing congestion recently?

share
Why are transactions on Solana frequently failing and experiencing congestion recently?

Recently, many users have experienced frequent transaction failures and congestion on the Solana network. Researcher @nishil has provided insights into this issue in a tweet, attributing it to design flaws in the QUIC protocol at the network layer. Many developers are reportedly working on addressing this problem.

Background: Current Status of Solana Network

Solana Network User Transaction Failure Rate Exceeds 50%

There are usually three outcomes for transactions submitted on the Solana network:

  • Successful execution of the transaction
  • Execution of the transaction fails
  • Transaction loss

Since November last year, the transaction failure rate on Solana has consistently remained around 50%, meaning half of the transactions result in the second or third scenario.

Solana Network User Transaction Failure Rate

However, there are distinctions between these two types of failures that need to be clarified.

Execution Failure: Changes in Transaction Conditions, Arbitrage Bots

The reason for the execution failure of a transaction is typically due to changes in execution conditions that no longer meet the requirements of the transaction. For example, minting an NFT that has already been minted, or a transaction slippage exceeding the set maximum value, and so on. These failure reasons are also common on other blockchain networks.

Due to the low fees on the Solana network, it is filled with a large number of arbitrage bots and junk trading information. To arbitrage, bots release a large number of transactions, making it easy for them to fail due to changes in execution conditions, losing out to other bots. According to data, the transaction failure rate initiated by these arbitrage bots reaches 98%.

Arbitrage Transaction Failure Rate in Solana Network is 98%

However, the transaction failures caused by these bots are not the main reason for the poor user experience on Solana, so it is not the focus of this article. The reasons for user-submitted transaction failures come from the third scenario mentioned above—transaction loss.

Transaction Loss: Due to Network Layer Design

The issue of transaction loss is the main reason why users' transactions frequently fail on the Solana network. Transaction loss means that the transaction was not successfully delivered to the block leader, a role known in the Solana ecosystem responsible for receiving and executing transaction content in that slot.

The network layer is the communication layer of the network used for transmitting data. For example, TCP, UDP, QUIC, and other protocols. Recently, the Solana network changed its network layer communication protocol to QUIC to avoid block leaders receiving too many transaction requests in a short period, leading to crashes.

Difference in Transmission Protocols between HTTP and QUIC

QUIC allows block leaders to stop certain user connections or limit their transmission rates based on specific requirements, reducing the risk of network crashes during peak usage periods, even though it may reduce network efficiency, it is still better than crashing.

However, the current logic for "choosing which client connections to restrict" is poorly set and contains errors. Solana's current approach is to randomly discard transactions rather than discarding transactions based on specific criteria, such as discarding all transactions with fees lower than x.

This leads to users sending more junk transactions. To ensure transactions can be successfully executed, users are encouraged to initiate a large number of transactions, which in turn increases the failure rate and congests the network, creating a vicious cycle.

To increase the probability of transactions being selected, bots on the Solana network send a large number of transactions in batches, reducing the likelihood of users' transactions being successfully executed

Finally, because Solana's network design does not include a mempool, discarded transactions are lost, leading to transaction failures.

Future Outlook

The main client teams in the ecosystem, including Firedancer, Anza, and the Solana official team, have already begun addressing this issue. It is expected that this issue will be resolved in the coming weeks, but the performance after the update will determine the outcome. There is still a need to address the congestion issue caused by adjusting the transmission mechanism.

On the other hand, Solana still needs to resolve the issue of junk transactions flooding the network due to low fuel costs.

Recommended Reading: Institutional Analysis of Solana's Four Technological Developments: SOL Ecosystem Has Indeed Improved
Reason for recommendation: This article explains recent technological developments related to user experience on Solana, including solutions to the problem of junk transactions due to low fuel costs, providing a better understanding of Solana's recent developments.

Perhaps after these issues are resolved, a truly robust Solana network can be seen, but it seems that there is still much work to be done.