ETHTaipei | How blockchain verifies the authenticity of AI information

share
ETHTaipei | How blockchain verifies the authenticity of AI information

On the second day of ETHTaipei 2024, former researcher of the Ethereum Foundation, Cathie, addressed the issue of verifying the authenticity of AI models and output information. She introduced the potential of zero-knowledge proof machine learning (zkML) and fraud-proof machine learning (opML), and proposed methods to improve the development environment of open-source models from the perspective of token economics, aiming to reduce the difficulty of verifying AI information.

The Two Dilemmas Facing the AI Industry

How to Verify Information?

As artificial intelligence (AI) continues to advance, the importance of proving the correctness of data input, models, and outputs is increasing for future applications. One of the criticized issues with OpenAI is the lack of open-source models and undisclosed input data, leading to numerous criticisms regarding digital content copyright and information authenticity.

Crisis in the media and publishing industry? The New York Times sues OpenAI and Microsoft for massive copyright infringement

Validating the accuracy of the output results of a private model, verifying the effectiveness of private data while maintaining privacy, and how to verify performance without disclosing private models have always been challenges in the development of machine learning models.

In fact, the above issues can be summarized as:

  • Verifiability
  • Preserving input privacy
  • Preserving model privacy

How to Ensure Profitability of Open-Source Models?

Currently, the traditional solution seems to be to open-source the model to alleviate this problem, but open-sourcing may not be conducive to the commercialization and development of models, and still cannot verify the correctness of the output 100%.

Focusing only on profits, Musk sues OpenAI, Sam Altman, demands return to open-source

Therefore, whether directly verifying the authenticity of AI information or the development strategy after open-sourcing the model, current Web2 technologies seem to lack effective solutions. Cathie points out that blockchain has the potential to help the AI industry address these two issues.

How Blockchain Addresses the AI Verification Issue

Potential Solution: Uploading Proof Data to the Blockchain

Cathie mentions that many projects are already attempting to combine machine learning with blockchain and cryptography technologies to verify the accuracy of "model itself and input/output" information. By uploading proof data to the blockchain, ensuring immutability and accessibility to everyone, a mechanism for fair information verification is provided.

Current technologies can generally be divided into two categories:

  • Zero-Knowledge Machine Learning (zkML): Generating zero-knowledge proofs related to machine learning models, uploading them to the blockchain for verification by everyone.
  • Optimistic Machine Learning (opML): Generating fraud proofs related to machine learning models, uploading them to the blockchain for everyone to view, allowing challenges if the information is found to be inaccurate.

Each of these technologies has unique advantages and disadvantages.

zkML requires generating zero-knowledge proofs, which consumes more resources and costs, making it currently only applicable to smaller models, such as Decision forest, nanoGPT, GPT-2 models, etc. However, once the proof is generated and uploaded to the chain, verification can be quickly finalized.

opML, on the other hand, requires waiting for the challenge period to end after uploading information to the blockchain before finality can be reached. It also lacks good protection for privacy information, and most importantly, its verification security is lower compared to zkML. However, opML can be combined with models of any size, such as Stable Diffusion, LLaMA, increasing its usability.

In addition to the above two methods, Cathie also proposes a recent study that combines the strengths of both - Optimistic Privacy-Preserving AI (OPP/AI), which can reduce model verification costs while providing a design similar to security and privacy.

Potential Solution: Tokenizing Ownership of Open-Source Models

In addition to considering methods to verify while maintaining model and data privacy, blockchain can also tokenize ownership of open-source models, providing economic means to encourage more models to be open-sourced.

Specifically, Cathie proposes two ERCs trying to address this issue:

  • ERC-7641: Intrinsic RevShare Token for revenue sharing. Automatically distributes funds from the corresponding fund pool to token holders.
  • ERC-7007: Ensures model owners can sell models and share profits through verification. Supporting zkML and opML, it can accumulate profits for developers while verifying the correctness of AI-generated content AIGC.

Cathie combines the above mechanisms and calls it the Initial Model Offering (IMO) mechanism, which is the focus of the future ORA Protocol.

IMO Model

Blockchain Assisting the Development of the AI Industry

Cathie suggests that it may not necessarily be starting from how AI can help blockchain, but the current situation is more likely that blockchain helps verify the authenticity of AI information.