DEVELOPMENT

Google I/O 2024 unveils: AI technology dominates, Gemini AI also features ChatGPT 4o functionality

2024/5/15

Google I/O 2024 developer conference was held as scheduled, with a focus on the latest artificial intelligence technologies and product updates. During the two-hour keynote address, Google unveiled a series of innovative technological breakthroughs and updates aimed at providing developers and consumers with more convenience and innovative experiences.

Table of Contents

New Generation Development Tool: Firebase Genkit

At this year's conference, Google introduced a new platform called Firebase Genkit. This is an open-source framework designed to simplify developers' use of JavaScript/TypeScript to build artificial intelligence applications, with support for the Go language update coming soon. The launch of Firebase Genkit aims to accelerate the implementation of AI features in both new and existing applications, covering various purposes such as content generation, summarization, text translation, and image generation.

Focus on AI Applications

In this presentation, Google mentioned AI technology a remarkable 121 times, demonstrating the company's deep involvement in this field. CEO Sundar Pichai even emphasized this as part of Google's effort to showcase its commitment to AI to the public.

AI Innovation in Education: LearnLM

Google also unveiled a new generation generative AI model called LearnLM, specifically tailored for educational purposes. Developed in collaboration between Google's DeepMind AI research division and Google Research, this series of models aims to support student learning through conversational teaching. LearnLM has already been tested in Google Classroom and will further integrate into educational program planning and optimization.

New Feature in YouTube Educational Videos: AI-generated Quizzes

YouTube has introduced AI-generated quiz features, allowing viewers to interact while watching educational videos, such as asking questions, getting explanations, or participating in quizzes. This feature provides a new learning method for users who need to watch long educational videos.

Enhanced AI Capabilities: Update to Gemma 2

To meet the demands of developers, Google will soon add a new model with 2.7 billion parameters in Gemma 2, optimized by Nvidia for efficient operation on next-generation GPUs.

New Discovery Features in Google Play

Google Play has updated its discovery features, making it easier for users and developers to promote and discover applications. This includes new ways for users to acquire apps, updates to Play Points, and enhancements to tools and APIs for developers.

User Safety Protection: Detecting Fraudulent Activities in Calls

Google previewed a new feature that can detect potential fraudulent activities in calls in real-time, which will be integrated into future versions of Android. By analyzing conversation patterns in calls, this system can effectively warn users to be cautious about security.

Innovative Search and Interaction Methods: Ask Photos

Google Photos is set to launch an experimental feature called "Ask Photos," which uses AI to understand photo content and metadata. Users can query using natural language, making the search process more intuitive with less manual intervention.

Gemini AI Applications

Gemini's Application in Gmail

In Gmail, users will be able to utilize Gemini AI technology for searching, summarizing, and drafting emails. Furthermore, Gemini AI can perform more complex tasks such as handling e-commerce returns, including searching the inbox, finding receipts, and filling out online forms.

Gemini 1.5 Pro: Doubling Processing Power

The upgrade to Gemini 1.5 Pro enables it to analyze longer documents, code repositories, videos, and recordings than before. In the latest private preview, the processing power of this flagship model has increased to handle up to 2 million tokens, twice the previous capacity.

Gemini Live: Real-Time Interactive Experience

Google previewed a new feature called Gemini Live, allowing users to engage in deep voice conversations with Gemini through smartphones. Users can interrupt Gemini in the conversation at any time, and the system will adapt in real-time to the user's speech patterns. Moreover, Gemini can recognize and respond to the user's surroundings through the smartphone camera.

Gemini Nano: Chrome-integrated Micro AI

As the smallest member of the Google AI model family, Gemini Nano will be directly integrated into the Chrome desktop client starting from Chrome version 126. This will enable developers to leverage on-device models to implement their AI functionalities, such as the "Smart Compose" tool in Gmail.

Gemini's Application on Android

Google's Gemini AI will replace Google Assistant and deeply integrate into the Android system, allowing users to drag and drop AI-generated images directly into applications like Gmail, Google Messages, and more. YouTube users will also be able to search for specific information in videos through the "Ask about this video" feature.

Gemini's Application on Google Maps

Gemini's capabilities will be applied to the Google Maps developer platform, starting from the Places API. Developers can showcase locations and area summaries analyzed by Gemini in their applications and websites. Developers no longer need to write their own custom location descriptions.

Expanding AI Capabilities: Performance Improvement of Tensor Processing Units

Google also announced the sixth generation of its Tensor Processing Units (TPU) AI chips, named Trillium, set to be released later this year, with a significant performance improvement compared to the previous generation.

Google I/O 2024 showcased Google's latest achievements in AI and technological innovation, from educational tools to developer resources. Each update aims to enhance efficiency, increase interactivity, and ensure user safety.