Google launches the most powerful AI Gemini, capable of serving as consultant, tutor, and assistant

share
Google launches the most powerful AI Gemini, capable of serving as consultant, tutor, and assistant

Google's native multimodal AI model Gemini emphasizes building a multimodal AI model from scratch, similar to how humans have five senses and can simultaneously receive and perceive the world. It can summarize, seamlessly understand, operate, and combine different types of information like humans, including text, code, audio, images, and videos. TheAIGRID, with 120,000 subscribers, recently detailed Gemini's various functions in a video, leaving viewers in awe and saying, "There's no turning back!"

See, Hear, and Write, Chat Freely from North to South

Gemini emphasizes that it is a multi-module AI model built from scratch, just like humans have five senses, simultaneously receiving and perceiving the world. This also means that Gemini can comprehensively and seamlessly understand, operate, and combine different types of information, including text, code, audio, images, and videos, just like humans.

This means you can connect a camera, microphone, use images with voice to ask Gemini questions at the same time, and it is continuous, just like casual conversations between friends from North to South, and you can even play games with it.

Starts at 5:20 in the video

Playing a game of guessing which hand the coin is in with Gemini

Ultimate AI Consultant Gemini

Gemini is also the ultimate AI consultant. The video demonstrates a task of "planning a birthday party for a daughter," providing conditions such as liking animals and wanting an outdoor party, Gemini immediately generates several options for selection.

Moreover, Gemini's generated form integrates text and images. It provides party theme options, envisions party decorations, activities to prepare, food, and more. You can click on your favorite theme to view details further, or directly ask more questions in the details, such as the designs of cups and cakes for the party, how to make them yourself, etc. It's simply the ultimate AI consultant on Earth!

Starts at 13:47 in the video

Ultimate Tutor, Upload Exam Questions for Solutions

Users can also directly upload exam questions, and Gemini will first correct the test paper for you, then tell you where the mistakes are, and step by step, solve the questions for you to see. If there are parts you don't understand, you can ask at any time, and even ask it to provide similar questions again to ensure you fully understand the concept of this question type.

Starts at 17:15 in the video

Furthermore, you can also upload videos of yourself practicing soccer for Gemini to correct your posture adjustments for successful scoring.

Starts at 27:10 in the video

Ultimate AI Model on Earth

Google's AI chatbot Bard has started using a refined version of Gemini Pro for advanced reasoning, planning, understanding, etc. Google has also incorporated Gemini into Pixel phones and will appear in more products and services in the coming months.

From the video, it is easy to see that Gemini is not just a chatbot; it can also draw, help scientists organize massive amounts of data, and integrate them into the format you specify. It's truly the ultimate AI consultant on Earth, which is why Google and Alphabet CEO Sundar Pichai can proudly state:

This is our most powerful and versatile model to date, and I am excited about the future and the opportunities Gemini will bring to people around the world.

Google introduces the native multimodal AI model Gemini, challenging GPT-4