Google launches AI model Lumiere! Convert text and images into videos with a single click, and customize the style of materials

2024/1/26

Google recently launched an AI video generator called "Lumiere," which utilizes a spatiotemporal diffusion model to transform text or images into realistic videos. Users can even customize video materials and styles according to their needs. It features the innovative "spatiotemporal U-Net architecture" to showcase realistic, diverse, and coherent motion scenarios in the videos.

Table of Contents

Google's "Lumiere" Text-to-Video Generation Tool

According to a research paper released by Google Research, the team has developed a Space-Time Diffusion Model named "Lumiere," which serves as an AI video generation tool. It claims to consider spatial and temporal motion concepts when generating videos to create consistent and smooth dynamic images.

Google Research's "Lumiere" research paper

Reportedly, Lumiere utilizes a "Space-Time U-Net architecture." During the generation process, Lumiere continuously checks the spatial concept of where objects are located and the temporal concept of how and for how long objects move, ensuring consistency in both aspects in a single run.

Our model has been trained and learned from over 30 million images and text materials, calculated and processed across multiple spatiotemporal scales, and can generate up to 80 frames at a speed of 16 frames per second.

What Can Lumiere Do?

Specifically, Lumiere boasts three powerful features:

Text and Image-to-Video Conversion

Firstly, users can provide instructions to Lumiere through text descriptions or by uploading static images to generate dynamic videos, similar to the video generation feature in ChatGPT.

OpenAI launches GPT Store offering users a diverse range of model choices and popular trend recommendations

7 Style Options for Generation

Furthermore, while AI-generated content often lacks the ability to fine-tune details such as content or style, Lumiere can.

Users can choose from 7 different material styles, including "stickers," "lines," "flat cartoons," "watercolor," "fluorescent," "3D fusion gold," and "3D rendering," and adjust them according to their needs.

Video Editing and Post-Processing

Notably, Lumiere can also edit parts of the video content. Users can request to keep only the torches burning while preventing the clouds above from moving, or change the outfits of people walking, for example.

Ability to dynamically edit parts of an image

For instance, as shown below, users can also change the quality or material of objects in motion to achieve different desired effects.

Editing the composition elements of a running girl

Users Still Can't Experience it

Although the immediate and high-quality video creation feature is enticing, Lumiere is currently a research project, and users may have to wait some time before being able to try it out themselves.

However, it is speculated that like previous research results released by tech companies such as Microsoft, Google, and Meta, the underlying technology and functions of this product may be integrated into Google's other products in the future rather than being released as a standalone product.

Rowan Cheung: Making Movies Will Become Easier

Rowan Cheung, founder of AI news site The Rundown AI, expressed excitement, calling the product an incredible technological breakthrough.

Google just made an incredible AI video breakthrough with its latest diffusion model, Lumiere.
2024 is going to be a massive year for AI video, mark my words.
Here's what separates Lumiere from other AI video models: pic.twitter.com/PulSjVZaCp
— Rowan Cheung (@rowancheung) January 25, 2024

The pace of artificial intelligence development is astonishing. I believe that in a few years, people may be able to quickly make movies through their phones.

Intellectual Property Issues in AI Training

It's worth mentioning that Google does not mention in the paper the sources of text, images, or other data used to train the model. This has been a sensitive corporate ethics and copyright issue within the AI industry and has been widely discussed.

As the prevalence of AI generation models continues to grow, there have been many lawsuits worldwide involving intellectual property infringement.

Media and publishing industry crisis? New York Times sues OpenAI and Microsoft for massive copyright infringement

APPLICATION DEVELOPMENT