Google launches AI model Lumiere! Convert text and images into videos with a single click, and customize the style of materials
Google recently launched an AI video generator called "Lumiere," which utilizes a spatiotemporal diffusion model to transform text or images into realistic videos. Users can even customize video materials and styles according to their needs. It features the innovative "spatiotemporal U-Net architecture" to showcase realistic, diverse, and coherent motion scenarios in the videos.
Table of Contents
Google's "Lumiere" Text-to-Video Generation Tool
According to a research paper released by Google Research, the team has developed a Space-Time Diffusion Model named "Lumiere," which serves as an AI video generation tool. It claims to consider spatial and temporal motion concepts when generating videos to create consistent and smooth dynamic images.
Reportedly, Lumiere utilizes a "Space-Time U-Net architecture." During the generation process, Lumiere continuously checks the spatial concept of where objects are located and the temporal concept of how and for how long objects move, ensuring consistency in both aspects in a single run.
Our model has been trained and learned from over 30 million images and text materials, calculated and processed across multiple spatiotemporal scales, and can generate up to 80 frames at a speed of 16 frames per second.
What Can Lumiere Do?
Specifically, Lumiere boasts three powerful features:
Text and Image-to-Video Conversion
Firstly, users can provide instructions to Lumiere through text descriptions or by uploading static images to generate dynamic videos, similar to the video generation feature in ChatGPT.
OpenAI launches GPT Store offering users a diverse range of model choices and popular trend recommendations
7 Style Options for Generation
Furthermore, while AI-generated content often lacks the ability to fine-tune details such as content or style, Lumiere can.
Users can choose from 7 different material styles, including "stickers," "lines," "flat cartoons," "watercolor," "fluorescent," "3D fusion gold," and "3D rendering," and adjust them according to their needs.
Video Editing and Post-Processing
Notably, Lumiere can also edit parts of the video content. Users can request to keep only the torches burning while preventing the clouds above from moving, or change the outfits of people walking, for example.
For instance, as shown below, users can also change the quality or material of objects in motion to achieve different desired effects.
Users Still Can't Experience it
Although the immediate and high-quality video creation feature is enticing, Lumiere is currently a research project, and users may have to wait some time before being able to try it out themselves.
However, it is speculated that like previous research results released by tech companies such as Microsoft, Google, and Meta, the underlying technology and functions of this product may be integrated into Google's other products in the future rather than being released as a standalone product.
Rowan Cheung: Making Movies Will Become Easier
Rowan Cheung, founder of AI news site The Rundown AI, expressed excitement, calling the product an incredible technological breakthrough.
Google just made an incredible AI video breakthrough with its latest diffusion model, Lumiere.
2024 is going to be a massive year for AI video, mark my words.
Here's what separates Lumiere from other AI video models: pic.twitter.com/PulSjVZaCp
— Rowan Cheung (@rowancheung) January 25, 2024
The pace of artificial intelligence development is astonishing. I believe that in a few years, people may be able to quickly make movies through their phones.
Intellectual Property Issues in AI Training
It's worth mentioning that Google does not mention in the paper the sources of text, images, or other data used to train the model. This has been a sensitive corporate ethics and copyright issue within the AI industry and has been widely discussed.
As the prevalence of AI generation models continues to grow, there have been many lawsuits worldwide involving intellectual property infringement.
Media and publishing industry crisis? New York Times sues OpenAI and Microsoft for massive copyright infringement
Related
- VanEck's European market Solana ETN collateralization, will it test the waters for future Ethereum ETF collateralization?
- Evernote and Meetup's parent company, Bending Spoons, looks to IPO in the United States.
- Swaziland proposes CBDC plan that allows for physical card storage and supports offline payments