OpenAI launches GPT-4o Mini: a cost-effective small-scale model for faster text and visual responses

2024/7/19

OpenAI announced the launch of GPT-4o mini, claiming it to be the most cost-efficient small model to date. This breakthrough is expected to significantly expand the range of AI-built applications, making intelligent technology more affordable.

Table of Contents

Providing Superior Performance at Affordable Prices

The pricing for GPT-4o mini is 15 cents per million input tokens and 60 cents per million output tokens. This is an order of magnitude cheaper than the previous model and over 60% cheaper than GPT-3.5 Turbo. Despite the low cost, GPT-4o mini scored 82% in the Multi-task Language Understanding (MMLU) benchmark test and outperformed GPT-4.1 in chat preferences on the LMSYS leaderboard.

Faster Chat Responses with GPT-4o Mini

The low cost and low latency of GPT-4o mini make it suitable for a wide range of tasks, including applications that involve linking or parallelizing multiple model calls, such as calling multiple APIs, providing the model with large amounts of context like entire codebases or conversation histories, or interacting with customers through fast real-time text responses (e.g., customer service chatbots).

Support for Text and Visual Features, with Future Enhancements

Currently, GPT-4o mini supports text and visual features in its API.

Future updates will include support for text, images, videos, and audio inputs and outputs. With a context window of 128K tokens and the ability to support up to 16K output tokens per request, GPT-4o mini can handle a variety of tasks. The model also possesses knowledge up to October 2023 and can efficiently process non-English text through a new tokenizer shared with GPT-4o.

Text Intelligence and Multimodal Reasoning Beyond GPT-3.5 Turbo

GPT-4o mini surpasses GPT-3.5 Turbo and other small models in academic benchmark tests, both in text intelligence and multimodal reasoning. It supports the same range of languages as GPT-4o and excels in function invocation, allowing developers to build applications that can retrieve data from external systems or take actions. Additionally, it shows improvements in long-context performance compared to GPT-3.5 Turbo.

Highlights of Key Benchmark Test Performances

Reasoning Tasks: GPT-4o mini scored 82.0% in MMLU, surpassing Gemini Flash (77.9%) and Claude Haiku (73.8%).
Mathematical and Coding Abilities: In mathematical reasoning and coding tasks, GPT-4o mini scored 87.0% and 87.2% in MGSM and HumanEval, respectively, higher than Gemini Flash and Claude Haiku.
Multimodal Reasoning: In the MMMU multimodal reasoning evaluation, GPT-4o mini scored 59.4%, surpassing Gemini Flash (56.1%) and Claude Haiku (50.2%).

Built-in Security Measures: Review Policies, Anti-Cracking

OpenAI states that harmful content such as hate speech and spam is filtered out during the pre-training phase. Post-training, human feedback is used to reinforce the model's behavior consistency with policies through techniques like reinforcement learning with human feedback (RLHF).

GPT-4o mini inherits the security mitigations of GPT-4o and undergoes evaluations through automatic and manual assessments based on OpenAI's readiness framework. Insights from over 70 external experts have helped enhance the security of both GPT-4o and GPT-4o mini.

No Fear of Cracking Instructions

GPT-4o mini is the first model to apply our instruction-level approach in an API, enhancing its resistance to cracking, prompt injection, and system prompt extraction, making the model's responses more reliable and secure in large-scale applications.

Even the Free Version Can Use GPT-4o mini

OpenAI announces that in ChatGPT, free, Plus, and Team users will now have access to GPT-4o mini starting today, replacing GPT-3.5. Enterprise users will gain access next week. The future of powerful AI is becoming more affordable!

Related