Exciting news! Today we introduced GPT-4o mini—our new affordable and intelligent small model that’s significantly smarter, cheaper, and just as fast as GPT-3.5 Turbo—and launched it in the API. Here’s what you need to know: - Intelligence: GPT-4o mini outperforms GPT-3.5 Turbo in textual intelligence (scoring 82% on MMLU compared to 69.8%) and multimodal reasoning.
- Price: GPT-4o mini is more than 60% cheaper than GPT-3.5 Turbo, priced at $0.15 per 1M input tokens and $0.60 per 1M output tokens (roughly the equivalent of 2500 pages in a standard book).
- Modalities: GPT-4o mini currently supports text and vision capabilities, and we plan to add support for audio and video inputs and outputs in the future.
- Languages: GPT-4o mini has improved multilingual understanding over GPT-3.5 Turbo across a wide range of non-English languages.
With its low cost and latency, GPT-4o mini works well for high-volume tasks (e.g., passing a full code base or conversation history to the model), cost-sensitive tasks (e.g., summarizing large documents), and tasks that require fast responses (e.g., customer support chatbots). Like GPT-4o, GPT-4o mini has a 128k context window, supports up to 16k output tokens per request, and has a knowledge cut-off date of October 2023. We plan to launch fine-tuning for GPT-4o mini in the coming days.
We recommend developers using GPT-3.5 Turbo switch to GPT-4o mini to unlock higher intelligence at a lower cost. You can use GPT-4o mini in the Chat Completions API and Assistants API, or in the Batch API where you get a 50% discount on batch jobs completed asynchronously within 24 hours. Happy building!
—The OpenAI team