Introducing GPT-4o mini in the API · Zamboni Inner Circle

Introducing GPT-4o mini in the API

Exciting news! Today we introduced GPT-4o mini—our new affordable and intelligent small model that’s significantly smarter, cheaper, and just as fast as GPT-3.5 Turbo—and launched it in the API. Here’s what you need to know:

Intelligence: GPT-4o mini outperforms GPT-3.5 Turbo in textual intelligence (scoring 82% on MMLU compared to 69.8%) and multimodal reasoning.
Price: GPT-4o mini is more than 60% cheaper than GPT-3.5 Turbo, priced at $0.15 per 1M input tokens and $0.60 per 1M output tokens (roughly the equivalent of 2500 pages in a standard book).
Modalities: GPT-4o mini currently supports text and vision capabilities, and we plan to add support for audio and video inputs and outputs in the future.
Languages: GPT-4o mini has improved multilingual understanding over GPT-3.5 Turbo across a wide range of non-English languages.

With its low cost and latency, GPT-4o mini works well for high-volume tasks (e.g., passing a full code base or conversation history to the model), cost-sensitive tasks (e.g., summarizing large documents), and tasks that require fast responses (e.g., customer support chatbots). Like GPT-4o, GPT-4o mini has a 128k context window, supports up to 16k output tokens per request, and has a knowledge cut-off date of October 2023. We plan to launch fine-tuning for GPT-4o mini in the coming days.

We recommend developers using GPT-3.5 Turbo switch to GPT-4o mini to unlock higher intelligence at a lower cost. You can use GPT-4o mini in the Chat Completions API and Assistants API, or in the Batch API where you get a 50% discount on batch jobs completed asynchronously within 24 hours.

To get started, test the model in Playground and check out our API documentation. To learn how to use vision with GPT-4o mini, check out the Introduction to GPT-4o and GPT-4o mini in the cookbook. If you have questions, please reach out in the OpenAI developer forum.

Happy building!

—The OpenAI team

4 comments