
Open AI
OpenAI has launched a new option called Flex processing, an API feature that offers lower costs for using its AI models in exchange for slower response times and potential resource unavailability.
Currently in beta, this feature is available for OpenAI’s latest reasoning models, o3 and o4-mini. Flex processing is specifically designed for non-urgent, non-production tasks such as model testing, data enrichment, and asynchronous workloads.
By using Flex, users can save up to 50% on API costs. For the o3 model, the price is reduced to $5 per million input tokens and $20 per million output tokens, down from the standard rates of $10 and $40. Meanwhile, for the o4-mini model, the cost drops to $0.55 for input and $2.20 for output per million tokens, compared to the regular prices of $1.10 and $4.40.
This move comes amid rising costs in developing cutting-edge AI and increasing competition from other companies like Google, which recently released Gemini 2.5 Flash — a highly efficient model with strong performance and lower input token costs than its competitor, DeepSeek R1.
In an email announcement, OpenAI also stated that users in tiers 1 to 3 (based on their spending level) must complete identity verification to access the o3 model. Additionally, features like reasoning summaries and streaming API support for o3 and other models are also restricted to verified users.
OpenAI emphasized that this ID verification policy is intended to prevent misuse of its services by bad actors.