OpenAI Launches Flex Processing for Cost-Effective, Slower AI Tasks

April 18, 2025

In a strategic move to stay competitive in the rapidly evolving AI industry, OpenAI has unveiled a new API option called Flex processing. This offering provides a significantly cheaper alternative for developers who can trade off speed and guaranteed availability for lower prices — a particularly useful option for non-critical tasks.

What Is Flex Processing?

Flex processing is currently available in beta for OpenAI’s newest reasoning models, o3 and o4-mini. It’s designed for lower-priority and asynchronous workloads — such as data enrichment, model evaluation, and other non-production applications — where latency and uptime aren’t mission-critical.

According to OpenAI, Flex processing:

Cuts API usage costs by 50%
Offers variable performance (i.e., slower response times)
May experience intermittent unavailability

This move helps OpenAI address growing concerns over the cost of advanced AI models, which have increased significantly as compute demands have surged.

Pricing Breakdown

Here’s how Flex pricing compares with standard rates:

Model	Standard Input	Flex Input	Standard Output	Flex Output
o3	$10/M tokens	$5/M tokens	$40/M tokens	$20/M tokens
o4-mini	$1.10/M tokens	$0.55/M tokens	$4.40/M tokens	$2.20/M tokens

To put that in perspective, 1 million tokens equates to approximately 750,000 words — making these savings meaningful for developers running large-scale evaluations or pipelines.

A Response to Market Pressure

The release of Flex processing coincides with competitive moves from rival tech giants. On the same day, Google launched Gemini 2.5 Flash, a budget-friendly reasoning model said to rival DeepSeek’s R1 in both performance and price.

With AI costs under scrutiny and many organizations looking to scale affordably, OpenAI’s Flex aims to capture use cases that don’t require real-time performance but still demand the sophistication of frontier models.

Verification Now Required

Alongside Flex, OpenAI also revealed updates around access control for its most advanced models. In an email to developers, the company stated that:

Users in tiers 1–3 (based on spend levels) must now complete ID verification to access models like o3.
Reasoning summaries and streaming API features are also locked behind verification.

The move appears to be part of OpenAI’s broader strategy to reinforce responsible usage. According to the company, these checks are designed to prevent abuse and ensure compliance with its safety protocols.

“As AI becomes more powerful, we want to make sure it’s being used by verified users who follow our safety standards,” OpenAI said in its announcement.

A Balancing Act Between Power and Price

Flex processing is the latest example of OpenAI trying to walk the line between innovation and accessibility. While high-performance models remain expensive, Flex offers a middle ground — delivering advanced capabilities at a lower cost for users who don’t mind a bit of latency.

Whether this will attract more developers or shift industry dynamics remains to be seen. But as AI deployment continues to scale, options like Flex could become a vital tool for startups, researchers, and enterprises alike.

Also Read : OpenAI in Talks to Acquire Windsurf for $3 Billion: Report

Former Y Combinator President Geoff Ralston Launches New AI Safety Fund

April 18, 2025

Tech

Bluesky Set to Launch New Blue Check Verification System – How It Differs from X’s Approach

April 19, 2025

Hand-Picked Top-Read Stories

Apple’s Sleek ‘Liquid Glass’ Design Could Be a Sneak Peek at Its Future AR Glasses

Amazon Dives Deeper into Nuclear, Secures 1.92 GW to Power AWS

The Browser Company Launches Its AI-First Browser, Dia, in Beta

Trending Tags