OpenAI Rolls Out Gpt-realtime Voice Model

OpenAI has made its Realtime API generally available and launched gpt-realtime, a next-generation speech-to-speech model with new voices, image and phone support, stronger benchmarks, lower pricing, and enterprise-ready safeguards—bringing natural, production-ready voice agents to real-world deployment. (Source: Image by RR)

Zillow, T-Mobile, and StubHub Already Using Realtime API for Customer Experience

OpenAI has officially launched its Realtime API into general availability, alongside gpt-realtime, its most advanced speech-to-speech model yet. The upgrades mark a major step in making production-ready voice agents practical for enterprises, with capabilities including support for remote MCP servers, image input, and Session Initiation Protocol (SIP) phone calls. These features, according to an article in openai.com, allow developers to build AI systems that interact more naturally with users while integrating seamlessly into existing business infrastructure.

The new gpt-realtime model delivers substantial improvements in intelligence, naturalness, and instruction-following. It can handle complex requests, repeat disclaimers word-for-word, read alphanumeric sequences across multiple languages, and even switch tone or accent mid-sentence. Two new voices—Marin and Cedar—debut with the launch, bringing more human-like expressiveness. Benchmarks highlight the leap in performance: 82.8% accuracy on the Big Bench Audio reasoning test (up from 65.6% in 2024), 30.5% on MultiChallenge instruction following (vs. 20.6% previously), and 66.5% accuracy on ComplexFuncBench for function calling (up from 49.7%).

Unlike traditional voice pipelines that chain separate speech-to-text and text-to-speech models, Realtime runs audio directly through a single model, reducing latency and preserving nuance. Early enterprise partners—including Zillow, T-Mobile, StubHub, Oscar Health and Lemonade—are already using the technology to power real-world use cases such as guiding home searches, handling customer support, and managing insurance or healthcare conversations. Zillow’s head of AI called the improvements a way to make digital interactions feel “as natural as a conversation with a friend.”

OpenAI is also lowering costs and expanding safety controls. Pricing for gpt-realtime has been cut 20% compared to previous previews, now starting at $32 per million audio input tokens and $64 per million audio output tokens. Developers can reuse prompts across sessions and use new tools for managing long conversations cost-effectively. On the safety front, the system integrates active classifiers to halt misuse, preset voices to avoid impersonation, and enterprise-level privacy commitments including EU data residency. Together, these updates make gpt-realtime and the Realtime API a foundation for scaling conversational AI in production environments.

About the Author: Roque Ramirez

Leave A Comment Cancel reply

Our Company Mission

Seeflection.AI / Seeflection.com is focused in two areas, which provide synergies to each other. First, Seeflection.com provides AI news, information and e- learning and associated development resources. Second, we provide AI-based development and support services to companies focused in AI, quantum-AI and AI-enabled blockchain development. We have a rapidly growing set of affiliations with a range of corporate and non-profit Artificial Intelligence laboratories and research centers-- as well as individuals in various AI specialties. We are active in both primary and applied AI research and development programs, as well as AI applied to medicine, robotics, media and related markets.

Our Philosophy

Create synergy through applying technology to address long-term problems and create lasting opportunities for people.

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Zillow, T-Mobile, and StubHub Already Using Realtime API for Customer Experience

About the Author: Roque Ramirez

Musk’s xAI Sues Apple, OpenAI

Success In AI Age Is Human-Centered

AI Layoffs Turn into a PR Disaster

Thor Chip Gives Robots 7x More Power

Google Details Gemini AI Energy Use

Leave A Comment Cancel reply

Our Company Mission

Our Philosophy

OpenAI Rolls Out Gpt-realtime Voice Model