OpenAI launched three OpenAI audio models for its developer platform on Thursday to help software agents listen, translate and act during live conversations.
The ChatGPT maker said the models are available for testing in its developer playground.
The new models are GPT-Realtime-2, GPT-Realtime-Translate and GPT-Realtime-Whisper.
GPT-Realtime-2 is designed to handle more complex voice requests, handle call tools, manage interruptions and maintain context across longer sessions.
GPT-Realtime-Translate supports translation from more than 70 languages into 13 output languages.
OpenAI said the translation model targets customer support, education and other live conversation settings.
GPT-Realtime-Whisper provides live speech-to-text for captions, meeting notes and workflow updates as a speaker talks.
Customers testing the models include Zillow, Priceline and Deutsche Telekom. Pricing for GPT-Realtime-2 starts at $32 per million audio input tokens.
GPT-Realtime-Translate costs $0.034 per minute, while GPT-Realtime-Whisper costs $0.017 per minute.