Playing Around with OpenAI's GPT Realtime Voice API

May 08, 2026

I've been fascinating by Voice assistants for a while, so when OpenAI released their new GPT Realtime API I decided to try it out.

Voice Assistant

Yesterday OpenAI added 3 new models to their Realtime API. The one that caught my attention was GPT-Realtime-2, which they claim is "our first voice model with GPT-5-class reasoning that can handle harder requests and carry the conversation forward naturally." The demo showed the presenter being able to ask the assistant to look up calendar information and even ask it to wait until he said a special phrase before it started conversing with him again. So I decided to play around with it and make my own Voice assistant. I used SolveIt, a Jupyter like environment with AI integration, with Codex to build a working prototype that has the ability to do websearches. Here is the final artifact that we built as a python script that you can run with uv run gpt_realtime.py. You need an openai api key for the realtime api and a gemini one for the websearch tool.

The API really earns that Realtime moniker with its speed. Additionally, the ability to have it call tools while also narrating is an interesting UX. I'm exciting to explore it with more sophisticated tools it can use.