The Gist
- AI-driven assistant. GPT4o introduces a faster AI-driven voice assistant with human-like responses.
- Real-time translation. GPT4o supports real-time translation in 50 languages, enhancing communication.
- Enhanced safety. Rigorous testing and safety interventions ensure secure interactions with GPT4o.
After much industry speculation last week about what many thought would be an announcement from OpenAI about its initial venture into search engines, OpenAI CEO Sam Altman posted that its spring announcement was not going to be about an OpenAI search engine, nor was it going to be about ChatGPT5, but rather, it was going to be about something that “feels like magic.”
This led some to speculate that OpenAI would be announcing an AI-driven voice assistant. Others intimated that it would be about the integration of real-time data In ChatGPT, “across English, French, and Spanish.” So what was Altman referring to?
Related Article: What Is ChatGPT? Everything You Need to Know
The Announcement: GPT4o, Standalone App, and Voice Translation
The OpenAI announcement began with an overview that included the introduction of a faster, easier-to-use version of ChatGPT called GPT4o that features the ability to reason across text, vision and voice. GPT4o is said to be as fast as GPT Turbo on text in English and code and can respond to audio inputs in as few as 232 milliseconds, “with an average of 320 milliseconds, which is similar to human response time in a conversation.” This decrease in response time is because, unlike other ChatGPT releases, all inputs and outputs are processed by the same neural network.
This release includes support for 50 languages, and is free for all users, though Plus users have five times the capacity limits of free users. In addition, GPT4o is "natively multimodal," and has the ability to interpret and generate content across voice, text or images. Unlike previous versions, generated images that include text are actually, at least in this demonstration graphic from OpenAI, free from typographical errors:
GPT-4o has been rigorously tested by more than 70 external experts across various fields, including social psychology, bias and fairness, and misinformation, to uncover and assess risks introduced or heightened by its new features. Insights gained from this comprehensive red teaming have guided the development of OpenAI’s safety interventions, enhancing the security of interactions with GPT-4o.
GPT4o: An AI-Driven Voice Assistant
The announcement, which was live-streamed on YouTube, showcased the GPT4o mobile app, which now enables users to interact with ChatGPT via voice commands. GPT4o is able to deliver responses in a natural, human-like voice and perform various vocal characterizations upon request, effectively transforming GPT4o into a digital personal assistant capable of engaging in real-time, spoken conversations. Users can even prompt GPT4o to adopt more expressive tones or mimic a robotic voice. Initially, the selection of voices will be limited to predefined options, but OpenAI has announced plans to introduce an enhanced version of Voice Mode in GPT-4o, currently in alpha, to ChatGPT Plus subscribers in the upcoming weeks.
One of the most impressive demonstrations was GPT4o’s ability to translate spoken words from Italian to English in real-time — and it includes support for 50 languages, turning it into a very effective universal translator. Additionally, it can “see” and interpret written text, as well as facial expressions in images, though the demonstration was limited to the interpretation of “happy.”
Related Article: ChatGPT Is All the Rage but Don't Stop Learning Just Yet
GPT4o Availability
GPT-4o is now available in the OpenAI API as a text and vision model, and is said to be two times faster, at half the cost, and features 5 times the rate limits of GPT-4 Turbo. OpenAI announced that in the coming weeks, support for GPT-4o's new audio and video capabilities will be provided to a small group of trusted partners in the API. Finally, OpenAI announced that in the coming months, it will release a ChatGPT desktop app with GPT-4o capabilities, adding to the current web and mobile versions of the AI technology.