OpenAI claims that its free GPT-4o model can talk, laugh, sing and see like a human

OpenAI on Monday announced GPT-4o, a brand new AI model that that the company says is one step closer to “much more natural human-computer interaction.” The new model accepts any combination of text, audio and images as input and can generate an output in all three formats. It’s also capable of recognizing emotion, lets you interrupt it mid-speech, and responds nearly as fast as a human being during conversations.

“The special thing about GPT-4o is it beings GPT-4 level intelligence to everyone, including our free users,” said OpenAI CTO Mira Murati during a live-streamed presentation. “This is the first time we’re making a huge step forward when it comes to ease of use.”

During the presentation, OpenAI showed off GPT-4o translating live between English and Italian, helping a researcher solve a linear equation in real time on paper, and providing guidance on deep breathing to another OpenAI executive simply by listening to his breaths.

Say hello to GPT-4o, our new flagship model which can reason across audio, vision, and text in real time: https://t.co/MYHZB79UqN

Text and image input rolling out today in API and ChatGPT with voice and video in the coming weeks. pic.twitter.com/uuthKZyzYx

— OpenAI (@OpenAI) May 13, 2024

The “o” in GPT-4o stands for “omni”, a reference to the model’s multimodal capabilities. OpenAI said that GPT-4o was trained across text, vision and audio, which means that all inputs and outputs are processed by the same neural network. This is different from the company’s previous models, GPT-3.5 and GPT-4, which did let users ask questions simply by speaking, but then transcribing the speech into text. This stripped out tone and emotion and made interactions slower.

OpenAI is making the new model available to everyone, including free ChatGPT users, over the next few weeks and also releasing a desktop version of ChatGPT, initially for the Mac, which paid users will have access to starting today.

OpenAI’s announcement comes a day before Google I/O, the company’s annual developer conference. Shortly after OpenAI revealed GPT-4o, Google teased a version of Gemini, its own AI chatbot, with similar capabilties.

This article originally appeared on Engadget at https://www.engadget.com/openai-claims-that-its-free-gpt-4o-model-can-talk-laugh-sing-and-see-like-a-human-184249780.html?src=rss

Go Here to Read this Fast! OpenAI claims that its free GPT-4o model can talk, laugh, sing and see like a human

Originally appeared here:
OpenAI claims that its free GPT-4o model can talk, laugh, sing and see like a human

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

OpenAI claims that its free GPT-4o model can talk, laugh, sing and see like a human

More posts

Red Hat bets big on AI with its Neural Magic acquisition

How many software updates does the OnePlus 13 get?

The best air purifier for 2025

UK Government launches ransomware protection proposals