With the new GPT-4o model OpenAI takes its ChatGPT to the next level

By Sofia Elizabella Wyciślik-Wilson
Published 2 years ago

Pioneering AI firm OpenAI has launched the latest edition of its LLM, GPT-4o. The flagship model is being made available to all ChatGPT users free of charge, although paying users will get faster access to it.

There is a lot to this update, but OpenAI highlights improvements to capabilities across text, voice and vision, and well as faster performance. Oh, and if you were curious, the "o" in GPT-4o stands for "omni".

See also:

The updates to the model open up a new world of possible uses, and OpenAI says that "GPT-4o is much better than any existing model at understanding and discussing the images you share". It is described as being a "step towards much more natural human-computer interaction -- it accepts as input any combination of text, audio, and image, and generates any combination of text, audio, and image outputs".

So far, so vague. But what does it mean in practice? The company offers up some potential usage scenarios:

You can now take a picture of a menu in a different language and talk to GPT-4o to translate it, learn about the food's history and significance, and get recommendations. In the future, improvements will allow for more natural, real-time voice conversation and the ability to converse with ChatGPT via real-time video. For example, you could show ChatGPT a live sports game and ask it to explain the rules to you. We plan to launch a new Voice Mode with these new capabilities in an alpha in the coming weeks, with early access for Plus users as we roll out more broadly.

The speed of GPT-4o is comparable to that of human responses, and OpenAI also draws attention to massive improvements to translation capabilities and operations in non-English languages.

There have been safety concerns about artificial intelligence from the very beginning, and these are only growing as the technology becomes more powerful. Acknowledging this, OpenAI says:

GPT-4o has also undergone extensive external red teaming with 70+ external experts in domains such as social psychology, bias and fairness, and misinformation to identify risks that are introduced or amplified by the newly added modalities. We used these learnings to build out our safety interventions in order to improve the safety of interacting with GPT-4o. We will continue to mitigate new risks as they're discovered.

We recognize that GPT-4o's audio modalities present a variety of novel risks. Today we are publicly releasing text and image inputs and text outputs. Over the upcoming weeks and months, we'll be working on the technical infrastructure, usability via post-training, and safety necessary to release the other modalities. For example, at launch, audio outputs will be limited to a selection of preset voices and will abide by our existing safety policies. We will share further details addressing the full range of GPT-4o's modalities in the forthcoming system card.

There is a wealth of additional information about GPT-4o available here.

Tags: AI (Artificial Inteligence), Artificial Intelligence, ChatGPT, ChatGPT 4o, LLM, OpenAI

1 Comment

With the new GPT-4o model OpenAI takes its ChatGPT to the next level

One Response to With the new GPT-4o model OpenAI takes its ChatGPT to the next level

Recent Headlines

WhatsApp brings back the About feature to supplement statuses

Google updates Android Quick Share for compatibility with Apple AirDrop

Microsoft open sources Infocom’s Zork trilogy

IBM and Cisco announce partnership to create a distributed network of fault-tolerant quantum computers

Logitech opens its first U.S. experience store in San Francisco, with a giant MX Master 4 mouse

One in 11 new Black Friday websites is malicious

Over 71 percent of in-house IT builds fail to deliver

Most Commented Stories

Say goodbye to Microsoft Windows 11 and hello to Nitrux Linux 5

Apple bows to Chinese pressure to remove queer dating apps from its App Store

Microsoft says Windows update may have caused login problems

New year, new Microsoft OS -- the stunning Windows 26 is everything Windows 12 should be

Microsoft is changing the naming schema for Windows 11 updates

Confidence in ransomware recovery is high but actual success rates remain low

Would you swap personal information for a bargain?

OpenAI launches ChatGPT Atlas, a new browser built around AI -- but it’s macOS only for now

Why Trust Us