OpenAI's launch of GPT-4o appears to have upped the large language model (LLM) ante with a real-time conversational chat interface that recognizes audio and video and detect emotions. Here's a look at the implications to the enterprise, the short-term impact and the long run.

First, the details. OpenAI's GPT4o enables you to do a lot more than traditional models. In fact, GPT-4o is more in line with what you'd see in a science fiction movie. It's cool, odd and scary at the same time. For what it's worth, the "o" stand for omni and GPT-4o can respond to audio on an average of 320 milliseconds to match a human. It's also more efficient and cheaper.

Here's a look at the benchmarks.

And in a blog post, OpenAI CEO Sam Altman said: "the new voice (and video) mode is the best computer interface I’ve ever used. It feels like AI from the movies; and it’s still a bit surprising to me that it’s real. Getting to human-level response times and expressiveness turns out to be a big change."

Meanwhile, OpenAI aims to make its new model affordable with a free tier and then access via ChatGPT Plus and Teams with a rollout to enterprises on deck.

Here are a few thoughts about what this all means.

  • Short-term: This launch is almost comical in how it falls ahead of Apple's WWDC. Apple has been widely reported to be in negotiations with OpenAI about embedding models into iOS. And in case you haven't noticed (and who hasn't?) Siri has needed a new brain for years. OpenAI's GPT-4o basically seals the Apple deal and a massive revenue stream.
  • Mid-term: OpenAI has a close relationship with Microsoft and that's been mutually beneficial. The problem for OpenAI is that Microsoft is more likely to own that customer relationship than OpenAI. It's obvious GPT-4o can help build more direct enterprise relationships. Moderna and OpenAI may just be a start as enterprises will want more direct access to GPT-4o.
  • Mid-term: Enterprise use cases with GPT-4o are going to surge. Assuming OpenAI's latest and greatest model can replicate a human closely customer experiences are likely to go even more human.
  • Short-term: GPT-4o is interesting because OpenAI is so good at big bang AI. You can expect the competition to heat up even more from LLM rivals, who may not be so far behind.
  • Short-term: The enterprise pendulum has swung to more model choices, but key platforms like Amazon Bedrock feature a bevy of choices that usually don't include OpenAI. OpenAI with GPT-4o may be able to carve out relationships with hyperscalers not named Microsoft Azure. Enterprises are going to want to swap out models as needed, but OpenAI's latest LLM may be too good to ignore. Despite what enterprises say about being vendor neutral, they usually gravitate to one provider and lock-in (and complain about it later).
  • Long-term: GPT-4o appears to be a big leap and enough to trigger an even larger white-collar recession in the future. There are also a bevy of other cultural issues to ponder. As the LLM race accelerates, these issues are only going to become larger.
  • Long-term: OpenAI has already made ChatGPT a verb to some degree. Broad access at a reasonable price may seal the deal. The company said:

"GPT-4o's text and image capabilities are starting to roll out today in ChatGPT. We are making GPT-4o available in the free tier, and to Plus users with up to 5x higher message limits. We'll roll out a new version of Voice Mode with GPT-4o in alpha within ChatGPT Plus in the coming weeks.

Developers can also now access GPT-4o in the API as a text and vision model. GPT-4o is 2x faster, half the price, and has 5x higher rate limits compared to GPT-4 Turbo. We plan to launch support for GPT-4o's new audio and video capabilities to a small group of trusted partners in the API in the coming weeks."

See more: