“It’s something that, you know, we can’t comment on at this point. It’s very competitive out there,” OpenAI chief scientist Ilya Sutskever said in a video call with the GPT-4 team an hour after the announcement.
Access to GPT-4 will be available on a limited, text-only basis for waitlisted users and paid premium subscribers to ChatGPT Plus.
GPT-4 is a multimodal large language model, meaning it can respond to both text and images. Give it a picture of the contents of your fridge and ask what you can make, and GPT-4 will try to provide recipes that use the ingredients in the picture.
“The continuous improvement in many aspects is impressive,” says Oren Etzioni of the Allen Institute for AI. “GPT-4 is now the standard against which all base models are evaluated.”
“A good multimodal model has been the holy grail of many large tech labs for the last couple of years,” said Thomas Wolff, founder of Hugging Face, the AI startup behind the open-source language model BLOOM. But it remained elusive.
Combining theory, text, and images allows multimodal models to better understand the world. “It may be able to deal with traditional weak points of language models, such as spatial reasoning,” says Wolff.
It is not yet clear if that is true for GPT-4. OpenAI’s new model appears to be better than ChatGPT for some basic reasons, such as solving simple puzzles such as summarizing blocks of text into words that start with the same letter. On my demo, GPT-4 showed me a blurb from the OpenAI website that summarized it in words starting with g: “GPT-4, the next generation of infrastructure, has achieved great results.” Ways of protection, guidance and profit are collected. Huge, infrastructure and global talent. In another demonstration, GPT-4 took a tax document and answered questions about it, the reasons for the answers.