OpenAI introduced the world to its latest powerful AI model, GPT-4, and excitingly, the first thing associated with its new capabilities is helping people with visual impairments. Blind and low-vision people can be my eyes, asking them to describe what their phone sees, with ‘virtual volunteers’ providing AI-powered assistance at all times.
In the year We’ve written about Be My Eyes many times since its launch in 2015, and of course the development of computer vision and other tools has featured prominently in its history of helping the visually impaired navigate everyday life with ease. But the app itself only does so much, and its main feature has always been the ability to get a helping hand from a volunteer, who can look at your phone’s camera view and provide detailed descriptions or instructions.
The new version of the app is the first to integrate GPT-4’s multimodal capabilities, which include not only intellectual discussion, but also the ability to examine and understand rendered images.
Users can send images through the app to an AI-powered virtual volunteer, who will answer any questions about that image and provide quick visual assistance for various tasks.
For example, if a user sends a picture of the inside of their fridge, Virtual Volunteers will not only identify exactly what’s inside, but also analyze what can be made of those ingredients. The tool provides many recipes for those ingredients and can send you step-by-step instructions on how to make them.
But the video accompanying the explanation is even more enlightening. In it, Be My Eyes user Lucy demonstrates the app in a live broadcast that does a lot of things. If you don’t know the rapid-fire patois of a screen reader, you might miss some of the dialogue, but it can describe the look of a dress, identify a plant, read a map, translate a label, learn a language. She will step on a certain machine at the gym, and tell her which buttons to push on the vending machine. (You can watch the video below.)
Let my eyes be virtual volunteers
It’s a very succinct demonstration of how unfriendly our urban and commercial infrastructure is to the visually impaired. And it also shows how useful the GPT-4 multimodal discussion can be in the right context.
There is no doubt that human volunteers will continue to be a tool for users of the Be My Eyes app – there is nothing to replace them, only to step up when needed (and they can be called immediately if the AI response is not enough).
For example, AIA makes a valuable suggestion that in the gym, “the machines are where the people aren’t.” Thank you! As OpenAI co-founder Sam Altman said today, the capabilities are more impressive at first blush than once you’ve used them for a while, but we should be careful not to look this gift horse too close in the mouth.
The Be My Eyes team is working closely with OpenAI and the community to define and guide its capabilities as development continues.
Right now, the feature is in a closed beta among a “small subset” of My Eyes users that will expand in the coming weeks. “We hope to make Virtual Volunteers more widely available in the coming months,” the team wrote. “Like our previous volunteer service, this tool is free to all blind and low vision community members who use the Eyes Hun app.”
Considering how quickly ChatGPT has been integrated into providing services for corporate SaaS platforms and other prosaic applications, it’s exciting to see this innovation immediately put to work helping people. You can read more about GPT-4 here.