Summary
Before the redesign I led, Images mode required users to upload or share a photo, so they could extract the text in it or get a basic description.
Now, Images mode opens to a photo-capturing experience with assistive audio framing guidance. People with blindness can line up their shot, snap a photo, have AI describe it, and ask follow up questions.
Problem
Images do not typically have a description or alt text, which excludes visually impaired people from engaging with a critical media format used in the world today.
Design
We wanted our visually-challenged users to know what's in the frame and have help lining up a selfie (a common desire we heard in our research). So we reused some of our tech from other parts of Lookout. The selfie camera can tell you how to adjust your camera ("face cropped on bottom").
Once the user presses the shutter button, we should immediately send the AI prompt to describe it. The description text is read out loud with word highlighting, followed by other metadata such as the text it recognizes. Then users can ask follow-up questions about the photo, share the image and/or description, download, etc.
I wanted the buttons to be as large as reasonably possible given our users base's disabilities, while knowing that many users have to maximize their system-level display/font size.
I led this design process which included
Overall UI/UX including the capture experience, description page, chat page, prompt menu.
A tutorial video which the app opens to
Marketing materials for Google I/O
A workshop with the Deepmind team