Mobile AI for Vision Loss: Apps, Features, and Accessibility

Mobile AI for Vision Loss: Apps, Features, and Accessibility

Explore how AI-powered mobile technology empowers people with vision loss. Learn about top accessibility apps, screen readers, and real-time AI visual assistance.

Mobile Technology and AI for Vision Loss

Modern smartphones and tablets come with a host of programs and settings designed to enable people with disabilities to use them. For these individuals, phones have become not just mere conveniences, but valuable tools for an independent life. Among other things, they serve as navigational aids, describe images or scenes, scan printed text, hold libraries of resources, help avoid the hassle of getting paperwork in braille, and quickly summon help when needed. This post covers the basics of using a phone for those with vision loss and dives into how artificial intelligence (AI) and other developments have been making them more useful with time.

How People with Vision Loss Use Mobile Devices

A user with vision loss can make their phone accessible in two ways. Those with milder vision loss may opt to use screen magnification, which enlarges items on the screen to make them easier to read. For more severe visual impairment, a screen reader translates the graphical interface into a linear text format that can be spoken out loud using a synthesized voice (text-to-speech) or rendered into braille on a braille display. Some people choose to use a combination of both methods.

Native Screen Readers: VoiceOver and TalkBack

Both Apple’s iOS and Google’s Android, the operating systems used by the most popular smartphones, come with built-in screen reading technology. Apple’s screen reader is called VoiceOver, while the Android screen reader is called TalkBack. Both work by converting the graphical screen layout to text and making that text available to the user via speech or braille output. Each allows for different levels of customization to improve the reading experience. Users can choose from a variety of voices and adjust the speaking rate and pitch of the voice. They can use aids like VoiceOver’s Rotor, which allows for navigation by a particular type of element, such as by character or by heading. App developers can also interface specifically with the screen reader, creating custom behaviors meant to boost accessibility within their own third-party apps.

Magnification and High-Contrast Display Settings

Android and iOS also give users features designed specifically for those with low vision. Screen magnification, which iOS calls Zoom, is the ability to enlarge items on the screen. The user controls whether all or only part of the screen is magnified. Both programs have a gesture to turn on the magnification and ways to adjust the size of an item on the fly. In addition to magnification, each operating system also has settings for text size, bold text, various settings to increase contrast, color filtering, and many others so the user can customize their phone to the needs dictated by their vision loss.

Gesture-Based Navigation and Haptic Feedback

A key way in which screen readers help those with vision loss is that each one modifies the basic finger gestures, such as tapping and swiping, that control the phone. For example, with the screen reader off, you activate an item (click a link, press a button, etc.) by tapping the screen once with your finger. This would be problematic for a person with vision loss, as the phone would be constantly tapping unwanted items at a mere touch. Therefore, both screen readers were designed so that tapping once selects an item on the screen, and tapping twice quickly, called a “double tap,” activates the item. Many additional gestures allow for scrolling the screen, answering calls, playing and pausing music, pausing or muting speech, etc. More complex gestures differ between the screen readers. VoiceOver generally uses single-movement gestures with one to four fingers, while TalkBack uses many double-movement gestures, such as swiping up and then right, but with fewer fingers.

The Role of Artificial Intelligence in Visual Accessibility

Computer vision is a form of artificial intelligence in which a computer recognizes, decodes, reconstructs and reports back the content of a visual input. It can be used to identify colors, objects, and environments in the vicinity. For example, a person with vision loss could use computer vision to find the location of nearby doors or find out what a printed photograph shows. It can even be trained to recognize faces.

With the recent explosion of AI technology, people with vision loss can now use many types of AI at once to find information or accomplish tasks. Also, more apps designed specifically for those with vision loss are taking advantage of the variety of technology available to enhance the utility of their services. Mainstream AI agents like ChatGPT, Gemini and Claude, as well as players in the vision loss space like Be My AI, offer the ability not just to recognize something, but to answer questions about it. Users no longer need to wait passively for information to be announced. With the advent of multimodal AI and large language models, users can prompt the AI with the exact information they need and get a response.

Several apps made specifically for those with vision loss offer the ability to convert a scanned image of text into a format accessible with a screen reader. This is usually done via optical character recognition (OCR). Some OCR apps can quickly scan and read paper documents, often making it unnecessary for people with vision loss to ask for sighted assistance. Other apps can convert electronic files such as scanned PDF documents into readable text and save them on the phone. Advances in OCR technology have revolutionized the ways in which people with vision loss interface with printed and scanned material.

Leading AI Apps for the Visually Impaired

Be My Eyes is an app designed to enable people with vision loss to connect to sighted volunteers to complete a task requiring visual help. Examples might be finding a particular paper document from within a pile, reading a food label, or even navigation. Over the years, Be My Eyes has also made it possible to connect with customer service representatives from a host of well-known companies, enabling those with vision loss to ask targeted questions. Recent developments include Be My AI, a tool which performs image recognition and responds to specific prompts and questions from the user.

An early player in visual intelligence for iOS, now also available on Android, is Microsoft’s Seeing AI. This app with a minimalist design can scan and read text, including multi-page documents. It can describe the environment in a photo, and answer prompts and questions. It can identify products based on codes on the packaging and provide label information. It can be trained to recognize faces and objects, then used to find them later. It also recognizes currency, color, and light intensity.

Recognition of text, environments, food labels, documents, currency, and objects. These variables are modes which the user can select. A notable feature of Lookout is the ability to choose a kind of object to find from a list. The “Find” mode can search for seating and tables, doors, windows, kitchen appliances, and even bathrooms. Depending on the model of the phone, Lookout can also provide distance and directional feedback when finding objects.

Envision AI is an independently developed recognition app. It can recognize short texts, scan documents, and read PDFs and other images with OCR. It can explore items around the phone and describe scenes, including the ability to answer specific questions posed by the user. Envision can also find a specific type of object, and the list is dozens long. Popular options include doors and doorhandles, light switches, benches, chairs and tables, laptops, couches, and more. Envision is also known for its wearable technology, including the Envision Glasses and Ally Solos Glasses. The latter include Ally, Envision’s accessible AI assistant.

Specialized Navigation and Wayfinding Technology

Smartphones come with built-in map apps, valuable navigation aids for those with vision loss. However, other apps have custom functionality that can give more specific and targeted information. Apps like Goodmaps,  Outdoors, Soundscape, VoiceVista, Lazarillo, and BlindSquare, are GPS apps designed for those living with vision loss. They all use different technology and lay out their information in different ways, so the user will need to experiment to find out which apps they prefer and in which contexts. All are based on the concept of points of interest (POIs), which can include home, nearby shops, restaurants, businesses, and transit stops. Soundscape and VoiceVista, both descended from a discontinued Microsoft project, let the user explore their environment virtually with audio cues and plan routes in advance using its street preview tool. Perhaps the most helpful features are the ability to set custom markers along the route and the “audio beacon.” The latter is a sound that plays in the background from a certain distance to the destination and changes based on direction, allowing the person to always know exactly where they are in relation to the point of interest. Various categories of locations, such as transit stops, crosswalks, and other landmarks, get individual sound effects. Goodmaps Outdoors, Lazarillo, BlindSquare, and others are more typical GPS apps. If the user wears a headset, these apps will announce directions in either the left or right ear, depending on where the destination is. Notably, Goodmaps Outdoors, Soundscape, and VoiceVista can give directions via the clock face method.

In some ways, indoor navigation is still in its infancy, due to the unreliability of smartphone location data within buildings. The key for using any of these apps is to remember that the venue must have partnered with the app and provided maps for navigation, and it is important to find out via the apps’ location directories, which are often updated, before taking a trip. Two apps specifically made for people with vision loss are Goodmaps Explore and Waymap. Both provide turn-by-turn navigation using tailored directions, based on the maps they receive from venues, and on the phone’s location and motion data. Both also have directories of supported locations, Goodmaps alphabetically and Waymap by city. They allow the user to explore locations virtually and test out routes before a trip, similar to features in some of the outdoor navigation apps.

Certain apps like Oko use AI to determine when it is safe for a user with vision loss to cross the street. To access this technology, the user approaches a crosswalk and raises their phone so that the camera can detect when the light changes. Oko’s AI analyzes the camera’s input and notifies the user when it has detected the crossing sign, enabling them to cross safely in the absence of audible crosswalk signals or other aids.

Voice Assistants and Hands-Free Control

Many specialized AI assistants are now designed to function via voice input. They use natural-sounding text-to-speech, large language models, and the ability to connect to other productivity tools. They can benefit people with vision loss by automating certain activities, reducing reliance on inaccessible apps or websites to accomplish necessary tasks, and communicate in a conversational fashion.

Various built-in and third-party apps, with and without AI, allow people with vision loss to control the phone using only their voice. On Apple’s iOS, for example, the built-in Voice Control can open apps, adjust settings and more. Additionally, iOS’s Shortcuts app provides for automation of complex tasks and can be used in conjunction with Voice Control. Voice Access, the Android equivalent, functions similarly. These programs are good fits for people with vision loss who also have motor or learning disabilities, or who simply find it more convenient to limit interaction with the screen and other visual or tactile elements.

The Future of Mobile Accessibility: Wearables and Beyond

When integrated with smartphones, AI smart glasses, such as those sold by Meta-Ray Ban and Envision, provide easy ways to scan text, find objects, and find out what’s around you simply by turning your head. They can also search the web, answer calls, play music, and more. They are a hands-free independent living solution for people with vision loss, representing a future where wearables take center stage in the assistive technology landscape.

Google’s Project Astra powers Aira’s prototype AI visual interpreter. This AI interpreter is designed to find and recognize objects, read texts, perform comparisons, and more, all while interacting with the user in a conversational way. At any time, the user can opt to have one of Aira’s human interpreters take over. Aira currently has a waitlist to join its Trusted Tester program for this new technology.