top of page

AI-powered smartphone tool helps visually impaired users ‘feel’ where objects are in real time

  • MM24 News Desk
  • 1 hour ago
  • 4 min read
ree

Credit: Caleb Craig/Penn State.


Penn State researchers have developed NaviSense, an AI-powered smartphone app that guides visually impaired users to objects using real-time voice commands and haptic feedback. The award-winning tool uses large-language models and vision-language models to identify objects without preloading data, significantly outperforming existing commercial options in testing.


In a breakthrough for assistive technology, researchers at Penn State have created an innovative AI tool that enables visually impaired users to physically "feel" where objects are located around them through their smartphones.


The application, called NaviSense, represents a significant advancement in accessibility technology by combining real-time object recognition with precise haptic and audio guidance, effectively creating a new sensory experience for users who cannot rely on vision to navigate their environments.


The development team, led by Vijaykrishnan Narayanan, Evan Pugh University Professor and A. Robert Noll Chair Professor of Electrical Engineering, took a novel approach by conducting extensive interviews with visually impaired individuals before beginning technical development.




This user-centered design process ensured that the final product addressed real-world challenges rather than theoretical problems. "These interviews gave us a good sense of the actual challenges visually impaired people face," explained Ajay Narayanan Sridhar, a computer engineering doctoral student and lead student investigator on the project.



What sets NaviSense apart from existing visual aid applications is its sophisticated use of artificial intelligence. Unlike previous systems that required preloaded object models—a significant limitation that restricted functionality—NaviSense utilizes large-language models (LLMs) and vision-language models (VLMs) hosted on external servers. This architecture allows the app to recognize objects in real-time based on voice commands alone, without needing prior knowledge of the environment or objects within it.


"The technology is quite close to commercial release, and we're working to make it even more accessible," said Professor Narayanan. "Using VLMs and LLMs, NaviSense can recognize objects in its environment in real-time based on voice commands, without needing to preload models of objects. This is a major milestone for this technology."


The application works through a sophisticated multi-step process. Users simply speak what they're looking for, and NaviSense activates the smartphone's camera to scan the environment. The AI then filters out objects that don't match the verbal request and begins guiding the user through a combination of audio instructions and vibrational feedback. If the system doesn't fully understand the request, it engages in conversational follow-up questions to narrow the search—a feature that test users found particularly valuable.



Perhaps the most innovative aspect of NaviSense is its hand guidance capability. The app can track user hand movements in real-time by monitoring how the phone moves, providing continuous feedback about where the desired object is located relative to their hand.


"This hand guidance was really the most important aspect of this tool," Sridhar emphasized. "There was really no off-the-shelf solution that actively guided users' hands to objects, but this feature was continually requested in our survey."


The research team put their creation to the test in a controlled environment with 12 participants comparing NaviSense against two commercial visual aid options. The results were striking: NaviSense significantly reduced the time users spent searching for objects while simultaneously achieving higher identification accuracy than existing commercial programs.



Perhaps more importantly, participants reported a substantially better user experience, with one test user enthusiastically noting in a post-experiment survey, "I like the fact that it is giving you cues to the location of where the object is, whether it is left or right, up or down, and then bullseye, boom, you got it."


The tool's recognition earned the team the Best Audience Choice Poster Award at the Association for Computing Machinery's SIGACCESS ASSETS '25 conference in Denver this past October, where they presented their findings. The research details were also published in the conference's proceedings, marking an important academic contribution to the field of accessible computing.


Despite its impressive capabilities, the current iteration of NaviSense still has room for improvement before commercial release. The research team is currently working to optimize the application's power consumption to reduce smartphone battery drain and further refine the efficiency of the LLM and VLM components. These improvements will be crucial for making the technology practical for everyday use.



The development team included several other key contributors from Penn State, including Mehrdad Mahdavi, Hartz Family Associate Professor of Computer Science and Engineering, and Fuli Qiao, a computer science doctoral student.


They collaborated with researchers from the University of Southern California, including Laurent Itti, professor of computer science and psychology, and Yanpei Shi, a computer science doctoral candidate, along with independent researcher Nelson Daniel Troncoso Aldas.


As assistive technology continues to evolve, NaviSense represents a significant step forward in creating more intuitive, responsive tools for visually impaired individuals. By combining cutting-edge AI with thoughtful user-centered design, the Penn State team has demonstrated how technology can bridge accessibility gaps in ways that genuinely improve quality of life. With further refinement and eventual commercial release, NaviSense could become an essential tool for millions of visually impaired people worldwide, transforming how they interact with their environments independently and confidently.


Comments


bottom of page