Visually impaired individuals lack spatial awareness of obstacles in their environment. Traditional aids such as white canes and seeing eye dogs allow users to navigate while understanding a very small spatial area around themselves. The aim was to increase the spatial awareness of visually impaired individuals by allowing them to hear their environment. This was accomplished by creating a light-to-audio conversion device that utilizes two single-point LiDAR (light detection and ranging) sensors that output distance values that are converted into audio feedback, which provides a sense of depth perception and warns the user of any impending obstacles. Through testing of this device, it was found that the device could successfully increase a user’s ability to detect obstacles at a distance and take evasive action to avoid them. It was also determined that the double coding audio feedback system is intuitive for first time users and creates an artificial depth perception. The device was produced on a limited budget and tested in a controlled environment. It was intended to serve as a proof of concept and sufficiently meets the outlined goals. Further testing and higher quality components are necessary for this device to reach a commercial market.
From a 2015 study, it was estimated that there are 36 million people worldwide who suffer from blindness of various degrees [1]. Those that are visually impaired face many challenges when it comes to mobility and independence during everyday activities. The public has become accustomed to aids such as white canes and guide dogs that give the visually impaired some independence. Traditional aids provide accurate feedback for ground-level obstacles but are not as helpful for elevated objects (i.e. low hanging branches, low doorways) or obstacles further than a meter or two. When creating a device for the visually impaired, it is important to consider the way in which they learn and navigate their environment. Devices that take advantage of various sensory modalities can provide the user with greater amounts of information; however, they also introduce the potential for information overload. Manufacturers must account for this possibility by investigating the device’s performance under high-load, multisensory conditions.
Having lost one of five senses, the other four senses tend to be heightened for visually impaired individuals - in particular, hearing. A heightened ability to hear allows the visually impaired to make the reliance on navigating the world through listening for various audio cues a way of life. Even with various navigational aids, there still remains the problem that the visually impaired lack the spatial awareness and depth perception to detect obstacles in their vicinity that sighted individuals have. With the goal of providing an artificial sense of perception for the visually impaired, our device will utilize audio cues that map to the auditory cortex and temporal lobe of the human brain in an effort to improve the user’s independence.
The device to be developed should align with standards specified by organizations such as the FDA (Food and Drug Administration) and the ISO (International Organization for Standardization). Additionally, it is important for our device to be as environmentally friendly as possible by not creating unnecessary waste. For testing purposes, the environment will need to be greatly controlled since the device is only a prototype that focuses on a proof-of-concept approach.
A device that can provide an artificial depth perception to aid in detecting potential hazards can give the visually impaired more independence in their day-to-day tasks. Four specific levels of need can be served with this device: 1) researchers, 2) hospitals, 3) ophthalmologists and optometrists, 4) visually impaired patients. Researchers looking to create object identification algorithms or sound feedback algorithms that aid in object detection can greatly benefit from this device. Hospitals could use this device as an option for patients to use in their everyday life. Similarly, ophthalmologists and optometrists can recommend this device to patients and use it as a building stone to create a better product. The visually impaired can use this device to improve independence and prevent injury over unseen obstacles.
To further quantify the need of this device, it is estimated that 36 million people worldwide are blind [1].
Problem Statement: Even while using traditional aids, visually impaired individuals lack spatial awareness and depth perception as they are limited by the reach of their white cane. The ability to detect objects at greater depths beyond what traditional aids provide could greatly enhance a visually impaired individual’s independence and give them the novel ability to detect distant objects similarly to how sighted individuals are able to.
Everyday mobility for the visually impaired poses many challenges, and existing aides fail to solve all of these. Canes are used to detect obstacles that are primarily closer to the ground (e.g. steps, walls, obstacles that can be tripped over, other people) but often fail to detect obstacles above the head (e.g. low hanging branches). Guide dogs also provide a great amount of mobility and are trained specially to help the visually impaired navigate through crowds, habitual routes, and various obstacles [2]. However, they struggle to navigate unfamiliar environments, so they are often paired with navigational mobile applications such as Google Maps. However, awareness of a guide dog, listening to a navigational guide, and remaining aware of possible obstacles is stressful to manage.
Before constructing a device to aid the visually impaired, it is important to consider the optimal, user-friendly design for this audience. Most sighted individuals use visual experience to form spatial representations of their surroundings, but this is not possible for the visually impaired. Many visually impaired individuals tend to use a verbal rehearsal strategy over a mental imagery strategy when trying to navigate an environment, but studies have shown that this method leads to lower mobility and poor spatial representations. Blind individuals that rely on mental imagery strategies tend to be more mobile so visual aid devices should cater to this strategy [3]. Additionally, wayfinding experiments show that blind individuals utilize significantly more information and make more decisions than sighted individuals when navigating environments [4]. As such, attention to these details is vital for constructing a visual aid device.
Not only must a device cater to how the blind navigate, but it also must be easy for the blind to learn how to use. Sighted individuals learn new actions primarily through the mirror system cortex by mimicking actions that they see. Studies have shown that blind individuals can still utilize the mirror system cortex but rely on audio cues associated with actions in the absence of visual cues [5]. Instead of mimicking the actions they see, they learn to recreate the sounds of the actions they hear. In order to create a user-friendly device for the blind, it is important to utilize audio cues to help navigate the device functions. Additionally, if the device is going to use a touch screen device, such as a smartphone, it will need to utilize gestures that cater to the blind. Studies have shown that blind individuals prefer edge-based gestures and tend to perform gestures with different shapes, sizes, and speeds than sighted individuals [6]. For a device to be user friendly, it must incorporate audio cues, easy to use gestures, and a simple button design to control its functions.
Spatial auditory input has an advantage over tactile devices (i.e., braille) in that a user can take in multiple pieces of information at once and hearing is a sense that can be perceived at various levels of intenseness [7]. One method uses earcons to acoustically relay information about an object and its position [7]. An earcon is a specific sound that is used to represent an object, movement, or direction. One challenge is that those born blind (i.e., congenital blindness) are limited in their ability to form spatial representations in their mind; those who become visually impaired later on in life have a greater ability to visualize their surroundings [7].
3D spatial guidance through audio mapping can provide real-time audio feedback to aid the user in locating a desired object and providing information on what the user is touching [8]. Another method is to use an audio-based 3D targeting strategy using monaural sound to indicate the location of the target from a stylus tip used to help the user locate an object [8]. Tempo changes indicate the distance to the “target” and a low or high pitch would indicate whether the user should move the stylus down or up respectively to hit the “target” [8]. For this method of 3D Audio Mapping, users reported being unsure of how to move the stylus in order to increase (indicating closer) the tempo and that the learning process was slow [8]. Since our device utilizes a similar audio communication method, we need to develop an intuitive audio protocol for conveying distance information to the user.
Haptic feedback devices pair well with audio feedback because the different modalities cause less interference; the user can track more information when two different modalities are used to relay information about the surroundings.
A device known as Phantom Omni is commonly used for research and educational purposes for tactile rendering of 3D models; it has 6 degrees of freedom [9].
Haptic feedback devices have been used to help direct the user through a virtual corridor [10]. Haptic feedback is given when the user is not headed in the correct direction and when adjustment is required. When the user is headed straight down the virtual corridor, the vibrations will cease. This virtual corridor that is created also creates a navigational component that can direct the user to a desired destination.
Tactile belts make use of haptic feedback to warn users of obstacles in front of them [11]. The tactile belt consists of a 3x3 array of tactors to indicate the locations of incoming obstacles. The top row represents hanging obstacles, the middle row grounded obstacles, and the bottom row gaps. An obstacle’s distance from the user is given by varying the rate of tactile delivery, with closer objects resulting in a higher frequency tactition.
However, while the usage of two separate modalities allows for a greater amount of information to be accessible to the user, it also introduces potential for information overload [11]. Electronic travel aids (ETA) may provide useful information to visually impaired individuals beyond that provided by the white cane, but manufacturers must take into account how these tools perform under high load, multitasking conditions. Before adding haptic feedback to our device, we must determine if it will potentially distract the user from our audio feedback.
The advancement of computer vision perception algorithms has allowed for complex applications such as the invention of self-driving cars with the use of LiDAR sensors. LiDAR sensors determine ranges by emitting a laser and measuring the time it takes for the light to reflect back to the receiver. In order to reduce noise, lane detection makes use of edge- and color-based thresholding algorithms to filter images and region of interest (ROI) masking to determine the areas where finding lanes is most likely [12]. One method to determine these maximum probability regions is by using the sliding window search algorithm to identify the regions of the frame with the highest density of nonzero pixels. Such minimalist lane detection algorithms are sufficient for straight roads without obstructions, however more robust algorithms are required to adapt to different conditions such as curves.
Spat is a software suite for spatialization of sound signals in real-time, generally intended for music creation, postproduction, and live performances [13]. It allows users to control the localization of sound sources in 3D audio spaces, and add spatializations in real-time, which it then outputs into an electroacoustic system (loudspeaker or headphones).
Slam (simultaneous localization and mapping) is a method used for autonomous vehicles that allows you to build a map and then localize the vehicle within that map at the same time. [14] SLAMs algorithms also allow vehicles to map out unknown environments, which allows for path planning and obstacle avoidance. SLAM utilizes two technologies in order to properly work: sensor signal processing, which is highly dependent on the sensors used, and pose-graph optimization, which is sensor agnostic. The sensor-signal processing, which includes the front-end processing, is primarily used for motion estimation and obstacle location estimation whereas the pose-graph optimization, or the back end that is independent of the sensor, registers the pose graphs and optimizes the graph for location estimation [14].
There are two different methods of SLAM: Visual SLAM (i.e., vSLAM) and LiDAR SLAM. vSLAM uses images acquired from cameras or other image sensors, ranging from simple cameras to stereo cameras, and RGB-D cameras [14]. This implementation has the advantage of being implemented at a low cost depending on which cameras are used. LiDAR SLAM primarily uses a laser / distance sensor, which is significantly more precise than cameras, which is why LiDAR is used for self-driving vehicles and drones. The LiDAR sensor outputs values as either 2D or 3D point cloud data (a collection of data points in a 3D space), allowing for highly accurate map construction and vehicle localization. However, because point clouds are not as detailed as camera images and require high processing power, it's generally reserved for applications where accuracy and safety are most pertinent, such as for UAVs and autonomous vehicles [14].
There are several novel solutions currently on the market designed to aid the visually impaired. Some solutions are designed for those that are only legally blind, not totally blind. Others incorporate haptic feedback to guide the user down a predetermined path. Narration is another prominent feature, using machine learning algorithms in order to identify, read, and describe objects or text in the user’s view.
Esight incorporates a high-quality camera with two high resolution screens that are placed close to the user’s eyes using a “halo comfort band”. This solution is designed for the legally blind and can bring their eyesight to 20/20. The halo band design contains the screens as well as a shell on the back which holds rechargeable batteries. These batteries last approximately 3 hours and can easily be replaced during longer-use sessions [15].
The MyEye uses machine learning in order to identify and narrate objects and text to the user. This approach uses narration of full sentences to describe scenes, similar to an assistant. It uses a more compact design that can clip magnetically to the side of a pair of glasses. Features of the MyEye include object/product recognition, text scanning/reading, facial recognition, and money note recognition [16].
Microsoft Seeing AI is a similar experimental application that takes photos and attempts to recognize the objects in it. A seamless sentence will then be formed to be read aloud to the user. Similar to the MyEye, this app also features product recognition, document scanning/reading, and facial recognition [17].
Wayband pairs with a navigation app to deliver haptic feedback to bracelets that will keep the user within a “virtual corridor”. When a user strays outside of the safe area, one of the bracelets will vibrate to indicate danger and push the user back onto the predetermined path [10].
While the recent devices are undoubtedly groundbreaking there are still issues with each device that prevents it from replacing the traditional white cane or guide dog. While Microsoft Seeing AI and OrCam MyEye act as an effective assistant for the visually impaired, it lacks the ability to detect possible imminent obstacles. It provides independence for its users by guiding them to their destination and pointing out key locations along the way; however, it is meant to be used in conjunction with their white cane or guide dog. The device is not providing a constant stream of feedback that is versatile enough to use for true navigation in any situation.
Esight Eyewear is a device made for the legally blind and brings to high-definition screens closer to the eyes so that they can see far away objects clearly; however, this device assumes that the user still has some vision and would not work for completely blind individuals or individuals with severe vision loss.
WearWorks Wayband is useful for navigating along a set path but does not warn of obstacles that wouldn’t appear on a navigational map. A common issue amongst most current solutions is a lack of versatile, live feedback that is functionally useful in navigating unknown environments.
Comparing and evaluating the final design against a standard ensures that the product is reliable, safe, and usable. It is predicted that the proposed device that our team wishes to build for this project will fall into the category of a medical device and one that has a software component. The relevant standards that our device will need to be assessed with are ISO 62366-1, IEC 62304, and ISO 14971. These standards come from the International Organization for Standardization (IOS) and International Electrotechnical Commission (IEC). ISO 62366-1 details the human factors of the device and assess the usability over a long period of time. IEC 62304 deals with the software life cycle process. ISO 14971 details the risk management for medical devices.
Current gold standard aids such as white canes and guide dogs lack the ability to aid the user with elevated objects. The visually impaired may greatly benefit from devices that can aid their awareness of objects in all directions and elevation levels. Many existing technologies previously discussed require large computing powers and add a cumbersome piece of equipment to the user’s everyday life. When considering the disadvantages with existing visual mobility aids, it is important to create a device that either can fully replace and improve upon the existing functionalities or cover their weaknesses. Visually impaired individuals use a lot of information to simply walk through a given environment, primarily audio and physical cues, so a device must be able to provide the necessary information without overloading one’s immediate senses. Utilizing acoustic layers and audio mapping is useful for creating mental spatial representations with those who developed visual impairments, whereas haptic feedback can be an effective guide for anyone who’s blind, whether they were born with it or developed it. With all these various technologies, there will likely be devices that can provide full mobility to the visually impaired at any level.
References
R.A. Bourne, et al. Magnitude, temporal trends, and projections of the global prevalence of blindness and distance and near vision impairment: a systematic review and meta-analysis. The Lancet Global Health. (2017). https://www.thelancet.com/journals/langlo/article/PIIS2214-109X(17)30293-0/fulltext
D. Howarth. Headset creates “3D soundscape” to helpblind people navigate cities. Microsoft. Dezeen. (2014). https://www.dezeen.com/2014/11/06/future-cities-catapult-microsoft-guide-dogs-3d-headset-soundscape-to-help-blind-people/
S. Schmidt, et al. Spatial Representations in Blind People: The Role of Strategies and Mobility Skills. Acta Psychologica, North-Holland, 8 Dec. 2012, www.sciencedirect.com/science/article/pii/S0001691812001928?casa_token=phyIHIBwR3sAAAAA%3A6VEEm8MZubFVAFNdnleRMLOkM7orU5OJZ7v5xaAUNIL5_u0IqcU2__5AsteGiVCa1GyLaFajTC8.
R. Passini, G. Proulx. Wayfinding without Vision: An Experiment with Congenitally Totally Blind People - Romedi Passini, Guyltne Proulx, 1988. SAGE Journals, https://journals.sagepub.com/doi/abs/10.1177/0013916588202006?casa_token=VlYqHVy3vTYAAAAA%3A20zyaLpSSPdkJwLHnspOBoQ8EehQrmIeEsyDcE7bFwD2iFvq1AmfUybNl5L0flGzzpDVsLjIHFycWA&
E. Ricciardi. D. Bonino, L. Sani, T. Vecchi, M. Guazzelli, J.V. Haxby, L. Fadiga, P. Pietrini. Do We Really Need Vision? How Blind People ‘See’ the Actions of Others. Journal of Neuroscience, Society for Neuroscience, 5 Aug. 2009, www.jneurosci.org/content/29/31/9719.short.
S. Kane, J. Wobbrock, R. Ladner. Usable Gestures for Blind People: Understanding Preference and Performance. Usable Gestures for Blind People | Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 1 May 2011, https://dl.acm.org/doi/abs/10.1145/1978942.1979001?casa_token=-4T2v_PAmsAAAAAA%3AdvEg-yLaMAIDdkNKh1zy_F2HkhVka9RqFaeFYh1QJobuK20D87LrUjQpqpw3Ktl5ZBH6SzKssKvYtQ
C. Frauenberger, M. Noisternig. 3D Audio Interfaces for the Blind. International Conference on Auditory Display. 2003. https://www.researchgate.net/publication/215639387_3D_Audio_Interfaces_for_the_Blind
J.M. Coughlan, B. Biggs, M. Riviere, H. Shen. An Audio-Based 3D Spatial Guidance AR System for Blind Users. Computers Helping People with Special Needs. 2020; 12376:475-484. doi: 10.1007/978-3-030-58796-3_55
P. Konstantinos, K. Panagiotis, K. Eleni, M. Marina, V. Asimis, E. Valari. Audio-Haptic Map: An Orientation and Mobility Aid for Individuals with Blindness. Science Direct. Elsevier. Procedia Computer Science. 2015; 67(2015):223-230. doi: 10.1016/j.procs.2015.09.266
WearWorks. Wayband. Go Anywhere. https://www.wear.works/
J. Erp, C. Liselotte. M. Kroon, T. Mioch, K.I. Paul. Obstacle Detection Display for Visually Impaired: Coding of Direction, Distance, and Height on a Vibrotactile Waist Band. Frontiers in ICT, Volume 4, 2017.
doi: 10.3389/fict.2017.00023
R. Muthalagu, A. Bolimera, V. Kalaichelvi. Lane detection technique based on perspective transformation and histogram analysis for self-driving cars. Computers & Electrical Engineering, Volume 85, 2020, https://doi.org/10.1016/j.compeleceng.2020.106653.
IRCAM. Spat. Ircam Forum https://forum.ircam.fr/projects/detail/spat/
MathWorks. What Is SLAM? MathWorks https://www.mathworks.com/discovery/slam.html
eSight 4. eSight. https://esighteyewear.com/low-vision-device-for-visually-impaired/
OrCam MyEye. StackPath. https://www.orcam.com/en/myeye2/
Seeing AI app from Microsoft. Seeing AI App from Microsoft. Available: https://www.microsoft.com/en-us/ai/seeing-ai.
page created by: Irene Lee