Cornell University researchers have developed a cutting-edge silent-speech recognition interface, named EchoSpeech, which utilizes acoustic-sensing and artificial intelligence to continuously recognize 31 nonvocalized commands based on lip and mouth movements. The wearable, low-power interface is capable of recognizing commands after only a few minutes of user training data, allowing it to be run on a smartphone. The technology, as described in the paper “EchoSpeech: Continuous Silent Speech Recognition on Minimally-Obstructive Eyewear Powered by Acoustic Sensing” by lead author Ruidong Zhang, will be presented at the upcoming Association for Computing Machinery’s Conference on Human Factors in Computing Systems (CHI) in Hamburg, Germany.
According to Zhang, the silent speech technology can prove to be an excellent input for voice synthesizers for people who cannot raise their voice. With further development, the technology has the potential to be used in various ways. In its current form, EchoSpeech can be utilized to communicate via smartphone in situations where speech is inappropriate or inconvenient, such as in noisy restaurants or quiet libraries. It can also be combined with a stylus and used with design software, eliminating the need for a keyboard and mouse.
The EcoSpeech glasses, designed with a microphone and a pair of speakers smaller than a pencil eraser, are transformed into a wearable AI-powered sonar system that sends and receives sound waves across the face while sensing mouth movements. The echo profiles generated by these waves are analyzed in real-time using a deep learning algorithm with an accuracy of approximately 95%.
Cheng Zhang, assistant professor of informatics and director of Cornell’s Smart Computer Interface for Future Interaction (SciFi) Lab, explained that the technology takes sonar to the body, and the team is very excited about this system because it advances the field on performance and privacy. With its small, low-power, and privacy-sensitive features, EchoSpeech enables new wearable devices to be used in the real world, making it an important characteristic for applying qualified techniques.