It’s an experience we’ve all had: Whether catching up with a friend over dinner at a restaurant, meeting an interesting person at a cocktail party, or conducting a meeting amid office commotion, we find ourselves having to shout over background chatter and general noise. The human ear and brain are not especially good at identifying separate sources of sound in a noisy environment to focus on a particular conversation. This ability deteriorates further with general hearing loss, which is becoming more prevalent as people live longer, and can lead to social isolation.
However, a team of researchers from the University of Washington, Microsoft, and Assembly AI have just shown that AI can outdo humans in isolating sound sources to create a zone of silence. This sound bubble allows people within a radius of up to 2 meters to converse with hugely reduced interference from other speakers or noise outside the zone.
The group, led by University of Washington professor Shyam Gollakota, aims to combine AI with hardware to augment human capabilities. This is different, Gollakota says, from working with enormous computational resources such as those ChatGPT employs; rather, the challenge is to create useful AI applications within the limits of hardware constraints, particularly for mobile or wearable use. Gollakota has long thought that what has been called the “cocktail party problem” is a widespread issue where this approach could be feasible and beneficial.
Currently, commercially available noise-cancelling headsets suppress background noise but do not compensate for distances to the sound sources or other issues such as reverberations in enclosed spaces. Previous studies, however, have shown that neural networks achieve better separation of sound sources than conventional signal processing. Building on this finding, Gollakota’s group designed an integrated hardware-AI “hearable” system that analyzes audio data to clearly identify sound sources within and without a designated bubble size. The system then suppresses extraneous sounds in real time so there is no perceptible lag between what users hear, and what they see while watching the person speaking.
The audio part of the system is a commercial noise-cancelling headset with up to six microphones that detect nearby and more distant sounds, providing data for neural network analysis. Custom-built networks find the distances to sound sources and determine which of them lay inside a programmable bubble radius of 1 m, 1.5 m, or 2 m. These networks were trained with both simulated and real-world data, taken in 22 rooms of varied sizes and sound-absorbing qualitieswith different combinations of human subjects.The algorithm runs on a small embedded CPU, either the Orange Pi or Raspberry Pi, and sends processed data back to the headphones in milliseconds, fast enough to keep hearing and vision in sync.
Hear the difference between a…
Read full article: Noise-cancelling headsets use AI to make zones of silence
The post “Noise-cancelling headsets use AI to make zones of silence” by Sidney Perkowitz was published on 12/03/2024 by spectrum.ieee.org
Leave a Reply