Machine learning improves human speech recognition – ScienceDaily


Hearing loss is a rapidly growing area of ​​scientific research as the number of baby boomers struggling with hearing loss continues to increase with age.

To understand how hearing loss affects people, researchers study people’s ability to recognize speech. It is more difficult for humans to recognize human speech when there is reverberation, some hearing impairment, or significant background noise such as traffic noise or multiple speakers.

As a result, hearing aid algorithms are widely used to improve human speech recognition. To evaluate such algorithms, researchers conduct experiments aimed at determining the signal-to-noise ratio at which a certain number of words (usually 50%) are recognized. However, these tests are time-consuming and costly.

in the The Journal of the Acoustical Society of Americapublished by the Acoustical Society of America via AIP Publishing, researchers from Germany explore a human speech recognition model based on machine learning and deep neural networks.

“What’s new about our model is that it provides good predictions for noise types with very different complexity for the hearing-impaired and shows both low errors and high correlations with the measurement data,” says author Jana Roßbach from the Carl von Ossietzky University.

The researchers used automatic speech recognition (ASR) to calculate how many words a listener understood per sentence. Most people know ASR from voice recognition tools like Alexa and Siri.

The study consisted of eight normal-hearing and 20 hearing-impaired listeners who were exposed to a variety of complex sounds that mask speech. The hearing impaired were divided into three groups with different age-related hearing loss.

The model allowed researchers to predict human speech recognition performance of hearing-impaired listeners with varying degrees of hearing loss for a variety of noise maskers with increasing complexity of temporal modulation and similarity to real speech. The possible hearing loss of a person could be considered individually.

“We were very surprised that the predictions worked well for all types of intoxication. We expected the model to struggle when using a single competing speaker. However, that was not the case,” said Rossbach.

The model made predictions for hearing in one ear. In the future, the researchers will develop a binaural model, since understanding speech is influenced by hearing with two ears.

In addition to predicting speech intelligibility, the model could potentially also be used to predict listening effort or speech quality, as these topics are very closely related.

story source:

Materials provided by American Institute of Physics. Note: Content can be edited for style and length.


Comments are closed.