Did you hear that?
Brain Bites offers alumni and friends a window into the world of how our brains perceive sound
 
                          Department Head Michale Fee introduces Professor Josh McDermott (front row, far left) to talk about his cognitive science research. Photo: Devan Munroe
In October, Brain and Cognitive Sciences hosted a series of Brain Bites receptions in Los Angeles and Palo Alto, California, drawing a diverse crowd of more than 100 alumni and friends.
Department Head Michale Fee, the Glen V. and Phyllis F. Dorflinger Professor of Neuroscience and investigator in the McGovern Institute for Brain Research, and Professor Josh McDermott PhD ’06 participated in a fireside chat discussing McDermott’s research at the intersection of psychology, neuroscience, and engineering on how people and machines perceive sound.
McDermott and his colleagues in the Laboratory for Computational Audition have developed a computer model for human sound perception that he hopes will contribute to engineering solutions for hearing impairment or loss.
“We now have a model that can actually localize sounds in the real world,” McDermott said about his research recently published in Nature Human Behavior. “And when we treated the model like a human experimental participant and simulated this large set of experiments that people had tested humans on in the past, what we found over and over again is it the model recapitulates the results that you see in humans.”
Scientists have long sought to build computer models that can perform the same kind of calculations that the brain uses to localize sounds. These models sometimes work well in idealized settings with no background noise, but never in real-world environments, with their noises and echoes.
To develop a more sophisticated model of localization, the MIT team turned to convolutional neural networks. This kind of computer modeling has been used extensively to model the human visual system, and more recently, McDermott and other scientists have begun applying it to audition as well.
Convolutional neural networks can be designed with many different architectures, so to help them find the ones that would work best for localization, the MIT team used a supercomputer that allowed them to train and test about 1,500 different models. That search identified 10 that seemed the best suited for localization, which the researchers further trained and used for all of their subsequent studies.
In addition to analyzing the difference in arrival time at the right and left ears, the human brain also bases its location judgments on differences in the intensity of sound that reaches each ear. Previous studies have shown that the success of both of these strategies varies depending on the frequency of the incoming sound. In the new study, the MIT team found that the models showed this same pattern of sensitivity to frequency.
“The model seems to use timing and level differences between the two ears in the same way that people do, in a way that’s frequency-dependent,” McDermott says.
The researchers also showed that when they made localization tasks more difficult, by adding multiple sound sources played at the same time, the computer models’ performance declined in a way that closely mimicked human failure patterns under the same circumstances.
The research was funded by the National Science Foundation and the National Institute on Deafness and Other Communication Disorders.
Devan Monroe | Department of Brain and Cognitive Sciences
Additional reporting on the published research provided by Anne Trafton, MIT News.