Deep learning for decomposing sound into vector audio

This is an idea proposed in 2024 as a Cambridge Computer Science Part III or MPhil project, and has expired. It may be co-supervised with Trevor Agus.

All that we hear is mediated through cues transmitted to the brain from the cochlea, which acts like a bank of auditory filters centred at a wide range of centre frequencies. A lot of our knowledge of hearing comes from psychoacoustical experiments that involve simple sounds, like sine waves, whose synthesis parameters are closely related to cues available beyond the cochlea. However, for recorded sounds, many types of cue are available, but our use of these cues is limited by the extent that these cues can be manipulated in a controlled fashion. ^[1] ^[2]

The goal of this project is to apply deep learning tools to explore the extent to which recorded sounds, such as speech, music and noise, can be decomposed into components, such as modulated sine waves, that dominate independent regions of activity on the cochlea. The training data would come from combinations of basic sounds with known synthesis parameters and the corresponding output from a differential auditory filterbank, which has recently become available (Famularo^[3]). The ability to control perceptually relevant parameters of arbitrarily complex sounds would be a powerful tool in hearing research and may have other applications in data compression and artificially generated sound.

(Note: this will be co-supervised with faculty from Queen's University, Belfast)

McDermott, J.H. and E.P. Simoncelli, Sound texture perception via statistics of the auditory periphery: evidence from sound synthesis. Neuron, 2011. 71(5): p. 926-40.
↩︎︎
Agus, T.R., et al., Fast recognition of musical sounds based on timbre. J Acoust Soc Am, 2012. 131(5): p. 4124-33.
↩︎︎
Famularo, R.L., et al., Biomimetic frontend for differentiable audio processing. [pre-print], 2024.
↩︎︎

# 1st Jan 2024

ideas ai audio idea-expired idea-hard

Anil Madhavapeddy, Professor of Planetary Computing

Deep learning for decomposing sound into vector audio

Related News

Interspatial OS / Jan 2018