Cross-modal Sound Mapping Using Deep Learning
Ohad Fried, and Rebecca Fiebrink
Proceedings of the International Conference on New Interfaces for Musical Expression
- Year: 2013
- Location: Daejeon, Republic of Korea
- Pages: 531–534
- Keywords: Deep learning, feature learning, mapping, gestural control
- DOI: 10.5281/zenodo.1178528 (Link to paper)
- PDF link
Abstract:
We present a method for automatic feature extraction and cross-modal mappingusing deep learning. Our system uses stacked autoencoders to learn a layeredfeature representation of the data. Feature vectors from two (or more)different domains are mapped to each other, effectively creating a cross-modalmapping. Our system can either run fully unsupervised, or it can use high-levellabeling to fine-tune the mapping according a user's needs. We show severalapplications for our method, mapping sound to or from images or gestures. Weevaluate system performance both in standalone inference tasks and incross-modal mappings.
Citation:
Ohad Fried, and Rebecca Fiebrink. 2013. Cross-modal Sound Mapping Using Deep Learning. Proceedings of the International Conference on New Interfaces for Musical Expression. DOI: 10.5281/zenodo.1178528BibTeX Entry:
@inproceedings{Fried2013, abstract = {We present a method for automatic feature extraction and cross-modal mappingusing deep learning. Our system uses stacked autoencoders to learn a layeredfeature representation of the data. Feature vectors from two (or more)different domains are mapped to each other, effectively creating a cross-modalmapping. Our system can either run fully unsupervised, or it can use high-levellabeling to fine-tune the mapping according a user's needs. We show severalapplications for our method, mapping sound to or from images or gestures. Weevaluate system performance both in standalone inference tasks and incross-modal mappings.}, address = {Daejeon, Republic of Korea}, author = {Ohad Fried and Rebecca Fiebrink}, booktitle = {Proceedings of the International Conference on New Interfaces for Musical Expression}, doi = {10.5281/zenodo.1178528}, issn = {2220-4806}, keywords = {Deep learning, feature learning, mapping, gestural control}, month = {May}, pages = {531--534}, publisher = {Graduate School of Culture Technology, KAIST}, title = {Cross-modal Sound Mapping Using Deep Learning}, url = {http://www.nime.org/proceedings/2013/nime2013_111.pdf}, year = {2013} }