Stacco: Exploring the Embodied Perception of Latent Representations in Neural Synthesis

Nicola Privato, Victor Shepardson, Giacomo Lepri, and Thor Magnusson

Proceedings of the International Conference on New Interfaces for Musical Expression

Abstract:

The application of neural audio synthesis methods for sound generation has grown significantly in recent years. Among such systems, streaming autoencoders such as RAVE are particularly suitable for instrument design, as they map audio to and from control signals in an abstract latent space with acceptable latency. Despite the uptake of autoencoders in NIME design, little research has been done to characterize the latent spaces of audio models, and to investigate their affordances in practical musical scenarios. In this paper we present Stacco, an instrument specifically designed for the intuitive control of neural audio synthesis latent parameters through the displacement of magnetic objects on a wooden board with four magnetic attractors. We then examine models trained on the same data with different seeds, we explore strategies for more consistent mappings from audio to latent space, and propose a method for stitching the latent space of one model to another. Finally, in a user study, we investigate whether and how these techniques are perceived through embodied practice with Stacco.

Citation:

Nicola Privato, Victor Shepardson, Giacomo Lepri, and Thor Magnusson. 2024. Stacco: Exploring the Embodied Perception of Latent Representations in Neural Synthesis. Proceedings of the International Conference on New Interfaces for Musical Expression. DOI: 10.5281/zenodo.13904899

BibTeX Entry:

  @article{nime2024_62,
 abstract = {The application of neural audio synthesis methods for sound generation has grown significantly in recent years. Among such systems, streaming autoencoders such as RAVE are particularly suitable for instrument design, as they map audio to and from control signals in an abstract latent space with acceptable latency. Despite the uptake of autoencoders in NIME design, little research has been done to characterize the latent spaces of audio models, and to investigate their affordances in practical musical scenarios. In this paper we present Stacco, an instrument specifically designed for the intuitive control of neural audio synthesis latent parameters through the displacement of magnetic objects on a wooden board with four magnetic attractors. We then examine models trained on the same data with different seeds, we explore strategies for more consistent mappings from audio to latent space, and propose a method for stitching the latent space of one model to another. Finally, in a user study, we investigate whether and how these techniques are perceived through embodied practice with Stacco.},
 address = {Utrecht, Netherlands},
 articleno = {62},
 author = {Nicola Privato and Victor Shepardson and Giacomo Lepri and Thor Magnusson},
 booktitle = {Proceedings of the International Conference on New Interfaces for Musical Expression},
 doi = {10.5281/zenodo.13904899},
 editor = {S M Astrid Bin and Courtney N. Reed},
 issn = {2220-4806},
 month = {September},
 numpages = {8},
 pages = {424--431},
 presentation-video = {https://youtu.be/AJpuVTn_tPM?si=0qIAkXQZjIdl6r_o},
 title = {Stacco: Exploring the Embodied Perception of Latent Representations in Neural Synthesis},
 track = {Papers},
 url = {http://nime.org/proceedings/2024/nime2024_62.pdf},
 year = {2024}
}