I have been experimenting with AI and so called timbre transfer and written a research article about it that will be published in early 2024. Until then a small excerpt, sounds and materials from the forthcoming article are available here.
The Roughness of Neural Networks – Jimi Hendrix, Holly Herndon, GPT-3, Timbre Transfer and the Promising Failure Aesthetics of Musical Ais
(Playlist on the right)
Audio 1a: My voice as a trumpet (using Magenta Tone Transfer):
Audio 1b: the original voice recording
Audio 2: My voice through Holly+ My voice was saying „This method is especially interesting for music and art because it promises to be able to produce something new and surprising, in other words, to show a kind of machine creativity.“
Different Sounds with the Timbre/Tone of Hendrix-Style-Guitar
The original guitar was played by myself (trainings data Audio 8) and used as a timbre for the following inputs
Audio 3: using the original voice above (Audio 1b)
Audio 4: prerecorded acoustic piano as input
Audio 5: fieldrecording/ soundscape of birds in the woods as input
Guitar as Input (flute vs hendrix guitar timbre)
Audio 6: guitar as input
Audio 7: same guitar like in Audio 6 as input, existing preset of a flute as output
Audio 8: Trainingsdata for Hendrix-like Guitar Timbre, short selection of about 13 minutes, that have been used for training the timbre of the VST instrument
Promptism and Holly+
With Holly+, the artist has made a tool available that enables users to sing with her voice. As is the case with other AI tools, new sounds can be created directly in the prompt. This so-called “promtism” (Hayward 2022), as artists have already named this phenomenon, is easily accessible but comes with other challenges: “[S]ome of my AI images took 30+ tries to match closely to my end goal. I had a clear picture of what I wanted in my head, but it’s a matter of articulating that in a way that’s friendly to the machine.” (Hayward 2022) In the case of Holly+, all that needs to be done is to upload an audio file to be downloaded in the sound colour. The results of this process reveal the roughness described by Herndon (Audio 2). Even if the process is not yet live for every user, it is already possible (Herndon 2022). If it actually becomes more common to sing and produce music through other voices, this naturally raises questions about identity, authorship, exploitation or possible forms of cultural appropriation – even if these questions are less the focus of my contribution. It will be at least as challenging a task as in the cultural practice of sampling to decide how to deal with the diverse identities, artistic freedoms and ethical and legal issues in each individual case. The musical spectrum has expanded.
Jimi Hendrix and Tone Transfer
After experimenting with my voice or guitar as input for Holly + or Magenta’s Tone Transfer, I wondered if, instead of mimicking guitar riffs to learn how Hendrix played, it would be possible to preserve his timbre. Of course, Chat GPT could provide a convincing answer to the question of what the guitar of Jimi Hendrix sounds like, and the specialized literature (Clague 2014; Trampert 1998; Waksman 2010: 166–206) as well as the original recordings provide reliable, but also more varied and complex answers, which are, however, beyond the scope of this experimental study. Therefore, I first decided to use similar equipment that Hendrix used in most of his recordings. Then I created training data for machine learning by recording my own guitar playing on a Fender Stratocaster using a Marshall amp, and various effects devices such as wah-wah and fuzz distortion. And in doing so, I tried to imitate some significant playing techniques of Hendrix and replay song sequences. The training data generated could then be used within the environment of a so-called “Colab notebook” provided for the Tone Transfer project, which contains the Python code for the machine learning model including step-by-step instructions (Carney et al. 2021).
After these technical hurdles, about 12 minutes of audio were used for training a neural network with 30 000 steps for about 3 hours. The resulting files could finally be played in the music software environment of Ableton Live with the help of the Tone Transfer VST plug-in. At this point, it is necessary to experiment again and this time with the input, which produces different fascinating results in interaction with the timbre (Audio 3-7). Even after several attempts with different inputs of one’s own voice (Audio 3), a piano (Audio 4), a field recording of birds (Audio 5) or a guitar (Audio 6), the result does not seem to correspond particularly clearly to a Hendrix-like electric guitar, but unique references to the recorded training data (Audio 8) remain. The results clearly differ from the existing presets of other timbres (Audio 7) and produce a lot of aesthetic failures or “rasping of the neurons”. They also have an emotional effect that comes with having designed one’s own timbre and making it usable in other contexts, similar to how musicians emphasise the value of specially recorded or discovered sounds and music sequences when sampling or djing. Otherwise, “spawning”, as Herndon calls the methods of timbre transfer, clearly differs from digital sampling due to the specific media-technical and cultural collaboration: “So, for example, with sampling, usually you copy and remix a recording by someone else to create something new. But with spawning, you can perform as someone else based on trained information about them.” (Herndon 2022 min 3:21)
Human Learning through Machine Learning …
In the age of timbre transfer and after the death of the author (Barthes 1967), two things can be said. Thanks to spawning as a form of machine learning, it will be possible in the future to speak with the voices of dead authors without sampling them directly and thus reproducing something they have already said. And as this is at least unrecognisable to the amateur, it bears numerous dangers and uncertainties. But it also bears many creative potentials that can be found especially at the edges of various AI music productions. As the previous experiment showed, the result is not the trained Hendrix sound, but an in-between that is tempting and more than the reproduced imitation of the original. Such approaches are appealing not least because we never listen exclusively with our ears but also with our other senses, our memories and depending on the most diverse contexts (Schulze 2018; Sterne 2003). In this way, the reference to Hendrix becomes culturally significant without being recognized as a sound event.
… to be continued …
Video Playlist with Resources mentioned in the Article
All Audio Examples in one Playlist
„A robot playing the guitar in the style of a spectrogram“ (DALL·E 2022-12-19)
Barthes, R. (1967): “Der Tod des Autors,” Texte zur Theorie der Autorschaft :185–197.
Carney, M./ Li, C./ Toh, E./ Zada, N./ Yu, P./ Engel, J. (2021): “Tone Transfer: In-Browser Interactive Neural Audio Synthesis,” Joint Proceedings of the ACM IUI 2021 Workshops, April 13-17, 2021, College Station USA.
Clague, M. (2014): “‘This Is America’: Jimi Hendrix’s Star Spangled Banner Journey as Psychedelic Citizenship,” Journal of the Society for American Music, 8(04), 435–478. https://doi.org/10.1017/S1752196314000364
Hayward, J. (2022): “The Growing Art Movement of ‘Promptism’,” Counter Arts. https://medium.com/counterarts/the-growing-art-movement-of-promptism-9ec956d82a61
Herndon, H. (2022): “Holly Herndon: What if you could sing in your favorite musician’s voice?,” TED Talk. https://www.ted.com/talks/holly_herndon_what_if_you_could_sing_in_your_favorite_musician_s_voice
Schulze, H. (2018): The Sonic Persona: An Anthropology of Sound, New York: Bloomsbury Academic.
Sterne, J. (2003): Audible Past: Cultural Origins of Sound Reproduction, Durham: Duke University Press.
Stuart, C. (2003): “Damaged sound: Glitching and skipping compact discs in the audio of Yasunao Tone, Nicolas Collins and Oval,” Leonardo Music Journal, 13, 47–52.
Trampert, L. (1998): Elektrisch!: Jimi Hendrix—Der Musiker hinter dem Mythos, Augsburg: Sonnentanz.