Making Virtual Instruments using Granular Synthesis and Machine Learning, Feedback Wanted!

DSP, Plugin and Host development discussion.
Post Reply New Topic
RELATED
PRODUCTS

Post

AUTO-ADMIN: Non-MP3, WAV, OGG, SoundCloud, YouTube, Vimeo, Twitter and Facebook links in this post have been protected automatically. Once the member reaches 5 posts the links will function as normal.
Hey everyone!

Hey everyone,
I'm finishing up a masters thesis in computer science and my project entails producing new virtual instruments by analyzing 20 mS audio chunks called grains using signal analysis, performing clustering techniques on them, and recombining them into new samples. I am looking for people to listen to some of the audio I've created and provide some honest feedback. Please note that your responses will be used anonymously as part of my thesis report.

If you are interested, please visit the following link. Your help would be greatly appreciated!
https://docs.google.com/forms/d/1Y_ypHU ... U/viewform (https://docs.google.com/forms/d/1Y_ypHUAhifQVfCth7ltudFBGO_kgqNAopeQ0ENQE5JU/viewform)

If you're interested in code, the python scripts used to produce the audio clips can be found here: https://github.com/neobonzi/Concatenati ... esisThesis (https://github.com/neobonzi/ConcatenativeSynthesisThesis)

As for the method:

The framework takes as input a mono mp3 file. It then divides the audio file into grains (the lengths of which determined by an input parameter, usually 20 - 100 mS in practice) and makes an entry for them in a mongo database. From there, signal analysis is performed on the grains based on several features I have hypothesized would be good indicators of timbre such as MFCC's, RMS Energy, ZCR, Spectral Features such as Centroid and Kurtosis, and I also did some comparisons of energy in the fundamental to harmonics. All of this analysis is saved to the respective grains in the database.

From there, K-Means clustering is performed on the feature vectors that represent grains to form m groupings where m is a provided parameter. The grains in each group are concatenated together with 50% crossfade to produce audio patches.

Post Reply

Return to “DSP and Plugin Development”