Thoughts? Seems more focused on the control of DSP processes than novel approaches to raw DSP. Google-funded hype, an interesting avenue for exploration (or both)?
Today, we’re pleased to introduce the Differentiable Digital Signal Processing (DDSP) library. DDSP lets you combine the interpretable structure of classical DSP elements (such as filters, oscillators, reverberation, etc.) with the expressivity of deep learning.
Neural networks (such as WaveNet or GANSynth) are often black boxes. They can adapt to different datasets but often overfit details of the dataset and are difficult to interpret. Interpretable models (such as musical grammars) use known structure, so they are easier to understand, but have trouble adapting to diverse datasets.
DSP (Digital Signal Processing, without the extra “differentiable” D) is one of the backbones of modern society, integral to telecommunications, transportation, audio, and many medical technologies. You could fill many books with DSP knowledge, but here are some fun introductions to audio signals, oscillators, and filters, if this is new to you.
The key idea is to use simple interpretable DSP elements to create complex realistic signals by precisely controlling their many parameters. For example, a collection of linear filters and sinusoidal oscillators (DSP elements) can create the sound of a realistic violin if the frequencies and responses are tuned in just the right way. However, it is difficult to dynamically control all of these parameters by hand, which is why synthesizers with simple controls often sound unnatural and “synthetic”.
With DDSP, we use a neural network to convert a user’s input into complex DSP controls that can produce more realistic signals. This input could be any form of control signal, including features extracted from audio itself. Since the DDSP units are differentiable (thus the extra D), we can then train the neural network to adapt to a dataset through standard backpropagation.