Max Frenzel, PhD
2 min readApr 27, 2019

--

Hi Benedikt!

Thanks a lot for reading, and leaving the detailed question. Happy to hear that you’ve been playing around with this yourself!

One first quick comment is that it seems like you didn’t train your network for very long. From what you said above it seems like you trained your model for 2920 steps. I don’t know the details of your training data and other parameters, but you might want to try and train it a bit longer.

Also, if you’re training data was biased towards mostly drum sounds (like mine was), I’m also not surprised that your network got particularly good at drums and struggled with everything else. But in general, I guess drums are easier to model than other sounds (like bass) since you need much longer time-ranges involved in sustained sounds.

The global conditioning could be interesting to use for this purpose. The original Wavenet implementation you linked to allows for “categorical” conditioning. You basically give it an additional one hot vector which tells it what category of sound it should produce. In the original example they used it to specify which speaker the network should model. But for your purpose you could use it to specify what instrument to model. E.g. let’s assume you want to cover three instruments: drum, bass, piano. Every time you want a drum sound you give it the vector [1, 0, 0], for bass [0, 1, 0], and for piano [0, 0, 1]. That way it should have an easier time to learn the distinction during training, and during generation you have more control about exactly what sound it should generate.

I slightly modified this so that I could give it not only one hot category vectors, but actually more complex feature/embedding vectors that were generated by another network. But that might go a bit beyond what you are trying to do at the moment.

Hope this helps! Good luck! :)

--

--

Max Frenzel, PhD
Max Frenzel, PhD

Written by Max Frenzel, PhD

AI Researcher, Writer, Digital Creative. Passionate about helping you build your rest ethic. Author of the international bestseller Time Off. www.maxfrenzel.com

Responses (1)