Not yet, but we plan to release the code & pre-trained model!

dharma1 · on April 25, 2019

What kind of hardware (and what size data set) would you need to train with new types of music?

Doable with a single 1080ti and a couple of hundred midi files?

Also, can you do supervised learning with this - say melody input and chords (with good voice leading) output?

gwern · on April 26, 2019

> Doable with a single 1080ti and a couple of hundred midi files?

A 1080ti would probably require something like several days or a week. It depends on how big the model is... Probably not a big deal. However, a few hundred MIDI files would be pushing it in terms of sample size. If you look at experiments like my GPT-2-small finetuning to make it generate poetry instead of random English text ( https://www.gwern.net/GPT-2 ), it really works best if you are into at least the megabyte range of text. Similarly with StyleGAN, if you want to retrain my anime face StyleGAN on a specific character ( https://www.gwern.net/Faces#transfer-learning ), you want at least a few hundred faces. Below that, you're going to need architectures designed specifically for transfer learning/few-shot learning, which are designed to work in the low _n_ regime. (They exist, but StyleGAN and GPT-2 are not them.)

IBCNU · on April 25, 2019

Human interaction has so much potential - speaking as a musician here this is exciting.

_yj7v · on April 26, 2019

Hi please provide an ETA?

IBCNU · on April 25, 2019

Please do!