While this may not be perfect composition, this is a surreal (and almost sad) moment in my life to hear music that is passable created by a computer under it's own volition. I work at a company that works with a lot of machine learning so I generally understand its limitations and haven't ever been an alarmist. That being said, I generally always thought of it being applied to automate work. For some reason I had always considered that which we normally attribute to human creativity to be off-limits. Sure now it's not great, but in 10 years will it be able to compose better music than Chopin? In 10 years, will music created by a computer surprise and delight me more than music composed by humans?
Fear not, my friend, because the execution is only a part of why music and art as a whole carries meaning for us.
Consider two tracks that are identical (forget copywrite for a minute). Between one that an AI generated and a human composed, I would personally grant the human-generated version more credit and enjoy it more. The story of how art is created and the stories of the artist are as substantial to appreciating art as a stroke of a brush or a note on a page. Computers will never replicate this until singularity.
First of all, I disagree with this premise, as the overwhelming majority of music either (1) doesn't have a particular story behind it at all or (2) becomes popular, and then people learn the story behind it.
Even accepting the premise, what happens when the next artist with a great story is simply using MuseNet to write their emotional pieces and passing it off as human? They'll be functionally the same, yet it still feels like something was lost.
What makes you think that a computer that can generate music cannot also generate a background for the "creator" of that music? It can give you all the stories that touch our hearts even more so than we can imagine.
Why generate a fake story? Why not communicate the real and moving journey of how a single note in the training data travelled through hundreds of neurons and thousands of matrices, and eventually made it past the final activation function to become a feature in the output tensor.
Given two stories, both identical, where one story is real - the real story will always be more meaningful because it has actually happened within the constraints of our reality, granting it validity and us the ability to relate to it.
Now, consider two stories, both identical, where one story is "real" and the other story is from a simulated universe. Now I'd say that both stories are of possibly equivalent value, since both have happened.
I kinda thank you for your comment, it gives me more...hope? :) (I'm an artist)
The same way when we see other human did something extraordinary that we think human can't do(based mostly on ourselves) For example, an artist who can draw something so life-like or sculptor who shape hard marble into soft, flowing clothes — with just hammer and chisel.
Human potential always intrigues us "What? Human can do THAT?" kind of way.
Yes, the machine and AI can do the same thing at the fraction of time, from the practicality standpoint, but it's not and never be the same — it's empty. It's just lifeless product and we never feel related to it.
I think that when placed against each other in this hypothetical, yes one would naturally side with a human (even if it is just out of solidarity), but what about when humans (who already claim the fame from songs written by other humans) claim the fame for song written by computers--that have no legal recourse? Would we just assume the music claimed by a human, was originated by a human?
It doesn't sound passable to me - it sounds boring, it's a hack around a text-parsing architecture ffs, trying to make it understand multi-dimensional and multi-timbral data... There are models that do better, and can "surprise and delight" you in different ways than a human would. Think about DeepDream and the whole experience of trying to spot all the weird doggie parts that the computer manages to sneak in those pictures - I don't think that a human could paint a DeepDream-dog picture nearly as effortlessly and perfectly as a computer can! I would definitely describe that as successful art, as far as it goes. But that doesn't mean that DeepDream "solves" visual art as a task!
Music created solely by machines will probably remain derivative and simplistic for a long time. I expect the biggest result of this research in the near term is that we'll be able to create tools that lower the skill and time required to create good music, kind of like an audio version of templates/autocomplete/spellcheck.
There are a lot of futurists and singularity types that take it personally when people disagree with their assessment. It's no big deal, but the open minded, progressive thing to do would be to have a debate and save the downvotes for trolls.
It's probably only a matter of time before we have a GauGAN like interface for synthetic music creation...so you could say 'i want a sad song with a soft intro and a buildup of tension here with lyrics covering these emotions and things which lasts 7 minutes'.
ML/DL is coming for a lot of the grunt work. It's coming for us as programmers as well. It's probably a few years away, but ML/DL
Given how easy it is to train a Transformer on any sequence data, and given how plentiful open source code is, I'd say "CodeNet" is probably less than a year away. OpenAI will probably do it first given they already have the setup.
I've been training on Stack Overflow and the model has already learned the syntaxes and common coding conventions of a bunch of different languages all on its own. Excited to see what else it's able to do as I keep experimenting.
Some sample outputs (you'll probably want to browse to some of the "Random" questions because by default it's showing "answers" right now and I haven't trained that model as long as some of the older question-generation ones): https://stackroboflow.com
I've tried it as well and got good syntactic results. For more sensical programs, I think we will need more layers & attn heads. Perhaps someone will fork gpt-2 and add the sparse transformer to it.
That CodeNet would be the SkyNet, essentially. What's shown here looks impressive, but it's the same good old text generator that can produce something that looks very similar to the dataset used to train it. It can't go beyond the dataset and generate something new. From the mathematical point of view, that generator interpolates samples from the dataset and generates a new sample.
To give an idea how big is the gap between MuseNet and CodeNet, we can consider a simple problem of reversing a sequence: [1,2,3,4,5] should become [5,4,3,2,1] and so on. How many samples do you need to look at to understand how to reverse an arbitrary sequence of numbers? Do you need to retrain your brain to reverse a sequence of pictures? No, because instead of memorizing the given samples, you looked at a few and built a mental model of "reversing a sequence of things". Now, the state of the art ML models can reverse sequences as long as they are using the same numbers as in the dataset, i.e. we can train them to reverse any sequence of 1..5 or 1..50 numbers, but once we add 6 to the input, the model instantly fails, no matter how complex and fancy it is. I don't even dare to add a letter to the input. Reason? 6 isn't in the samples it's learnt to interpolate. And CodeNet is supposed to generate a C++ program that would reverse any sequence, btw.
At the moment, ML is kinda stuck at this pictures interpolation stage. For AI, we don't need to interpolate samples, but need to build a "mental model" of what these samples are and as far as I know, we have no clue how to even approach this problem.
Yeah, I know what you are saying... But let's just let somebody try this experiment (and somebody eventually will), and we can judge what can or cannot be learned by the results.
We will definitely get a great code autocompleter at the very least..
Can you explain? I'm not an expert on ML by any stretch of the imagination, but you'd think with the sort of stringent logical coherence required to construct useful programs, it'd be a pretty subpar use case. Or do you mean smaller-scope tools to aid programming, like linters and autocompleters?
I wonder if you could find a representation for computer programs that eliminated all of the degrees of freedom that were syntax errors, leaving only valid programs. In a sense that's what an AST is but you can still have invalid ASTs. I bet it would be a lot easier to generate interesting programs in a representation like that.
There is cartesian genetic programming and some lisp-like models to encode a program as a tree where all combination are valid.
Combined with recent work on convolutional graph DNNs, this might be a good approach.
It's not passable...it's pretty obviously algorithmic.
The program does not have volition.
Why would you think that using statistics to generate a model of a piece of art (which is just data in the case of MIDI and pixels) would be "off-limits"? People have been doing this for decades.
No one knows the answer to your last two questions, but there is no indication that this program is leading there.