I do not disagree with any of what you wrote. That is also an entirely different line of reasoning than your first argument.
That said, LLMs today, cannot do this in a meaningful way. If an author cannot write a better book than ChatGPT, then that author would not be able to live of their writing anyway. And the authors that use ChatGPT to write a book, but still put the effort into fine tuning it, will not be able to this at scale. You also need someone to line out the plot and twists and turns if it is to be a full length book.
Let's assume that in 10 years, LLMs are at the point where you cannot distinguish between a well written book by an author and one generated by entirely by an AI. Suppose we have two authors, one that is long dead and whos works are public domain and a young one that is just starting. An LLM trained on the author whos work is in the public domain can generate books that is just like the original works. But what if the young author writes in a similar way, is that now legal or illegal to generate the same content? It's impossible to know if the young authors work have been used for training.
My take on it, is that LLMs are pretty stupid. They cannot come up with new and novel things. So if a writer writes something that is different (i.e. new and novel), how do we protect that? We cannot prevent it from being used for training, so the next logical step is to protect it the same way as we protect technology with patents. But that come with its own class of problems, say if two people write the same way independently, only one can have the right to do it. That is not the solution either.
I do not have the answer, but I am certain that trying to ban LLMs, or dictating what and how is not the answer. Perhaps the authors that can write in a new and novel way, and knows how to use AI will proliferate because they embrace it.
> Let's assume that in 10 years, LLMs are at the point where you cannot distinguish between a well written book by an author and one generated by entirely by an AI. Suppose we have two authors, one that is long dead and whos works are public domain and a young one that is just starting. An LLM trained on the author whos work is in the public domain can generate books that is just like the original works. But what if the young author writes in a similar way, is that now legal or illegal to generate the same content? It's impossible to know if the young authors work have been used for training.
which is central to my reasoning. You could go the same way as software patents, but that is not preferable in any way.
---
>> I am certain that trying to ban LLMs, or dictating what and how is not the answer.
> I wouldn't ban LLMs because of copyright issues, though I would let authors choose whether their IP can be used for training or not.
> Why not? Just say that using for training is considered derivative work, and that's it. Now copyright owners just have to update their license to allow for training if they want to, and that's solved. Of course, Big Tech makes less money from that scenario.
Big Tech can train on everything that is "legal" and malicious actors can finetune with a specific authors works and then generate books. You will not be able to detect that and the malicious actor can claim to have written it themself. Then we're back to the starting point.
> We cannot prevent it from being used for training, so the next logical step is to protect it the same way as we protect technology with patents.
Why not? Just say that using for training is considered derivative work, and that's it. Now copyright owners just have to update their license to allow for training if they want to, and that's solved. Of course, Big Tech makes less money from that scenario.
> I am certain that trying to ban LLMs, or dictating what and how is not the answer.
I wouldn't ban LLMs because of copyright issues, though I would let authors choose whether their IP can be used for training or not.
However, copyright is only one issue with LLMs. All the black hat use-cases are a whole other category of issues. And I am of the opinion that technology is not neutral: IMO, it is perfectly fine for a society to ban a technology if it believes that it is globally doing more harm than good.
That said, LLMs today, cannot do this in a meaningful way. If an author cannot write a better book than ChatGPT, then that author would not be able to live of their writing anyway. And the authors that use ChatGPT to write a book, but still put the effort into fine tuning it, will not be able to this at scale. You also need someone to line out the plot and twists and turns if it is to be a full length book.
Let's assume that in 10 years, LLMs are at the point where you cannot distinguish between a well written book by an author and one generated by entirely by an AI. Suppose we have two authors, one that is long dead and whos works are public domain and a young one that is just starting. An LLM trained on the author whos work is in the public domain can generate books that is just like the original works. But what if the young author writes in a similar way, is that now legal or illegal to generate the same content? It's impossible to know if the young authors work have been used for training.
My take on it, is that LLMs are pretty stupid. They cannot come up with new and novel things. So if a writer writes something that is different (i.e. new and novel), how do we protect that? We cannot prevent it from being used for training, so the next logical step is to protect it the same way as we protect technology with patents. But that come with its own class of problems, say if two people write the same way independently, only one can have the right to do it. That is not the solution either.
I do not have the answer, but I am certain that trying to ban LLMs, or dictating what and how is not the answer. Perhaps the authors that can write in a new and novel way, and knows how to use AI will proliferate because they embrace it.