Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't see how the authors are dishonest about it, or that the rebuttal refutes any of the actual claims made. They make it clear in the paper that they're specifically evaluating people who aren't especially interested in poetry, and talk at length about how and why this is different from other approaches. I suppose the clickbait title gives a bad first impression.

To summarize the facts: when people were asked to tell if a given poem was written by human or AI, people thought the AI poems were human more often than human poems. People also tended to rate them higher when they thought they were created by AI than when they thought they were human-made. It's speculated in the paper that this is because the AI poems tended to be more accessible and direct than the human poems selected, and the preference for this style from non-experts combined with the perception that AI poetry is poor led to the results.




The selection of human poets is cooked to give the result they wanted. I will grant the authors may have lied to themselves. But I don't think honest scientists would have ever constructed a study like this. It is comparing human avant garde jazz to AI dance music and concluding that "AI music" is more danceable than "human music", without including human dance music! It's just infuriating.


They expressly state the result is likely because the AI poetry was more simple and direct than the poetry selected, which is more accessible for the average person not interested in poetry. They compare and contrast this with other studies where this was not the case.

Yes, it's comparing apples and oranges; that's the whole point. It doesn't make the experiment itself flawed.


It seems to me that the whole study was intended to manufacture a result to grab headlines. Scientific clickbait. It doesn't matter how transparent they are, because that is mostly there to cover their asses.


Hum, but it should have compared against human poems that go for a similar style no? Otherwise, it doesn't tell us much, except that AI was not able to make more complex poems? And maybe that people who don't like poetry when asked prefer simpler poems?


> I don't see how the authors are dishonest about it, or that the rebuttal refutes any of the actual claims made.

Suppose I prepared a glass of lemonade and purchased a bottle of vintage wine—one held in high regard by connoisseurs. Then I found a set of random bystanders and surveyed them on which drink they preferred, and they generally preferred the lemonade. Do you see how it would be dishonest of me to treat this like a serious evaluation of my skill, either as a lemonade maker or a wine maker?


Sure - but it could still be pretty relevant if we want to ask about the future of beverage making and consumption, especially if new technology enables everybody to mass-produce lemonade (and similar sugary beverages) at home at minimal cost.

I'm quite sympathetic to poetry - I actually wrote a blog post about this article last week https://gallant.dev/posts/whither-poetry/

But much like the "debate" between linguistic prescriptivism ("'beg the question' doesn't mean 'raise the question'") and descriptivism ("language is how it is used"), both perspectives have relevance, and neither are really responses to the other.

I certainly hope people keep writing great, human, poetry. But generative ML is a systemic change to creative output in general. Poetry just happens to be in some ways simplest for the LLMs, but other art is tokens and patterns as well.


Personally, I think this would be a sin. To call something art which has no depth. We have too many things that are shallow. I think this has been detrimental to us as a society. That we're so caught up with the next thing that our leisure is anything but. What is the point of this all if not to make life more enjoyable? How can we enjoy life if we cannot have a moment to appreciate it? If we treat time off as if it is a chore that we try to get done as fast as possible? If we cannot have time to contemplate it? A world without friction is dull. It's as if we envy the machines. Perhaps we should make the world less tiring, so we have the energy to be human.


Human art most definitely can be reduced to tokens, since that's also essentially how we compress and transmit it.

Now, whether a statistical token generator makes "real art" is subjective (as human art already is). And again, I'm actually quite sympathetic to the "humans are special" perspective.

But the point of my comment is that this philosophical stance is not a practical reply to what will actually happen in terms of social dynamics and content creation/consumption. Whether we call it "real art" or not, generative tools exist and will be used. So, it makes sense to understand them, even if your goal for doing so is to mitigate their incursions into "real art."

In other words, art must adapt. Which, it always does.


> Suppose I prepared a glass of lemonade and purchased a bottle of vintage wine—one held in high regard by connoisseurs.

I would tell you that there have been results about the blind testing of wines held in high regard by connoisseurs that might make you not want to choose that for a comparison.


The blind tasting studies prove that connoisseurs can't discern the price of wine by taste. They can tell whether or not they like it perfectly well. A good bottle, not an expensive one.


Does this not undermine the premise of the original metaphor (i.e. a glass of lemonade is somehow inferior to a fine wine)? Seems like a lot of goal post shifting.


A glass of lemonade is not inferior. It's a different thing entirely. You can compare good lemonade to bad lemonade, or good wine to bad wine, but asking a group of people who prefer lemonade to compare a glass of lemonade with a glass of wine tells you nothing about the quality of the lemonade or wine in question.

The human poets in the test wrote high-brow poetry, the AI generated low-brow poetry, and the audience of laypeople who were surveyed preferred the low-brow poetry. There's nothing wrong with a straightforward rhyme scheme or anything—it's not bad to be low-brow—but it's not a useful comparison.


I believe you missed the OP's point. Poetry is to be processed. That's a feature, not a bug. Now that we're in an analytical conversation you need to process both papers and OP's words. Like poetry, there is context, things between the lines. Because to write everything explicitly would require a book's worth of text. LLMs are amazing compression machines, but they pale in comparison to what we do. If you're take a step back, you'll even see this comment is littered with that compression itself.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: