Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ernest Davis has an outstanding and definitive rebuttal of these claims: https://cs.nyu.edu/~davise/papers/GPT-Poetry.pdf

The basic problem is that GPT generates easy poetry, and the authors were comparing to difficult human poets like Walt Whitman and Emily Dickinson, and rating using a bunch of people who don't particuarly like poetry. If you actually do like poetry, comparing PlathGPT to Sylvia Plath is so offensive as to be misanthropic: the human Sylvia Plath is objectively better by any possible honest measure. I don't think the authors have any interest in being honest: ignorance is no excuse for something like this.

Given the audience, it would be much more honest if the authors compared with popular poets, like Robert Frost, whose work is sophisticated but approachable.




Huh, that sounds a little like claiming that AI can draw pictures just as well as humans because they look realistic at a first glance. But not if you check whether the text, repetitive elements, and partially-occluded objects in the background look correct.


The more basic problem is that their methodology would conclude Harry Potter is better than Ulysses, AC/DC is better than Carla Frey, etc etc. It is completely fine to enjoy "dumb" art - I like Marvel comics and a lot of the Disney-era Star Wars novels have been pretty fun. But using easiness and fun as a metric of quality is simply celebrating ignorance and laziness.


It's another case of the impedance mismatch between numbers / KPIs etc and what might be called "lived experience".


Why is AC/DC “dumber” than, I assume, a fantastic classic musician? I used to think that these artists are riff-riff, but they all turned out to be masters of their art and, ignoring the pop/rock/etc flavor and type, they may actually surpass the genius of a violin virtuoso. Quite a claim you’re making here, assuming that AC/DC popularity was due to “dumbness” overall and that it’s “fine”. It’s not fine, it’s the same step on the ladder. I find this facet of a deep-shallow distinction completely synthetic and (imo) coming from a limited technical-ish view on music. Vivaldi, AC/DC, Blackpink — it’s all an art of a genius level, just differently flavored.


I'm a bit confused about this AC/DC vs "Carla Frey" comparison. For starters, I can't find any musicians by the name of Carla Frey. There is however a free jazz pianist named Carla Bley. It feels like OP has maybe selected a niche artist that they're personally fond of. I had a listen and it's the sort of stuff you'd get in hotel lobbies. It's nice, and it probably means a lot to people who are really into jazz, but I suspect that few people will ever hear of this artist. Conversely, Angus Young of AC/DC is extremely musically gifted and I suspect that songs like Thunderstruck will have a large social impact for years to come. There's nothing dumb about their songs. Though I don't put much stock in Spotify listens, it has 1.6 billion vs Bley's most popular "Lawns" having around 2 million.

Sure, there are cases where great artists are not well known in their own time - from Mozart to Nick Drake - but recognition generally follows in the subsequent decades. If Carla Bley is who they inteded to refer to, they've had over 50 years to become recognised.


Why is Ulysses better than Harry Potter?


Are you aware of studies of Ulysses and James Joyce in general? That's a good start for this discussion, assuming it's not just a kneejerk reaction.


Why is Harry Pottery better than a more unknown book like the Sugar Barons? Both are exciting but one is based on real events/letters..

I think parent is basing it on high art principles which are: older better, uniqueness, what others agree is valuable and content/quality.

Why aren't we talking about the first english book published? The Recuyell of the Histories of Troye


Apples are better than oranges. "An orange a day keeps the doctor away?" I don't think so. Newton's orange? Nope. Orange may be the surprisingly successful remains of a former telecom monopoly, but Apple is the most valuable company on the planet.


You are assuming that it is objectively true that "Harry Potter is better than Ulysses, AC/DC is better than Carla Frey". What is your basis for that? And what is your definition of "better"?


If all, they are assuming the opposite.


My point stands


I get the point you're trying to make, and in some philosophical sense you're right. In reality though, people into a field are able to rank quality extremely well and it doesn't end up being the puzzle you're implying it is


It sounds different to me.

Poetry is for the enjoyment and enlightenment of the reader. Who reads poems? The question is a little vague. Who reads poems on a birthday card? Everyone. In a children's book? Parents and children.

But this study focuses on literary poems specifically, using works by literary heavy weights like Chaucer, Shakespeare, Dickinson, Whitman, Byron, Ginsberg, and so on. So the question is who reads literary poems, to which the answer is: academics, literary writers, and a vanishingly small population of reader who enjoy such pursuits.

So the test of if ersatz AI poetry is as good as "real" poetry should be if target audience (i.e. academics) finds the poems to be enjoyable or enlightening. But this study does not test that hypothesis.

This study tests a different hypothesis: can lay people who are not generally interested in literary poetry distinguish between real and AI literary poetry?

The hypothesis and paper feel kind of like a potshot at literary poetry to me, or at least I don't understand why this particular question is interesting or worthy of scientific inquiry.


  > Poetry is for the enjoyment and enlightenment of the reader.
Is it?

  If I read a book and it makes my whole body so cold no fire can warm me, I know that is poetry. If I feel physically as if the top of my head were taken off, I know that is poetry. These are the only ways I know it. Is there any other way?
  - Emily Dickinson
Clearly Dickinson believes it is far more than what’s at the surface. It is more than the rhyme and rhythm. They add to the effect, but they are not necessary conditions. It’s generally agreed upon that art is defined by making you feel.

Lay people are often uninterested in poetry because they didn’t study it. There’s nothing wrong with that. Many are put off by elitists, garbage, and the difficulty to parse. But this is true for any subject matter, we see similarities in programming and in science. You see even many from there point to the surface and dismiss despite the importance being underneath. But it also doesn’t have to be for everyone. It’s not about being smart or dumb either, as most people happily get into the depths of the subjects they enjoy. Be it a child’s obsession with dinosaurs, a teen’s obsession in videogames, or an elite academic. We all have that capacity but there are always hurdles to entry and sometimes the point is to stumble. When it is, removing the hurdles harms the domain, even though it’s almost always done with the best of intentions. Eventually you gotta touch the stove top to learn it’s hot


To me, it sounds more like claiming that AI writes better code than Linus Torvalds because a group of random non-programmers preferred reading simple AI-generated Python over Linux kernel C code.


Not sure that I agree with this metaphor, as the utility of code is both subjective (readability) and objective (performance). There are objective ways in which C code simply can’t be matched by a Python implementation. If they were equivalent in terms of performance, I think use of C would crater.

With poetry, the “utility” is entirely the subjective experience of the reader. There is no objective component.


Or if you check that any pictures of people have the correct number of fingers, toes, arms, legs, or heads.


The vast majority of people seem to prefer Avengers 17 over any cinematic masterpiece, the latest drake song would be better rated than a Tchaikovsky... We should let them play and worship chat gpt if that's what they want to waste their time on


I don't understand the logic of calling superhero movies lesser/unserious like this, it's very snobby. Movies and music are made to be entertaining, the avengers is more entertaining than your "arthouse cinematic masterpiece that nobody likes but it's just because they aren't smart enough to understand it". It's also lazy and ignorant to ignore the sheer manpower that goes into making a movie like that.


I don’t fully agree with putting down “fun” movies like the Avengers, but at the same time “serious” art is not primarily for plain entertainment.

People might find “serious” art meaningful and it might spark feelings in them, but that’s not the same as getting an adrenaline rush from exploding cars in an action scene.

Of course there are also cases where the boundary between “fun” and “serious art” is not so clear, there are always exceptions to any attempt to define what makes something “serious art”. Art can also be subversive and run counter to traditional expectations of what art “should” be. But I don’t think the Avengers is an example of that.


Movies, music, wiriting, all human arts, are made to make their audience feel something. "Entertaining" is only a small and honestly ill-defined subset of this, no more valid than any other approach.


I don't think analyzing the black square or Ulysses, or Arnold Schönberg's works (just random examples, I could go on and on) in terms of if and what they make you feel is an ill fated course of action. It's also not what people actually do.

On the other hand a lot of other stuff can be broadly analyzed in terms of making people feel something. Painkillers, excuses, hugs, titles.

So it seems your generalization is neither necessary nor sufficient for "human arts".


> your "arthouse cinematic masterpiece that nobody likes

You're reading way too much into my comment. Any block buster from the 80/90s absolutely shits on 90% of block busters released today. I'm not talking about obscure 1950s czechoslovak cinema here...

> ignore the sheet manpower that goes into making a movie like that.

A lot of work doesn't make something good, especially when cgi quality actually gets worse year after year. FYI the entire LOTR trilogy had 30% less budget and 4x the runtime of the last avenger movie... And they actually filmed things outside of a Hollywood studio

The only lazy thing here are the scenarists and the directors shitting out the blandest movies ever. But then again if all we care about is raw entertainment then sure, it's perfect, very easy to digest, lots of colors and not too much to think about, the cinematic equivalent of fast food. You can even buy avengers branded toilet paper and bottle water, that really shows how much they care about movies!


Well said. There's tons of blockbusters and other popular movies from the 80s/90s that were absolutely made for the "masses", but were genuinely great films, and far better than almost any blockbuster from the last 5-10 years, especially all the comic-book stuff. Alien(s), Back to the Future trilogy, Terminator 1/2, Ghostbusters, Beetlejuice, I could go on and on. And of course the LotR trilogy if you look at the early 2000s. Movies just aren't as innovative or risky these days; something as quirky as Ghostbusters wouldn't be made now (but Hollywood is happy to make remakes and sequels of that franchise now, 40 years later).


Film is such a nascent art form. The 90s as “peak blockbuster action” is a valid stance on taste but hard to defend as superior to all that came after. Christopher Nolan’s Dark Knight is leagues aways from the 90s Batman, as an auteur friendly and obvious comparison. Pixar another on the animation front.

There have been great films made in every era, but the trend towards tighter writing, more legible and compelling action, and emotionally impactful story telling is strongly trending upwards overall.

And nothing will ever top the merchandising mania of the 80s!


I hope you're referring to Joel Schumacher's kitschy drivel, and not to Tim Burton's masterpieces (both of which are IMO vastly superior to Nolan's take on the subject).


It would be nice if people actually stated why is x better than y rather than expecting everyone to hold the same opinion as them. Makes for better conversations.

I don't get why people having this narrowminded view of literature/movies, you don't see it that much in culinary conversations


In a culinary conversation nobody is trying to make the case that the chicken mcnugget is objectively superior to fresh pasta in a handmade pesto sauce. So you don't need to tell people "please just go away with this mcnugget nonsense". You wouldn't be expected to explain why one is better than the other. Most people that taste food understand immediately what you mean.


Ah yes, "I like things that are more entertaining" has provided so much value.


Random thought outburst, feel free to downvote:

This reminded me so much of Spaceballs! And the yogurt merchandise towards the end! Such a great movie that has so many obvious "flaws" like the mirror under the speeder on the desert planet when they comb the desert. And yet I've actually watched that movie more often than even the actual real Star Wars movies (meaning the first three made - all of which are timeless awesomeness)


For perspective, your comments could be released direct to VHS.


> Any block buster from the 80/90s absolutely shits on 90% of block busters released today

You sure it's not survival bias, as in, you only are thinking and remembering the good ones over a two-decade period and comparing them against what movies came out this year. When in reality, there might be tons of blockbusters in those era that were just as bad as your average one today?


what a ridiculous comparison. Of course a superhero movie is more entertaining than a film that is explicitly designed to avoid mass appeal.

The nuance is that movies today is not where the most creative talent is directed anymore. The shift started with prestige TV taking off in the 2000s, and episodic content on streaming services surpassing film as a mass-market artform in the 2010s, with the pandemic driving the nail in the coffin.

I loved the late 2000s / early 2010s superhero movies. Spiderman, The Dark Knight, Iron Man, etc. These were great films. Today, the MCU is just eating its own tail with the most bland, repetitive crap. It's all designed to incentivize the same die hard fans to keep forking over their hard-earned cash with all the cross-film teasers and the need to watch every film to understand all the references and moving parts. I understand the business model—it's actually the same as comic books now—because people don't casually go in to see random movies anymore, they do that at home on Netflix, so they have to target the repeat viewers. It's visually impressive, and the acting is good enough to keep a relatively large subset of the population coming back, but for someone like me who wants at least a little bit of novelty or creativity in the plot or characters, it's just so become so mind-bogglingly boring.


Sheer manpower doesn't make it good. You should have just made a point about entertainment, which it definitely does provide. A case can be made that this was the only thing they were going for. You would have had a good point. Implying that just because a lot of people did a thing together that means it has merit is kind of a strange thing to say. It's definitely not self-evident and you make no attempt to elaborate on it.


"Movies and music are made to be entertaining"

In your opinion, perhaps. Other films are made to be provocative-- to make you think or reflect. Certainly, a lot of the "arthouse cinematic masterpieces" aim for that as a goal rather than purely entertainment.

You're arguing against a strawman here... nobody is saying making an avengers movie is low effort. Certain aspects of an avengers movie though require less effort.


There is more to art than entertainment. For example Oedipus Rex [1] - distinctly not entertaining; but art, and powerful in an incomparable way, anyway.

_____________

[1] Don't look it up if you don't know what that is.


A greek play that you are for some reason naming in latin instead of just using english?

If you're referring to the italian film, the original title is in italian :D


I'm Greek. English. Italian. It's all Latin to me.


I still don't know if you refer to the original play or the italian film about it.


Yes you do:

Oedipus Rex, also known by its Greek title, Oedipus Tyrannus (Ancient Greek: Οἰδίπους Τύραννος, pronounced [oidípuːs týrannos]), or Oedipus the King, is an Athenian tragedy by Sophocles.

https://en.wikipedia.org/wiki/Oedipus_Rex

And I'm perfectly aware that you're trying, hard, to make some sarcastic point, but that's what everybody in the English-speaking world calls it. Or did you want to talk about it in Greek? I'm fine with that. After all, it was in Greek that I've watched it, as a child, here:

https://en.wikipedia.org/wiki/Ancient_Theatre_of_Epidaurus

Did you think that I'm just name-dropping a Greek tragedy to appear erudite and culturrred? Again: I'm Greek. I grew up with that stuff. They even teach us some of it in school (Antigone, for one).


This is like watching a pair of frogs not named Euripides nor Aeschylus farting bubbles in a pond.

Βάτραχοι ?

( Also not cultured )


No, that's like watching you and the other guy trying to troll me and that's not making you look as cool as you think.


That's clearly your opinion; I can't speak to the motivations of whom you refer to as "the other guy" but I have zero interest in either attempting to troll you or in looking cool.


Uhm. Sorry to bother you with something totally off-topic.

I have an almost lifelong itch, which I couldn't successfully scratch so far.

It's about the meaning of this surename: Κούβελας

Usually it is transcribed in English as Kouvelas, in German as Kouwelas, and in French it can be Couvelas, and AFAIK it is pronounced something like Koo-well-as(s) in Greek.

Does it have any 'speaking'/describing meaning, like Miller, Carpenter, Fisher, Baker and so on, or is it something like 'from a place called this', maybe distorted over generations?

I only get nothing from sites like this https://forebears.io/surnames/kouvelas , and the few people with Greek heritage I knew couldn't tell me either, so far.

Can/would you, If it doesn't bother you?


Probably the Stravinsky/Cocteau Opera/libretto combo to properly bastardize the mix by throwing in some gratuitous Russian and French flavour.

Unless they're thinking of some Korean or Japanese New Wave productions such as Oldboy or Funeral Parade of Roses.


Movies and music are usually made to be entertaining, but sometimes they're made as an artistic outlet for the creator.

I was listening to Schoenberg's "Suite for Piano" the other day. Did he make it to be entertaining? I don't know, interesting maybe. I wouldn't put it on at a party.

It's true that snobbery is off-putting, but if you're looking for artistic merit, then some works last longer than others. If you're looking for something to enjoy with your popcorn, then there's that too.


Nobody likes art that requires them to think.


Well, the art can't judge itself.

Maybe critics are art, too. Like Lipton's "Inside the Actor's Studio" (Detroit). That's art.

"It's not art it's ari. You want to make an art film? You take it to Sundance, you take it to Telluride, you take it to Cannes."


  > The basic problem is that GPT generates easy poetry
I was going to come in here and say this. I'll even make the claim that GPT and LLMs __cannot write poetry__.

Of course, this depends on what we mean by "poetry." Art is so hard to define, we might as well call it ineffable. Maybe Justice Potter said it best, "I know it when I see it." And I think most artists would agree with this, because the point is to evoke emotion. It is why you might take a nice picture and put it up on a wall in your house but no one would ever put it in a museum. But good art should make you stop, take some time to think, figure out what's important to you.

The art that is notable is not something you simply hang on a wall and get good feelings from when you glance at it. They are deep. They require processing. This is purposeful. A feature, not a bug. They are filled with cultural rhetoric and commentary. Did you ever ask why you are no Dorothea Lange? Why your photos aren't as meaningful as Alfred Eisenstaedt's? Clearly There's something happening here, but what it is ain't exactly clear.

Let me give a very recent example. Here[0] is a letter from The Onion (yes, that Onion, the one who bought InfoWars The Onion) wrote an amicus brief to the Supreme Court. It is full of satire while arguing that satire cannot be outlawed. It is __not__ intended to be read at a glance. In fact, they even specifically say so

  > (“[T]he very nature of parody . . . is to catch the reader off guard at first glance, after which the ‘victim’ recognizes that the joke is on him to the extent that it caught him unaware.”).
That parody only works if one is able to be fooled. You can find the author explaining it more here[1].

But we're coders, not lawyers. So maybe a better analogy is what makes "beautiful code." It sure as fuck is not aesthetically pleasing. Tell me what about this code is aesthetically pleasing and easy to understand?

    float InvSqrt(float x){
        float xhalf = 0.5f * x;
        int i = *(int*)&x;            
        i = 0x5f3759df - (i >> 1);    
        x = *(float*)&i;             
        x = x*(1.5f - xhalf*x*x);     
        return x;
    }
It requires people writing explanations![2] Yet, I'd call this code BEAUTIFUL. A work of art. I'd call you a liar or a wizard if you truly could understand this code at a glance.

I specifically bring this up because there's a lot of sentiment around here that "you don't need to write pretty code, just working code." When in fact, the reality is that the two are one in the same. The code is pretty __because__ it works. The code is a masterpiece because it solves issues you probably didn't even know existed! There's this talk as if there's this bifurcation between "those who __like__ to write code and those who use it to get things done." Or those who think "code should be pretty vs those who think code should just work." I promise you, everyone in the former group is deeply concerned with making things work. And I'll tell you now, you've been sold a lie. Code is not supposed to be a Lovcraftian creature made of spaghetti and duct tape. You should kill it. It doesn't want to live. You are the Frankenstein of the story.

To see the beauty in the code, you have to sit and stare at it. Parse it. Contemplate it. Ask yourself why each decision is being made. There is so much depth to this and it's writing is a literal demonstration of how well Carmack understands every part of the computer: the language, how the memory is handled, how the CPU operations function at a low level, etc.

I truly feel that we are under attack. I don't know about you, but I do not want to go gentle into that good night. Slow down, you move too fast, you got to make the morning last. It's easy to say not today, I got a lot to do, but then you'll grow up to be just like your dad.

[0] https://www.supremecourt.gov/DocketPDF/22/22-293/242292/2022...

[1] https://www.law.berkeley.edu/article/peeling-layers-onion-he...

[2] https://betterexplained.com/articles/understanding-quakes-fa...


I don't agree that it's possible to describe a piece of poetry as "objectively" better or worse.


You can if the assignment is to write something in iambic pentameter.


I would say Davis's definition of "objectively better" here is "nobody who reads these poems carefully could possibly conclude that this AI crap is better than Walt Whitman, the only explanation is Walt Whitman is so difficult that the raters didn't read it carefully."

The Nature paper is making a bold and anti-humanist claim right in the headline, laundering bullshit with bad data, without considering how poorly-defined the problem is. This data really is awful because the subjects aren't interested in reading difficult poetry. It is entirely appropriate for Davis, as someone who actually is interested in good poetry, to make a qualitative stand as to what is or isn't good poetry and try to define the problem accordingly.


>This data really is awful because the subjects aren't interested in reading difficult poetry.

If the results were the opposite, would the data still be awful?


The data would still be awful, and people would pay less attention to the study because it’s not a priori surprising that ChatGPT would write worse poetry than the most celebrated poets in history.

If I use bad data to conclude that “Java is faster than C++ in most cases” you can be sure it will receive a lot more attention than if I reached the opposite conclusion based on similarly bad data.


> The data would still be awful, and people would pay less attention to the study because it’s not a priori surprising that ChatGPT would write worse poetry than the most celebrated poets in history.

[emphasis mine] You've inadvertently made the poster's point for them. You have written Spivak's "I knew it" reaction, just phrased more glibly.


That's my litmus test for whether I'm confirmation biasing myself.

Obviously this study is flawed and the results are garbage! But if the study had concluded the opposite then I knew it!


I don't see how the authors are dishonest about it, or that the rebuttal refutes any of the actual claims made. They make it clear in the paper that they're specifically evaluating people who aren't especially interested in poetry, and talk at length about how and why this is different from other approaches. I suppose the clickbait title gives a bad first impression.

To summarize the facts: when people were asked to tell if a given poem was written by human or AI, people thought the AI poems were human more often than human poems. People also tended to rate them higher when they thought they were created by AI than when they thought they were human-made. It's speculated in the paper that this is because the AI poems tended to be more accessible and direct than the human poems selected, and the preference for this style from non-experts combined with the perception that AI poetry is poor led to the results.


The selection of human poets is cooked to give the result they wanted. I will grant the authors may have lied to themselves. But I don't think honest scientists would have ever constructed a study like this. It is comparing human avant garde jazz to AI dance music and concluding that "AI music" is more danceable than "human music", without including human dance music! It's just infuriating.


They expressly state the result is likely because the AI poetry was more simple and direct than the poetry selected, which is more accessible for the average person not interested in poetry. They compare and contrast this with other studies where this was not the case.

Yes, it's comparing apples and oranges; that's the whole point. It doesn't make the experiment itself flawed.


It seems to me that the whole study was intended to manufacture a result to grab headlines. Scientific clickbait. It doesn't matter how transparent they are, because that is mostly there to cover their asses.


Hum, but it should have compared against human poems that go for a similar style no? Otherwise, it doesn't tell us much, except that AI was not able to make more complex poems? And maybe that people who don't like poetry when asked prefer simpler poems?


> I don't see how the authors are dishonest about it, or that the rebuttal refutes any of the actual claims made.

Suppose I prepared a glass of lemonade and purchased a bottle of vintage wine—one held in high regard by connoisseurs. Then I found a set of random bystanders and surveyed them on which drink they preferred, and they generally preferred the lemonade. Do you see how it would be dishonest of me to treat this like a serious evaluation of my skill, either as a lemonade maker or a wine maker?


Sure - but it could still be pretty relevant if we want to ask about the future of beverage making and consumption, especially if new technology enables everybody to mass-produce lemonade (and similar sugary beverages) at home at minimal cost.

I'm quite sympathetic to poetry - I actually wrote a blog post about this article last week https://gallant.dev/posts/whither-poetry/

But much like the "debate" between linguistic prescriptivism ("'beg the question' doesn't mean 'raise the question'") and descriptivism ("language is how it is used"), both perspectives have relevance, and neither are really responses to the other.

I certainly hope people keep writing great, human, poetry. But generative ML is a systemic change to creative output in general. Poetry just happens to be in some ways simplest for the LLMs, but other art is tokens and patterns as well.


Personally, I think this would be a sin. To call something art which has no depth. We have too many things that are shallow. I think this has been detrimental to us as a society. That we're so caught up with the next thing that our leisure is anything but. What is the point of this all if not to make life more enjoyable? How can we enjoy life if we cannot have a moment to appreciate it? If we treat time off as if it is a chore that we try to get done as fast as possible? If we cannot have time to contemplate it? A world without friction is dull. It's as if we envy the machines. Perhaps we should make the world less tiring, so we have the energy to be human.


Human art most definitely can be reduced to tokens, since that's also essentially how we compress and transmit it.

Now, whether a statistical token generator makes "real art" is subjective (as human art already is). And again, I'm actually quite sympathetic to the "humans are special" perspective.

But the point of my comment is that this philosophical stance is not a practical reply to what will actually happen in terms of social dynamics and content creation/consumption. Whether we call it "real art" or not, generative tools exist and will be used. So, it makes sense to understand them, even if your goal for doing so is to mitigate their incursions into "real art."

In other words, art must adapt. Which, it always does.


> Suppose I prepared a glass of lemonade and purchased a bottle of vintage wine—one held in high regard by connoisseurs.

I would tell you that there have been results about the blind testing of wines held in high regard by connoisseurs that might make you not want to choose that for a comparison.


The blind tasting studies prove that connoisseurs can't discern the price of wine by taste. They can tell whether or not they like it perfectly well. A good bottle, not an expensive one.


Does this not undermine the premise of the original metaphor (i.e. a glass of lemonade is somehow inferior to a fine wine)? Seems like a lot of goal post shifting.


A glass of lemonade is not inferior. It's a different thing entirely. You can compare good lemonade to bad lemonade, or good wine to bad wine, but asking a group of people who prefer lemonade to compare a glass of lemonade with a glass of wine tells you nothing about the quality of the lemonade or wine in question.

The human poets in the test wrote high-brow poetry, the AI generated low-brow poetry, and the audience of laypeople who were surveyed preferred the low-brow poetry. There's nothing wrong with a straightforward rhyme scheme or anything—it's not bad to be low-brow—but it's not a useful comparison.


I believe you missed the OP's point. Poetry is to be processed. That's a feature, not a bug. Now that we're in an analytical conversation you need to process both papers and OP's words. Like poetry, there is context, things between the lines. Because to write everything explicitly would require a book's worth of text. LLMs are amazing compression machines, but they pale in comparison to what we do. If you're take a step back, you'll even see this comment is littered with that compression itself.


You're making a load of generous claims for yourself without giving your thought process:

> The basic problem is that GPT generates easy poetry

> were comparing to difficult human poets

What's your qualitative process for measuring "easy" vs. "difficult" poetry?

> rating using a bunch of people who don't particuarly like poetry

How do you know these people don't like poetry? Maybe they don't seek it out, but certainly poetry is not just for poetry lovers. Good poetry speaks to anyone.

> the human Sylvia Plath is objectively better by any possible honest measure

Really? Whats your objective measure?


>is so offensive ...

Agree. Poetry is the compact written reflection of an expanded or conflicted soul ... it requires lived experience, self awareness, and ability to compress out superfluous details in language.

The question of whether the human soul can be tricked by an AI illusionist and to what degree is a non sequiteur.

Humans eat the meal. At best, and charitably considered, AI eats the menu. Not the same.

In other news I asked AI to convince me it understands American capitalism, and to explain it to me as if I lived in Los Angeles - you know so I could really see/feel it.

It did decently well concluding it will, for example, balance the demand for "ample parking" with supply. Now, I leave you to assess whether that's an awesome and intelligent example, or an AI love song that like Marshall Tucker reminds "can't be wrong."


> the human Sylvia Plath is objectively better by any possible honest measure.

Except for arguably the most important one, creating something that people enjoy. Just because you dont like it doesnt make it worthless. I guess the actual question is do the raters actually get any enjoyment out of the ai poem or do they just intensely dislike both?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: