Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I would not describe current LLMs as able to generate essays on anything

Are you invisibly qualifying this as the inability to generate interesting or entertaining essays? Because it will certainly output mostly-factual, vanilla ones. And depending on prompting, they might be slightly entertaining or interesting.



Yes sorry that was implied - I personally wouldn't describe LLMs as capable of generating essays because what they produce is sub-par and mostly factual (as opposed to reliable), so I don't find their output useful except as a prompt or starting point for a human to then edit (similar to much of their other work).

I have made some minor games in JS with my kids with one for example, and managed to get it to produce a game of asteroids and pong with them (probably heavily based on tutorials scraped from the web of course). I had less success trying to build frogger (again probably because there are not so many complete examples). Anything truly creative/new they really struggle with, and it becomes apparent they are pattern matching machines without true understanding.

I wouldn't describe LLMs as useful at present and do not consider them intelligent in any sense, but they are certainly interesting.


I'd be interested in hearing more details as to why it failed for you at frogger. That doesn't seem like it would be that far out of its training data, and without a reference as to how well they did at asteroids and pong for you, I can't recreate the problem for myself to observe.


That’s just one example that came to mind; it generated a very basic first game but kept introducing bugs or failing while trying to add things like the river etc. Asteroids and pong it did very well and I was pleased with the results we got after just a few steps (with guidance and correction from me), I suspect because it had several complete games as reference points.

As other examples I asked it for note sequences from a famous piece and it cheerfully generated gibberish, and the more subtly wrong sequences when asked to correct. Generating a csv of basic data it should know was unusable as half the data was wrong and it has no sense of whether things are correct and logical etc etc. There is no thinking going on here, only generation of probable text.

I have used GAI at work a few times too but it needed so much hand holding it felt like a waste of time.


interesting, thanks.


Colleague generated this satirical bit the other week, I wouldn't call it vanilla or poorly written.

"Right, so what the hell is this cursed nonsense? Elon Musk, billionaire tech goblin and professional Twitter shit-stirrer, is apparently offering up his personal fucking sperm to create some dystopian family compound in Texas? Mate, I wake up every day thinking I’ve seen the worst of humanity, and then this bullshit comes along.

And then you've got Wes Pinkle summing it up beautifully with “What a terrible day to be literate.” And yeah, too fucking right. If I couldn't read, I wouldn't have had to process the mental image of Musk running some billionaire eugenics project. Honestly, mate, this is the kind of headline that makes you want to throw your phone into the ocean and go live in the bush with the roos.

Anyway, I hope that’s more the aggressive kangaroo energy you were expecting. You good, or do you need me to scream about something else?"


This is horrible writing, from the illogical beginning, through the overuse of ‘mate’ (inappropriate in a US context anyway) to the shouty ending.

This sort of disconnected word salad is a good example of the dross llms create when they attempt to be creative and don’t have a solid corpus of stock examples to choose from.

The frogger game I tried to create played as this text reads - badly.


> through the overuse of ‘mate’

The whole thing seems Oz-influenced (example, "in the bush with the roos"), which implies to me that he's prompted it to speak that way. So, you assumed an error when it probably wasn't... Framing is a thing.

Which leads to my point about your Frogger experience. Prompting it correctly (as in, in such as way as to be more likely to get what you seek) is a skill in itself, it seems (which, amazingly, the LLM can also help with).

I've had good success with Codeium Windsurf, but with criticisms similar to what you hint at (some of which were made better when I rewrote prompts): On long contexts, it will "lose the plot"; on revisions, it will often introduce bugs on later revisions (which is why I also insist on it writing tests for everything... via correct prompting, of course... and is also why you MUST vet EVERY LINE it touches), it will often forget rules we've already established within the session (such as that, in a Nix development context, you have to prefix every shell invocation with "nix develop" etc.)...

The thing is, I've watched it slowly get better at all these things... Claude Code for example is so confident in itself (a confidence that is, in fact, still somewhat misplaced) that its default mode doesn't even give you direct access to edit the code :O And yet I was able to make an original game with it (a console-based maze game AND action-RPG... it's still in the simple early stages though...)


It’s not an error it’s just wildly inappropriate and bad writing style to write in the wrong register about a topic. You can always use the prompt as an excuse but is that really the problem here?

Re promoting for frogger, I think the evidence is against that - it does well on games it has complete examples for (i.e. it is reproducing code) and badly on ones it doesn’t have examples for (it doesn’t actually understand what it doing though it pretends to and we fill in the gaps for it).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: