Hacker Newsnew | past | comments | ask | show | jobs | submit | a_bonobo's commentslogin

>* For years, despite functional evidence and scientific hints accumulating, certain AI researchers continued to claim LLMs were stochastic parrots: probabilistic machines that would: 1. NOT have any representation about the meaning of the prompt. 2. NOT have any representation about what they were going to say. In 2025 finally almost everybody stopped saying so.

Man, Antirez and I walk in very different circles! I still feel like LLMs fall over backwards once you give them an 'unusual' or 'rare' task that isn't likely to be presented in the training data.


LLMs certainly struggle with tasks that require knowledge that is not provided to them (at significant enough volume/variance to retain it). But this is to be expected of any intelligent agent, it is certainly true of humans. It is not a good argument to support the claim that they are Chinese Rooms (unthinking imitators). Indeed, the whole point of the Chinese Room thought experiment was to consider if that distinction even mattered.

When it comes to of being able to do novel tasks on known knowledge, they seem to be quite good. One also needs to consider that problem-solving patterns are also a kind of (meta-)knowledge that needs to be taught, either through imitation/memorisation (Supervised Learning) or through practice (Reinforcement Learning). They can be logically derived from other techniques to an extent, just like new knowledge can be derived from known knowledge in general, and again LLMs seem to be pretty decent at this, but only to a point. Regardless, all of this is definitely true of humans too.


In most cases, LLMs has the knowledge(data). They just can't generalize them like human do. They can only reflect explicit things that are already there.

I don't think that's true. Consider that the "reasoning" behaviour trained with Reinforcement Learning in the last generation of "thinking" LLMs is trained on quite narrow datasets of olympiad math / programming problems and various science exams, since exact unambiguous answers are needed to have a good reward signal, and you want to exercise it on problems that require non-trivial logical derivation or calculation. Then this reasoning behaviour gets generalised very effectively to a myriad of contexts the user asks about that have nothing to do with that training data. That's just one recent example.

Generally, I use LLMs routinely on queries definitely no-one has written about. Are there similar texts out there that the LLM can put together and get the answer by analogy? Sure, to a degree, but at what point are we gonna start calling that intelligent? If that's not generalisation I'm not sure what is.

To what degree can you claim as a human that you are not just imitating knowledge patterns or problem-solving patterns, abstract or concrete, that you (or your ancestors) have seen before? Either via general observation or through intentional trial-and-error. It may be a conscious or unconscious process, many such patterns get backed into what we call intuition.

Are LLMs as good as humans at this? No, of course, sometimes they get close. But that's a question of degree, it's no argument to claim that they are somehow qualitatively lesser.


Late to this, but my interpretation of the parent's point was eg: LLMs still often produce bad code, despite "reading" every book about programming ever written. Simplistically, they aren't taking the knowledge from those books, and applying them to the knowledge of the code they've scraped, they are just using the scraped output. You can then separately ask them about knowledge from those books, but then if you go back and get them to code again, they still won't follow the advice they just gave you.

"In 2025 finally almost everybody stopped saying so."

I haven't.


Some people are slower to understand things.

That is why they need artificial inteligence

Well exactly ;)

I don’t think this is quite true.

I’ve seen them do fine on tasks that are clearly not in the training data, and it seems to me that they struggle when some particular type of task or solution or approach might be something they haven’t been exposed to, rather than the exact task.

In the context of the paragraph you quoted, that’s an important distinction.

It seems quite clear to me that they are getting at the meaning of the prompt and are able, at least somewhat, to generalise and connect aspects of their training to “plan” and output a meaningful response.

This certainly doesn’t seem all that deep (at times frustratingly shallow) and I can see how at first glance it might look like everything was just regurgitated training data, but my repeated experience (especially over the last ~6-9 months) is that there’s something more than that happening, which feels like whet Antirez was getting at.


Give me an example of one of those rare or unusual tasks.

I work on a few HPC systems with unusual, kinda custom-rolled architectures. A whole bunch of Python and R packages fail to compile on these systems. There's no publicly accessible documentation for these HPC systems, nor for these custom architectures. ChatGPT and Claude so far have given me only wrong advice on how to get around these compilation errors and there's not much on Google for these errors, but HPC staff usually knew what to do.

Set the font size of a simple field in openxml. Doesn't even seem that rare. It said to add a run inside and set the font there. Didn't do anything. I ended up reverse engineering the output out of ms word. This happened yesterday.

> to accurately prepopulate tax returns for around 45% of Americans. (Those other countries have much simpler tax codes than we do.)

One should note that the cited study quotes the 45% from a 1992 study. These days, with gig economy and quasi-self-employment, that number is probably higher since you don't have an employer who reports your income for you.

Still, here in Australia, where we have the return-free tax system, adding what you earned from your various gig jobs isn't too hard: you add that as items to the web form: 'I made 15,123 from Uber Eats'. That just gets added to your overall return. I don't see how that's so hard compared to the US?


Income reporting is not the problem: Anyone paying you any significant amount of money is required to file with the IRS, including if you’re paying yourself.

The issue is the broad range of deductions and credits that depend on things like the composition of your household and your primary residence. Contra some expectations, the IRS does not keep a database of who’s shacking up with whom, where, or if kids are in the picture.


In the states if you are a contractor there are tons of things that you can deduct from your taxable income. So “figuring out how much you should be taxed” is after those deductions.

If uber paid you $15123 but you:

Just bought a new bike bc your other was stolen

You paid $1200 for insurance

You bought a helmet and cold weather clothes etc etc.

Those things reduce your taxable income.


I think that's common in most places. What's different in the US is that the IRS forces you to proactively provide a lot more information about it, though. I have a rental property and need to enter the same information about the same income and expenses on three different forms, breaking it down in different ways. It's tedious and error-prone, and I guess the philosophy is that it's easier to spot fraud if the numbers on all the different forms don't add up to a coherent story.

Other countries presumably rely on other fraud signals. They might have more visibility into your day-to-day financial transactions, or there might be more of a culture of leaving an anonymous tip if you suspect your neighbor isn't paying a fair share.


What three forms are you talking about?


4562, 8825, 1065


Yes, same in Australia. Keep receipts and add the cost to the web form.

They have simplified it nicely, though: if you work from home you can claim a per-hour deduction so you don't have to do the math of wear-and-tear, electricity, internet etc. I think it was $0.6 per hour?


Finland did that even simpler more than 50% of work days you get 750€. Ofc, hard part is to calculate 50% of your internet bill. And then any technology you buy for remote work. Not chair, desk or lamps though, those are in the room part...

Thankfully(\s), they are simplifying it even further next year and removing whole thing. Now you only get to deduct money if you actually rent an office...


If you can, read Robert Caro's The Path To Power (Caro's The Power Broker has been a HN favorite ever since Aaron Swartz recommended it). It's the story of the first ~30 years of Lyndon B Johnson's life.

I forget which chapter it is, but Caro takes a detour where he describes the life of women during Johnson's childhood in the dirt-poor valley he was from: no electricity, no waterpower, everything in the house was done by women's hands, 24/7. There's a passage that stuck to me about how women in their 30s in that area looked like other area's women in their 70s, just a brutal life.


Chapter 4 - The Father and Mother

> Transplanted, moreover, to a world in which women had to work, and work hard. On washdays, clothes had to be lifted out of the big soaking vats of boiling water on the ends of long poles, the clothes dripping and heavy; the farm filth had to be scrubbed out in hours of kneeling over rough rub-boards, hours in which the lye in homemade soap burned the skin off women’s hands; the heavy flatirons had to be continually carried back and forth to the stove for reheating, and the stove had to be continually fed with new supplies of wood—decades later, even strong, sturdy farm wives would remember how their backs had ached on washday.


And what he left out of this book (and included in the memoir or in some interview) was that there was a scientific study of women in the area at the time which discovered that a very high percentage of women had birthing complications serious enough for hospitalization that went untreated as they had to go back to their chores next day and there was no hospital anywhere close.


Exactly what I thought of reading this, that chapter is genuinely one of the most affecting things I've ever read. The horror of it keeps growing as he continues to describe awful manual task after the other.


Related, I think people have stopped.... reacting on the internet? I've been part of the X/Twitter to Bluesky migration and people often mention how 'quiet' Bluesky is.

I think that's not due to algorithmic intervention of product design etc., I think people are just tired. The novelty of shouting at strangers on the internet has worn off - how many internet fights have we gotten into that did nothing in the end except waste time? It's only worse with a coin flip's chance of the other person being an LLM. We're all tired.


This is relatable. I often find myself starting a reply on here, really thinking it through as I type it out, and then hitting delete on what I just wrote. Sometimes I even hit submit, and then delete a few moments later.

It's just hard to justify engaging. Worst case, I get a fight on my hands with someone who's as dogmatic as they are wrong, which is both frequent and also a complete waste of my time. (A tech readership is always going to veer hard into the well, akshually...) Most likely case, I get fictitious internet points. Which - I won't lie - tickle my lizard brain, just as they do everyone else's. But they don't actually achieve anything meaningful.

Best case is that I learn something. Realistically, this happens vanishingly infrequently, and the signal-noise ratio is much, much worse than if I just pulled a book off my shelf.

I suppose this is all an artifact of time and experience. Maybe I've just picked all the low-hanging fruit, and so I no longer have the patience to watch people endlessly repost the same xkcd strips from fifteen years ago, navel-gaze about tabs or spaces, share thrilling new facts that I have in fact known for many decades, etc. And while I'm very excited for them to discover all these things anew (and anew... and anew...), it's just not a good use of my time and patience to participate.


> It's just hard to justify engaging. Worst case, I get a fight on my hands with someone who's as dogmatic as they are wrong, which is both frequent and also a complete waste of my time.

The three mindset changes I found that really help with this are understanding that:

* You don't have to try and get the last word in.

* Other people are not entitled to your time, especially if they're engaging in bad faith.

* Outside of small and curated communities, there's pretty good odds that you're not interacting with a real and honest person.

So whenever I click into the comment box, I always ask myself "Can I really be bothered with this? Is this really what I want to be spending my free time doing?"

And then I often close the comment box and get on with my life.


    It's just hard to justify engaging.
Well, if your try and force yourself to engage with multiple people, the site won't let you post that many comments in such a short time period. Which, overall, is a good thing I believe.


I wish we got karma points (or maybe "zen points") for every time we refrained from commenting on someone who is wrong on the internet.


I wonder if it's just creeping apathy, post-covid, current-AI boom. That we're just tired in life. There's a psych study, Dimensional Apathy Scale (DAS)[0] and one of the questions is basically "How much do I contact my friends?" I think it argues that the more apathy we feel, the less likely we are to reach out to others, and I imagine, the less likely we are to react or reply to comments (or even post).

I'm curious if the decline in reacting is matched by a decline in replying and posting in general.

Anyways, I worry that apathy is on the rise as we get overwhelmed with the rate of change and uncertainty in the 2020s and I'm working pretty hard to fight that apathy and bring more empathy, so if you're interested, please reach out to me the contact info in my bio.

[0]: https://das.psy.ed.ac.uk/wp-content/uploads/2018/04/SelfDAS....


I feel this, but also, I am... anxious about reactions? I rarely / never go back on comments I've written on HN. I know it's actually a really bad thing to do because it means I won't allow my views to be challenged, don't engage in debate, just want to get my side out without actively defending it.

Years ago I had a blog and one time I wrote a post in response to another blog post about education vs experience, arguing in favor of formal education. And that one got a link back from the original article, leading people back to my blog. I got engagement, comments, feedback, etc... and it was very uh. Overwhelming? Like suddenly I had to defend my arguments. It made me very uncomfortable, even though it was probably a good thing, all in all.

I don't know how to break that trend. I think I'd rather have realtime communications / chat, but that's another thing that seems to have died, at least in the space I've been at for a long time now.


The simple solution is that whenever you start to write a comment, ask yourself: do I want to have a discussion about this?

If the answer is "yes", then make your comment, check back and interact with the responses (assuming they seem to be in good faith). If it's "no" then just close the comment box and get on with your life.

But then I realise that it's fairly pointless writing this in the first place...


Spot on. Ten or fifteen years ago, participating in the internet was something I got excited about, now I just get excited about getting away from it.


I think the aggressive bots/AI, and bad moderation policy, have poisoned online discourse in popular channels.

You can still find real people in niche communities (like here), where good moderators can maintain a grip on quality. Though perhaps HN has some secret moderator sauce, I’m not aware of.

Humans are just migrating off the old, big platforms that no longer feel real.


Probably more related to progressive culture, people worried about saying the wrong thing. From the outside, it looks exhausting to try and keep up with the latest dogma of the left.


Participating? Or reacting? The internet I look seems plenty full of reactions despite the migrations you mention.

Maybe to YT or Threads instead.

I like Bsky but I don't think the userbase supports much large-scale communication (not a bad thing, frankly)


In my niche, bioinformatics, linkedin has become somewhat of a force ever since many people left Twitter/X during the 'rebranding'? It's quite weird.

They're mostly posts announcing new packages etc. but there seems to be more bioinformatics-y activity than, say, mastodon or bluesky. The posts definitely have a different tone than what OP decries.


Yes there are a bunch of weird niches that got a lot of Twitter traffic but found a home on LinkedIn when there's an overlap with professions. Another niche example that I see is applications for AI powered architectural visualization, many folks posting actually useful stuff there on a regular basis.


Honestly I wish people stuck with good old forums. There's forums for everything out there, in every niche, gaming, modding, hardware, cars, boats.

Every single community you can think of has likely a great forum out there, easily readable and searchable, where discussions on single topics last _years_ and go in extreme informative depth, the kind of depth that no platform like HN/Lobsters/LinkedIn can ever dream of.

The closest surrogate we have are issue trackers (like GitHub) or mailing lists, but even those offer such a poor UX that I can't but wonder..


Bioinformatics has biostars :) https://www.biostars.org/

The difference to linkedin is that biostars has 'in-domain experts' only; the postdocs, the staff bioinformaticians, etc. those are not the people who will hire you. The people who will hire you are on linkedin.


Yeah it's definitely a combination of posting for peers but also creating material that is helpful for finding the next gig


>I find there is usually also some file juggling, parsing, [...]

I'd say I'm 50/50 Python/R for exactly this reason: I write Python code on HPC or a server to parse many, many files, then I get some kind of MB-scale summary data I analyse locally in R.

R is not good at looping over hundreds of files in the gigabytes, Python is not good at making pretty insights from the summary. A tool for every task.


I think that's also because Claude Code (and LLMs) is built by engineers who think of their target audience as engineers; they can only think of the world through their own lenses.

Kind of how for the longest time, Google used to be best at finding solutions to programming problems and programming documentation: say, a Google built by librarians would have a totally different slant.

Perhaps that's why designers don't see it yet, no designers have built Claude's 'world-view'.


I've implemented the per-day time limit and it's just broken? It's very hard to measure how much time has been spent on the day, their measure is often above the limit I set, and sometimes a low limit will trigger immediately.

It seems like the limit and time measurement is based on the US time zone alone, not the local time zone. We're in Australia and that's the only explanation I can think of.


If you're interested in a technical solution, Amazon Kids+ on Kindle seems to enforce time limits reasonably well, except for the camera.


Thanks :) I've gone back to just being vigilant, regular parenting; often joining the games myself. I can recommend 99 Nights In the Forest, nice simple survival-style game with very little in-your-face-monetization.


and gives public presentations at gatherings of the far-far-right party https://www.abc.net.au/news/2025-01-26/elon-musk-supports-af...


I love Stanislaw Lem. Solaris is his most famous and it has many of his core themes; I'd also try Fiasco.

I'm also a huge fan of R.A. Lafferty, but his stuff his harder to find, mostly out of print.

Peter Watts' Blindsight is amazing recent-ish hard SF. (the follow-up, I did not like at all).

Anything from the Strugatsky brothers you can get your hands on!


I also love Solaris. I remember reading it the first time as a teenager. It is so matter-of-fact in its telling, but the facts are so bizarre, that I found it to really induce terror. It has always struck me as more of a horror novel than as a sci-fi novel although it is clearly the latter.

There are now multiple English translations of Solaris available. I know that there’s been a lot of praise for the newer translation, and I read it, but I do not like it. Something about the earlier translation feels more ominous.

On that note—I’ve always found it hard to believe that The Cyberiad was written by the same author! I love the Cyberiad as well but almost for the opposite reasons I love Solaris. The entire universe is charming and funny, whereas Solaris is engrossing but dreadful. I went through a phase in college, reading every Lem book I could find, and eventually discovering that my library’s stacks also included Lem in Polish. Sadly I know no Polish, and was not motivated enough to learn it, so those novels remained off-limits to me.


I grew up in Germany, and there are more German translations of Lem than there are English translations - some 'first' English translations are very recent (last 10 years? Like Summae Technologicae?). I've always had a hypothesis that German geeks are more constrained and worried about the impacts of tech because we grew up with Lem, while American geeks grew up with Heinlein. VERY different views of the world.


Solaris is such a unique concept. From Polish authors I've really enjoyed Limes Inferior by J. Zajdel. The concept of means of payment spoke to me, when I had my own wondering about workless future and digital currencies.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: