More

deegles · 2025-09-23T22:54:58 1758668098

no, letting misinformation persist is counterproductive because of the illusory truth effect. the more people hear it, the more they think (consciously or not) "there must be something to this if it keeps popping up"

NullCascade · 2025-09-23T23:10:01 1758669001

Elon Musk's takeover of X is already a good example of what happens with unlimited free speech and unlimited reach.

Neo-nazis and white nationalists went from their 3-4 replies per thread forums, 4chan posts, and Telegram channels, to now regularly reaching millions of people and getting tens of thousands of likes.

As a Danish person I remember how American media in the 2010s and early 2020s used to shame Denmark for being very right-wing on immigration. The average US immigration politics thread on X is worse than anything I have ever seen in Danish political discussions.

deegles · 2025-09-14T01:00:37 1757811637

I think they mean serious in scientific terms, not in policy making.

deegles · 2025-08-21T19:34:37 1755804877

> We're clearly seeing what AI will eventually be able to do

Are we though? Aside from a narrow set of tasks like translation, grammar, and tone-shifting, LLMs are a dead end. Code generation sucks. Agents suck. They still hallucinate. If you wouldn't trust its medical advice without review from an actual doctor, why would you trust its advice on anything else?

Also, the companies trying to "fix" issues with LLMs with more training data will just rediscover the "long-tail" problem... there is an infinite number of new things that need to be put into the dataset, and that's just going to reduce the quality of responses.

For example: the "there are three 'b's in blueberry" problem was caused by so much training data in response to "there are two r's in strawberry". it's a systemic issue. no amount of data will solve it because LLMs will -never- be sentient.

Finally, I'm convinced that any AI company promising they are on the path to General AI should be sued for fraud. LLMs are not it.

hnfong · 2025-08-21T19:45:40 1755805540

I have a feeling that you believe "translation, grammar, and tone-shifting" works but "code generation sucks" for LLMs because you're good at coding and hence you see its flaws, and you're not in the business of doing translation etc.

Pretty sure if you're going to use LLMs for translating anything non-trivial, you'd have to carefully review the outputs, just like if you're using LLMs to write code.

deegles · 2025-08-21T22:06:54 1755814014

You know, you're right. It -also- sucks at those tasks because on top of the issue you mention, unedited LLM text is identifiable if you get used to its patterns.

h4ck_th3_pl4n3t · 2025-08-21T21:54:15 1755813255

By definition, transformers can never exceed average.

That is the thing, and what companies pushing LLMs don't seem to realize yet.

janalsncm · 2025-08-21T23:19:16 1755818356

Can you expand on this? For tasks with verifiable rewards you can improve with rejection sampling and search (i.e. test time compute). For things like creative writing it’s harder.

miki123211 · 2025-08-22T01:05:42 1755824742

For creative writing, you can do the same, you just use human verifiers rather than automatic ones.

LLMs have encountered the entire spectrum of qualities in its training data, from extremely poor writing and sloppy code, to absolute masterpieces. Part of what Reinforcement Learning techniques do is reinforcing the "produce things that are like the masterpieces" behavior while suppressing the "produce low-quality slop" one.

Because there are humans in the loop, this is hard to scale. I suspect that the propensity of LLMs for certain kinds of writing (bullet points, bolded text, conclusion) is a direct result of this. If you have to judge 200 LLM outputs per day, you prize different qualities than when you ask for just 3. "Does this look correct at a glance" is then a much more important quality.

mdemare · 2025-08-21T20:18:44 1755807524

Exactly. Books are still being translated by human translators.

I have a text on my computer, the first couple of paragraphs from the Dutch novel "De aanslag", and every few years I feed it to the leading machine translation sites, and invariably, the results are atrocious. Don't get me wrong, the translation is quite understandable, but the text is wooden, and the translation contains 3 or 4 translation blunders.

GPT-5 output for example:

Far, far away in the Second World War, a certain Anton Steenwijk lived with his parents and his brother on the edge of Haarlem. Along a quay, which ran for a hundred meters beside the water and then, with a gentle curve, turned back into an ordinary street, stood four houses not far apart. Each surrounded by a garden, with their small balconies, bay windows, and steep roofs, they had the appearance of villas, although they were more small than large; in the upstairs rooms, all the walls slanted. They stood there with peeling paint and somewhat dilapidated, for even in the thirties little had been done to them. Each bore a respectable, bourgeois name from more carefree days: Welgelegen Buitenrust Nooitgedacht Rustenburg Anton lived in the second house from the left: the one with the thatched roof. It already had that name when his parents rented it shortly before the war; his father had first called it Eleutheria or something like that, but then written in Greek letters. Even before the catastrophe occurred, Anton had not understood the name Buitenrust as the calm of being outside, but rather as something that was outside rest—just as extraordinary does not refer to the ordinary nature of the outside (and still less to living outside in general), but to something that is precisely not ordinary.

tschwimmer · 2025-08-21T22:54:21 1755816861

Can you provide a reference translation or at least call out the issues you see with this passage? I see "far far away in the [time period]" which I should imagine should be "a long time ago" What are the other issues?

mdemare · 2025-08-22T09:25:54 1755854754

- "they were more small than large" (what?)

- "even in the thirties little had been done to them" (done to them?)

- "Welgelegen Buitenrust Nooitgedacht Rustenburg" (Untranslated!)

- "his father had first called it Eleutheria" (his father'd rather called it)

- "just as extraordinary does not refer to the ordinary nature of the outside" (complete non-sequitur)

mrtranscendence · 2025-08-22T13:34:33 1755869673

Waar heb je het over? "Welgelegen Buitenrust Nooitgedacht Rustenburg" is volkomen cromulent Engels.

For what it's worth, I do use AI for language learning, though I'm not sure it's the best idea. Primarily for helping translate German news articles into English and making vocabulary flashcards; it's usually clear when the AI has lost the plot and I can correct the translation by hand. Of course, if issues were more subtle then I probably wouldn't catch them ...

tschwimmer · 2025-08-22T21:28:09 1755898089

Thanks yeah, you’re right these are bad.

frm88 · 2025-08-22T09:26:55 1755854815

Not the original poster, but you can read these paragraphs translated by a human on amazon's sneak peek: https://lesen.amazon.de/sample/B0D74T75KH?f=1&l=de_DE&r=2801...

The difference is gigantic.

rstuart4133 · 2025-08-21T21:24:17 1755811457

> Aside from a narrow set of tasks like translation, grammar, and tone-shifting, LLMs are a dead end.

I consider myself an LLM skeptic, but gee saying they are a "dead end" seems harsh.

Before LLM's came along computers understanding human language was graveyard academics when to end their careers in. Now computers are better at it and far faster than most humans.

LLM's also have an extortionary ability to distill and compress knowledge, so much so that you can download a model whose since is measured in GB, and it seems to have a pretty good general knowledge of everything of the internet. Again, far better than any human could do. Yes, the compression is lossy, and yes they consequently spout authoritative sounding bullshit on occasion. But I use them regardless as a sounding board, and I can ask them questions in plain English rather than go on a magical keyword hunt.

Merely being able to understand language or having a good memory is not sufficient to code or do a lot else, on it's own. But they are necessary ingredients for many tasks, and consequently it's hard to imagine a AI that can competently code that doesn't have an LLM as a component.

deegles · 2025-08-21T22:20:35 1755814835

> it's hard to imagine a AI that can competently code that doesn't have an LLM as a component.

That's just it. LLMs are a component, they generate text or images from a higher-level description but are not themselves "intelligent". If you imagine the language center of your brain being replaced with a tiny LLM powered chip, you would not say it's sentient. it translates your thoughts into words which you then choose to speak or not. That's all modulated by consciousness.

miki123211 · 2025-08-22T00:57:46 1755824266

> If you wouldn't trust its medical advice without review from an actual doctor, why would you trust its advice on anything else?

When an LLM gives you medical advice, it's right x% of the time. When a doctor gives you medical advice, it's right y% of the time. During the last few years, x has gone from 0 to wherever it is now, while y has mostly stayed constant. It is not unimaginable to me that x might (and notice I said might, not will) cross y at some point in the future.

The real problem with LLM advice is that it is harder to find a "scapegoat" (particularly for legal purposes) when something goes wrong.

mrtranscendence · 2025-08-22T13:46:37 1755870397

Microsoft claims that they have an AI setup that outperforms human doctors on diagnosis tasks: https://microsoft.ai/new/the-path-to-medical-superintelligen...

"MAI-DxO boosted the diagnostic performance of every model we tested. The best performing setup was MAI-DxO paired with OpenAI’s o3, which correctly solved 85.5% of the NEJM benchmark cases. For comparison, we also evaluated 21 practicing physicians from the US and UK, each with 5-20 years of clinical experience. On the same tasks, these experts achieved a mean accuracy of 20% across completed cases."

Of course, AI "doctors" can't do physical examinations and the best performing models cost thousands to run per case. This is also a test of diagnosis, not of treatment.

randomNumber7 · 2025-08-22T07:35:48 1755848148

If you consider how little time doctors have to look at you (at least in Germanys half broken public health sector) and how little they actually care ...

I think x is already higher than y for me.

deegles · 2025-08-22T17:08:55 1755882535

That's fair. Reliable access to a 70% expert is better than no access to a 99% expert.

deegles · 2025-08-21T19:26:42 1755804402

I tried using agents in Cursor and when it runs into issues it will just rip out the offending code :)

robertfw · 2025-08-22T00:43:16 1755823396

I've had similar cases where the fix to the test was.. delete the test. Ah, if only I'd realized that little hack earlier in my career!

deegles · 2025-08-17T20:07:29 1755461249

Are there legitimate organizations to donate to that are effectively evacuating people?

deegles · 2025-08-08T16:11:11 1754669471

I might have submitted it before already :)

deegles · 2025-08-03T14:34:57 1754231697

Human speech has a bit rate of around 39 bits per second, no matter how quickly you speak. assuming reading is similar, I guess more "dense" tokens would just take longer for humans to read.

https://www.science.org/content/article/human-speech-may-hav...

__s · 2025-08-03T20:38:07 1754253487

Sure, but that link has Japanese at 5 bits per syllable & Vietnamese at 8 bits per syllable, so if billing was based on syllables per prompt you'd want Vietnamese prompts

Granted English is probably going to have better quality output based on training data size

deegles · 2025-07-25T18:32:23 1753468343

I guess they're asking where to actually buy them at these prices, not doubting that the price is dropping.

ziga · 2025-07-25T18:53:51 1753469631

Ah, those are wholesale prices. And the form factor will increasingly be prismatic LFP packs, not cylindrical cells.

For buying LFP cells, I would start here: https://diysolarforum.com/

deegles · 2025-07-03T21:57:01 1751579821

for explanation i've seen for the where's waldo analogy: imagine the single page of the where's waldo puzzle, and another giant piece of paper with the shape of waldo cut out of it.

by providing a picture of waldo in the cut-out, you can prove you know where he is without providing the location. a zero knowledge proof.

yababa_y · 2025-07-03T22:31:59 1751581919

everyone in this thread needs to read this paper: https://dl.acm.org/doi/abs/10.1145/3411497.3420225

Where’s Waldo as presented isn’t even a proof of knowledge

edanm · 2025-07-04T06:51:08 1751611868

I think the Where's Waldo example, while not technically zero knowledge, gives a pretty good intuition of the idea behind it.

It certainly gives a "layperson" example of being able to prove you know something without revealing it, which isn't the whole definition of ZK but is the idea driving it.

goopypoop · 2025-07-03T22:35:34 1751582134

Is that "Draw a Waldo with this outline"?

cma · 2025-07-03T23:34:49 1751585689

Imagine it isn't Waldo, but an unknown figure and you are only given the silhouette to find. If you can draw what's within the silhouette or something, you've proven you've located it to high certainty without saying where.

Say the whole image looked like noise and was generated from quantum measurements, and the coordinates to hash for the problem were generated with quantum measurements, and you were given the silhouette and the hash of the noise within to look for. I could see it for proof of work: you could slide along a hashing window and prove you actually did work examining half the image on average or whatever.

goopypoop · 2025-07-04T00:52:27 1751590347

Thanks. So is it really different from "what's (the hash of) word x on page y of the manual?"?

cma · 2025-07-04T02:11:01 1751595061

I think my example isn't great and would need to be modified like maybe give the hash of a neighboring area to prove you found it, so your answer couldn't be used by others to find the location much more cheaply.

arcastroe · 2025-07-04T00:50:30 1751590230

Plot twist: In addition to the cutout paper, the prover also brings their OWN picture of waldo, which they always place behind the cutout.

deegles · 2025-06-27T05:33:37 1751002417

possible, yes. did they? that's the question

blackbear_ · 2025-06-27T06:32:18 1751005938

Yes they did, and published it all.

Sometimes I can't believe how low discussions on HN can fall. Did really nobody in this thread bother to check this? Are we fine disparaging research solely based on the fact that they used a method that gives bad results with bad inputs (which doesn't?) and their incentives could be misaligned (whose aren't?)?

If there are well justified concerns about the method or data then by all means let's talk about it, but please let's all try to keep low effort anti intellectual conspiracy theories away from here.

riskassessment · 2025-06-27T13:10:45 1751029845

I read the paper before I made my original comment. They fit a clustering algorithm and then hand waved at intepreting the clusters. 'Omics papers get away with a lot of hand waving. Yeah they did some peak detection and found peaks, but you are going to find peaks in a random walk.

They didn't test the theory that rapid aging occurs at those two specific time points in an independent hold out set.

Most importantly even if these peaks exist this paper does not prove they are biological. They could correspond to common socially driven changes in behavior

blackbear_ · 2025-06-27T14:37:59 1751035079

> I read the paper before I made my original comment.

That's good, now I'm wondering about the others in the thread.

> 'Omics papers get away with a lot of hand waving

Making assumptions and interpreting results is part of any type of analysis, especially for unsupervised learning approaches like clustering. Or maybe I am missing something: how do you not-handwave the results of a clustering analysis if you don't have any supervision signal?

In any case, I agree that omics in particular take many more liberties than usual with their interpretations. And yet, sometimes they come up with useful and important finding. Yes, a broken clock...2x a day, but maybe after working in the same field for many years one can gain some insights and intuitions.

> but you are going to find peaks in a random walk

I would hope so since a random walk has pretty obvious peaks, and it's not hard to test if the peak is significantly beyond the level expected due to chance.

Do you have actual concerns about the data and the peaks they found, or are we back at wondering about all the fallacies that they may or may have not committed?

> They didn't test the theory that rapid aging occurs at those two specific time points in an independent hold out set.

This is a glaring omission, I agree.

> this paper does not prove they are biological. They could correspond to common socially driven changes in behavior.

True, but it dot make this paper worth any less. If anything, it's a great question for follow-up work.