More

shotnothing · 2025-05-31T00:16:06 1748650566

> Similarly, it seems like languages with de Bruijn indices are immune to LLMs

i think large chain of thought LLMs (e.g. o3) might be able to manage

shotnothing · 2025-05-22T06:00:16 1747893616

so does steve jobs, seems to have worked out for him

shotnothing · 2024-12-16T06:45:53 1734331553

much more relative to the average worker in many other industries

shotnothing · on Oct 31, 2024

> I definitely don’t learn math by means of gradient descents.

https://physoc.onlinelibrary.wiley.com/doi/10.1113/JP282747

larodi · on Oct 31, 2024

The fact it (is suggested / we are led to believe / was recently imlied ) the neurons can be explained to be doing something like it on the underlying layer still says little about the process of forming ontological context needed for any kind of syllogism.

shotnothing · on Oct 31, 2024

i think there have been many observations and studies reporting emergent intelligence

dartos · on Oct 31, 2024

Observations are anecdotal. Since a lot of LLMs are non deterministic due to their sampling step, you could give rhe same survey to the same LLM many times and receive different results.

And we don’t have a good measure for emergent intelligence, so I would take any “study” with a large grain of salt. I’ve read one or two arxiv papers suggesting reasoning capabilities, but they were not reproduced and I personally couldn’t reproduce their results.

unoti · on Oct 31, 2024

Go back to the ReAct paper, reasoning and action. This is the basis of most of the modern stuff. Read the paper carefully, and reproduce it. I have done so, this is doable. The paper and the papers it refers to directly addresses many things you have said in these threads. For example, the stochastic nature of LLM’s is discussed at length with the CoT-SC paper (chain of thought self consistency). When you’re done with that take a look at the Reflexion paper.

nuancebydefault · on Oct 31, 2024

To me it feels that whatever 'proof' you give that LLMs have a model in behind, other than 'next token prediction', it would not make a difference for people not 'believing' that. I see this happening over and over on HN.

We don't know how reasoning emerges in humans. I'm pretty sure the multi-model-ness helps, but it is not needed for reasoning, because they imply other forms of input, hence just more (be it somewhat different) input. A blind person can still form an 'image'.

In the same sense, we don't know how reasoning emerges in LLMs. For me the evidence lays in the results, rather than in how it works. For me the results are enough of an evidence.

cjbprime · on Nov 1, 2024

The argument isn't that there is something more than next token prediction happening.

The argument is that next token prediction does not imply an upper bound on intelligence, because an improved next token prediction will pull increasingly more of the world that is described in the training data into itself.

unoti · on Nov 1, 2024

> The argument isn't that there is something more than next token prediction happening.

> The argument is that next token prediction does not imply an upper bound on intelligence, because an improved next token prediction will pull increasingly more of the world that is described in the training data into itself.

Well said! There's a philosophical rift appearing in the tech community over this issue semi-neatly dividing people between naysayers, "disbelievers" and believers over this very issue.

nuancebydefault · on Nov 1, 2024

I fully agree. Some people fully disagree though on the 'pull of the world' part, let alone 'intelligence' part, which are in fact impossible to define.

corimaith · on Oct 31, 2024

The reasoning emerges from the long distance relations between words picked up by the parallel nature of the transformers. It's why they were so much more performant than earlier RNNs and LSTMs which were using similar tokenization.

shotnothing · on Aug 1, 2024

sounds like they should have added a warning

shotnothing · on July 26, 2024

> if the sites tracking these sort of things can be believed

I really don't think they can

shotnothing · on June 25, 2024

As you have said, neither you nor the other guy is qualified to make a technical assesment on this and we just have to wait for yhe investigations to play out.

However, what we can assess is the ex-employee's actions and multitude of claims, which while in isolation seems benign, together makes her look more like a vengeful ex-employee than a credible victim

PROgrammerTHREE · on June 25, 2024

And importantly, Ashley's not qualified either. She has a track record of this stuff!!

shotnothing · on June 25, 2024

but it's a strong indictor

wewtyflakes · on June 25, 2024

I do not think that is reasonable to believe that because a report has some mundane findings that it indicates that the whole report should be dismissed.

shotnothing · on June 25, 2024

just two? given how massive the US and its industrial capacity is, I'd expect it to be way more