More

adroniser · 2026-01-18T14:16:40 1768745800

Adding the position vector is basic sure, but it's naive to think the model doesn't develop its own positional system bootstrapping on top of the barebones one.

thesz · 2026-01-18T22:56:31 1768776991

For some reason people are still adding position encodings into embeddings.

As if they are not relying on the model's ability to develop its own "positional system bootstrapping on top of the barebones one."

adroniser · 2025-09-27T18:08:13 1758996493

fmri's are correlational nonsense (see Brainwashed, for example) and so are any "model introspection" tools.

adroniser · 2025-09-15T09:57:47 1757930267

peer review would encourage less hand wavy language and more precise claims. They would penalize the authors for bringing up bizarre analogies to physics concepts for seemingly no reason. They would criticize the fact that they spend the whole post talking about features without a concrete definition of a feature.

The sloppiness of the circuits thread blog posts has been very damaging to the health of the field, in my opinion. People first learn about mech interp from these blog posts, and then they adopt a similarly sloppy style in discussion.

Frankly, the whole field currently is just a big circle jerk, and it's hard not to think these blog posts are responsible for that.

I mean do you actually think this kind of slop would be publishable in NeurIPS if they submitted the blog post as it is?

PeterStuer · 2025-09-15T11:01:10 1757934070

"peer review would encourage less hand wavy language and more precise claims"

In theory, yes. Lets not pretend actual peer review would do this.

adroniser · 2025-09-15T11:03:49 1757934229

So you think that this blog post would make it into any of the mainstream conferences? I doubt it.

sdenton4 · 2025-09-15T16:50:47 1757955047

IME: most of the reviewers in the big ML conferences are second-year phd students sent into the breach against the overwhelming tide of 10k submissions... Their review comments are often somewhere between useless and actively promoting scientific dishonesty.

Sometimes we get good reviewers, who ask questions and make comments which improve the quality of a paper, but I don't really expect it in the conference track. It's much more common to get good reviewers in smaller journals, in domains where the reviewers are experts and care about the subject matter. OTOH, the turnaround for publication in these journals can take a long time.

Meanwhile, some of the best and most important observations in machine learning never went through the conference circuit, simply because the scientific paper often isn't the best venue for broad observation... The OG paper on linear probes comes to mind. https://arxiv.org/pdf/1610.01644

adroniser · 2025-09-15T17:30:38 1757957438

Of the papers submitted to a conference, it might be that reviewers don't offer suggestions that would significantly improve the quality of the work. Indeed the quality of reviews has gone down significantly in recent years. But if Anthropic were going to submit this work to peer review, they would be forced to tighten it up significantly.

The linear probe paper is still written in a format where it could reasonably be submitted, and indeed it was submitted to an ICLR workshop.

adroniser · 2025-08-28T11:15:22 1756379722

didn't hackers used to be for piracy?

bgwalter · 2025-08-28T12:46:37 1756385197

Generic piracy is for example distributing a commercial unmodified video with full attribution. You don't normally create a derivative work.

Plagiarism is training on generously licensed open source software and creating a derivative work without attribution.

Not all hackers were in favor of piracy, the majority of open source hackers have always been pretty protective of their licenses, which were written before the existence of the laundromats.

Taking all IP and using it against its creators is entirely new and does not match the piracy issues.

whamlastxmas · 2025-08-28T12:04:26 1756382666

Yeah the corporate bootlicking around all of this is weird. We’ve hated patent trolls and Disney’s abuse/corruption of the copyright system and record label lawsuits, but suddenly intellectual property is great?

keeda · 2025-08-28T19:37:58 1756409878

It is difficult to get a man to stick to his principles when it is his salary that those principles will impinge upon.

adroniser · 2025-08-28T11:03:08 1756378988

This suggests people should pre-register benchmarks. Because currently it feels like there is little incentive to publish benchmarks that models saturate.

adroniser · on Sept 12, 2024

But there are lots of models available now that render much faster which are better quality than sora

adroniser · on Aug 13, 2024

I completely agree this shit is so depressing. When I saw the AlphaProof paper I basically spent 3 days in mourning basically, because their approach was so simple.

adroniser · on Aug 13, 2024

I think the whole paper is a satire lol.

adroniser · on Aug 13, 2024

Does it really? If you want an LLM to edit code you need to feed it every single line of code in a prompt. Is it really that surprising that having just learnt it has been timed out, and then seeing code that has an explicit timeout in it, it edits it?? This is just a claim about the underlying foundational LLM since the whole science thing is just a wrapper.

I think this bit of it is just a gimmick put in for hype purposes.

adroniser · on Aug 10, 2024

Isn't RL the algorithm we want basically?

HarHarVeryFunny · on Aug 10, 2024

Want for what?

RL is one way to implement goal directed behavior (making decisions now that hopefully will lead towards a later reward), but I doubt this is the actual mechanism at play when we exhibit goal directed behavior ourselves. Something more RL-like may potentially be used in our cerebellum (not cortex) to learn fine motor skills.

Some of the things that are clearly needed for human-like AGI are things like the ability to learn incrementally and continuously (the main ways we learn are by trial and error, and by copying), as opposed to pre-training with SGD, things like working memory, ability to think to arbitrary depth before acting, innate qualities like curiosity and boredom to drive learning and exploration, etc.

The Transformer architecture underlying all of today's LLMs have none of the above, not surprising since it was never intended as a cognitive architecture - it was designed for seq2seq use such as language models (LLMs).

So, no, I don't think RL is the answer to AGI, and note that DeepMind who had previously believed that have since largely switched to LLMs in the pursuit of AGI, and are mostly using RL as part of more specialized machine learning applications such as AlphaGo and AlphaFold.

adroniser · on Aug 11, 2024

But RL algorithms do implement things like curiosity to drive exploration?? https://arxiv.org/pdf/1810.12894.

Thinking to arbitrary depth sounds like Monte Carlo tree search? Which is often implemented in conjunction with RL. And working memory I think is a matter of the architecture you use in conjunction with RL, agree that transformers aren't very helpful for this.

I think what you call 'trial and error', is what I intuitively think of RL as doing.

AlphaProof runs an RL algorithm during training, AND at inference time. When given an olympiad problem, it generates many variations on that problem, tries to solve them, and then uses RL to effectively finetune itself on the particular problem currently being solved. Note again that this process is done at inference time, not just training.

And AlphaProof uses an LLM to generate the Lean proofs, and uses RL to train this LLM. So it kinda strikes me as a type error to say that DeepMind have somehow abandoned RL in favour of LLMs? Note this Demis tweet https://x.com/demishassabis/status/1816596568398545149 where it seems like he is saying that they are going to combine some of this RL stuff with the main gemini models.

HarHarVeryFunny · on Aug 11, 2024

> But RL algorithms do implement things like curiosity to drive exploration??

I hadn't read that paper, but yes using prediction failure as learning signal (and attention mechanism), same as we do, is what I had in mind, but it seems that to be useful it needs to be combined with online learning ability, so that having explored then next time one's predictions will be better.

It's easy to imagine LLM's being extended in all sorts of ad-hoc ways, including external prompting/scaffolding such as think step by step and tree search, which help mitigate some of the architectural shortcomings, but I think online learning is going to be tough to add in this way, and it also seems that using the model's own output as a substitute for working memory isn't sufficient to support long term focus and reasoning. You can try to script intelligence by putting the long-term focus and tree search into an agent, but I think that will only get you so far. At the end of the day a pre-trained transformer really is just a fancy sentence completion engine, and while it's informative how much "reactive intelligence" emerges from this type of frozen prediction, it seems the architecture has been stretched about as far as it will go.

I wasn't saying that DeepMind have abandoned RL in favor of LLMs, just that they are using RL in more narrow applications than AGI. David Silver at least still also seems to think that "Reward is enough" [for AGI], as of a few years ago, although I think most people disagree.

adroniser · on Aug 11, 2024

Hmm well the reason a pre-trained transformer is a fancy sentence completion engine is because that is what it is trained on, cross entropy loss on next token prediction. As I say, if you train an LLM to do math proofs, it learns to solve 4 out of the 6 IMO problems. I feel like you're not appreciating how impressive that is. And that is only possible because of the RL aspect of the system.

To be clear, i'm not claiming that you take an LLM and do some RL on it and suddenly it can do particular tasks. I'm saying that if you train it from scratch using RL it will be able to do certain well defined formal tasks.

Idk what you mean about the online learning ability tbh. The paper uses it in the exact way you specify, which is that it uses RL to play montezuma's revenge and gets better on the fly.

Similar to my point about the inference time RL ability of the alphaProof LLM. That's why I emphasized that RL is done at inference time, like each proof you do it uses to make itself better for next time.

I think you are taking LLM to mean GPT style models, and I am taking LLM to mean transformers which output text, and they can be trained to do any variety of things.

HarHarVeryFunny · on Aug 11, 2024

A transformer, regardless of what it is trained to do, is just a pass thru architecture consisting of a fixed number of layers, no feedback paths, and no memory from one input to the next. Most of it's limitations (wrt AGI) stem from the architecture. How you train it, and on what, can't change that.

Narrow skills like playing Chess (DeepBlue), Go, or math proofs are impressive in some sense, but not the same as generality and/or intelligence which are the hallmarks of AGI. Note that AlphaProof, as the same suggests, has more in common with AlphaGo and AlphaFold than a plain transformer. It's a hybrid neuro-symbolic approach where the real power is coming from the search/verification component. Sure, RL can do some impressive things when the right problem presents itself, but it's not a silver bullet to all machine learning problems, and few outside of David Silver think it's going to be the/a way to achieve AGI.

adroniser · on Aug 11, 2024

I agree with you that transformers are probably not the architecture of choice. Not sure what that has to do with the viability of RL though.