This is not only average. This is actual magic.
So let's be real: the SQL is average. The joins are average. The chart is average. And that took us less than 5 minutes and that was amazing, that is the entire point.
You did not need a data engineer to model your HubSpot data, or a meeting to agree on whether it should be last-click or first-click or linear or time-decay or whatever.
You needed a query, written fast, on data you already own. Your LLM wrote it. You confirmed it made sense. Your manager got a link.
Honestly, average is clearly magic; prove me wrong.
I'll give it a go. This is generated slop, and the poor, factory-made quality of the writing undercuts every aspect of the argument.
Author here; I suppose the... side eye awkward monkey meme was a bit lost on you; it was written on purpose. Funnily enough. Everything is slop if you want it to be slop. This however, was written by hand my little hands. Now I might be a bad writter - that is indeed another subject.
At that point, asking the model to e.g. note any ambiguities about the task at hand is exactly equivalent to asking it to evaluate any input
This point is load-bearing for your position, and it is completely wrong.
Prompt P at state S leads to a new state SP'. The "common jumping off point" you describe is effectively useless, because we instantly diverge from it by using different prompts.
And even if it weren't useless for that reason, LLMs don't "query" their "state" in the way that humans reflect on their state of mind.
The idea that hallucinations are somehow less likely because you're asking meta-questions about LLM output is completely without basis
"Not in the slightest" is an overreach, the paper the second level down from that link doesn't really support the conclusion in the blog post - the paper is much more nuanced.
Are they going to fib to you sometimes? Yes of course, but that doesn't mean there's no value in behavioural metaqueries.
Like most new tech, the discussion tends to polarise into "Best thing evah!" and "Utter shite!" The truth is somewhere in between.
It's nothing like "most new tech".
Most new tech tends to be adopted early by young people and experienced techies. In this case it is mostly the opposite: The teens absolutely hate it, probably because the shitty AI content does not inspire the young mind, and the experienced techies see it for what it is. I've never seen such "new tech" which was cheered on by the proverbial average "boomers" (i.e. old people doing "office jobs", not the literal age bracket) and despised by the young folks and experienced experts of all ages.
Judging from Claude Code and the sheer number of “Make Your Favorite Anime Crush Into An AI” SaaSes on the market, I’d posit that both the young and experienced are quite enthusiastic about the new tech.
No mate, this tech is marketed as superintelligence. Nation of PhDs in a datacentet. Yadda,yadda,yadda. No in-betweens please. Why is it not delivering after so many years and hundreds of billions in investment?
Name me a new bit of tech that hasn't been hyped beyond reasonable bounds. And yes, this is one of the worst examples. But saying it doesn't have its uses isn't reasonable either.
None was hyped like this ever before. What are you talking about? Mac was about "it just works" (and it f*ing did), iPhone was "a phone, an iPod and Internet access device". Need more? Microsoft Excel - actually more powerful if you know the tool compared to the bullshit machine. C#, the programming language: "Java done right". And it bloody was! What is in common: None of these techs were hyped beyond reasonable doubt. They were hyped a bit, but not to the level of bullshit LLMs. And none of these techs claimed to do incredible stuff only to underdeliver. After so much money burnt, yes I want to see that nation of PhDs. I want to see AI "writing all the code" in six months (Anthropic claimed this in January this year). Enough of bullshit and people being told they are stupid for not knowing how to win the lottery system and comparing lottery systems. Show me the superintelligence or shut the f. up.
Reveal = show something that was hidden previously.
Seems like the appropriate word to use about a source code leak.
The words proposed by you are suitable for describing the consequences of a revelation, while no longer containing any hint about their original cause, so using them would have lead to a more verbose sentence for delivering the same information.
It wasn't hidden previously. It was fairly well-understood.
The CC source doesn't "reveal" a single thing about anything other than Anthropic internals. It says nothing about the industry at large, certainly nothing new.
And this:
The words proposed by you are suitable for describing the consequences of a revelation, while no longer containing any hint about their original cause
doesn't make any sense. There is no "revelation" here. And the word "reveal" contains no connotations whatsoever about the "cause" of a "revelation".
This does what the best speculative fiction does, attempts to stretch and expand your understanding of the real world by presenting a provocative fictional reality.
The author is trying to get you to speculate on the kind of intelligence that would say this about humans.
GPT-2, o1, Opus...been here so many times. The reason they do this is because they know it works (and they seem to specifically employ credulous people who are prone to believe AGI is right around the corner). There haven't been significant innovations, the code generated is still not good but the hype cycle has to retrigger.
I remember when OpenAI created the first thinking model with o1 and there were all these breathless posts on here hyperventilating about how the model had to be kept secret, how dangerous it was, etc.
Fell for it again award. All thinking does is burn output tokens for accuracy, it is the AI getting high on its own supply, this isn't innovation but it was supposed to super AGI. Not serious.
> All thinking does is burn output tokens for accuracy
“All that phenomenon X does is make a tradeoff of Y for Z”
It sounds like you’re indignant about it being called thinking, that’s fine, but surely you can realize that the mechanism you’re criticizing actually works really well?
>I remember when OpenAI created the first thinking model with o1 and there were all these breathless posts on here hyperventilating about how the model had to be kept secret, how dangerous it was, etc.
I've read that about Llama and Stable Diffusion. AI doomers are, and always have been, retarded.
Genuine question - if you don't think the models are improved or that the code is any good, why do you still have a subscription?
You must see some value, or are you in a situation where you're required to test / use it, eg to report on it or required by employer?
(I would disagree about the code, the benefits seem obvious to me. But I'm still curious why others would disagree, especially after actively using them for years.)
The assumption that the other person made was that I would only use it for coding. If you look through my other comments today, I suggest that they are useful for performing repetitive tasks i.e. checking lint on PR, etc. Also, can be used for throwaway code, very useful.
I don't think the issue is with the model, it is with the implication that AGI is just around the corner and that is what is required for AI to be useful...which is not accurate. The more grey area is with agentic coding but my opinion (one that I didn't always hold) is that these workflows are a complete waste of time. The problem is: if all this is true then how does the CTO justify spending $1m/month on Anthropic (I work somewhere where this has happened, OpenAI got the earlier contract then Cursor Teams was added, now they are adding Anthropic...within 72 hours of the rollout, it was pulled back from non-engineering teams). I think companies will ask why they need to pay Anthropic to do a job they were doing without Anthropic six months ago.
Also, the code is bad. This is something that is non-obvious to 95% of people who talk about AI online because they don't work in a team environment or manage legacy applications. If I interview somewhere and they are using agentic workflow, the codebase will be shit and the company will be unable to deliver. At most companies, the average developer is an idiot, giving them AI is like giving a monkey an AK-47 (I also say this as someone of middling competence, I have been the monkey with AK many times). You increase the ability to produce output without improving the ability to produce good output. That is the reality of coding in most jobs.
AI isn't good enough to replace a competent human, it is fast enough to make an incompetent human dangerous.
uhh the model found actual vulnerabilities in software that people use. either you believe that the vulnerabilities were not found or were not serious enough to warrant a more thoughtful release
Like think carefully about this. Did they discover AGI? Or did a bunch of investors make a leveraged bet on them "discovering AGI" so they're doing absolutely anything they can to make it seem like this time it's brand new and different.
If we're to believe Anthropic on these claims, we also have to just take it on faith, with absolutely no evidence, that they've made something so incredibly capable and so incredibly powerful that it cannot possibly be given to mere mortals. Conveniently, that's exactly the story that they are selling to investors.
Like do you see the unreliable narrator dynamic here?
I don't see the problem here. How would you have handled it differently? If you released this model as such without any safety concern, the vulnerabilities might be found by bad actors and used for wrong things.
Vulnerabilities were found, probably a few by bad actors, when GPT4 was released. Every vulnerability found now is probably found with AI assistance at the very least. Should they have never released GPT4? Should we have believed claims that GPT4 was too dangerous for mere mortals to access? I believe openAI was making similar claims about how GPT4 was a step function and going to change white collar work forever when that model was released.
The point is that this whole "the model is too powerful" schtick is a bunch of smoke and mirrors. It serves the valuation.
Its far more simple to believe that they are releasing it step by step. Release to trusted third parties first, get the easy vulnerabilities fixed, work on the alignment and then release to public.
Do you don't believe that the vulnerabilities found by these agents are serious enough to warrant staggered release?
On the other hand I've gotten to use opus-4.6 and claude code and the quality is off the charts compared to 2023 when coding agents first hit the scene. And what you're saying is essentially "If they haven't created God, I'm not impressed". You don't think there's some middleground between those two?
Also they just hit a $30B run-rate, I don't think they're that needy for new hype cycles.
Didn't OpenAI say something similar about GPT-3? Too dangerous to open source and then afew years later tehy were open sourcing gpt-oss because a bunch of oss labs were competing with their top models.
If there's limited hardware but ample cash, it doesn't make sense to sell compute-intensive services to the public while you're still trying to push the frontier of capability.
that's more or less what I'm saying. "Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available", translated from bullshit, means "It would've cost four digits per 1M tokens to run this model without severe quantization, and we think we'll make more money off our hardware with lighter models. Cool benchmarks though, right?"
Think of all the things that took hundreds/thousands/millions of years to develop and mature, which humans have managed to destroy in relatively short order.
Every 50 years we cycle out an entirely new batch of thinking humans. What cognitive legacy is it exactly that you think is going to be self-preserving?
You're talking about system altering the environment. GP was talking about the system altering itself. The system is a massive self-stabilizing collection of feedback loops. Unlike the static environment[0], it's incredibly hard to intentionally move such system to a different equilibrium. If it weren't, we'd already solved all the thorny world problems long ago.
--
[0] - Any self-stabilizing system that operates much slower than us - such as ecosystems or climate - is, from our perspective, static.
> The system is a massive self-stabilizing collection of feedback loops.
Source? lol
Actual, measurable literacy is in the toilet. The average person reads at the 6th grade level. What sort of equilibrium are you trying to claim we are in right now?
> Unlike the static environment, it's incredibly hard to intentionally move such system to a different equilibrium.
It's the strongest possible memetic weapon humans would have - I think it's entirely consistent with the meta-nature of the book, especially the self-conscious part.
If the take is religion is itself the weapon and the depiction given is mere evidence of that, OK, that's at least avoids the ending being totally awful. HOWEVER
The book spends much of its time saying the transcendent cannot even be represented, to people, to us the read -- then just represents it, and in a tawdry christian way.
I think the violation of that norm, as well as the ending being played straight -- with literally a long paragraph explaining with ideaspace is... that's a fourth-wall break into christianity imv
Which makes the whole book read as, "the issue with humans is our physical bodies in a fallen world which are limited. just die, go to heaven, then you can know/represent/understand everything. Yay! Death!"
OK. Just kinda naff.
It reads as a religious person who accidentally wrote a good sci-fi book then hurridly, at the end, reminds us all that its really a parable with a Noble Message that in Death all things are trascended.
I read the book and at no time did I think "Christianity". It seems like motivated reasoning on your part. At no time did the book ever preach, or was even moralistic.
I'm referring to the ending of the published version, which is quite different than v1, which ends abburptly, in particular the sections before and after:
> “She steps back from him. She flexes what could be wings.”
> “In ideatic space everything is possible and everything is real and every metaphor is apt. She sees a galaxy of shining points: people, all the people who have ever existed, packed almost densely enough to form a continuum, living and dead, real and fictional and borderline. Similar people, who think in similar ways and who stand for similar things, are closer together. Significant people, the famous and iconic, are brighter. There are stars for inanimate entities, too, and events and abstracts: countries, homes, works of art, births and first steps and words, shocks and dramas, archetypes, numbers and equations, long arcs of stories, grand mythologies, philosophies, politics, tropes. Every truth and lie is here. Ideatic space itself—the human conception of it, at least—is here too, a fixed point embedded inside itself. The idea of the Unknown Organization is here. The idea of Adam Quinn is here. Marie, rising, waking, is here. And occupying the same space as the first brilliant spiral is a second, its counterpart, a galaxy whose points are relationships between the points of the first: what each person means to each other person. Loves, mutual and unrequited; admirations, aspirations, intimidations, fears, and revulsions. Conceptions and misconceptions. There is Adam’s shining link with Marie, and Marie’s link back to Adam. And Marie’s link to the Organization. And at the core of the whole dazzling ecosystem is an ultimate singular point, to which every other point is connected: humanity.
> And the whole thing, the entirety of human ideatic space, is being torn apart. U-3125 hangs above it, a monumental, blinding new presence, a singular entity more massive and luminous than both spirals combined. Its malevolent gravity drags humanity and all human ideas into its orbit, warping them beyond recognition. Beneath it, within its context, everything becomes corrupted into the worst version of itself. It takes joy and turns it into vindictive glee; it takes self-reliance and turns it into solipsistic psychosis; it turns love into smothering assault, pride into humiliation, families into traps, safety into paranoia, peace into discontent. It turns people into people who do not see people as people. And civilizations, ultimately, into abominations.
> U-3125 is titanic in its structure, brain-breaking in its topology. It comes from another part of ideatic space, a place where ideas exist on a scale entirely beyond those of humans. Its wrongness and[…]”
> “She sets a course. Outbound, to the deepest limit of ideatic space.”
Etc. The references to U3125 incarnating, and it being The Adversary. And the explicit ascention narrative with Mary getting wings, flying thru clouds of Ideas -- which are actually animate and incarnated in this world, ie., they are souls. I mean, it's terribly misjudged ending
It is like nails on a chalkboard.
reply