Hacker News new | past | comments | ask | show | jobs | submit | more vhantz's comments login

+1

But there's a fundamental difference between Markov chains and transformers that should be noted. Markov chains only learn how likely it is for one token to follow another. Transformers learn how likely it is for a set of token to be seen together. Transformers add a wider context to msrkov chain. That quantitative change leads to a qualitative improvement: transformers generate text that is semantically plausible.


Yes, but k-token lookup was already a thing with markov chains. Transformers are indeed better, but just because they model language distributions better than mostly-empty arrays of (token-count)^(context).


> of course we can _in theory_ do error correction

Oh yeah? This is begging the question.


You should back those claims with something more than handwaving.


I am not trying to make him look bad. I just simply stated my reasons for hesitancy when it comes to contributing to the language. At any rate, his messages are available on Discord (unless deleted) and the pull requests are out there too, on GitHub.

I do not intend to have a collection of all the times he lost his cool.


Well, with your unsubstantiated claims, you are making him look bad, regardless of your intentions.


Yeah, but it would be even worse if I collected everything about him, that would even lean towards obsession.


Could you elaborate on the "hidden URL"?


Just a hard to guess one. Like using UUID or things.


I'd be way more comfortable with a VPN or some kind of access-gating for a personal calendar compared to just exposing it on the web and relying on obscurity..


Web calendars don't offer authentication. You have to build it into the URL anyway. If a service I use -- let's say my bank's chequing account, wants to offer a calendar I can subscribe to, I'll be given a URL that looks like https://somebank.com/api/cal?token=abc12345. Anyone who knows that URL can see the calendar as well. No different than my own web app where the URL is https://mysite.com/dev/cal_abc12345.ics.

For a personal calendar, I see no reason to make it any more secure than an obscure URL.


> how do we test for reasoning? if A -> B and B -> C, then something that can reason could conclude A -> C. If I give A -> B and B -> C to an LLM, and ask it about the relationship between A and C, it'll tell me about the transitive property of implication, graph theory, transitivity.

Not true.

A LLM might give you that answer x% of the time, x being a number less than 100. However, any thinking person answering your question, will give you the same answer, no matter how many times you ask it. That's the fundamental difference between thinking and statistically mapping and reproducing the structure of human language.


I'm pretty sure if you set the temp to 0 it will product the exact same output every time. its the sampling that produces the output variation.


> any thinking person answering your question, will give you the same answer, no matter how many times you ask it

Oh, will they? Will they really?!


Yes 2 + 2 is always 4 if you're not a language model and know basic arithmetics


Or if the language model answers the question by writing and running a Python script. Which is exactly what it can do.

Never mind that the tendency to give the exact same answer to the same question over time is not the exhibition of reasoning power you seem to think it is. Have you actually asked some people to multiply 10-digit numbers in their heads? Did they always get the same result? No? Well, there goes that argument.

We don't do anything that the LLMs don't do at this point, except adjust our weights (poorly) to move short-term context into long-term memory. Once that capability is added to the models -- which will happen soon enough, because why wouldn't it? -- where will the goalposts go next?


It's not about giving the same answer to the same question. It's about getting the right answer 100% of the time, in some very specific domains. If you know, understand and are able to use the basic rules of arithmetic, 2 + 2 only has one answer. If you know, understand and are able to use the basic rules of formal logic, the same premises will lead you to the same conclusion. Two trivial cases for any reasoning person . Two cases that illustrate how fundamentally different LLMs text generation is to reasoning. Two cases that illustrate some of the challenges that need to be solved to bring AI models closer the fiction so many on this site are desperately taking them to be.

Of course those who don't care about improving those systems also don't care about understanding their limits, which is unsurprisingly the case for a lot of people on this website.


You've failed to explain -- or to understand -- how the models get the right answer at all. The fact is, when you ask what 2+2 is, or what 2342+33222 is, the current ChatGPT model will give you the correct answer, even if you don't tell it to write code to get it. The first answer can simply be regurgitated. The second one, not so much.

Heck, let's throw in a square root for the fun of it: https://i.imgur.com/Q9eHAaI.png

How'd it do that, if it can't reason? That problem wasn't in its training corpus. Similar ones were, with different numbers, and that was enough.

Ask it 100 times, and it will probably get it wrong a certain percentage of the time... just like you would if I asked you to perform the calculation in your head.

Notice that the model actually got the LSD slightly wrong in this example. 188.58 would be a better estimate. It even screws up the way we do. That, to me, is almost as interesting as the fact that it can deal with the problem at all.

Of course those who don't care about improving those systems also don't care about understanding their limits

The people who do care about improving these systems seem to be doing a pretty awesome job.

As for the limits, they frankly don't seem to exist. They certainly aren't where you and your predecessors over the past few years have assured us they are.


So you've never had an encounter at a bar, moderately intoxicated, where you regrettably put the server on the spot by not understanding why you are supposed to pay the amount they're telling you you're supposed to pay?

Cause I have, and I have doubts it was significantly more complicated math than basic integer addition and subtraction. I also do use a calculator even for basic, low value, integer math, because what do you know, in my perfectness I often had an issue with numbers not ending up as what they were supposed to.

There's also the quite accessible and extensive history of human calculators and the extensive error correction strategies they had to employ because they'd keep cocking up calculations.

Come on...


> C++ has been dead and buried for years

LOL, lmao even.


In the Linux kernel


> It's surprising to me that the people most knowledgeable about the models often appear to be the biggest believers - perhaps they're self-interestedly pumping a valuation or are simply obsessed with the idea of building something straight from the science fiction stories they grew up with.

"Believer" really is the most appropriate label here. Altman or Musk lying and pretending they "AGI" right around the corner to pump their stocks is to be expected. The actual knowledgeable making completely irrational claims is simply incomprehensible beyond narcissism and obscurantism.

Interestingly, those who argue against the fiction that current models are reasoning, are using reason to make their points. A non-reasoning system generating plausible text is not at all a mystery can be explained, therefore, it's not sufficient for a system to generate plausible text to qualify as reasoning.

Those who are hyping the emergence of intelligence out of statistical models of written language on the other hand rely strictly on the basest empiricism, e.g. "I have an interaction with ChatGPT that proves it's intelligent" or "I put your argument into ChatGPT and here's what it said, isn't that interestingly insightful". But I don't see anyone coming out with any reasoning on how ability to reason could emerge out of a system predicting text.

There's also a tacit connection made between those language models being large and complex and their supposed intelligence. The human brain is large and complex, and it's the material basis of human intelligence, "therefore expensive large language models with internal behavior completely unexplainable to us, must be intelligent".

I don't think it will, but if the release of the deepseek models effectively shifts the main focus towards efficiency as opposed to "throwing more GPUs at it", that will also force the field to produce models with the current behavior using only the bare minimum, both in terms of architecture and resources. That would help against some aspects of the mysticism.

The biggest believers are not the best placed to drive the research forward. They are not looking at it critically and trying to understand it. They are using every generated sentence as a confirmation of their preconceptions. If the most knowledgeable are indeed the biggest believers, we are in for a long dark (mystic) AI winter.


> But of course the output contains errors sometimes. So do search engine results.

That's not true.

Search engine results are links and (non-AI generated) summaries of existing resources on the web. No search engine returns links to resources it generated as the result of the search query. Those resources can have innacurate information, yes, but the search engine itself does not returns errors.

LLMs output do not contain errors "sometimes". The output of an LLMs is never truthful nor false in itself. In the same way that the next word your keyboard suggests for you to type on a mobile device is never truthful nor false. It's simply the next suggestion based on the context.

These two methods of accessing information very clearly do not have the same limitations. A search engine provide link to specific resources. A LLM generates some approximation of some average of some information.

It's up to intelligent thinking people to decide whether a LLM or a search engine is currently the best way for them to parse through information in search for truth.


Obviously I meant that the content of the results can be inaccurate, and I assume you weren't actually confused about that.


The reality is nobody is hired to decide why things are done. Under capitalism, the why is to make profits, to increase shareholder value, if you will. That's it.

You don't plan what you don't control, you don't control what you don't own. For the vast majority (yes, including engineers), the only thing we own that is a part of the whole process of capitalist production is our ability to do some kind of work. When that ability is sold (for a wage/salary) to whoever possesses the resources to decide the why, we forego the right to decide how it's used. Just like when you sell someone a chair for example, you don't get to tell them where and how they can use that chair.

You can refuse to get or do this or that job, but you need to have a job. At the end of the day, we all need food and shelter to stay alive and the bills aren't gonna pay themselves. So ultimately, someone—among the vast majority of us who don't have a choice—will also have to do the job you refuse. So we shouldn't delude ourselves into thinking we have some sort of "superpower". We don't. That is, not unless we can collectively withold our ability to work and force the hand of the minority that actually decides why things are done.

A society where those who do the work decide why and how it's done, is a society where the working people own the resources. That is communism. That world is possible, but it won't build itself, you have to tirelessly fight for it. Look over at marxist dot com if you want to help out [1].

That doesn't mean that in the meantime we shouldn't strive to find meaningful work that fit with our values. We should, and whoever manages to find that good for you! You've been lucky (for now) but you won't be forever, and even if you are to be, the vast majority of people with pretty much the same skills and conditions as you, actually won't. We actually have to transform society if we want to secure that for ourselves and for everyone else.

[1] https://marxist.com/about-us.htm


Don't get distracted by the word communism!


Attributing "capitalism" to the increase of value to "shareholders" is a mischaracterization of capitalism. It's the only system that has worked for one singular reason:

People want more value for more work. If I work 12 hours picking potatoes and have to give 3/4 of them "according to my need" I will refuse to work. If you create fake jobs like "bolt counter" to insure 100% employment so the means can justify the ends (as communism believes) you end up with a collapsing economic system.

The current iteration of late-stage capitalism is not capitalism _at large_. It's feudalism clever disguised as an egalitarian economic system. The solution to this is not communism, it's undoing 100 years of stock-market oriented business design. We don't know what "late-stage" communism looks like simply because a revolution occurs several hundred years before it reaches that point. Not even the premiere implementers of communism, the Russians, could keep their system.

It doesn't work. In any country, in any system, with any group of people larger than 4.


That's basically the definition of capitalism- that shareholders (capital) spend their money to generate profit for themselves.

Like you note thoigh, we can have a free market and democracy without that dynamic


In a free market, you aim to increase your market share and push out your competitors. That's how monopoly are formed. Not by some perversion of the "pure" principles of the free market, but as a logical outcome of it. Turn back the clock to whenever you want in the course of capitalism, the same conditions with the same logic will drive you back to the same results.


I agree, but theres historical examples of markets in non-capitalist systems. Graeber's Debt makes a fun point that the two are at odds even, because like you say, capital really wants monopolies, which is the opposite of a free market. So by extention, if you support a free market you must on some level reject capitalism.


I find it interesting and somewhat relieving how many normal people are 'pro capitalism but anti shareholder'. Which means that its mostly an outreach and organization problem. If you are anti-shareholder you're by definition anticapitalist but you just don't know it yet.

Which is why the powers that be have to pass things like this: https://www.govtrack.us/congress/bills/118/hr5349/text/eh to keep people confused, and to conflate being opposed to shareholders (capitalism) with support for authoritarianism etc.


> The current iteration of late-stage capitalism

Don't fall for the trap of accepting "late-stage capitalism" as a valid concept - it's a made-up term that has no concrete or meaningful definition and is used by Marxists to constantly move goalposts and impute bad things into our functional (if suboptimal) economic system.


[dead]


> Every sentence in this reply is a strawman.

Zero evidence or explanation provided, meaning that it's far more likely that GP is making good points that you cannot respond to.

> here's an FAQ [1] that can help clarify what terms like capitalism, communism, feudalism, profit, etc. actually mean

This is a lie. Marxists intentionally repurpose and distort language, especially language around markets and governance structures, to deceive others into falling for their murderous and completely infeasible ideology. These are not what those terms "actually mean" - those are how Marxists use those terms. Claiming that those are what those terms "actually mean" or that that's how other people use them is false and deceptive.


Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: