Hacker Newsnew | past | comments | ask | show | jobs | submit | kolektiv's commentslogin

Absolutely, the technique of "you won't debate me so I must be right" has somehow risen from the playground to mainstream politics, but it's arrant nonsense. Not every idea is worthy of rational and moral consideration, and sometimes it is not weakness to reject even a proposition, simply humanity and a recognition of the underlying motive, which is not always to seek enlightenment, but sometimes to undermine the very idea of enlightenment.

> Not every idea is worthy of rational and moral consideration

Sure they are. If you can't justify your beliefs (or rejection of other beliefs) rationally from some set of principles or axoims, you can't claim that they're valid or that anyone else should abide by (or even respect) them.

Maybe the premises are something that you or others disagree with (e.g. to take one that we probably both disagree with: human value is predicated on intelligence), but if you can't argue from some set of premises, your beliefs are meaningless and invalid (and, almost always, inconsistent and hypocritical).

> sometimes it is not weakness to reject even a proposition, simply humanity and a recognition of the underlying motive

Can you read minds to discern motives? If not, then this is a false assertion.


TIL the word "arrant", thank you!

Indeed, and "what Mosley believed" was pretty well known at the time given his fascist activity over the preceding thirty years. Mosley was not likely to change his mind, and while there may well sometimes be joy and enlightenment in the practice of debate and rhetoric, you don't have to do it with a fascist. Bertrand Russell had nothing to prove and was perfectly reasonable in saying, effectively, that they were never going to agree and there's no point in wasting more paper in proving that.

Way, way off-topic now, but if you ever get a chance to see https://en.wikipedia.org/wiki/A_Disappearing_Number, don't miss it. It's rare to see a play weave mathematics and history into such a form, threading them through our modern world and showing the humanity of those who lived and breathed the equations on the page.

Particularly when, as in the case of UK water privatisation, there's a fairly convenient revolving door between the supposed regulator and the privatised water companies. Poacher turned entirely ineffective and rather friendly gamekeeper...

Nitpick - English water privatisation, I don't think the rest of the UK has private water companies - we certainly don't here in Scotland. Scottish Water is controlled by the Scottish Government.

You're absolutely right, English - my family north of the border are no doubt cursing me as we speak, I'll go and replace the cone as penance next time I'm there.

*England and Wales

It's useful as a term of understanding. It's not useful to OpenAI and their investors, so they'd like that term to mean something else. It's very generous to say that whether an LLM "knows" is irrelevant. They would like us to believe that it can be avoided, and perhaps it can, but they haven't shown they know how to do so yet. We can avoid it, but LLMs cannot, yet.

Yes, we can know whether something is true or false, but this is a system being sold as something useful. If it relies on us knowing whether the output is true or false, there is little point in us asking it a question we clearly already know the answer to.


I mean no disrespect, as I'm no more fond of OpenAI than anyone else (they are still the villains in this space), but I strongly disagree.

> It's useful as a term of understanding.

No it isn't. I dare you to try publishing in this field with that definition. Claiming all outputs are hallucinations because it's a probabilistic model tells us nothing of value about what the model is actually doing. By this definition, literally everything a human says is a hallucination as well. It is only valuable to those who wish to believe that LLMs can never do anything useful, which as Hinton says, is really starting to sound like an ego-driven religion at this point. Those that follow it do not publish in top relevant outlets any more, and should not be regarded as an expert on the subject.

> they haven't shown they know how to do so yet. We can avoid it, but LLMs cannot, yet.

This is exactly what they argue in the paper. They discuss the logical means by which humans are able to bypass making false statements by saying "I don't know". A model that responds only with a lookup table and an "I don't know" can never give false statements, but is probably not so useful either. There is a sweet spot here, and humans are likely close to it.

> If it relies on us knowing whether the output is true or false

I never said the system relies on it. I said that our definition of hallucination, and therefore our metrics by which to measure it, depend only on our knowing whether the output is true. This is no different from any other benchmark. They are claiming that it might be useful to make a new benchmark for this concept.


But an LLM is not answering "what is truth?". It's "answering" "what does an answer to the question "what is truth?" look like?".

It doesn't need a conceptual understanding of truth - yes, there are far more wrong responses than right ones, but the right ones appear more often in the training data and so the probabilities assigned to the tokens which would make up a "right" one are higher, and thus returned more often.

You're anthropomorphizing in using terms like "lying to us" or "know the truth". Yes, it's theoretically possible I suppose that they've secretly obtained some form of emergent consciousness and also decided to hide that fact, but there's no evidence that makes that seem probable - to start from that premise would be very questionable scientifically.

A lot of people seem to be saying we don't understand what it's doing, but I haven't seen any credible proof that we don't. It looks miraculous to the relatively untrained eye - many things do, but just because I might not understand how something works, it doesn't mean nobody does.


Nice to read some common sense in a friendly way. I follow your RSS feed, please keep posting on your blog. Unless you're an AI and secretly obtained some form of emergent consciousness, then not.


>But an LLM is not answering "what is truth?". It's "answering" "what does an answer to the question "what is truth?" look like?".

You don't actually know this right? You said what I'm saying is theoretically possible so you're contradicting what you're saying.

>You're anthropomorphizing in using terms like "lying to us" or "know the truth". Yes, it's theoretically possible I suppose that they've secretly obtained some form of emergent consciousness and also decided to hide that fact, but there's no evidence that makes that seem probable - to start from that premise would be very questionable scientifically.

Where did I say it's conscious? You hallucinated here thinking I said something I didn't.

Just because you can lie doesn't mean you're conscious. For example, a sign can lie to you. If the speed limit is 60 but there's a sign that says the speed limit is 100 then the sign is lying. Is the sign conscious? No.

Knowing is a different story though. But think about this carefully. How would we determine whether a "human" knows anything? We only can tell whether a "human" "knows" things based on what it Tells us. Just like an LLM. So based off of what the LLM tells us, it's MORE probable that the LLM "knows" because that's the SAME exact reasoning on how we can tell a human "knows". There's no other way we can determine whether or not an LLM or a human "knows" anything.

So really I'm not anthropomorphizing anything. You're the one that's falling for that trap. Knowing and lying are not unique concepts to conciousness or humanity. These are neutral concepts that exist beyond what it means to be human. When I say something, "knows" or something "lies" I'm saying it from a highly unbiased and netural perspective. It is your bias that causes you to anthropomorphize these concepts with the hallucination that these are human centric concepts.

>A lot of people seem to be saying we don't understand what it's doing, but I haven't seen any credible proof that we don't.

Bro. You're out of touch.

https://www.youtube.com/watch?v=qrvK_KuIeJk&t=284s

Hinton, the godfather of modern AI says we don't understand. It's not people saying we don't understand. It's the generally understanding within academia is: we don't understand LLMs. So you're wrong. You don't know what you're talking about and you're highly misinformed.


I think your assessment of the academic take on AI is wrong. We have a rather thorough understanding of the how/why of the mechanisms of LLMs, even if after training their results sometimes surprise us.

Additionally, there is a very large body of academic research that digs into how LLMs seem to understand concepts and truths and, sure enough, examples of us making point edits to models to change the “facts” that they “know”. My favorite of that corpus, though far from the only or most current/advances research , is the Bau Lab’s work: https://rome.baulab.info/


It’s not about what you think it’s about who’s factually right or wrong.

You referenced a work on model interpretability which is essentially the equivalent of putting on MRI or electrodes on the human brain and saying we understand the brain because some portion of it lights up when we show the brain a picture of a cow. There’s lots of work on model interpretability just like how there’s lots of science involving brain scans of the human brain… the problem is none of this gives insight into how the brain or an LLM works.

In terms of understanding LLMs we overall don’t understand what’s going on. It’s not like I didn’t know about attempts to decode what’s going on in these neural networks… I know all about it, but none of it changes the overall sentiment of: we don’t know how LLMs work.

This is fundamentally different from computers. We know how computers work such that we can emulate a computer. But for an LLM we can’t fully control it, we don’t fully understand why it hallucinates, we don’t understand how to fix the hallucination and we definitely cannot emulate an LLM in the same way we do for a computer. It isn’t just that we don’t understand LLMs. It’s that there isn’t anything in the history of human invention that we lack such fundamental understanding of.

Off of that logic, the facts are unequivocally clear: we don’t understand LLMs and your statement is wrong.

But it goes beyond this. I’m not just saying this. This is the accepted general sentiment in academia and you can watch that video of Hinton, the godfather of AI in academia basically saying the exact opposite of your claim here. He literally says we don’t understand LLMs.


Here’s where you're clearly wrong. The correct favorite in that corpus is Golden Gate Claude: https://www.anthropic.com/news/golden-gate-claude


Both are very good! I usually default to sharing the Bau Lab's work on this subject rather than Anthropic's because a) it's a little less fraught when sharing with folks who are skeptical of commercial AI companies, and b) because Bau's linked research/notebooks/demos/graphics are a lot more accessible to different points on the spectrum between "machine learning academic researcher" and "casual reader"; "Scaling/Towards Monosemanticity" are both massive and, depending on the section, written for pretty extreme ends of the layperson/researcher spectrum.

The Anthropic papers also cover a lot more subjects (e.g. feature splitting, discussion on use in model moderation, activation penalties) than Bau Lab's, as well--which is great, but maybe not when shared as a targeted intro to interpretability/model editing.


Oh, ouch, yeah. We already know that misinformation tends to get amplified, the last thing we need is a starting point full of harmful misinformation. There are lots of "causal beliefs" on the internet that should have no place in any kind of general dataset.


It's even worse than that, because the way they extract the causal link is just a regex, so

"vaccines > autism"

because

"Even though the article was fraudulent and was retracted, 1 in 4 parents still believe vaccines can cause autism."

I think this could be solved much better by using even a modestly powerful LLM to do the causal extraction... The website claims "an estimated extraction precision of 83% " but I doubt this is an even remotely sensible estimate.


Plus, regardless of what you might think of how valid that connection is, what they're actually collecting, absent any kind of mechanism, is a set of all apparent correlations...


And what most of us know, correlation doesn't necessarily equal causation


You don't have to assume people are going to be bad, but it's reasonable and prudent to expect it from people who have already shown themselves to be so (in this context).

I trust people until they give me cause to do otherwise.


Training on personal data people thought was going to remain private vs. stuff out in public view (copyright or not), are two different magnitudes of ethics breaches. Opt OUT instead of Opt IN for this is CRAZY in my opinion. I hope that the reddit post is WRONG on that detail but I seriously doubt it.

I asked Claude: "If a company has a privacy policy and says they will not train on your data and then decides to change the policy in order "to make the models better for everyone." What should the terms be?"

The model suggests in the first paragraph or so EXPLICIT OPT IN. Not Opt OUT


No, nbulka is correct. People should not shrug off and accept things that are wrong just because it's to be expected. It's one of the worst things you can do because as already pointed out, it just normalizes wrong.


It always surprised me somewhat that there isn't a set of traits covering some kind of `fs` like surface. It's not a trivial surface, but it's not huge either, and I've also found myself in a position of wanting to have multiple implementations of a filesystem-like structure (not even for the same reasons).

Tricky to make that kind of change to std lib now I appreciate, but it seems like an odd gap.


See David R. Hanson's "A Portable File Directory System" [0][1], for example: a 700 lines long implementation of early UNIX's filesystem API that piggy-backs on some sort of pre-existing (block-oriented) I/O primitives, which means you can do it entirely in-memory, with about another 300 lines of code or so.

I suspect that with OSes becoming much more UNIX-like the demand for such abstraction layers shrank almost to nothing.

[0] https://drh.github.io/documents/pds-spe.pdf

[1] https://drh.github.io/documents/pds.pdf


Having traits in the stdlib would be nice, but there's also the type parameter pollution that results from mocking which I think is also a turn-off, as TFA says about rsfs.

I have a Rust library to implement the UAPI config spec (a spec that describes which files and directories a service should look for config files in), and initally wanted to test it with filesystem mocks. After making some effort to implement the mock types and traits, plus wrappers around the `<F = StdFs>` types to hide the `<F>` parameter because I didn't want to expose it in the public API, I realized it was much easier to not bother and just create all the directory trees I needed for the tests.


Yeah having traits for this in the stdlib would be nice.

You might find Lunchbox [1] interesting. I needed an async virtual filesystem interface for a project a few years ago (and didn't find an existing library that fit my needs) so I built one:

> Lunchbox provides a common interface that can be used to interact with any filesystem (e.g. a local FS, in-memory FS, zip filesystem, etc). This interface closely matches `tokio::fs::` ...

It includes a few traits (`ReadableFileSystem`, `WritableFileSystem`) along with an implementation for local filesystems. I also used those traits to build libraries that enable things like read-only filesystems backed by zip files [2] and remote filesystems over a transport (e.g. TCP, UDS, etc) [3].

[1] https://crates.io/crates/lunchbox

[2] https://crates.io/crates/zipfs

[3] https://github.com/VivekPanyam/carton/tree/main/source/anywh...


Go has a basic FS abstraction in the standard library: https://dev.to/rezmoss/gos-fs-package-modern-file-system-abs...

But the stdlib one is a bit barebones. So people created: https://github.com/spf13/afero


This is one nice thing about go.

I think was trying to test something in Rust and I was surprised by how many people were OK with using real file's for unit testing.

It seems like a massive oversight for being able to use rust in a corporate environment.


> It seems like a massive oversight for being able to use rust in a corporate environment.

Why does being in a corporate environment matter?


Permissions are usually more strict then a hobby projects. So you can't just be writing and leaving things all over the file system without the possibility of failure.

Or maybe your using drives over a network and randomly your tests will now fail becasue of things outside your control. Things like that.

That's why when writing tests you always want them to actually do io like that.


Seems like you could just put your test files in a tmpfs, if leaving files laying around is an issue.


You need to mount and unmount fikesystems to do that.

Take a look at the library mentioned,afero, and you'll see how nice it handles working with the file system in tests.

You can have everything in memory, and a whole new fs in each test


The problem is it only tests trivial properties of the API. That might be enough for some things, but it’s not a panacea.


This is something I'm hopeful will fall out accidentally from Zig's new IO interface. If everyone doing IO has the implementation injected, mocks (fault injection, etc) become trivial, along with any other sort of feature you might want like checking S3 if you don't have a sufficiently recent local copy.


Mocking file system or network seems counter productive.

Complicated logic can be in pure functions and not be intertwined with IO if it needs to be tested.

Mocking IO seems like it won’t really capture the problems you might encounter in reality anyway.


It's not always about mocking (in my cases it hasn't been). Sometimes it is about multiple "real" implementations - a filesystem is itself an abstraction, and a very common one, it seems like it would at least sometimes be useful to be able to leverage that more flexibly.


I have a little system that takes a .git and mounts it as a fuse filesystem. Every commit becomes a directory with a snapshot of the project at that point in time.

You could read the whole .git in at once, and then you'd have an in-memory file-system, if you wanted to.

In any case, I agree with you: it's not about mocking.


Some examples where it would be useful: Exposing a zip file or exe-embedded data as a filesystem, or making an FS backed by 9P, WebDAV or SFTP.


Fault injection is much easier if you can mock IO. And you aren't really testing your software if you're not injecting faults.


My point is, you can give gibberish data to a processing function if it just takes in-memory data instead of being parameterized by “IO”.

If you parameterize everything by IO then you have to mock the IO


It's not just about gibberish data. It's read failures, write failures, open failures, for a variety of reasons including permission errors, disk errors, space errors, etc.

You basically want to test your code does something sensical with either every possible combination of errors, or with some random subset of combinations of errors. This can be automated and can verify the behaviour of your program in the edge cases. It's actually not usually that helpful to only test "we read corrupt garbage from the file" which is the thing you're describing here.


Same like mocking databases; yes they make your tests run faster and more “pure”, but you’re suddenly not really testing reality anymore.


Sure, but they're not mutually exclusive approaches. Having tests that run in a couple of seconds thanks to low I/O and cheap scaffolding setup/teardown can be a valuable addition to slower, comprehensive system and integration test suites.


And you can spot differences between your understanding of reality (test mockups) and how it behaves on the real filesystem. (In this case.)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: