Hacker News new | past | comments | ask | show | jobs | submit login

> The biggest time sink for me is validating answers so not sure I agree on that take.

But you're assuming that it'll always ne validated by humans. I'd imagine that most validation (and subsequent processing, especially going forward) will be done on machines.




If that is the way to get quality, sure.

Otherwise I feel that power consumption is the bigger issue than speed, though in this case they are interlinked.


Humans consume a lot of power and resources.


The basic efficiency is pretty high.


How does the next machine/LLM know what’s valid or not? I don’t really understand the idea behind layers of hallucinating LLMs.


By comparison with reality. The initial LLMs had "reality" be "a training set of text", when ChatGPT came out everyone rapidly expanded into RLFH (reinforcement learning from human feedback), and now there's vision and text models the training and feedback is grounded on a much broader aspect of reality than just text.


Given that there are more and more AI generated texts and pictures that ground will be pretty unreliable.


Perhaps. But CCTV cameras and smartphones are huge sources of raw content of the real world.

Unless you want to take the argument of Morpheus in The Marix and ask "what is real?"


So let’s crank up total surveillance for better auto descriptions of a picture.

We aren’t exchanging freedom for security anymore, what could be reasonable under certain conditions, we just get convenience. Bad deal.


That's one way to do it, but overkill for this specific thing — self-driving cars or robotics, or natural use of smart-[phone|watch|glass|doorbell|fridge], likely sufficient.

Total surveillance may be necessary for other reasons, like making sure organised crime can't blackmail anyone because the state already knows it all, but it's overkill for AI.


Could you link to a paper or working POC that shows how this “turtles all the way down“ solution works?


I don't understand your question.

This isn't turtles all the way down, it's grounded in real world data, and increasingly large varieties of it.


How does the AI know it’s reality and not a fake image or text fed to the system?


I refer you to Wachowski & Wachowski (1999)*, building on previous work including Descartes and A. J. Ayer.

To whit: humans can't either, so that's an unreasonable question.

More formally, the tripartite definition of knowledge is flawed, and everything you think you know has a Munchausen trilemma.

* Genuinely part of my A-level in philosophy


So we get the same flaws as before with a higher power consumption.

And because it’s fast and easy we now get more fakes, scams and disinformation.

That makes AI a lose-lose not to mention further negative consequences.


Not if you source your training data from reality.

Are you treating "the internet" as "reality" with this line of questions?

The internet is the map, don't mistake the map for the territory — it's fine as a bootstrap but not the final result, just like it's OK for a human to research a topic by reading on Wikipedia but not to use it as the only source.


Sooner or later someone is going to figure out how to do active training on AI models. It's the holy grail of AI before AGI. This would allow you to do base training on a small set of very high quality data, and then let the model actively decide what it wants to train on going forward or let it "forget" what it wants to unlearn.


I wasn’t expecting your response to be “the truth is unknowable”, but was hoping for something of more substance to discuss.


Then you need a more precisely framed question.

1. AI can do what we can do, in much the same way we can do it, because it's biologically inspired. Not a perfect copy, but close enough for the general case of this argument.

2. AI can't ever be perfect because of the same reasons we can't ever be perfect: it's impossible to become certain of anything in finite time and with finite examples.

3. AI can still reach higher performance in specific things than us — not everything, not yet — because the information processing speedup going from synapses to transistors is of the same order of magnitude as walking is to continental drift, so when there exists sufficient training data to overcome the inefficiency of the model, we can make models absorb approximately all of that information.


Does the AI need to know or the curator of the dataset? If the curator took a camera and walked outside (or let a drone wander around for a while), do you believe this problem would still arise?


And who validates the validation?


the compiler/interpreter are assumed to work in this scenario.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: