Wouldn't you say the same thing for most of the people? Most of the people suck at verifying truth and reasoning. Even "intelligent" people make mistakes based on their biases.
I think at least LLMs are more receptive to the idea that they may be wrong, and based on that, we can have N diverse LLMs and they may argue more peacefully and build a reliable consensus than N "intelligent" people.
The difference between a person and a bot is that a person has a stake in the outcome. A bot is like a person who's already put in their two weeks notice and doesn't have to be there to see the outcome of their work.
Even if it was a consensus opinion among all HN users, which hardly seems to be the case, it would have little impact on the other billion plus potential customers…
The issue is that most people, especially when prompted, can provide their level of confidence in the answer or even refuse to provide an answer if they are not sure. LLMs, by default, seem to be extremely confident in their answers, and it's quite hard to get the "confidence" level out of them (if that metric is even applicable to LLMs). That's why they are so good at duping people into believing them after all.
> The issue is that most people, especially when prompted, can provide their level of confidence in the answer or even refuse to provide an answer if they are not sure.
People also pull this figure out of their ass, over or undertrust themselves, and lie. I'm not sure self-reported confidence is that interesting compared to "showing your work".
How is this a counter argument that LLMs are marketed as having intelligence when it’s more accurate to think of them as predictive models? The fact that humans are also flawed isn’t super relevant to a $200/month LLM purchasing decision.
> Wouldn't you say the same thing for most of the people? Most of the people suck at verifying truth and reasoning. Even "intelligent" people make mistakes based on their biases.
I think there's a huge difference because individuals can be reasoned with, convinced they're wrong, and have the ability to verify they're wrong and change their position. If I can convince one person they're wrong about something, they convince others. It has an exponential effect and it's a good way of eliminating common errors.
I don't understand how LLMs will do that. If everyone stops learning and starts relying on LLMs to tell them how to do everything, who will discover the mistakes?
Here's a specific example. I'll pick on LinuxServer since they're big [1], but almost every 'docker-compose.yml' stack you see online will have a database service defined like this:
Assuming the database is dedicated to that app, and it typically is, publishing port 3306 for the database isn't necessary and is a bad practice because it unnecessarily exposes it to your entire local network. You don't need to publish it because it's already accessible to other containers in the same stack.
Another Docker related example would be a Dockerfile using 'apt[-get]' without the '--error-on=any' switch. Pay attention to Docker build files and you'll realize almost no one uses that switch. Failing to do so allows silent failures of the 'update' command and it's possible to build containers with stale package versions if you have a transient error that affects the 'update' command, but succeeds on a subsequent 'install' command.
There are tons of misunderstandings like that which end up being so common that no one realizes they're doing things wrong. For people, I can do something as simple as posting on HN and others can see my suggestion, verify it's correct, and repeat the solution. Eventually, the misconception is corrected and those paying attention know to ignore the mistakes in all of the old internet posts that will never be updated.
How do you convince ChatGPT the above is correct and that it's a million posts on the internet that are wrong?
Wow. I can honestly say I'm surprised it makes that suggestion. That's great!
I don't understand how it gets there though. How does it "know" that's the right thing to suggest when the majority of the online documentation all gets it wrong?
I know how I do it. I read the Docker docs, I see that I don't think publishing that port is needed, I spin up a test, and I verify my theory. AFAIK, ChatGPT isn't testing to verify assumptions like that, so I wonder how it determines correct from incorrect.
I suspect there is acsolid corpus of advices online that mention the exposed ports risk. Alongside with flawed examples you mentioned. Narrow request will trigger the right response. That's why LLMs are still requiring basic understanding of what exactly you plan to achieve.
Yeah, most people suck at verifying truth and reasoning. But most information technology employees, above intern level, are highly capable of reasoning and making decisions in their area of expertise.
Try asking an LLM complex questions in your area of expertise. Interview it as if you needed to be confident that it could do your job. You'll quickly find out that it can't do your job, and isn't actually capable of reasoning.
I think at least LLMs are more receptive to the idea that they may be wrong, and based on that, we can have N diverse LLMs and they may argue more peacefully and build a reliable consensus than N "intelligent" people.