Well, then what does this say of LLM engineers at literally any AI company in existence if they are delivering AI that is unreliable then? Surely, they must take responsibility for the quality of their work and not blame it on something else.
I feel like what "unreliable" means, depends on well you understand LLMs. I use them in my professional work, and they're reliable in terms of I'm always getting tokens back from them, I don't think my local models have failed even once at doing just that. And this is the product that is being sold.
Some people take that to mean that responses from LLMs are (by human standards) "always correct" and "based on knowledge", while this is a misunderstanding about how LLMs work. They don't know "correct" nor do they have "knowledge", they have tokens, that come after tokens, and that's about it.
> they're reliable in terms of I'm always getting tokens back from them
This is not what you are being sold though. They are not selling you "tokens". Check their marketing articles and you will not see the word token or synonym on any of their headings or subheadings. You are being sold these abilities:
- “Generate reports, draft emails, summarize meetings, and complete projects.”
- “Automate repetitive tasks, like converting screenshots or dashboards into presentations … rearranging meetings … updating spreadsheets with new financial data while retaining the same formatting.”
- "Support-type automation: e.g. customer support agents that can summarize incoming messages, detect sentiment, route tickets to the right team."
- "For enterprise workflows: via Gemini Enterprise — allowing firms to connect internal data sources (e.g. CRM, BI, SharePoint, Salesforce, SAP) and build custom AI agents that can: answer complex questions, carry out tasks, iterate deliverables — effectively automating internal processes."
These are taken straight from their websites. The idea that you are JUST being sold tokens is as hilariously fictional as any company selling you their app was actually just selling you patterns of pixels on your screen.
it’s not “some people”, it’s practically everyone that doesn’t understand how these tools work, and even some people that do.
Lawyers are running their careers by citing hallucinated cases. Researchers are writing papers with hallucinated references. Programmers are taking down production by not verifying AI code.
Humans were made to do things, not to verify things. Verifying something is 10x harder than doing it right. AI in the hands of humans is a foot rocket launcher.
> it’s not “some people”, it’s practically everyone that doesn’t understand how these tools work, and even some people that do.
Again, true for most things. A lot of people are terrible drivers, terrible judge of their own character, and terrible recreational drug users. Does that mean we need to remove all those things that can be misused?
I much rather push back on shoddy work no matter what source. I don't care if the citations are from a robot or a human, if they suck, then you suck, because you're presenting this as your work. I don't care if your paralegal actually wrote the document, be responsible for the work you supposedly do.
> Humans were made to do things, not to verify things.
I'm glad you seemingly have some grand idea of what humans were meant to do, I certainly wouldn't claim I do so, but I'm also not religious. For me, humans do what humans do, and while we didn't used to mostly sit down and consume so much food and other things, now we do.
>A lot of people are terrible drivers, terrible judge of their own character, and terrible recreational drug users. Does that mean we need to remove all those things that can be misused?
Uhh, yes??? We have completely reshaped our cities so that cars can thrive in them at the expense of people. We have laws and exams and enforcement all to prevent cars from being driven by irresponsible people.
And most drugs are literally illegal! The ones that arent are highly regulated!
If your argument is that AI is like heroin then I agree, let’s ban it and arrest anyone making it.
People need to be responsible for things they put their name on. End of story. No AI company claims their models are perfect and don’t hallucinate. But paper authors should at least verify every single character their submit.
>No AI company claims their models are perfect and don’t hallucinate
You can't have it both ways. Either AIs are worth billions BECAUSE they can run mostly unsupervised or they are not. This is exactly like the AI driving system in Autopilot, sold as autonomous but reality doesn't live up to it.