I still think that main issue of hallucination is bad AI wrapper tools. AI must have every available public API with documentation, preloaded in the context. And explicit instructions to avoid using any API not mentioned in the context.
LLM is like a developer without internet or docs access, who needs to write code on the paper. Every developer would hallucinate in that environment. It's a miracle that LLM does so much with so limited environment.
It’s not a miracle, it’s statistics. Once you understand it’s a clever lossy text compression technique, you can see why it appears to do well with boilerplate(crud)/common interview coding questions. Any code request requiring any kind of nuisance will return the equivalent of the first answer of a stack overflow question. Aka. Kinda maybe in the ballpark but incorrect.
I was using LLM to help me with a PoC. I wanted to access an API that required OTP via email. I asked I believe Claude to provide me with an initial implementation of the interfacing with Gmail and it worked the first time. That showcases how you can use LLMs with day to day activities, in prototyping and synthesizing first versions of small components.
That's way more advanced than just coding interview questions that the solution could just be added to the dataset.
You need first to believe there is value in adding AI to your workflow. Then you need to search and find ways to have it add value to you. But you are ultimately the one that understands what value really is and who has to put effort into making AI valuable.
Vim won't make you a better developer just as much as LLMs won't code for you. But they can both be invaluable if you know how to wield them.
Interfacing with Gmail seems pretty well covered with example code in the docs[0], so I don't see the AI adding much value. And the fiddly bit seems to be configuring the tokens, permissions and various things in the administration console, so how does the AI help with that? Did you give it administrative access to your Google account?
“You need to believe” pretty much says it all. Your example isn’t convincing because there will be only one correct answer with little variation(the API in question).
I’m sure you’re finding some use for it.
I can’t wait for when the LLM providers start including ads in the answers to help pay back all that VC money currently being burned.
Both Facebook and Google won by being patient before including ads. MySpace and Yahoo both were riddled with ads early and lost. It will be interesting to see who blinks first. My money is on Microsoft who anded ands to Solitaire of all things.
If you don't believe computers have value you will default to writing on paper. That's what I meant with it. You need to believe first that there is something of value to be had there before exploring otherwise you are just aimlessly shooting and seeing what sticks. Maybe that gives you a better understanding of what I meant.
LLM is like a developer without internet or docs access, who needs to write code on the paper. Every developer would hallucinate in that environment. It's a miracle that LLM does so much with so limited environment.