Ultimately it depends what the model is trained on, what you're using it for, and what error-rate/severity is acceptable.
My main beef here involves the most-popular stuff (e.g. ChatGPT) where they are being trained on much-of-the-internet, marketed as being good for just-about-everything, and most consumers aren't checking the accuracy except when one talks about eating rocks or using glue to keep cheese on pizza.
That leads to a philosophical question: How widespread does dangerous misuse of a tool have to be before we can attribute the "fault" to the behavior/presentation of the tool itself, rather than to the user?
Casting around for a simple example... Perhaps any program with a "delete everything permanently" workflow. I think most of us would agree that a lack of confirmation steps would be a flaw in the tool itself, rather than in how it's being used, even though, yes, ideally the user would have been more careful.
Or perhaps the "tool" of US Social Security numbers, which as integers have a truly small surface-area for interaction. People were told not to piggyback on them for identifying customers--let alone authenticating them--but the resulting mess suggests that maybe "just educate people better" isn't enough to overcome the appeal of misuse.
This is like saying a gun that appears safe but that can easily backfire unless used by experts is completely fine. It's not an issue with the gun, the user should be competent.
Yes, it's technically true, but practically it's extremely disingenuous. LLMs are being marketed as the next generation research and search tool, and they are superbly powerful in the hands of an expert. An expert who doesn't blindly trust the output.
However, the public is not being educated about this at all, and it might not be possible to educate the public this way because people are fundamentally lazy and want to be spoonfed. But GPT is not a tool that can be used to spoonfeed results, because it ends up spoonfeeding you a whole bunch of shit. The shit is coated with enough good looking and smelling stuff that most of the public won't be able to detect it.
It does not appear safe. It clearly says at the bottom that you should checkup important facts.
I have in my kitchen several knives which are sharp and dangerous. They must be sharp and dangerous to be useful - if you demand that I replace them with dull plastic because users might inadvertantly hurt themselves, then you are not making the world a safer place, you are making my kitchen significantly more useless.
If you don't want to do this to my physical tools, don't do this to my info tools.
I attempted to respond with extending the knife-analogy, but it stops being useful for LLMs pretty quick since (A) the danger is pretty obvious to users and (B) the damage is immediate and detectable.
Instead it's more like lead poisoning. Nobody's saying that you need a permit to purchase and own lead, nor that you must surrender the family pewter or old fishing-sinkers. However we should be doing something when it's being marketed as a Miracle Ingredient via colorful paints and cosmetics and dusts and gases of cheap gasoline.
Ah, because some text saying "cigarettes cause cancer" is all that's needed to educate people about the dangers of smoking and it's not a problem at all if you enjoy it responsibly, right?
I'm talking about the industry and a surrounding crowd of breathless sycophants who hail them as the second coming of Christ. I'm talking about malign comments like "Our AI is so good we can't release the weights because they are too dangerous in the wrong hands".
Let's not pretend that there's a strong and concerted effort to educate the public about the dangers and shortcomings of LLMs. There's too much money to be made.
> it’s been a massive boost to my productivity, creativity and ability to learn
What are concrete examples of the boosts to your productivity, creativity, and ability to learn? It seems to me that when you outsource your thinking to ChatGPT you'll be doing less of all three.
i used to use gpt for asking really specific questions that i cant quite search on google, but i stopped using it when i realized it presented some of the information in a really misleading way, so now i have nothing
Not OP, but it helped me to generate story for a d&d character, cause I’m new to the game, and I’m not creative enough and generally done really care about back story. But regardless, i think ai causes far more harm than good.
Exactly this for me as well - think people really underestimate how fast it allows you to iterate through prototyping. It's not outsourcing your thinking, it's more that it can generate a lot of the basics for you so you can infer the missing parts and tweak to suit your needs.
I mainly use it for taking text input and doing stuff that's easy to describe but hard to script for. Feed it some articles and ask for a very specific and obscure bibliography format? Great! Change up the style or the wording? Fine. Don't it ask for data or facts.
It's not so much the understanding of it. It's the putting together a decent summary of the issues involved such that I can make a reasonable judgement and do further research as to what to do next.
Don't me wrong, it's not replacing expertise on important legal matters, but really helps in the initiation of solutions, or providing direction towards solutions.
On the simpler stuff, it's still useful. Drafting first templates, etc.
To do the same in Google would be 30 minutes instead of 1 minute in AI.
AI first, Google for focused search, Meat expertise third
How do you know what your lawyer is saying isn’t incorrect? It’s not like people are infallible. You question, get a second opinion, verify things yourself etc.
People aren't infallible, but in my experience they're much less likely to give me incorrect factual information than LLMs. Sometimes lawyers are wrong, of course, but they are wrong less frequently and less randomly. I've typically been able to get away with not verifying every single thing someone else tells me, but I don't think I'd be that lucky relying on ChatGPT for everything.
Edit: and it's a good thing, too, because I'd never be able to afford getting second legal opinions and I don't have time to verify everything my lawyer tells me.
Ideas and keywords to begin learning about a brand new topic. Primers on those topics.
Product reviews and comparisons
Picking the right tool for a job. Sometimes I don’t even know if something exists for the job till chatgpt tells me.
Identifying really specific buttons on obscure machines
Identifying plants, insects, caterpillars etc.
Honestly the list is endless. Those were just a handful of queries over the last 3 days. It is pretty much the only thing that can answer hyper specific questions and provide backing sources. If you don’t like the sources you can ask for more reliable ones.
Won't it generate tests that prove the correctness of the code instead of the correctness of the application?
As in: if my code is doing something wrong and I ask it to write tests for it, it will supply tests that pass on the wrong code instead of finding the problem in my code?
I use it for the same and usually have to ask it to infer the functionality from the interfaces and class/function descriptions. I then usually have to review the tests for correctness. It's not perfect but it's great for building a 60% outline.
At our company I have to switch between 6 or 7 different languages pretty regularly and I'm always forgetting specifics of how the test frameworks work; having a tool that can translate "intent to test" into the framework methods really has been a boon
Any time someone says LLMs have been a massive boost to their productivity, I have to assume that they are terrible at their job, and are using it to produce a higher volume of even more terrible work.
Those replies are a dime a dozen. Unless they’re poignant, well thought out discussions on specific failures, they’re usually from folks that have an axe to grind against LLMs or are fearful that they will be replaced.
aye. every attempt I've tried to use ChatGPT to do some moderate to advanced python scripting had it fail at something.
for the most part the code is alright... but then it references libraries that are deprecated or wrong or weren't included for some reason. example:
one time I was pulling some sample financial data from Quandl and asked it why it wasn't working right -- it mentioned that I was referencing a FED dataset that was gone. And that was true, it was old code that I pulled out of a previous project. So I asked it for a new target dataset... and it gave me an older one again.
Okay, fine, this time find me a new one -- again, was wrong. Didn't take a lot of time to find that, decided to find my own.
Go find one, then send that back to the AI... and it mangles the API key variable. An easy fix, but again, still didn't work.
The goal was to get it done quickly, to get some sample data to test a pipeline, but in practice it required help every step, and I probably could have just written it on my own from scratch in roughly the same time.
I normally ask for pointers to sources and documentation. ChatGPT does a decent job, Claude is much better in my experience.
Often when starting down a new path we don't know what questions we should be asking, so asking a search engine is near impossible and asking colleagues is frustrating for both parties. Once I've got a summarised overview it's much easier to find the right books to read and people to ask to fill in the gaps.
If you think coding is slinging strings that make the compiler do what you want, I pity the fool that has to work alongside or after you on code projects.
You always check multiple sources like I’ve been doing with all my Google searches previously. Anecdotally, having checked my sources, it’s usually right the vast majority of the time.
I used it for learning biology, e.g. going down from human outer layer to lower layer (e.g. from organs to cells) to understand inner workings. It's possible to verify everything from everywhere in the Internet. The problem is finding an initial material that could present things in this specific or for you.
Yeah, but ChatGPT is much more dynamic. I learn better when I follow my interests. E.g. I am shown this piece of info, questions pop up in my mind that I want answers to before I can move on and it can go into a rabbit hole.
That actually was a problem for me in school, that even for subjects that I was interested in, I had trouble going by the exact order, so I started thinking about something else with no answers.
It has made studying or learning about new things so much more fun.
This argument is specious and boring: everything an LLM outputs is "hallucinated" - just like with us. I'm not about to throw you out or even think less of you for making this mistake, though; it's just a mistake.
QAnon folks, for example, are biological models that are trained on propaganda and misinformation.
Trauma victims are models trained on maladaptive environments that therapists take YEARS to fine-tune.
Physicians are models trained on a corpus of some of the best training sets we have available, and they still manage to hallucinate misdiagnoses at a staggering rate.
I don't know why everyone here seems to think human brains are some collection of Magical Jesus Boxes that don't REGULARLY and CATASTROPHICALLY hallucinate outputs.