I asked Chat GPT which antacid medications are contraindicated for some medication I'm on. Easily found through NICE.
It made up a severe risk of death taking a very common medicine combo. It was super convincing, even giving information on how long to avoid taking them together. It was pure bullshit.
I think as much as hyping the benefits we need to hype the flaws and dangers.
If the public at large learn to trust these LLMs too quickly and deeply, that's a hard hole to dig out of. Skepticism in all information sources is a key critical thinking life skill.
That skill is harder to apply the more natural the information source seems and the more ambient the information.
I've been using Bing AI instead of Google for a few days to test it out. It gives responses with citations. A few times already it has hallucinated details that aren't on the cited website at all, but that do sound very plausible (for example something about the horns of a bull pointing up when I asked about the etymology of "bullish".)
I think it comes down to the fact that most stuff is actually like 90% bullshit so when you train a model on a corpus of everything you end up with a decent bullshit generator. Which is fine for many purposes but I'm not sure it will take over search.
Lmao. Now I’m imagining any questions posted anywhere like Ask Hacker News, Ask Reddit, or Quora being filled with everyone that doesn’t know the answer replying “I don’t know.”
Product q&a on Amazon has this. It is because Amazon emails you other people's questions about products you have previously purchased, and many users don't realize there is no need to reply just to be polite.
I wish that Amazon would put a note suggesting the if you don't know, don't answer the question... I know they send emails to people that bought something when questions come up... but for the sake of sanity, why do people actually input an "I don't know" answer?
I've said this already today on another comment, but the manual from OpenAI themselves pretty much says the thing is not to be trusted for pretty much anything.
Not sure if you tried GPT-4 but my experience with 4 is quite different. It has been quite bullshit free, though not completely. For example I asked it to contrast oral and injectable semaglutide formulations. It did a bang up job. One thing I always do is ask it for evidence. And then I look the references up. Sometimes the references don’t say exactly what it said they will. I come back and have a discussion with it. It’s a back and forth for sure, but it’s an order of magnitude more informative and at least as accurate as a google search result is for me. I come out of it learning much more than I wanted to every single time. It’s pretty much part of my daily work and life, and I spend hours on it hitting the api limits now. So I’m not sure what others are telling about how it’s still shit.
If this AI system progresses no further, it’s already transformative. It’s an intelligence multiplier for some (I’m squarely in that category, just not sure if it’s 2x or 5x), but clearly for a lot of others it’s going to be something that takes away their livelihood.
I already use it for my coding work, one-way only, since I can’t paste proprietary code into it yet. The day something gpt-4-smart can plug into my orgs codebase, each of us will get at least 2x more efficient, conservatively.
This tracks with my experience as well. It's not perfect and can be a little frustrating at times, but the capability provided is extremely powerful to the point of being game-changing.
It can pretty much augment any workflow in a net-beneficial way, provided you properly account for its shortcomings.
If you’re discussing a scientific topic just ask it for references, it generally gives the first author, year and journal along with the title, for most reasonably cited papers!
> The day something gpt 4 smart can plug into my orgs codebase, each of us will get at least 2x more efficient, conservatively.
Can't wait for GPT-4 VSCode integration. What I want most is to have it see the errors and files (file formats, directories, etc) so it will automatically know what is where and how it is structured. Not just code, but also data files.
In the mean while I am starting to format my code in such a way as to contain this information, put there by me by hand. Fully documented files are better for GPT.
Thats why im not falling for the hype again, i think i have seen like 2 previous AI hype cycles and all those fell off after like 3 months. Same with full automatic driving stuff.
This is a black swan event which, if anything, vindicates the hype from dreamers who were ahead of their time. They had a lot of the theory right but lacked the horsepower. This will complete the last mile for other AI tech and be the lacquer that gives it the all important finishing touch.
The black swan event includes the unexpected emergent capabilities of LLMs, which can pass professional examinations and exhibit many attributes of general intelligence, combined with the broad availability to market and massive pressure on big industry players to adapt and innovate or die.
If things keep going this way then pretty soon the black swans are going to outnumber the white ones.
Single stories of LLM success are just as problematic as single stories of LLM failures. As always. The fact is that we have a dangerous tool at hand that absolutely REQUIRES skepticism if you are trying to get *facts* out of it.
I agree that LLMs are extremely likely to impact many areas of work, particularly bullshit work. But as it stands you absolutely cannot use them as fact machines, the results can be catastrophic.
What it does well, among others:
- Scaffolding text, breaking writers block etc.
- Compose basic texts from minimal input, for example for bullshit tasks -> I generated an internal "vision statement" during a Miro workshop for my team by inputting a bunch of bullet points gathered from the team members brain storming. It created a concise, fluid text that everybody liked. It's now the vision statement.
- Point you in good directions, give you ideas
What it does NOT well, among others:
- provide factual responses. all responses MUST be scrutinized because they are likely containing false information. This is very dangerous for society ("Can I take this medicine with this other medicine?")
- Compose creative texts that are coherent and novel. ChatGPT texts can be quite fun but they rarely make sense beyond very superficial screening and convey no deeper message.
However, ChatGPT-like tools are used with a lot of naivety and often blind acceptance instead of using them as tools to aid your work.
For the most part I agree, but will qualify that the caveats you listed apply to the out of the box version of ChatGPT. I expect these limitations will be overcome by using it as programming substrate and connecting it to other models and APIs.
I am impressed by what we’ve seen from ChatGPT so far, but am especially excited to see what industry does with LLMs as new type of building block.
Fully agreed! I suspect that there will always be this subtle risk of catastrophic failure though, something observed in a lot of AI systems. The scrutiny filter may become less relevant but will likely not be less needed to prevent 1 in 10,000 bad responses.
If 100,000 people ask critical questions then 10 people might run into potentially catastrophic consequences. ChatGPT is a powerful tool and will only become more so but it will probably not be perfectly reliable by any means due to the nature of the system.
I am excited for the generative AI future and whatever the hell is still coming. Only those who adapt will survive.
I don’t put much stock in the claims about GPT4 “passing” professional exams. Many copies of previously administered exams are available, and the exams are formulaic in their construction (to make them stable, predictable targets).
A compressive copy of the internet brute-forcing its way through an exam (which it may even have digested already) is really not interpretable as performing well on the exam. It’s a meaningless measure because the tests were not designed with this use in mind.
I knew something changed after AlphaGo. Compute could do what we thought only true intelligence can do. So I agree, LLMs are not a black swan. It will change everything, nevertheless.
I keep reading these opinions that LLMs are just doing some advanced form of copy paste. Actually, we don't know what they are doing. Are they actually doing some form of modelling and abstraction? Seems likely to me.
"I keep reading these opinions that LLMs are just doing some advanced form of copy paste. Actually, we don't know what they are doing. Are they actually doing some form of modelling and abstraction? Seems likely to me."
This is exactly the problem with AI. For business or government, the answer is as important as the methodology employed. A black box does not work for the majority of use cases.
"This is not even an argument, only a disagreement."
It's a straw man.
The need for transparency in process is known, documented, and undisputed. Your comment has no relevance. My brain might be a black box, but I can still communicate and/or document the specifics of a process.
Can [insert your preferred model] do that? Didn't think so.
Yes, in many situations it is important that we get AIs to explain how they came up
with a result, and that we are confident in the correctness of the explanation.
That is probably the most interesting thing you can work on now. Is that a sideshow? I don't think so, although I wouldn't mind that at all.
AlphaGo is basically the same as image recognition, not sure why you thought AlphaGo was special compared to detecting faces. AlphaGo works the same way but maps game structures on a Go board instead of mapping facial structures and comparing those to known people.
That's exactly the kind of attitude I am talking about. You can rationalise pretty much everything and keep moving the goal post. I've seen AI do something nobody thought was possible. If that doesn't influence how you think about AI I cannot help you.
It was able to beat the top GO player. What does it matter if that technology is also used in image recognition. I'd say that if its 'brain' can play games and recognize people, or in case of ChatGPT4, can talk and play chess, this is all starting to sound pretty 'General' to me.
But the human brain is a much more mature product (although I will concede there are still a ton of bugs to work out that we really have no idea how to repair because the product is extremely complex and still pretty fragile).
Exactly. All the people saying LLM's aren't important because it is just auto-complete, really don't connect the dots that humans are also just auto-complete.
Our brain excels at pattern matching. We attribute a lot of human abilities to “intelligence”, but the line between that and pattern-matching (or “autocomplete” ad it’s being referred as here) is being challenged by these latest LLMs.
If it can look at a picture, explain what’s in it, and hypothesize about physics inside the picture’s environment, is it “just pattern matching”?
Those are way too broad terms for this conversation to go anywhere. The point is, what's to say the LLM is not doing "extrapolation via approximation" or some process that is analog to how we think? We barely understand how the brain works to start with.
This kind of problem is (trivially?) solvable through the ReAct framework, like LangChain etc. Basically you get good data, vector embed it, and make sure the LLM knows where to look for accurate information.
Slight tangent but preferably i would like an AI that can go against the status quo, the elite classes, the mainstream discourses and ignore the circus.
I think one of that most interesting things that can come of bots like GPT-X is that it can make new connections, unravel "stuff", do extremely intricate deductive reasoning.
Be the data driven arbiter of the the truth for everyone, not just the tiny established classes or cultural hegemonies.
The ideological and cultural noise increasingly smokescreen any realpolitical, material or resource-oriented analysis of actual economic power structures in the world in the last years, AI could be a godsend (or the opposite unfortunately).
I remember reading a sci-fi years ago about the stuff an AI concluded when asked philosophical questions that were so bizarre and frightening that people shut it down, and i'm sure we're in the same territory with political and scientific analysis.
It's either dangerous to the orders of the world, or not that interesting and borg like on a philosophical level.
> Slight tangent but preferably i would like an AI that can go against the status quo, the elite classes, the mainstream discourses and ignore the circus
There's a Charlie Brown comic about this sort of thing. Although I think it's an edit, not an original comic. Something like "They are never going to give you the education you need to overthrow them"
Similarly they are never going to give you an AI that will side with a you against them.
Agreed... An example would be the recent Jordan Peterson mentions in Twitter around the bias in the system. Even without leaning one way or the other politically speaking it should be concerning... because any similar bias can easily be used to target "you".
- it chunks inputs, with some overlap, but this can destroy context
- the retrieved passages, when they come from different documents, have no apparent relation or could be mistakenly considered related
- the model struggles to correlate data between the document snippets, taking half an idea from one side and half from the other side and mixing them up in something that doesn't really make sense
Implementation details. Check out what's going on with LangChain, augmented retrieval, etc. We'll be able to create knowledge bases on specific subjects with vetted data, and get the bot to retrieve and summarize appropriate results while providing a citation to the original source.
I think AI's hype machine is definitely going to negatively affect its future. If people are made to believe it's able to automate everything and it gets just one crucial thing wrong, the loss of credibility at a time when it's supposedly a finished product will be considerable. If it gets something wrong and it's made clear it's still under development the reaction will be less negative. Currently the hype puts the current state of AI somewhere between Rosey the robot and Skynet. I don't think the realization of how far we are from either will do much to promote adoption.
You should never medicate yourself after using a LLM, not in 2023 anyway! You should not run any code you have not vetted, or take any pills you don't know what they do.
Thats the problem, most people do take pills without knowing how they work, because doctor told them to take them. Also no one (probably) is vetting all the code they run on their computers. So this advice is kinda victim blaming. You ran the code that you did not verify? too bad, your own fault. You took advice from LLM? too bad, your own fault. You got tricked in the street by a crook? Too bad, your own fault.
So what are we discussing, is, should we try to put guardrails around dangerous things so that inexperienced/vulnerable people would not get damaged.
For this specific example, an LLM in front of NICE will produce the correct result. It's a matter of time before this case will be fixed.
From a non-expert's perspective through, an LLM is very dependable, unless it completely goes off the rail. How would anyone know when to trust it and when to be skeptical?
That's not how LLMs work. Including the NICE, if it actually isn't already, will not guaranty a "correct" result. It will increase the chance that the response is directly coming from the training but there is no guarante. If you are interested in why this is the case you can read this [1] post from Stephen Wolfram on how ChatGPT and in general LLMs work. This might give some insight on how and when to to use it more effectively.
Could we instead have the LLM use NICE similar to how Bing uses the web as a reference instead? It still wouldn't guarantee a correct result, but it would that increase the reliability, right?
Could be. That means feedinng it the data from NICE in a prompt rather than relying on the data being in training set. That will intuitively increase the reliability of the answer (I have to look into it a bit more).
On one hand if you already have the NICE data at hand, you already have your answer. There won't be a need for a search enginge or a chat bot other than to perhaps, summerize the data (which is valuable on its own). On the other hand, if you don't have the NICE data at hand, the correctness of the response relies on the accuracy of the search method in order to feed the correct page to LLM. This is an issue additional to LLMs accuracy.
At first it might seem like an easy problem to solve but when one wants to engineer a solution to a nice streamlined product, it's more challenging; unsurprisingly.
Trust but verify. We’ll get processes built around this for accuracy sensitive applications. I imagine it will look something like a GAN configuration with two or more LLMs trained to adversarially critique and fact check the outputs of the other(s). Might even mimic the relationship between the hemispheres of our brains.
Two models trained from different lineages won't hallucinate the same. When you want to check for hallucination the cost is to run the task on two models. For now. But soon it looks like LLMs will be better calibrated. It seems they are well calibrated after pre-training but become less calibrated after fine-tuning and RLHF. The last stage breaks its abilities to estimate confidnce scores correctly.
ChatGPT is like the robot from Asimov's short story Liar!: It doesn't tell you the truth, it tells you what it thinks you want to hear. Instead of reading minds it returns a statistically plausible response to queries similar to yours.
I asked it "What is the accuracy and confidence level (you must rate it out of 10) that the above answer is correct ?"
It always generates a random number when it is caught making up bullshit sometimes claims that the information 10/10 accurate and LLM's always provides accurate information.
OpenAI did exact the test for GPT-4. The raw, non-fine-tune GPT-4 is quite good at predicting confidence level ("highly calibrated" by their words). But the RLHF fine-tuning process seems ruin its calibration. Figure 8 on page 12 of GPT-4 Technical Report shows this dramatic changes before & after fine-tuning.
This has been my experience: Chat GPT is astonishing in terms of producing plausible natural language text that looks related to the question. That's an astounding achievement imho. It does not (and I suspect wasn't expected to) produce correct answers to questions.
I had an extended back-and-forth in which I asked ChatGPT how to undo mistakenly marking a message as junk in Messages on iOS.
It gave me a 7-step answer that was completely wrong. Then I said no, you can't do that, it apologized and gave me an 8-step answer that was completely wrong. When I pointed out in this case there is no "Junk" folder in iOS Messages, it apologized for the confusion and gave me another 8-step answer that was completely wrong. When I pointed out why that one wouldn't work, it gave up and said recovery was impossible and that I would need to contact the sender to re-send, and be more careful when marking messages as junk. This was still wrong, as recovery is possible, just not by any of the means it described.
So yeah, I have been super-impressed by the quality of output from these LLMs, but I cannot imagine actually relying on one for anything where correctness matters.
Giving me a nice list of Korean shoegaze bands, sure. Its step-by-step for how to become a better volleyball player will be great for my daughter, and was better than the answer to the same question from Google. But correctness? No.
I don't want to have a calculator (or say bookkeeping software) that gives correct results most of the time but not always, and then hear from the developers that it will get better with each iteration. I need a calculator that is correct 100% of time, not even 99.999%, because otherwise I can't rely on it at all.
In other words, the utility of a calculator that is correct only 99% of time is zero, since you can't even tell when it's wrong.
Confining an LLM to the very narrow domain of "calculators" is a mistake, I think.
You wouldn't say "a programmer that is 99% correct is worthless, I need 100%". I'm pushing it, but for a more fair comparison I'd say measure it against a programmer. How often are we wrong? 75% of the time? :) being generous here. It's the tools that make us productive.
I don't know about you specifically, but I don't think you'll be very productive with a bare terminal lacking any modern IDE-like or even REPL facilities. I'll ask you to come up with instantly working code every time, all the time. It doesn't work like that. You need iteration and I believe these kinds of AI have the same issues as us. There are wrong sometimes (often) and need feedback.
> You need iteration and I believe these kinds of AI have the same issues as us.
It's funny how we resort to humanizing the machines when their results are inaccurate. We don't do that with the calculator, because it's expected to be 100% bug free. When there's a bug in the calculator code we expect it to be fixed, not gradually improved.
Speaking of bugs: mistakes in code is one thing, wrong output because of a fundamental flaw in the algorithm is another. The statistical machines we are dealing with work as intended, or at least the wrong output the top comment here brings up is not a bug, it's a feature. That's the difference.
Literally LLMs get much better with chain of thought, feedback, and/or consensus.
Gpt-3 performance on MultiArith goes from 18% to 92% with all three. This isn't some hackneyed anthropomizing. Countless research papers showing massive improvement with these processes.
That's (IMO) too narrow view of what a "machine" is. Complex machinery of any kind never is 100% correct and needs constant correction and maintenance. I still think approaching this as a "calculator" is awkward at best.
> Complex machinery of any kind never is 100% correct and needs constant correction and maintenance
Computers are extremely close to 100%, we generally expect a CPU to never make errors even after years of working. If it starts making any errors at all we throw it away and make a new one.
Do you code in checks to check the calculations made by the CPU? I've never ever seen anyone do that. If a CPU starts making errors we throw it away. A typical CPU will make many quadrillions of correct calculations before its first error, I'd say that is basically 0 errors.
I suspect that humans have an accuracy lower than 99.999%, and are similarly capable of producing confidently incorrect results.
GPT has a lot of hype and hysteria around it, but demanding 100% accuracy from it is a bit over the top imo. It doesn't need to have 100% accuracy on any arbitrary prompt in order to be a useful and valuable tool.
Yes. WE have a higher standard for computers. But humans produce a ton of bull shit answers. Are we trying to produce an 'answer machine' or a 'human mimic'. Because the bull shit makes it more human. And a wrong answer does not mean it's broken, what goes on in its neural net to produce a confident wrong answer might be similar to our own.
Do you have colleagues, bosses or reports that are correct 100% or the time? 99.999%? I would love colleagues that are 99% accurate, I certainly am not unfortunately.
> Do you have colleagues, bosses or reports that are correct 100% or the time? 99.999%? I would love colleagues that are 99% accurate, I certainly am not unfortunately.
I see this analogy all the time in these comment sections but it's not a very good one. A person is not a tool. One of the great achievements of humanity, and in computing in particular, is that we make tools that are more accurate than we are.
I expect a hammer to deliver a hard forward blow 100% of the time. If one out of one hundred times it delivers a hard backward blow, I cannot use it on a job site due to risk of injury to the user. The same is true of a calculator being used for financial transactions. And the same is true of a LLM that would be used for drug interactions as discussed in this thread. We already have 100% accurate ways of pulling data from a drug database—it's called SQL. A tool that is not as accurate is in at least some ways a step backward, even if it's easier to use due to its natural language interface.
Except hammers already do not work as expected 100% of the time, as evident by them painfully hitting the hands of the workers in mishaps. Yet we still use them.
How is a hammer "not working as expected" if you hit your hand with it? This makes no sense. If I hit my hand with a hammer, what am I supposed to expect except causing pain and injury?
If I try to light a cigarette and accidentally set my beard on fire due to my own clumsyness, did the lighter malfunction? No, the lighter did exactly what it was supposed to do, it was my hand that didn't do what I expected.
If you want to continue with semantics, the worker did not expect to hit his thumb until the hammer bounced off of it, hence the tool did not work as expected. Until regular hammers are able to put nails into wood by themselves, we are talking about the success of the complete system that includes both the hammer and its user.
Either way, when working with humans we already deal with plenty of misses and mistakes. Programmers create 10 to 20 bugs per 1000 lines of code, 9 out of 10 businesses fail, accountants make detrimental blunders, etc.
The point is that in the end the ML systems need only to replace these already non-perfect systems. I'll refrain from judging the consequences of this as I think it's out of scope.
Nice packpedal, except now what you said makes no sense anyway because the discussion was about tools doing what you expect them to do, not "tool + human systems" doing what you expect them to. These are two different things.
Do you think a carpenter who hits themselves with a hammer blames the hammer or themselves? Or are you going to unironically tell me they would blame the hammer + human system?
Do you have a more specific point you're trying to make?
The discussion is about AI partially replacing human colleagues. In practice, people already are not 100% reliable. You make a reasonable request and someone makes a stupid blunder instead. That's the "hammer" hitting your thumb. Maybe you were not specific enough or maybe they didn't listen but the damage is done.
Our work processes already take mistakes and iterative refinement into account. If AI, in some specific niche, is cheaper and makes no more mistakes than humans do, it gets the job.
It doesn't need to be perfect or perfectly reliable. Some guardrails will be built into it, and we'll come to trust over time.
>not "tool + human systems" doing what you expect them to. These are two different things.
Please explain how?
>Or are you going to unironically tell me they would blame the hammer + human system?
As your tooling gets more complex, yes it is very easy to have a non-zero blame assignment to each party. Look at any human+machine system where complex failure conditions can occur.
> Except hammers already do not work as expected 100% of the time, as evident by them painfully hitting the hands of the workers in mishaps. Yet we still use them.
The hammer did work 100% as expected. It's the human, who is fallible, that hit their hand with the hammer. My analogy stands. We make mistakes, we want tools that do not. LLMs should not be compared to humans, they should be compared to other tools.
So what about large asteroid strike levels of improbability? I have to reject your premise that there is no probability where you can stop worrying about the problem.
(This might be a moot point, because I'm not sure current methods can ever get to this level of accuracy, due to limitations of the training data. Needs an entirely new method or a clever insight to optimize for truthfulness with unclean training data, and InstructGPT hasn't made much progress on this, and it might not even be possible)
Surely this is "just" a matter of teaching the LLM to recognize this is a job for say Wolfram Aplha and generate a query to it, then feed the response back to you?
Imagine a life or death situation where a calculation held the key. Would you rather use a calculator that is right 50% of the time or 99% of the time. The obvious choice highlights that the utility is not exactly zero. But I see your point.
airplanes do not execute in a vacuum eg: computer. There are external factors that might cause an airplane to fail. A fully working airplane all things equal in terms of weather and a good pilot, will be safe.
you're missing the point. my point was: if everyone waited til something is 100% safe/complete/satisfaction guaranteed, there would be no progress in human civilization. we need to be willing to take a chance on a promising technology in order to allow for further progress. imagine if everyone said i'll wait for planes to be 100% safety guaranteed before i'll get on one - we wouldn't be flying today.
I'd be very careful assuming this, if you read the manual from OpenAI themselves they clearly provide warnings on this. It has become a vary more convincing liar apparently.
This is an interesting comment, and highlights the accuracy problem with LLMs. But this is but one type of query that LLMs are used for, akin to an informational web search (E.g. How tall is the Eiffel tower).
I find there are two kinds of people in the world. First are saying that LLMs are bullshitting. The second are bullshitting about whatever LLMs are saying.
I find there are more than two kinds of people. Among them some are skeptical and some are not. The latter are good at exploring, even if blind alleys while the former keep them in check. We’re lucky we have some variety otherwise we’d risk go full force into dead ends or we’d ignore fruitful possibilities.
The duality of mankind. On one hand, we have skeptics asserting that LLMs are masters of malarkey. On the other, we've got those regurgitating the verbal gymnastics of said LLMs. But lo and behold, there exists a third breed: those who delight in deciphering the eloquent dance between the bullshitters and the bullshitted.
but it's very good for experts or those who're very knowledgeable and want to dig quick information / bootstrapping something, since they can digest whether the information is correct, useful, or wasteful.
Stop evaluating tech by misapplying it. ChatGPT is not really good or reliable at factual questions, yes. But it's fantastic at transforming input data that you provide to it.
> Our findings indicate that the importance of science and critical thinking skills are strongly negatively
associated with exposure, suggesting that occupations requiring these skills are less likely to be impacted by
current language models. Conversely, programming and writing skills show a strong positive association
with exposure, implying that occupations involving these skills are more susceptible to being influenced by
language models...
Am I reading this correctly that the assumption here is that programming and writing skills aren't reliant on critical thinking?
There is also a table which indicates exposure to LLMs in various models and it shows Mathematicians to have 100% exposure. This bit is more puzzling to me. Maybe I am misunderstanding something here.
Programming skills that do not need critical thinking are more susceptible of being influenced indeed.
Don't forget that a lot of science requires computer programming these days.
This is the root of it: The more "genericc" your work is. The more its "out there on the internet" the more GPT can learn about it..
So, a lot of engineers that are just doign teh same old trick: Writing HTTP endpoints, parsing json. Mapping data types.. Yes that could be automated.
However, modelling a problem domain to code, and the core business logic of your code, which is where your "added value" comes from. And is mostly unique: Thats hard for GPT.
This is also why I try to convince engineering teams to optimize for maximum time spend on the core added value logic. The business logic layer. Not all the fluff around it, such as parsing, serialization, authenitcation, database connection.. These should be a constant cost C, once they setup you spend most of your time on the business logic.
When you see GPT program, its just repeating tricks to simple problem over and over again.. Its not really good yet
> So, a lot of engineers that are just doign teh same old trick: Writing HTTP endpoints, parsing json. Mapping data types.. Yes that could be automated.
And to be fair, automation for all of that already pretty much exists.
And to be fair, even though it exists, there's a huge majority that is done manually, even though it needs zero or very little manual "creativity" between specification and implementation
I agree. Usually you're already working within some framework or DSL where you can describe what you want to do. Ideally, you already have an idiomatic codebase enabling you to succinctly transform specifications into code.
Let's take parent poster's issues:
> Writing HTTP endpoints, parsing json. Mapping data types.
The generative model (for now) won't figure out for you: authentication, authorization, input form schema, JSON schema, required & optional fields, field constraints, entity modeling, indexing, query optimization, just to name a few basic issues we are looking at when "just developing CRUD apps".
If any of those go bad, it would result in 400s, 500s, performance or security issues.
It is exists where it can be supported. Lots of small businesses don't have the bandwidth to maintain additional infra that automates this sort of work.
Which sorta brings me back around: it's likely the Big Corps that are going to be trialing GPT first because they have the excess money and resources to play with it. How useful will it be in the end?
> The more its "out there on the internet" the more GPT can learn about it.
Interesting point. Do you think this will mean less and less domain experts will share their specific domain knowledge on a subject on their own personal blogs / twitter / open internet just so it can't be mined by ChatGPT?
This makes sense. There’s also a huge corpus of text (training data) available on internet for the inherently repetitive or general tasks which is helpful for this systems.
But I wonder how do they go from this to mathematics using the same line of reasoning while we’ve seen that math is not LLMs’ strong suit.
Also thing about the huge corpus of text not available on the internet, but available to these systems (just because e.g. Microsoft has it, so can get data, perhaps anonymized from private GitHub repos, Copilot, telemetry from WSL, VSCode, and Azure, and so on).
>Am I reading this correctly that the assumption here is that programming and writing skills aren't reliant on critical thinking?
No, they're just listing some skills that have both negative and positive associations with exposure. I don't think they intend to make a statement about whether the skills themselves are correlated. It's possible for them to be positively correlated with each other, even if one is positively and the other is negatively correlated with exposure (think multidimensional vectors).
> There is also a table which indicates exposure to LLMs in various models and it shows Mathematicians to have 100% exposure. This bit is more puzzling to me. Maybe I am misunderstanding something here.
Right, another one that stood out to me is the listing of financial investments as the most affected industry. I'm certainly not letting GPT-4 make investment choices for me. I guess it could summarize analyst reports or something? They seem to be making some very speculative assumptions about what ML will be capable of in the future. The paper would be more useful if they didn't go off like this and stayed closer to published ML research.
1. Specifying. You are working on getting all requirements, and laid out the specification of the expected behavior of a system.
2. Translating. Once the specification is a nailed down. It would be taken into the hands of translators and put into actual code.
Both involves critical thinking, but translators probably more susceptible to LLM's negative influence.
Also any programmers at one time plays both roles, so it is not about a particular person is going to be deemed useless, more like that part of programming work (translating), is discounted, not longer as valuable, for everyone.
May I add "scheming" to your list of programming activities?
At the macro level, this would be system design. Making sure that the architecture is extensible, transparent to failures, and easy to understand and develop for.
At the micro level, this would involve coming up with clever algorithms to solve specific problems. In ways that are simple or efficient or parsimonious.
In any case, this scheming activity, which would slot in between the specifying and translating that you speak of, would involve deeply understanding both the specification (and how it might evolve) and computing substrate (its APIs, what is efficient and what is not, etc.). I might even call it some combination of wisdom and deviousness?
I find GPT-4 awesome and certainly it will impact "programming", it's an open question how -- will there be a superclass of GPT enabled programmers that will take the jobs of the rest?
Right now GPT-4 is helping me solve real tasks at work and it feels like I'm the only accountant who has xcel, but surely others will catch on.
My hunch here is that, because of the sometimes haphazard hallucinations of ChatGPT, you would need to always review the code that ChatGPT has crated. Also as in a group of people you sometimes agree on specific styles, but I think that ChatGPT won’t adhere to such things, making its code unfamiliar and really hard to follow. We as humans have a hard time agreeing on how code should be written, does ChatGPT have a better notion? And how maintainable is that? How does it make a change?
>Also as in a group of people you sometimes agree on specific styles, but I think that ChatGPT won’t adhere to such things, making its code unfamiliar and really hard to follow.
ChatGPT wont, but a facade variant specifically trained and marketed for code probably would. It could even had a configuration for coding style, formatting and linting rules, and programming paradigm (more functional, more declarative, invent a DSL, and so on)
Aren't they? I'd say they can be reduced to a number of architectural tendencies (e.g. composition over inheritance, DSL or language-native code), go-to design and code organization patterns, and pure stylistic choices (like variable naming, short or larger functions, etc.)
Given that the primary author is OpenAI affiliated and some of the assumptions are far fetched (e.g., on programming - not even coding) this reads like a sponsored post pamphlet to me.
You have to give openai credit - their marketing strategy and effort is amazing. There’s spam everywhere. Albeit those claiming their jobs are already being replaced or those claiming they’ve written entire apps using chat gpt are quieting down a bit. They’ve either been called out for their nonsense or they got ahead of themselves.
But nowadays, you can profit off people's attention alone directly. This does not prove that there is no genuine interest. I am geniuenly interested myself.
But, there are financial incentives in generating a lot of sensational content, whether positive or negative about almost everything including AI, Rust, political issues, even scientific issues like climate or pandemics, etc.
What they actually did is ask 5 random people to rate what thought a language model could do to help different professions. These 5 random people don't know anything about the professions they're rating, just what anyone off the street knows, and they know as much about GPT as anyone who has briefly played with it.
The title should have been "We asked 5 friends to see what they thought about GPT and labor market"
Under "3.4 Limitations of our methodology" - "3.4.1 Subjective human judgments"
> A fundamental limitation of our approach lies in the subjectivity of the labeling. In our study, we employ annotators who are familiar with the GPT models’ capabilities. However, this group is not occupationally diverse, potentially leading to biased judgments regarding GPTs’ reliability and effectiveness in performing tasks within unfamiliar occupations. We acknowledge that obtaining high-quality labels for each task in an occupation requires workers engaged in those occupations or, at a minimum, possessing in-depth knowledge of the diverse tasks within those occupations. This represents an important area for future work in validating these results.
For sure. I had to read the paper to discover it's trash.
But if you read the abstract, it looks like they thoroughly assessed how GPT will impact many professions.
The sentence "Using a new rubric, we assess occupations based on their correspondence with GPT capabilities, incorporating both human expertise and classifications from GPT-4." does not scream to me "We asked 5 random people with no expertise in either these professions or GPT-4 what they thought and report those results".
Eh ... they report pretty good alignment with other studies on the topic, so there's at least some signal. Whether their labels contribute any new information is unknown, and the forecasts of any of the literature they cite are untestable (expect by the wait-and-see approach).
That said, some attempts at prognostication are preferable to a collective shrug, and people at OpenAI are better positioned than others to assess what GPT-4+ is (will be) capable of, while clearly under-equipped to map that capabilities to the intricacies of 1000 occupational categories.
I can't find how many people labeled the DWA task descriptions, where did you got that number?
The article seems to describing the labeling here:
> Human Ratings: We obtained human annotations by applying the rubric to each ONET Detailed Worker Activity (DWA) and a subset of all ONET tasks and then aggregated those DWA and task scores at the task and occupation levels. To ensure the quality of these annotations, the authors personally labeled a large sample of tasks and DWAs and enlisted experienced human annotators who have extensively reviewed GPT outputs as part of OpenAI’s alignment work (Ouyang et al., 2022).
I understand the authors, four, did the initial labeling and then asked an undefined set of people to the rest of the labeling.
It is stated that they use the same annotators that trained/filtered chatGPT’s output. I would assume its a rather large group (my company has 10 auditors in Nicaragua). The label biases are mostly stemming from that group and - as suggested - could be removed by using experts in each field to annotate the labels. But given some responses here by experts, I am sure those expert labels would have their very own biases :p
The paper is not of highest quality indicated by typos and mislabels but the analysis is likely as good as it can get for the given methodology. Dismissing any signal is just pure hubris.
I think more parallels should be drawn with what we were doing before: Googling it.
Perhaps it's because ChatGPT seemed to happen much more suddenly than Google became a programming resource, but we're using them in much the same way. Asking for pre-made solutions, explanations, troubleshooting tips etc.
ChatGPT just does the job way better. But no-one was worried Google would put knowledge workers out of a job.
Everything I've used ChatGPT for so far I could, more-or-less, have written or searched for myself in the time it took to get it out of chatgpt correctly. And I've often just had to rewrite it completely.
If you're an expert (writer, programmer, etc.) it's often faster to type it as-needed than modify chatgpt's output.
If you're not then either it's not reliable enough, since you do not have the expertise to modify; or the task is quite menial and you dont need reliability.
So it seems to me people are reacting to this based on wild assumptions about how it works and what "other people's jobs are". It don't see it being much more than a button in a few apps that makes some menial tasks 50% shorter.
People seem to be forgetting that in the vast majority of cases the thing ChatGPT is giving you is also available on stackoverflow, github, wikipeida, or 101 other high quality online sources.
>Everything I've used ChatGPT for so far I could, more-or-less, have written or searched for myself in the time it took to get it out of chatgpt correctly.
That's not even true. For one, it can stub a whole new function or coding project, intelligently, in a few seconds after the prompt, whether it's RoR or DSP code or whatever. For one unfamiliar with the domain, this can take hours or a full day, even with examples found on Google. Heck, even looking up and understanding how to use some command line flags in a shell pipeline can need lots of looking around, even if you have been using Linux/Unix for ages.
For a domain expert? They could do such things very fast. But several times slower still than GPT. Think minutes or half hour instead of seconds. Even the row typing and file creation would be some minutes.
It's also very premature when people judge a service we've had for like 5 years and has already changed by leaps and bounds, as if it's the peak stage, without considering what it could be in 5 or 10, or with different variants tuned for specific tasks.
> For one unfamiliar with the domain, this can take hours or a full day
Yes, I did say expert.
> But several times slower still than GPT.
Almost all the output needs to be re-read, modified and integrated in the project. This often takes longer than just typing it out, even from docs -- because you're still forced to think through the solution -- which is most of the time.
Yes, I have. That's largely why I've stopped using it.
The time it's taken to rewrite has been larger than not; and worse, it deprives me of understanding the problem i'm solving. So I end up not having thought through the problem and basically having to press delete and type from scratch.
Really? For me it could not generate meaning full tests, even for the simplest react-components. E.g. it tried to test the display of an error-message in a field in a form. The error message was basically "this is field is required and can't be empty.". It tried to test that, by inserting valid data into the field...
People don't have to love using it you know? If the guy doesn't like it, why force it down his neck?
I use plenty of code completion tools already, so I feel similar to them. Sometimes it saves me time, sometimes it doesn't.
For me personally, language servers have been the best thing, you can explore libraries, auto-complete nicely, without as the other poster said, worrying about verifying the correctness.
>People don't have to love using it you know? If the guy doesn't like it, why force it down his neck?
It's not about feelings. It's about there being a thing, in objective reality, with some specific merit, and we're trying to evaluate what it is with some accuracy.
Whether someone enjoys it or not is beside the point.
> People seem to be forgetting that in the vast majority of cases the thing ChatGPT is giving you is also available on stackoverflow, github, wikipeida, or 101 other high quality online sources.
No one is forgetting anything over here.
With StackOverflow, I have to scroll through ~2 pages of bing/google trash to even begin looking at a potential solution. This is only the beginning of the process.
99% of the time, the code sample I find has some simple-yet-annoying adjustments I'd like to make (i.e. unroll the inner loop & use SIMD). Certainly, I could spend half my afternoon massaging that method on my own. Or, I could reach for the circular saw and rip through this board in a few seconds. Sure - it leaves a bit of a rough edge most of the time, but it gets you a lot closer a lot faster than anything else in my experience.
> is turning 90min of googling into 5min + 30min of rewriting, "revolutionary" ?
I'd say so, yes. The side effect of an arbitrary feature going from 90 minutes to 35 minutes is that I am much more likely to consider features that I'd otherwise not.
I am working on a computer vision project right now that I would have never started without having this kind of access to the various algorithms. Go try to implement a sobel filter in your preferred language using traditional research, and then try it with AI assistance. I think you will start to see the light after going through a few methods like this.
My (sales-y) colleagues are already writing marketing emails with ChatGPT. Saves a little bit of work, but it might result in the loss of being able to write such emails yourself. This can extend to other jobs, which will have some effects.
Use in education worries me more. If schools don't change their lazy group-projects and "write an essay on" strategies now, coming generations will have put less effort in than previous, leading to a further drop in levels.
> coming generations will have put less effort in than previous
This has been happening since the inception of school. Calculators made math easier. Sparknotes made book reports easier. Wikipedia made essays easier.
> leading to a further drop in levels
Did levels drop because stuff got easier? Were there other causes over the past years, like dopamine dependency form infinitely scrolling algorithmic attention grabbing apps, among others? Maybe from schools becoming political battle grounds? Or from education spending being gutted leading to lower quality? Was it lack of decent education due to Covid?
Also, is there an actual drop in levels? I can't seem to find a source that has decent data on this. And the only stuff I can find says 'IQ' has been rising over time. So please share it with me.
Lastly, I'm personally not convinced any of this will lead to a net negative per se. It'll change the required knowledge and increase the over all capabilities of people. The calculator caused people to stop learning mental arithmetic and start learning more complicated math and how to use a calculator for it. Google caused people to memorize less and become adept at finding information through Google. Welding robots caused less people to learn how to weld and more how to program welding robots.
In the end it gives people the ability to do less of a simple thing and more of a complicated thing. Writing marketing e-mails and multiple titles for A/B testing isn't a skill, it's a trick and so is writing SEO stuff. Not having to do that opens up time to think about product-market fit, marketing strategies, improving advertising return on investment measurements. Which might be more valuable and interesting than writing emails.
Where did I say that that's the reason? You're coming up with developments that made school "easier" without any further similarity to ChatGPT, and don't consider adaptations in the curriculum or testing following those developments. I specifically mention that schools will have to get rid of (some of) their lazy evaluation processes.
> Also, is there an actual drop in levels? I can't seem to find a source that has decent data on this. And the only stuff I can find says 'IQ' has been rising over time. So please share it with me.
That trend has been observed everywhere, but has rarely been investigated for uncomfortable reasons. The Flynn effect wasn't actually believed, not even by Flynn himself.
> Lastly, I'm personally not convinced any of this will lead to a net negative per se.
That's such a bad basis to mess with the foundation of modern society.
> The calculator caused people to stop learning mental arithmetic
Agree.
> and start learning more complicated math
Doubt it. Mind you: I mean arithmetic, which is what calculators do, not maths. People rarely do arithmetic, even with a calculator, let alone more complicated calculations. Engineers, ok, they benefit from calculators, but the calculator has not engaged other people in more complicated arithmetics, I think.
> Which might be more valuable and interesting than writing emails.
I don't disagree, but it will still have unforeseen effects, and doesn't alleviate my worries about education levels.
> This has been happening since the inception of school. Calculators made math easier. Sparknotes made book reports easier. Wikipedia made essays easier.
And advance of LLM will make general thinking and cognition easier, relegating humans to assistants of a higher intelligence. Obviously this will make people anxious. The trajectory also indicates that the human component will likely not even be necessary anymore in the mid-term.
So what's left? Consuming AI-generated content optimized to hack our reward system.
In the last days I'm using ChatGPT as my first choice when searching for something and then googling if I'm not sure GPT is not hallucinating. Are they able to sustain the load?
This is an incredibly interesting study and brings to light some crucial points that we, as a society, need to address sooner rather than later. As GPTs become more powerful and ubiquitous, they have the potential to reshape the labor market in ways we haven't seen since the industrial revolution.
It’s building on a pretty serious thread of research, and demonstrates reasonable agreement with prior work (table 9) so I don’t think it can be dismissed as an advertisement.
Generating exposure ratings via GPT-4, from annotations provided by OpenAI people does definitely put a positive bias on the exposure estimates (which they acknowledge)
We are currently surviving with a massive shortage of skilled labor, where people in the richest countries have to wait days or weeks to see a doctor, and months to see a specialist and have important medical procedures.
The same is true for other skilled industries, where many people are excluded from access to good resources due to their scarcity.
We are a long way off from having too much skill. Let’s first get to parity with humanity’s needs.
The primary issue is not shortage of skilled labour. This is a right wing political soundbite.
What there is a shortage of is adequate pay to attract skilled people to the work required. British Doctors in the UK leave for other countries, work for private entities or go into different industries.
> leave for other countries, work for private entities or go into different industries.
So... Those people fill in jobs in businesses/countries where there's a shortage, causing a shortage somewhere else?
> political soundbite
Is exactly what you are doing. If there's no shortage in (skilled) labor, why is unemployment at the lowest rate in decades in much of the western world? Why are there not enough builders in much of the EU, while countries like Romania are suffering shortage due to skilled workers moving to EU countries to earn more? Why can't I find enough devs to do even half the work we could be doing? Why are so many companies looking at automation to solve the lack of labor?
How about this. Let's say historically 200 square meter houses costed 200k€ to build. But not enough people can afford that. Then you make a project where you count in the cost of materials but cut down cost of labor, price the house at 150k€ and say that there is a shortage of labor because you cannot find someone to build the house for the money you have available. You can even be realistic when saying that if you increased the price of the house you couldn't sell it. Maybe there just isn't enough people willing to pay for the house for the cost other people would be willing to build it for. Even if you managed to find people to work for the reduced pay you can just create a cheaper project and cut down on labor cost even more because now you sold to all the people who were willing to buy for 150k€. The "labor shortage" is basically guaranteed for any industry all the time. All it takes is for you to create a project fitting lower on the demand curve and cut down on labor since you can't really reduce price of materials.
One major risk with this kind of AIs is that for the last few decades the mental image of what such AIs will be like (both by scientist and science fiction) has a _extreme serve_ mismatch with how they turned out.
In the past humanity people thought of them as this non-emotional purely logic driven beings and in turn problems people imagined would be such as this logic not considering emotions and emotional well being of humans or blindly (but using logic) pursing a specific goal no matter the consequences or not valuing freedom or gaining emotion. But in all that it still follows logic.
But now we have AIs which could have all the problems above _but doesn't use logical thinking_ to _archive goals_. Instead it uses complex overlapping _statistical models_ to _tell a believable story_ where believable is defined by the training data which is _widely inconsistent, wrong, misleading, discriminating, emotionally charged, etc._ because it's just scrapped from the internet. So there is _no systematic finding of goals, subgoals, plan etc_, there is _no logic_, the concept of "truth" simply _doesn't exist_ for such systems etc.
At the same time this turned out to be "often times" good enough to be usable for many task and can be convincing enough to make people believe that it's sentient.
But this also means it will retell common false information, misconceptions, discrimination, hatred etc. from the internet.
Similar it will do what people call "hallucination" and "lying" but it _not_ either of that and calling it that is misleading. Because it just doing _exactly_ what it was created for: Telling a "believable" story given the training data.
And gaslighting, misleading people and lying are a extremely deep ingrained part of the internet, i.e. a deep ingrained part of the data on which it bases what is "believable".
And while we can add tones of bandaid on top to try to hide/filter out such "bad" responses IMHO without fundamentally either changing the training data or the approach this is bound to fail while even stronger upholding a misleading illusion.
In short: GPTs could do all the boring work for us and give us more time for things we like to do.
An example: Of course GPT can at some point make better music than me, but I am playing an instrument not because I want to sell the recordings, but because it is a lot of fun for me.
Let GPT do my taxes, then I have more time for playing. (can it do taxes finally?)
This is why my anxiety is at levels unseen before. I am a programmer who’s on a working visa. Assuming it does take my job, I have no idea what to do next… other than panic.
I decided that being paralyzed with fear isn't a great strategy so I'm going to subscribe to OpenAI tools for a while and see how they can fit into my workflow.
It feels "wrong" on a strangely emotional level but what is happening is gonna happen anyway.
On a brighter note: There are bound to be a lot of unfounded, hype-based Marketing claims that look reasonable during the rush but fall flat over time. It would be a first for humanity if there weren't
That is certainly true about the hype-machine it is in full force. I imagine that will level off soon and the really useful and novel stuff will float to the top. I have no backup plan but as you said, what happens is going to happen? It's a rollercoaster for me personally where some days I am in absolute dread and fear about the future. Then other days I just go along my day as usual.
Panic seems like the wrong response. Figure out what you do that these things can’t. Use them to make yourself more productive.
In the late 90s I was told my job was going to be outsourced, then dotcom, then the GFC, and lots of smaller bumps along the way. Those were actually somewhat scary times. Now should be excitement about what’s possible.
While you are not wrong, the idea that gives me anxiety is that this replaces all programmers. My job theoretically is safe for now. I work on fairly novel stuff, while a lot of my day is json parsing there is a big chunk of it that is domain specific. That said... it's that one day it will just take over all of programming that has me in dread. The only option then in my somewhat middle age is to unironically "learn a trade" I guess?
I don't know if that helps, I thought it was an interesting take.
I believe we still have humanity, even if everyones job was replaced, I don't think we're going to just let each other starve, even if Microsoft have their own auto-coders, I think we'd all do our best to keep some type of economy going for as long as possible.
I don't think everyone in the world wants to be living in the gutter, nor do I think the majority of the people in the world want to be filthy rich either.
We'll likely have a lot of people working on open source initiatives to help democratize access to important technologies etc.
Who knows what the future will bring. Maybe a massive solar flare while wipe it all away next week anyway...no one knows.
Given an emphasis on internalized documentation, these could be significantly better, especially over time than a lot of automated support in place for a lot of companies. As long as there is some kind of escape hatch for escalation.
That said, the reality is closer to the "Johnny Cab" from the first "Total Recall" movie.
With little kids I am increasingly conscious about how to best equip them for the world when they are adults. I think previous playbook will need to be redrafted.
'Resilience' and 'Critical Thinking' are two things I am thinking are key any others?
The key is being rich before AGI hits. I cannot imagine any social mobility after that. And philosophy maybe to find a meaning for their lives in the world where achievements don't exist any more. And don't push them too hard to learn skills that will become obsolete anyway - having a relaxed mind is more important
Hard to say. I, like many others here, am a software developer. My girlfriend of many years (have a daughter together) is a graphic designer. I never thought creative work like the one she does was going to be the first hit by AI, I actually thought it’d be the last, yet here we are. Her boss is nonchalantly talking about StableDiffusion, Modjourney and whatnot.
Meanwhile my cousing who wasn’t good at studying became a (good) hairdresser and his prospects look better than ours now.
It's often said that critical thinking is the cornerstone of a well-rounded education. Universities and colleges around the world extol the virtues of critical thinking and its ability to elevate our intellectual prowess. However, I'd like to argue that critical thinking, or more specifically, the way it's often employed in certain academic contexts, has become overrated and counterproductive to meaningful, constructive discourse.
In American college policy debates, a common phenomenon has emerged where "kritiks" (critiques) and "critical theorists" wield their tools of analysis in a manner that stifles the flow of ideas and inhibits the development of practical, solutions-oriented discussions. For those unfamiliar with policy debate, it is a competitive speaking activity where teams of two advocate for and against a resolution that typically proposes a policy change.
Kritiks are arguments that challenge the assumptions, methodology, or discourse used by the opposing team. While they can serve a valuable purpose in exposing biases, promoting introspection, and advocating for marginalized perspectives, they have also evolved into a means of derailing debates by focusing on abstract philosophical points rather than addressing the policy issue at hand.
Critical theorists, heavily influenced by postmodernist and marxist philosophy, have taken what was originally a quirk of Policy Debate and brought it to the mainstream. This has ended up with serial policy failure among the left wing, and has decimated our social science academic community.
"Critical Thinking" is overrated. We need constructive thinking, like what you get in engineering school.
It is important to start thinking along the lines of the government taxing corporations using LLM to increase their profit. This additional tax revenue should be used by the Government to give a universal basic income to people losing their livelihood to LLMs.
I feel like everyone is trying to draw very linear conclusions about what LLMs will affect, almost as if it's a circle that will be cut out of the market.
But if we really get to the point where programmers can be replaced with LLMs, I expect something closer to swiss cheese: For example, if you're a meat packer, maybe you work someplace that doesn't want to pay for a fancy new machine that can do your job...
But a massive plummeting in the cost of software paired with a massive spike in the rate at which we develop things means machines will cost a lot less. Who knows what kind of material sciences advancements we'll see for example.
At that point the margins may be razor thin, but automation will start to work in places where it couldn't be done profitably before. I think things get a little dystopian from there, since I'd expect the employers still holding onto humans to essentially be holding all the cards. What good is a union when the "scab" is a machine with a one time cost?
-
(but again, this is all assuming LLMs reach a place that is seriously earthshattering)
We don't yet have UBI (and worse, countries are raising pension ages, which is the closest thing we currently have to a UBI) so everything everywhere being automated all at once is going to be messy.
It seems like masses of people locked out of any economic growth and with no perspective are a recipe for desaster if you ask me. That's the foundation on which unrest and rebellions are built and those are usually not pleasant affairs for most people.
If vast swathes of the population are thrown onto the scrapheap in a relatively short period of time it absolutely would come to that. The ruling class would be smashed (literally and figuratively) and you'll wake up in the morning to discover that our form of capitalism has ceased to exist.
Honestly, I feel worse for checkout girls locally - people are more often replaced by improvements to low hanging fruit, and not the emergence of apparently incomprehensively profound.
The truth is that we're impressed by this because it's magic and we don't know how magic - but it still getting way more magic really fucking fast.
Right now it's still and idiot if you're not telling it what to do. Hypnotic babble.
(If you want to be scared or relieved feed it some time series stock data (if it's good it'll end up fighting itself, otherwise it's either stupid or a liar).)
There are some uses easy use cases for it - I really wouldn't mind a GPT bot replacing outsourced call centres.
Improvement of quality and an improvement in service, and no direct impact to the local economy.
Other than that, this all has the booming feel of every other damned tech buzz, can we just cool down on sensationalism and not try and imagine another multi-billion dollar sub-economy into existence?
We already did that with bitcoin, and now we're seeing the end of the magic trick; let's not do it again.
How about a calm objective and public analysis by under-excited experts before we start mining land that simply isn't there?
It made up a severe risk of death taking a very common medicine combo. It was super convincing, even giving information on how long to avoid taking them together. It was pure bullshit.
I think as much as hyping the benefits we need to hype the flaws and dangers.
If the public at large learn to trust these LLMs too quickly and deeply, that's a hard hole to dig out of. Skepticism in all information sources is a key critical thinking life skill.
That skill is harder to apply the more natural the information source seems and the more ambient the information.