Until the day when GPT hallucinates some nice (and confident) rm -rf equivalent ...

Timpy · on Nov 25, 2023

For me bash is a weird exception to the rule "code is easier to write than it is read". I'm never doing anything crazy in a shell a script but it's easy to just ask ChatGPT to whip it up for me and I will read it to verify. Perhaps anybody who can't validate the output is just as likely to write an accidentally-destructive script the old fashioned way.

ykonstant · on Nov 25, 2023

Which is extremely easy to do; just a misplaced `mv` is enough. I can't believe people are relying on GPT for ad hoc destructive I/O.

Analemma_ · on Nov 25, 2023

Have you actually tried it? Because this comment seems pretty misplaced with the reality of what ChatGPT can accomplish. I also use it all the time for bash one-liners, and so far my experience has been "80% work correctly on the first try, 15% require minor adjustment to run, 5% require major adjustment to run" and so far none have ever done anything flagrantly wrong or destructive. It also doesn't just print a command with no context: it explains what each flag/step of the command is purportedly doing, which you can (and should!) use to double-check that what you're about to run makes sense. Which is more than I can say for a lot of SO answers!

swatcoder · on Nov 25, 2023

Traditionally, the alternative wasn't "SO answers", which are indeed dangerous in their own way -- it was to develop, and then maintain, a comprehensive and fluent understanding of the tools suitable to your profession.

GPT and SO can help you make a deadline today, and we all may use them now and then, but consistently relying on them steals essential opportunities for professional growth.

A journeyman woodworker who just asked somebody else to perform all his tricky jigsaw cuts is going to have a hard time developing the muscle memory and intuitions that mark mastery of the craft.

fragmede · on Nov 25, 2023

A master carpenter that only uses hand tools is a master of their craft, and should be respected as such, but refusing to ever use power tools because of a moral objection to electricity would be seen as quite the eccentricity. Which, the Amish sell furniture, and it's really quite good, if niche, furniture.

simonw · on Nov 25, 2023

I'd rather get to AI-assisted mastery of many more different crafts than limit myself to achieving mastery of just one single thing.

anjanb · on Nov 25, 2023

I've used chatGPT(ver 4 through ChATGPT plus) for creating .bat files. Really easy ones, it can churn out quite well. A lil more complicated - it doesn't consider edge cases. Several cases require me to ask it to generate a golang program that gets called by the .bat file.

Several times, I had to spend hours to get a .bat script working. Hopefully, it will get better in the future.

pipes · on Nov 25, 2023

I used chat gpt to explain what a complex line in a bash script was doing. Though to be fair if I'd spent a day in man pages and Google I'd probably have learned a lot more than just being given the answer. Which makes think in the long run gpt may result in worse programmers.

Ma8ee · on Nov 25, 2023

In the same way as Stack Overflow has resulted in worse programmers, at least in some senses.

I don’t mean to bash on SO, it’s really valuable sometimes. But corporate culture (faster is always better) and a new breed of programmers that don’t really care, have left us with a big part of the profession that are unable to produce anything new, or solve any unique problems.

swatcoder · on Nov 25, 2023

And a generation before that, it was VB and (early) PHP programmers that earned that critique.

It does feel like the proportions might be changing and the quality of software is trending downward, but I think you have it right that the risk of programmers being too "GPT-reliant" is the continuation of a long, intergenerational pattern.

an_aparallel · on Nov 25, 2023

this is one of my BIGGEST frustrations learning to program. i do...not...want....to go online to work out hooow to do something. I want a "manual" - written well...that i can study and find the write tools to use...which for example (Pythons docs...are not...at least for beginners) - i then want to use the tools in said manual....to create. With the syntax, and the words you've defined oh so well in said manual. But...as the language grows...we'll just have to live with this never happening :)

this is something i get jealous of older programmers who grew up on old computers, with printed manuals...when languages were smaller, and the scope of programs were also smaller...and didnt require external libraries.

Ma8ee · on Nov 26, 2023

Books are still written, and there are quite a few online courses, and some of them aren't bad. I've learned that to be productive in a new language or framework I really need to dive into some of those resources. The manuals you find online are rarely good enough to learn from, and all the tutorials and "getting started" just scratches enough on the surface to make me frustrated.

ElectricalUnion · on Nov 26, 2023

Programs were smaller with less expectations on them. Users were few and forcily clustered near the computer system. Attackers were limited and computer security was physical access control security.

Now, it is kinda expected that programs support "text" - non-lgc alphabets, context-sensitive collations, ligatures and cursive text, mixed bidirectional text, combining characters, input method support, locale-dependent string interpolation logic.

And one doen't simply handle all of that, not without bolting something else in your program that is probably larger that your program.

And that isn't even instants in time; those are much harder to handle correctly all the time.

Ma8ee · on Nov 29, 2023

I’d say that we are programming on a higher level of abstraction today, so the programs do in some senses do more, but the size of the code we wrote then and now, and the complexities are of the same order of magnitude.

simonw · on Nov 25, 2023

What's better: spending a day figuring out a single line of a complex Bash script, or spending that same day learning literally 100x more because you didn't have to waste that much effort on each individual thing?

For me, the alternative to using ChatGPT to figure some weird piece of Bash trivial out isn't doing the work myself, meticulously and at great length. It's losing interest and not figuring that thing out at all.

eternityforest · on Nov 25, 2023

Losing interest and not figuring it out at all has been my approach to nontrivial bash. I'm not a UNIX philosophy fan in general, but trying to bea programming language and also a UI, all in one, is a bit much. It's ok-ish as a UI for cases where GUI doesn't work or you want a machine to be able to use the UI, but as a language it's missing a lot of stuff.

Rediscover · on Nov 25, 2023

Other option: asking the grey beard down the hall how to do it. Fifteen minutes, and all parties are happy.

rascul · on Nov 25, 2023

https://explainshell.com/ can help with that but isn't perfect.

keybored · on Nov 25, 2023

Bash doesn’t deserve the brainspace that truly knowing it would demand.

arcanemachiner · on Nov 25, 2023

There's a big difference between "build this program for me" and "explain how this line of bash code works".

I use LLMs for the latter, and they often do a great job and save a lot of time looking up individual flags, concepts, etc.

Too · on Nov 25, 2023

Both could be equally wrong and hallucinated.

thedaly · on Nov 25, 2023

> Both could be equally wrong and hallucinated.

As can I. I recently broke my years long streak of not deleting something by accident with an errant bash command.

Just because a tool has the potential for a negative outcome doesn't mean it shouldn't be used. It just means appropriate caution should be used as well.

williamcotton · on Nov 25, 2023

This statement implies that LLM hallucinations are completely random which is objectively false.

LLMs fill in the blanks when left to synthesize a response from a prompt as opposed to translating a response from a prompt. These synthesized responses, aka hallucinations, are predictable in nature. Quotes, titles of books, web page links, etc.

Conversely, providing an LLM with all of the facts necessary to complete a response will result in few to no hallucinations.

For example:

Select name and row_id from table1 joined on table2 on table1_id.

This will never return "DROP table1;". It will basically only ever return something very close to what you want.

williamcotton · on Nov 25, 2023

Why the downvotes? The comment is not rude and it is factually correct.

Too · on Nov 26, 2023

Not my downvotes but this is factually incorrect.

A LLM will give you the highest likely suggestion. If that happens to be a DROP, it will not stop.

Now that is of course going to be extremely unlikely in your example. What is more likely though is that your SELECT may include a sql injection vulnerability, even more so once your prompts get more complex. The chance of that happening or not, is completely random from a users point of view. Are we going to blame the user for not providing the requirement “without vulnerabilities”? Even if they did, it’s not sure to be fulfilled.

In this parent case, the scenario was inverted. Given a sql query, will gpt explain if it has vulnerabilities or not? Will it even explain the gist of it correct? Who knows if it will hallucinate or not?

As will answers from stackoverflow, always read the comments, always review yourself.

Use gpt all you want. I do it it myself, it’s great for suggestions. Just think that using gpt to explain things you don’t understand and can’t verify easily, can be risky. Even more so in bash where the difference making a destructive command can be a lot more subtle than select vs drop.

williamcotton · on Nov 26, 2023

Now that is of course going to be extremely unlikely in your example.

The OpenAI API now has support for deterministic responses.

There you go, the burden of proof is on the accuser.

If I were to state “you can never ride your bicycle to the moon”, you could easily say, well, there is a remote possibility, and then force me to prove that there actually is no remote possibility, well, you would clearly see the problem.

I’ll state it again: you will never ride your bicycle to the moon and ChatGPT will never return “DROP table1;” in response to the aforementioned request. It might not be correct, but it won’t be wildly off target like is flippantly suggested in these forums for populist appeal.

My entire point was that hallucinations are not random. If you craft a query that reduces the task to mere translation then you will not get some wildly incorrect response like you would if you asked for quotes from War and Peace.

I’m pretty much convinced that most of the shade against LLMs from developers is motivated more by emotion than reason because this stuff is easily verifiable. To not have realized this means approaching the tools willingly blindfolded!

Too · on Nov 26, 2023

That’s like saying a roll of dice is deterministic. In theory and under controlled circumstances, yes. In the real world and how people use it, no. The OpenAI docs even mention this, it’s only about consistency.

If I encounter a new unknown command and ask chatgpt to explain it. For me, it is entirely unpredictable if the answer will be 100% correct, 95% correct or complete mansplaining bullshit.

Even if it may be close to the truth, with bash the difference between a 95% answer and a 100% answer can be very subtle, with seemingly correct code and seemingly correct explanation give very wrong end result.

williamcotton · on Nov 26, 2023

Again, you've missed my point entirely. The reason for mentioning determinism was that I am telling you that the burden of proof for "DROP table1;" must be on someone who makes a claim such as yours, not me, and that such proof better come with some evidence, hence:

https://cookbook.openai.com/examples/deterministic_outputs_w...

Now go find some instances where someone is presented with "rm -rf /" or "DROP table1;" when otherwise expecting a response to help with non-destructive commands!

For me, it is entirely unpredictable if the answer will be 100% correct, 95% correct or complete mansplaining bullshit.

Please, show me some evidence of this variance because it is either a bold or ignorant claim to say that the outputs are wildly unpredictable. 100% true vs "complete mansplaining bullshit". Run the numbers! Do 10,000 responses and analyze the results! Show me! I am completely unconvinced by your arguments based on direct experience with reality. You can easily change my mind by presenting me with reproducible evidence to the contrary of my beliefs and experiences.

Even if it may be close to the truth

This is just a classic motte-and-bailey fallacy... let me explain! The bailey is the claim that the outputs are "complete mansplaining bullshit", which is very hard to defend. The motte that you retreat to, "close to the truth", is exactly what I'm saying for prompts that are more of a translation from one language to another, English to bash, English to SQL, etc.

I have never claimed it would be 100% correct, just that the hallucinations are very predictable in nature (not in exactness, in nature). Here's an example of the kind of error:

Select name and row_id from table1 joined on table2 on table1_id.

  SELECT table1.name, table2.row_id
  FROM table1
  JOIN table2 ON table1.table1_id = table2.table1_id;

Well, that should obviously be table1.row_id in the SELECT right? And I guess not super clear from the instructions, but standard that the JOIN should be table1.id Oopsie! Is it valid SQL? yes! Is it "complete mansplaining bullshit". Not. Even. Remotely.

xcv123 · on Nov 26, 2023

So what? Review and test it before running in production. Humans also screw things up on first attempts. Test it, check the output, fix it, try again. In my experience with GPT-4 if you describe a bug it will correct it.

AlecSchueler · on Nov 25, 2023

I haven't tried GPT for this purpose but I don't think we need to assume that everyone tinkering with it are blindly copy pasting potentially destructive commands.

Anyone who does that now would already have done it from random Google results anyway.

cactusfrog · on Nov 25, 2023

It’s really useful for learning. “Describe this complex bash command I found on stack overflow line-by-line”. Write a quiz question to make sure I understand it. Rate my response to this quiz question.

arcanemachiner · on Nov 25, 2023

> Write a quiz question to make sure I understand it. Rate my response to this quiz question.

I like this idea. Definitely gonna steal it.

I love that "extensions" can be implemented just by writing a couple extra words.

dotancohen · on Nov 25, 2023

This is what I come to HN for. I'll see if ChatGPT could maybe generate some relevant Anki cards as well.

MonaroVXR · on Nov 26, 2023

>Which is extremely easy to do; just a misplaced `mv` is enough.

Or `chown -r` or was it.. `chown -R`

anjanb · on Nov 25, 2023

It shouldn't be too difficult to pipe chatgpt output to bard or something equivalent and ask for validation

spion · on Nov 25, 2023

You can also (surprisingly) ask ChatGPT for validation. The part of the LLM that activates when solving problems is going to be different than the one that activates when reviewing.

da39a3ee · on Nov 25, 2023

Not used GPT much?