Bash Pitfalls

nikisweeting · on Sept 7, 2020

I consider shellcheck absolutely essential if you're writing even a single line of Bash. I also start all my scripts with this "unofficial bash strict mode" and DIR= shortcut:

    #!/usr/bin/env bash
    
    ### Bash Environment Setup
    # http://redsymbol.net/articles/unofficial-bash-strict-mode/
    # https://www.gnu.org/software/bash/manual/html_node/The-Set-Builtin.html
    # set -o xtrace
    set -o errexit
    set -o errtrace
    set -o nounset
    set -o pipefail
    IFS=$'\n'

    DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"

I have more tips/tricks here: https://github.com/pirate/bash-utils/blob/master/util/base.s...

abathur · on Sept 8, 2020

I've had a thought percolating (for many months?) but I haven't tried to phrase it and I'm leery it may sound condescending (this is also more of a public address than direct response) ...

Shellcheck tends to steer us away from "weird" parts of Bash/shell and will happily suggest changing code that was correct as written. There are good reasons for this (and I still think most scripts/projects should use Shellcheck), but Bash is a weird language, and there's a lot of power/potential in some of the weird stuff Shellcheck walls off.

If I had to nail down some heuristics for working with it...:

- do lean heavily on Shellcheck for writing new code if you're unfamiliar with the language or are only writing it under duress

- don't implement its suggestions in bulk (the time you save on iterating can easily be blown later trying to debug subtle behavior changes)

- don't apply them to code you don't understand (if you have the time and interest to understand it but find yourself stuck on some un-searchable syntax, explainshell may help)

- don't adopt them without careful testing

- do take Shellcheck warnings about code that is correct-as-written as a hint to leave a comment explaining what's going on

- do leave Shellcheck off (or only use it intermittently) if you're trying to explore, play, or otherwise learn the language

cle · on Sept 8, 2020

For me this is way too much to invest in Bash. Any non-trivial utilities should be written in a real programming language when possible, especially when they are used in a context where they are likely to accrete complexity over time. I’m sure there are good use cases for complex Bash programs, but we should be reaching for something with less footguns by default.

(Shellcheck is an amazing utility and I use it for all Bash that I write.)

abathur · on Sept 8, 2020

What is? Did I suggest anyone write complex Bash programs?

Shellcheck is amazing. As I suggested in the previous post, I use it for most of my own shell scripts/projects, especially stuff I intend to release. But, because shell and Bash are weird and full of pitfalls, I think it's worth making sure people know you can't just Shellcheck-and-ship.

oblio · on Sept 8, 2020

I think his nitpick was you promoting weird bash tricks. They're neat to use but usually a pain for someone else to maintain, as a general rule, don't use them, don't try to be clever unless you're 100% sure that it's code you'll be the only one to see (which can rarely be guaranteed).

jtchang · on Sept 8, 2020

What would you suggest for emdedded systems?

xnyan · on Sept 9, 2020

lua, all day every day. It’s well understood, slim and plays very nicely with C which is how most embedded dev is done.

jonahx · on Sept 8, 2020

> and there's a lot of power/potential in some of the weird stuff Shellcheck walls off.

Can you give an example? Something Shellcheck warns you away from but which, if you used it, would be better (more expressive / more powerful / more whatever) than a shellcheck-approved solution?

abathur · on Sept 8, 2020

Intentional word-splitting is a common one. So, for example:

    extract_nix_version(){
      echo "$3"
    }
    
    get_nix_version(){
      # shellcheck disable=SC2046
      extract_nix_version $(nix --version)
    }
    
    $ nix --version
    nix (Nix) 2.3.4
    
    $ get_nix_version
    2.3.4

This gets the information without another external utility (grep, cut, sed, awk...)

It's well-past bedtime, so for now I'm just leaving the first one I can think of (it's top-of-mind since I Shellchecked it in the last couple weeks... :])

bitsig · on Sept 9, 2020

  >    get_nix_version(){
  >      # shellcheck disable=SC2046
  >      extract_nix_version $(nix --version)
  >    }

You might do something like:

  get_nix_version(){
      local IFS=' '
      extract_nix_version $(nix --version)
  }

if you want set "IFS=$'\n'" globally, and still be able split on space "locally".

Edit: Still needs the "# shellcheck disable=SC2046" stanza, though

abathur · on Sept 8, 2020

It's a little contrived in this case, but another common example is a related suggestion for `read -r`:

    get_nix_version_parts(){
      local major minor patch
      # shellcheck disable=SC2034,SC2162
      IFS="." read major minor patch < <(get_nix_version)
      local -p
    }

    $ get_nix_version_parts
    major=2
    minor=3
    patch=4

jonahx · on Sept 8, 2020

Using -r there works fine for me, though your SC2034 complaint is valid, in that it essentially prevents from you from using a very specific built in formatter provided by "local -p".

This feels like a rare case, and I'd have no problem doing the "disable=SC2034" there, but I'd add a note explaining why.

Even then, one could make an argument (I think I would, actually) that an explicit printf is clearer and more robust. For example, it gives you more control over the formatting, and it doesn't introduce implicit coupling between the output of "local -p" and our function, which in theory could change in the future or be different on different bash versions, etc. I suspect this probability is very low in the particular case of local -p, but still it's not a great practice. Not something I want to give the stamp of approval of inside my code base.

    get_nix_version_parts(){
      local major minor patch
      IFS="." read -r major minor patch < <(nixv)
      printf '%s=%s\n' major "$major" minor "$minor" patch "$patch"
    }

abathur · on Sept 8, 2020

Good catch on -r not mattering, here. I think `read` is one I've had trouble with and assumed that was it since I bothered disabling it. But after grepping around a bit more, the only one I see that shellcheck would object to that actually differs isn't common enough that I'd gripe about it.

I don't actually have a gripe with 2034, here. It's a good example of a statement that the user obviously has to triage.

Likewise, I'm just using `local -p` as a cheap way to show that the variables populate, it wasn't part of the original code I adapted. Yes--the case for coupling to the output of local -p is when the output is going to be used to set variables again later.

jonahx · on Sept 8, 2020

Here is shellcheck compliant solution which uses no external utilities, and is explicit about our intention:

    # function to imitate "nix --version" since I don't have it installed
    nixv() {
      echo 'nix (Nix) 2.3.4'
    }

    read -ra arr < <(nixv)

    echo "${arr[2]}"

abathur · on Sept 8, 2020

Thanks for asking this, by the way.

Going back to intentionally look through a few things for examples is helping me better clarify why I've had this impression slowly building up...

1. Some Shellcheck code titles/short messages communicate uncertainty well, but some others can read as more confident/absolute than the situation merits.

2. Broadly, Shellcheck tends to steer people away from word-splitting. Word-splitting can be a surprising PITA, so I get it. But, it's also a critical concept for understanding shell.

If you are reluctantly writing Bash/shell, it's good to have Shellcheck help you avoid it. If you want or need to understand the language, you're going to have to grapple with word-splitting to get there.

sedatk · on Sept 8, 2020

That advice is applicable to any kind of code analysis tool.

abathur · on Sept 8, 2020

Fair :) but I think there's a degree difference here (and this is more about the language than Shellcheck itself).

Also, maybe my original phrasing left room for misinterpretation? Shellcheck will happily suggest changes that break working code. I think I've had it happen ~4 times this year?

Obviously it varies a little by tool type and ecosystem--it's obviously not much of a surprise if a dead-code analysis tool suggests removing things you can't actually cut, and likewise linters in most languages can point out unused variables/parameters that you can't cut. But I can't recall the last time another tool gave me specific do-x-not-y suggestions that just weren't anywhere close to fungible.

acqq · on Sept 8, 2020

> Shellcheck will happily suggest changes that break working code.

Which is OK once it is known. Your example nicely shows that. The flagged behavior is often unexpected, so there where it is expected having shellcheck line that turns off the warning greatly improves readability of the whole script, essentially showing the reader "here is the dependence on non obvious behavior."

Where such dependence exists, I prefer to see the "shellcheck" line before pointing to it (by turning that specific warning off exactly in that line). It's then a "vetted" and "explained" line.

rustshellscript · on Sept 7, 2020

I have created a similar library for bash. However, there are more pitfalls related to errexit, see [1] and even shellcheck can not help there. I have tried to solve the pitfalls in my library, but it turned out to be ugly and unreliable. That's why I'm trying to use rust [2] for shell scripting like tasks nowadays.

[1] https://github.com/anordal/shellharden/blob/master/how_to_do...

[2] https://github.com/rust-shell-script/rust_cmd_lib

HellsMaddy · on Sept 8, 2020

Your [1] doesn’t mention it, but I find shopt -s inherit_errexit to be indispensable.

https://twitter.com/hellsmaddy/status/1273744824835796993?s=...

rgrau · on Sept 7, 2020

You've a lot of cool stuff in that repo! I hope you don't mind me 'stealing' some stuff for a kind of booklet I'm working on with several shell tips and tricks.

https://github.com/kidd/scripting-field-guide/

PS: Any comments/PRs are super welcome there too

nikisweeting · on Sept 8, 2020

No problem, it's all MIT licensed so it's not stealing haha. See my parent comment with my full reading list as well.

If you want to link back to my repo as a cited source/credit that would be nice, but is not required :)

rgrau · on Sept 8, 2020

Cool thanks! I will link back and credit you :)

draebek · on Sept 8, 2020

Note that this same wiki has a good page on the numerous surprising behaviors of errexit: https://mywiki.wooledge.org/BashFAQ/105

My perception is that there is no consensus on whether or not to use it. I go back and forth, but I probably use it more often than not.

nikisweeting · on Sept 8, 2020

imho all the behaviors that post outlines as "confusing" are my desired behaviors most of the time. For me it's pretty clear-cut, error out by default and only continue when explicitly allowed with something like `|| true`.

nikisweeting · on Sept 8, 2020

Here's my full bash reading list:

- ShellCheck: life-changing BASH linter and testing toolkit https://github.com/koalaman/shellcheck

- 30 interesting commands for the Linux shell – Víctor López Ferrando https://www.lopezferrando.com/30-interesting-shell-commands/

- 7 Surprising Bash Variables https://zwischenzugs.com/2019/05/11/seven-surprising-bash-va...

- anordal/shellharden https://github.com/anordal/shellharden/blob/master/how_to_do...

- barryclark/bashstrap https://github.com/barryclark/bashstrap

- BashPitfalls : Greg's Wiki http://mywiki.wooledge.org/BashPitfalls

- Common shell script mistakes http://www.pixelbeat.org/programming/shell_script_mistakes.h...

- Comparison of all the UNIX shells http://hyperpolyglot.org/unix-shells

- Defensive Bash Programming https://www.kfirlavi.com/blog/2012/11/14/defensive-bash-prog... or https://jonlabelle.com/snippets/view/markdown/defensive-bash...

- Bash FAQ and Cookbook https://mywiki.wooledge.org/BashFAQ

- Detecting the use of "curl | bash" server side https://idontplaydarts.com/2016/04/detecting-curl-pipe-bash-...

- Gensokyo Blog - Use Bash Builtins shell,fish,bash https://blog.gensokyo.io/a/fafbe742.html

- Rich’s sh (POSIX shell) tricks http://www.etalabs.net/sh_tricks.html

- Shell Scripts Matter https://dev.to/thiht/shell-scripts-matter

- Shell Style Guide https://google.github.io/styleguide/shell.xml

- Shellcode Injection - Dhaval Kapil https://dhavalkapil.com/blogs/Shellcode-Injection/

- Something you didn't know about functions in bash http://catonmat.net/blog/bash-functions

- Ten More Things I Wish I’d Known About bash https://zwischenzugs.com/2018/01/21/ten-more-things-i-wish-i...

- Ten Things I Wish I’d Known About bash https://zwischenzugs.com/2018/01/06/ten-things-i-wish-id-kno...

- Testing Bash scripts with BATS https://opensource.com/article/19/2/testing-bash-bats

- Testing Bash scripts with Critic.sh https://github.com/Checksum/critic.sh

- Useful BASH and UNIX commands https://cb.vu/unixtoolbox.xhtml

- When Bash Scripts Bite :: Jane Street Tech Blogs https://blogs.janestreet.com/when-bash-scripts-bite/

- Bashible: Ansible-like framework for bash-based devops https://github.com/mig1984/bashible

- Auto-parse help text from comment at the top of script https://samizdat.dev/help-message-for-shell-scripts/

- Make bash scripts safer by writing them in Rust https://github.com/rust-shell-script/rust_cmd_lib

- Additional shell options for non-trivial bash scripts https://saveriomiroddi.github.io/Additional-shell-options-fo...

rantwasp · on Sept 7, 2020

set -euxo pipefail

Rapzid · on Sept 8, 2020

Ahh, errexit:

  unless
       the command that fails is part of an until or  while loop, part of an
       if statement, part of a && or || list, or if the command's return status
       is being inverted using !.

All hope is lost.

ed25519FUUU · on Sept 8, 2020

Every script. Good enough.

BiteCode_dev · on Sept 8, 2020

Same, but I also do "shopt -s dotglob" so that * includes hidden files.

hoseja · on Sept 8, 2020

That's really fun to pronounce.

nikisweeting · on Sept 8, 2020

ooh good one, I'll add that to my snippet

ausjke · on Sept 7, 2020

errtrace is new to me, otherwise I do similar things.

set -ueEo pipefail

JdeBP · on Sept 8, 2020

"pipefail" is also available in the FreeBSD Almquist shell, but not in the Debian Almquist shell.

zests · on Sept 7, 2020

Bash is one of my favorite programming languages. Not for production. Writing significant production Bash that others have to use and maintain is more of a struggle than it is worth.

But for personal use? Hitting APIs, organizing data, writing to files... Bash has been enjoyable because it requires much more creativity than almost any other language. At some point it's just you and the coreutils against the world.

bor100003 · on Sept 10, 2020

I have the same feeling. In most cases you do the job without any external libs which is rarely the case with other programming languages.

Ambol · on Sept 7, 2020

These type of things is why I always use a real programming language rather than shell scripting. Can't wrap my head around all the different pitfalls of treating random string outputs as input for other commands without first validating and parsing them into an actual data structure.

as-j · on Sept 7, 2020

Shell scripting has it’s place. In general it works fairly well and it’s an important skill to have.

I used to have a report who had a similar train of thought, and ended up writing shell scripts in python which was way more fragile, unreadable and more complicated.

The edge cases are there, but you don’t run into them all that often. Or if you do, you fix your scripts. Take the cp —- 2nd example. You generally don’t start file names with a -. It just makes Unix life annoying so you get pushed away from doing it.

Keep scripts sane and straight forward, you can do really nice and powerful work knowing Unix tools. You can also shoot yourself in the foot some day when you create a file called ‘-rf’.

3pt14159 · on Sept 7, 2020

Bash is like a spec ops soldier's knife. Sure it could be used to kill someone (Bash on Balls, anyone?), but it's mostly for doing boring stuff like opening MREs (getting around a filesystem) or removing a deep thorn from the skin (removing a stupid "$n/a" string in a one-use TSV).

I wouldn't send a navy seal into battle without one, but it isn't an absolutely critical piece of gear.

jeremysalwen · on Sept 7, 2020

Many people have responded to your comment saying that this is bad and that using other languages makes things more complicated when you are doing 'bash-like' things. I'm guessing they haven't actually tried it, because I have used python with the [sh](https://github.com/amoffat/sh) package many times to replace bash scripts, and I have never regretted it.

jayd16 · on Sept 8, 2020

I don't like having to deal with the external dependency of python being on whatever machine or container my script runs on.

samatman · on Sept 8, 2020

This was where perl really shined.

Still can, if you want it to. I'll admit my perl chops have rusted clean away.

But it used to be something you could rely on having, and never had the dependency hell problem that's plagued python, for whatever reason.

Also never went through the 2-to-3 transition tsuris, instead opting to make its sequel the Duke Nukem Forever of programming languages.

cle · on Sept 8, 2020

I definitely relate with this. This is why I do my “scripting” these days in Go. Programs aren’t editable like Python, but compiling a Go program results in a single executable that I can just drop onto a host somewhere, without worrying about interpreter versions and pip and all that crap.

I haven’t found Bash to be much better in practice. You can’t really do much with Bash alone, it’s only as useful as the tools you use it with, the ones you install with your OS package manager. And Bash versions themselves can vary dramatically. Try writing non-trivial Bash scripts that are portable between macOS and Linux...it’s a nightmare.

(Yes I know there are libs for python that can do this, but in general Go is just simpler: it has everything I need for writing utilities out of the box, including way simpler package management, and I can come back 2 years later to some code I wrote and jump back into it with no effort.)

arminiusreturns · on Sept 8, 2020

>Bash alone, it’s only as useful as the tools you use it with

This is like half the point for me. It's that I have lots of cli tools I like to use, and many of them are very unix-philosophy like. I want to cobble them together to accomplish a task, and since they already work with shell and pipes etc, it's trivial.

Now in an alternative, I have to go find some library that does the equivalent, or write my own version of the functions, etc, and lots of times those libraries have the thier own sets of pitfalls. I like bash because it encourages sticking the unix philsophy of do one thing and do it well (because you are just using it as a glue, not as a material itself). (also sometimes I just want something thats easy to read and understand)

For all those who bash bash as so horrible, unstable, non-prod.... bash scripts aren't perfect, and nobody who uses bash would say so, but bash runs on millions of prod systems across the world and things are fine... so maybe whats needed is a bit less holier-than-thou attitude from those people and bit more "you do you".

cle · on Sept 10, 2020

The point is that you still have to make sure those tools are installed on the system, and compatible with the code you wrote. Standard CLI programs we rely on in our Bash scripts can vary dramatically between Linux distributions and OSs. If you want portable Bash code you need to bundle it in a container w/ all of its dependencies, or be very judicious about what programs and program features you use.

I'm not bashing Bash, I use Bash all the time and it's enormously useful. But it has limitations and tradeoffs that make me reach for other things more often.

datashaman · on Sept 8, 2020

One of the beauties of bash is that it's _not_ compiled.

Can you read the Go source code from a Go binary? Is the algorithm transparent to the user?

How do you jump back to work on a binary? You have to keep the source code somewhere else (probably in a repo somewhere); now you have an application instead of a script.

tikkabhuna · on Sept 8, 2020

> You have to keep the source code somewhere else (probably in a repo somewhere)

You should be doing that for scripts as well! Far too often I've seen problems arise when a team depends on a process but its just a shell script run by cron from the home directory of the guy who's on vacation.

You could also do `go run main.go` if you want to be close to the source. The Gitlab Runner repo has a "script" run like that[1].

[1] https://gitlab.com/gitlab-org/gitlab-runner/-/blob/master/Ma...

cle · on Sept 10, 2020

Yeah that's one of the first things I mentioned. Go executables aren't editable. In practice it barely makes a difference, if you can SSH onto a machine then you can SCP a binary onto it too. At that point you edit the source code and press Ctrl+P, Return to rerun "go build -o fix-the-thing main.go && scp fix-the-thing $HOST:/tmp && ssh /tmp/fix-the-thing". And then you've got your code ready to check in to source.

nijave · on Sept 8, 2020

Yup and something rarely mentioned is lack of good testing frameworks and all the pitfalls when you try to make something portable

I will admit Python doesn't have great, simple primitives for shell-like tasks but you can create a few functions to make a simple sh("") wrapper.

With Ruby's backtick execution syntax you can blend in shell/system utilities pretty easily. Unfortunately, Ruby isn't installed on as many systems by default, afaik

Here's a Ruby equivalent of looping through mp3 files in the current directory

```

Dir["*.mp3"].each do |file|

  puts file # print the file name

end

```

bluedino · on Sept 8, 2020

I find a lot of times someone extends or duplicates a bash script and soon gets over their head in complexity, or doesn’t realize how much easier a Ruby/Python/whatever script would be to change, mostly because it’s so much more readable.

Bash has all kinds of pitfalls and things like checking if a file exists vs if it’s larger than 0 bytes.

Then you get people who instead use php to do scripts. Ugh.

rantwasp · on Sept 7, 2020

yeah no. you should start with bash and drop into a proper programming language when complexity warrants it

jayd16 · on Sept 8, 2020

What kind of parsing do you think can be done in "a real programming language" that can't be done in bash?

I like objects and performance and abstraction and other things about other languages but bash has a _lot_ of parsing tools.

ryl00 · on Sept 8, 2020

This is where perl really shines for me. Regular expressions are front and center, making it so frictionless to slice and dice text. My bash knowledge is so stunted because I bail to perl as soon as the going gets tough. :)

dredmorbius · on Sept 8, 2020

Bash's regex library is called "sed".

hnlmorg · on Sept 8, 2020

`sed` differs from different implementations (eg GNU vs BSD).

Bash does actually have a fair amount of regex support built in too though it's not as well talked about.

dredmorbius · on Sept 8, 2020

Both points taken.

Mine remains: that shell affords capabilities via utilities. Awk (again, with numerous implementations) also offers regex capabilities, some not offerred by sed, and missing a few.

When writing cross-platform scripting, adhering to common standards rises in importance. This is possible, if occasionally limiting.

renewiltord · on Sept 7, 2020

What frameworks/libraries do you use? I find that the reason I prefer using a shell script is:

* ergonomic integration with the shell

* easy piping

* easy input/output redirection

Maybe Ruby comes close with the backticks or `system` but it still isn't as nice as a shell IMHO

GuB-42 · on Sept 7, 2020

For me, it would be Perl.

Perl is influenced by shell scripting and there are several easy ways to run commands, build pipes, redirect and other things shells are good at, like testing and listing files.

And as ugly as Perl looks, it has much less pitfalls than shell scripts.

renewiltord · on Sept 7, 2020

I wrote some Perl code once and it looks like you need more than a passing familiarity with the language to create code that is not just write-once.

GuB-42 · on Sept 8, 2020

With proper development practices, you can write clean Perl, even if you don't know much about the language. It has the same constructs as most other procedural languages and you can apply the same principles. Object orientation is a bit tricky though.

The thing is, Perl won't help you with discipline. If you want write-only code, Perl will compile it, no problem.

For that reason, it is the language of choice for one liners and throwaway scripts (and that's how I use it most or the times). But writing clean Perl is perfectly doable, though I usually prefer static languages if maintainable code is a priority (no Perl, no shell, not even Python).

im3w1l · on Sept 7, 2020

While bash itself has some warts, many of the issues are inherent to spawning child processes for accomplishing tasks. Thus, when possible, I try to use proper libraries rather than invoking commands from any language at all.

renewiltord · on Sept 7, 2020

I see. Which ones have you found most ergonomic to use?

im3w1l · on Sept 7, 2020

Python and as for libraries, well that's very dependent on what I want to accomplish, obviously.

Spivak · on Sept 7, 2020

The cost you pay for this is verbose and unergonomic interop when your problem domain is executing a bunch of commands. For all its warts shells shine at being extremely expressive.

It’s usually not worth the effort to shove everything into a real programming language but write a few self-contained commands designed to be called from a shell and then use the shell as the glue.

amirkdv · on Sept 7, 2020

Years ago I spent a lot of time (arguably too much time) writing (ba)sh scripts and this document was a great resource! But my main lesson from that period was that bash pitfalls are readability/maintenance nightmares, confounded by the fact that most people don't know about most of the pitfalls. To make matters worse, you can't know which pitfalls are or aren't known to the original author / future maintainer of the script in front of you.

But I'm not sure to what extent the language warts are inevitable consequences of what makes it so expressive and powerful. For example: unix pipes are amazing and the bash syntax for them is quite expressive. I can reproduce the same behavior in any of the more "sane" languages but I find that the equivalent code is a less readable procedural soup of string munging, control flow, and error control.

asicsp · on Sept 8, 2020

See also:

* BashFAQ - https://mywiki.wooledge.org/BashFAQ

* shellharden: safe ways to do things in bash - https://github.com/anordal/shellharden/blob/master/how_to_do...

* Security implications of forgetting to quote a variable in bash/POSIX shells - https://unix.stackexchange.com/questions/171346/security-imp...

CaptainZapp · on Sept 8, 2020

Especially the first link is incredibly useful to me.

Thanks a lot!

rashkov · on Sept 7, 2020

Is there a greater principle behind all of these pitfalls, a mental model that I can learn and use to avoid these kinds of mistakes? Or is it sort of like CSS where years and years of ad-hoc development make it an exercise in memorizing edge-cases?

Spivak · on Sept 7, 2020

Some off the top of my head.

* On Linux file and directory names can be anything except NUL. Because of how much software already assumes they’re UTF-8 you can probably safely assume it too and not run into much trouble. But since bash basically just does some text substitutions and then calls eval you have to be defensive.

* Know how bash parses front and back. Almost all the pitfalls are unexpected consequences of the parser not behaving how you expect. https://web.archive.org/web/20200124085320/http://mywiki.woo...

bewuethr · on Sept 7, 2020

Pretty much this, and as a subset of it: quoting, word splitting, the IFS variable and friends. Reading the BashGuide [1] cover to cover is an excellent base, I find.

[1]: https://mywiki.wooledge.org/BashGuide

mfontani · on Sept 7, 2020

IIRC directory (and file) names can contain anything but slash and null, not just null.

doctoboggan · on Sept 7, 2020

Do you know if the `--` trick mentioned in the article to stop option scanning works with all bash programs? I thought parsing was up to the program itself. Is it just a widely used convention?

JdeBP · on Sept 8, 2020

Talking of mental models: These are not "bash programs". Most of the ones where you need this mechanism have nothing to do with the Bourne Again shell, and are just generally-usable utility programs, usable even by things which aren't shells at all.

They are, generally, programs that happen to use library mechanisms such as (but not limited to) getopt() or the popt library to parse their arguments.

joombaga · on Sept 7, 2020

Yes, it's just a widely used convention, and a POSIX guideline.

12.2 Utility Syntax Guideline #10

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1...

scbrg · on Sept 7, 2020

You are correct, option parsing is up to each program. The double dash is just a convention.

chubot · on Sept 8, 2020

The most important thing for understanding shell (without loads of special cases) is understanding the Unix kernel, as well as the kernel interface that the shell uses (i.e. which syscalls).

If you strace the shell, that goes a long way. If you haven't used strace or don't know what it is, I recommend learning.

These comics give more details, and so do Julia Evan's other comics (I think the first one was about strace):

Three Comics For Understanding Unix Shell

http://www.oilshell.org/blog/2020/04/comics.html

For example this will help you understand that filenames and argv arrays are byte strings in Unix, which may be different than the notion of strings in your programming language.

-----

That will go a long way. However there is still a ton of accidental and needless complexity in bash, particularly related to string processing, and which is unrelated to the kernel. Hence the Oil project:

http://www.oilshell.org/blog/2020/01/simplest-explanation.ht...

ufo · on Sept 7, 2020

I think it is different than CSS here. With CSS I could never really understand how it worked and each new browser released introduced new concepts that added more and more corner cases to the mix. On the other hand, with shell scripting after a while I got a decent mental model of how things worked. It is just that there are some bad design choices that make it very easy to shoot yourself on the foot.

For example, word splitting "feature" of POSIX shell is the source of many of the pitfalls in the list and the solution in those cases is always the same -- always quote your variables.

By the way, check out Shellcheck if you haven't already. It can detect many of these common pitfalls and the error messages have a link to the explanation so it also works as a learning tool.

mey · on Sept 7, 2020

https://web.archive.org/web/20200907092725/https://mywiki.wo...

Since the site is being hugged a little too much.

I also really like https://github.com/koalaman/shellcheck

INTPenis · on Sept 7, 2020

Seeing this wiki hugged to death by HN makes me realize that it would be perfect for migrating to some simple cloud hosting and using static markdown in git.

All their maintainers are already technical enough to handle doing updates in git. And with certain static doc generators like mkdocs you can have an Edit link in each page that brings the user to a Gitlab repo with an editor and Preview.

pathseeker · on Sept 8, 2020

Throwing everything at another company so you can blame your outages on them does not a better engineer make.

This site is just doing something silly dynamically for something that should be static (e.g. reading from a database). A single modern computer can easily handle the HN hug of death serving static files.

scns · on Sept 7, 2020

Shellcheck and IdeaVim are the first plugins i install on Jetbrains IDEs

pletnes · on Sept 7, 2020

Shellcheck is, in general, one of the most undervalued dev tools hiding in the bushes. Especially as everyone has to shell at least a little bit, in particular linux newbies.

1vuio0pswjnm7 · on Sept 8, 2020

"13. cat file | sed s/foo/bar/ > file

   You cannot ..."

With sed, in some cases, you can. One of the differences between Linux-provided sed and BSD-provided sed is the -a option.

FreeBSD manpage:

"-a The files listed as parameters for the "w" functions are created (or truncated) before any processing begins, by default. The -a option causes sed to delay opening each file until a command containing the related "w" function is applied to a line of input."

So let's say you have a file

   cat > file
   sandfoo
   ^D

BSD-provided sed

   cat file|sed -na 's/foo/bar/;H;$!d;g;w'file

The replacement is made without any temp file.

Note this example will add a blank line at the top.

Why use sed? sed is part of BSD base system, the toolchain^1 and install images thus it can easily be utilised early in boot process. I am not aware that "sponge" is part of any base system.

1. NetBSD's toolchain can be reliably cross-compiled on Linux.

koala_man · on Sept 8, 2020

An easier way to do the same thing but for any tool is `cat file | sed 's/foo/bar/' | sponge file`

yjftsjthsd-h · on Sept 8, 2020

Good old moreutils:)

samoa42 · on Sept 8, 2020

what is wrong with the '-i' option?

-i[SUFFIX], --in-place[=SUFFIX] edit files in place (makes backup if SUFFIX supplied)

1vuio0pswjnm7 · on Sept 8, 2020

sed -i creates a temp file. Needs additional storage space the size of the file.

This can pose a problem with large files on smaller computers with limited storage and memory.

In the past, some BSD seds (other than FreeBSD) did not have the -i option.

Also worth noting that "sponge", unlike sed, apparently reads the entire file into memory. For a large file, this would require enough RAM to fit the entire file.

uberduper · on Sept 8, 2020

Try `sed -i ""` to avoid creating a backup.

1vuio0pswjnm7 · on Sept 8, 2020

When you use sed -i, a temporary file is always created.

Using FreeBSD as an example, see references to "tmpfname" in main.c:

https://raw.githubusercontent.com/freebsd/freebsd/master/usr...

The naming scheme differs between seds. GNU prefixes the name with "sed" while BSD encloses it between two "!"'s.

To see the name of the temporary file on BSD, something like

  ktrace sed -i 's/foo/bar/' file
  kdump ktrace.out|sed -n '/NAMI/p'

simias · on Sept 8, 2020

Most of the pitfalls described in this doc are /bin/sh pitfalls, not bash per-se.

On that note, let me rant: stop writing bash scripts people. If POSIX shell is not good enough, it's a good sign that you should move to a more expressive language with better error handling. Bash is a crappy compromise: it's not as portable as POSIX but it's also a very crappy scripting language.

You might say I'm a pedant because "bash is everywhere nowadays anyway, your /bin/sh might even be bash in disguise" but I'll be the one having the last laugh the day you have to script something in some busybox or bash-less BSD environment. Knowing how to write portable shell scripts is a skill I value a lot personally, but that might be because I work a lot with embedded systems.

Also if you assume that your target system will always have bash, there's a very good chance that it'll also have perl and/or python too. So no excuses.

scipute68 · on Sept 8, 2020

This is basic nonsense. Design your scripts and provision your environments. _You_ can stop writing bash scripts. I find them eminently useful and I control/provision the environments they run in. If you are not in control then do what you are told. Most people would puke if I told them that a lot of csh is alive and well in sci-prod environments but that is the way it are. You _do_ tailor your environment bc that is the way *nix was designed. Controlling people and peoples habits is faang land.

bonoboTP · on Sept 8, 2020

The biggest transformative learning experience for me in Unix command gluing was to understand argument vectors. That a command execution is not just one big string, but will at the point of calling the command, be transformed by Bash to a vector of strings, to be passed on to the execve system call.

The point is, all those quotes etc. are syntax we use to tell the Bash I terpreter what the strings passed on should be. Think in terms of the string vector and not in terms of the one big string in front of your eyes that is a line of Bash code. The latter is a recipe for memorizing magic incantations and cargo cult sprinkling of quotes here and there.

The horrible thing with Bash is that while normal programming languages make the difference crystal clear between the name of a function being called and its mandatorily quoted string args neatly separated with commas, Bash will come up with this split on the spot using arcane rules.

Letting a string automatically fall apart into multiple argemts on the spot may have seemed convenient in the old days for forwarding invocations, but it seems clear to me that this was a design mistake.

In the end if you want to have robust Bash code you have to keep track of string splitting and will almost never rely on the default behavior of letting Bash "helpfully" pre-digest and split things up for you.

Much of Bash expertise is then about how to fight vigorously against the default behavior of silently chunking up your args. But fighting really strong means using e. g. arrays with ugly syntax, and other magic looking constructs. It shouldn't take so much alertness to just keep to sane behavior. The problem is, the interpreter is actively hostile to your endeavor and will explicitly try to screw with you.

slavik81 · on Sept 8, 2020

I was writing a script recently and I was going to use `cp -- "$src" "$dst"` but neither `man cp` nor `cp --help` mentioned `--` as an option. I'm not sure I'm using info right, but it doesn't seem to be mentioned in `info cp`, either.

Is that feature described somewhere else, or is it just undocumented?

JdeBP · on Sept 8, 2020

The -- mechanism in general, not specific to the cp command albeit mostly limited to programs that use getopt/popt for their argument parsing, is described all over the place. I first encountered it in a book, written by Eric Foxley, entitled UNIX for Super-Users and published in 1985.

* It's perennially asked about on Unix and Linux Stack Exchange, ranging from https://unix.stackexchange.com/q/11376/5132 almost a decade ago to https://unix.stackexchange.com/q/570729/5132 last year.

* It's in a whole bunch of duplicate questions there, including (but not limited to) https://unix.stackexchange.com/questions/linked/1519?lq=1 .

* It's in the Single Unix Specification, and has been for a long time. (https://pubs.opengroup.org/onlinepubs/7990989775/xbd/utilcon...)

* It's in the getopt(3) manual page from the Linux man-pages project. It's in the getopt(1) manual page from the util-linux project.

* It's in getopt(1) and getopt(3) in the NetBSD Manual. The same goes for the OpenBSD and FreeBSD manuals.

* It's in Blum's and Bresnehan's 2015 Linux Command Line and Shell Scripting Bible, on page 377.

* It's in Marcel Gagné's 2002 Linux System Administration: A User's Guide, on page 48.

* It's in Kirk Waingrow's 1999 UNIX Hints & Hacks.

* It was Q&A number 32 in part 2 of the Usenet comp.unix.shell FAQ document. (http://cfajohnson.com/shell/cus-faq-2.html#Q32)

Specific to the cp command, or rather to the many cp commands that you could end up using, as this is not a "bash command":

* No, the OpenBSD, FreeBSD, and NetBSD cp(1) manuals do not mention it.

* Neither does the GNU CoreUtils cp(1) manual page, as you noted. It's instead in the GNU CoreUtils doco section on "common options". The actual CoreUtils cp(1) manual doesn't mention that there even are "common options". (https://manpages.debian.org/sid/coreutils/cp.1.en.html) That's only mentioned, as an aside, in the info page. (https://gnu.org/software/coreutils/manual/html_node/cp-invoc...)

* For the cp command that is part of Rob Landley's toybox, this is covered in the introduction to the toybox(1) manual page, and there isn't really a distinct cp(1) manual page. (http://landley.net/toybox/help.html)

* No, BusyBox doesn't similarly document this for its cp command, or anywhere on busybox(1). (https://busybox.net/downloads/BusyBox.html)

* It's tangentially in several GNU CoreUtils FAQ Q&As, inasmuch as CoreUtils cp is the same as CoreUtils rm in this regard. (https://gnu.org/software/coreutils/faq/coreutils-faq.html#Ho...)

* It's tangentially in the Linux Documentation Project's Advanced Bash-Scripting Guide, again because cp is the same as rm here. (https://tldp.org/LDP/abs/html/special-chars.html#DOUBLEDASHR...)

* For an operating system where this is actually in the cp(1) manual itself, look to Solaris, and its successor Illumos, where you'll find it in the NOTES section. (https://illumos.org/man/1/cp)

* Another operating system where this is actually in the cp(1) manual itself is AIX, actually in the FLAGS table alongside all of the other flags. (https://ibm.com/support/knowledgecenter/ssw_aix_72/c_command...)

slavik81 · on Sept 8, 2020

That is a far more comprehensive answer than I was expecting!

You're absolutely right about my system. There is a link to the coreutils common options in the cp info page. I missed that when I was searching through it.

bluedays · on Sept 8, 2020

Oh neat. I found a problem with a script I wrote today because of this.

I wrote a script recently that takes several screenshots from the same folder, merges them using Image Magick and outputs the merged image into a PDF. Found out through the first tip that I shouldn't be using ls: https://mywiki.wooledge.org/BashPitfalls#for_f_in_.24.28ls_....

ryangittins · on Sept 7, 2020

Looks like HN has hugged this one to death.

https://web.archive.org/web/20200907174629/https://mywiki.wo...

chousuke · on Sept 8, 2020

One new pitfall I learned yesterday is that sourcing a file searches $PATH. I ran a tool that did ". config" on OpenBSD and it tried to source config(8) instead of its configuration in the local directory. Oops.

JdeBP · on Sept 8, 2020

On the subject of OpenBSD, sourcing configuration files, and security: Enjoy https://unix.stackexchange.com/a/433245/5132 .

rurban · on Sept 8, 2020

I have almost no bash scripts.

Whenever sh is not good enough and I have to switch to bash, I'd rather rewrite it in proper Perl. Bash really looks like a hack, and not all environments have bash. Everybody has a conforming sh.

LockAndLol · on Sept 8, 2020

I was half-expecting a page with the bash grammar. It's super easy to make mistakes in that language. I'm glad there are things like xonsh around trying to act as sane alternatives.

znpy · on Sept 7, 2020

It's both interesting to read and meh.

It's meh because like many things, you actually have to read the fine manual if you want stuff to actually work.

It's interesting because it's always nice to have a refresher and/or reminder of various bash things.

That being said... I've said this many times, I'll say it again: bash scripts are software, and thus must not be exempted from various programming best practices. The first thing that comes to my mind is input validation and error checking.

Bash is not magical. It's one of those things that have to be mastered through practice... Just like many other programming languages.

inoffensivename · on Sept 8, 2020

It would be easier to list all the constructs in bash that are well-behaved and intuitive.

zerofrancisco · on Sept 7, 2020

Excellent caveats and suggestions to do good bash. Thank you! :)

phibz · on Sept 7, 2020

I wonder if there's a linter to check for these issues.

boogies · on Sept 7, 2020

Shellcheck? https://github.com/koalaman/shellcheck

baby · on Sept 8, 2020

To me this screams: don't use bash in production.

jakeogh · on Sept 8, 2020

Yea. Or test the heck out of it. Once it works, it tends to work well.

It took me ~8 years (so far, I'm sure there is a bug somewhere) and considerable #bash advice to make a simple script that locks commands+args https://github.com/jakeogh/commandlock

I also wrote a MDA wrapper for gnupg, and again, it took years to feel good about having my mail filter through it https://github.com/jakeogh/gpgmda

Now that py has pathlib, if it's more than a few lines, py with @click is just way better. All that is really missing is cleaner copy/move abstractions.

There are also some features that are missing from all shells... like the ability of a script to know if expansion happened before it got $@... that internal state just isn't exposed.

hi41 · on Sept 8, 2020

What is the currently accepted best practice to write scripts? Is it to use Perl or Python?

kennywinker · on Sept 7, 2020

Bash pitfall #1: deciding to use bash

jjgreen · on Sept 7, 2020

So true, bash is a fine interactive shell, but for scripts the Bourne shell is almost as capable, portable as in POSIX and did not suffer shellshock.