Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I tend to agree.

> Ranges also make exceptions clearer, e.g., if you omitted the letter O it would be obvious in a range but not obvious in a long list.

That feels true for 0-9A-Za-z, specifically. But that's only because of the ubiquity of ANSI. In EBCDIC A-Z and a-z aren't contiguous, and POSIX only requires 0-9 to be contiguous.

In practice those are esoteric caveats we can all ignore, but for everything else (every other potential range) it can't be ignored at all. The range short-hand is actually quite useless except for very terse scripts (e.g. sed) and only then for A-Za-z (or A-Fa-f).[1]

Similarly, defining letter using the long form may be more error prone as compared to the idiomatic short form, but that's the only case. In practice letter is often predefined, and in any event it's rather trivial to verify oneself--step through the alphabet, then double check by counting letters to 26. I don't think I've ever had an error where I forgot a letter in a long-form A-Za-z set, but I have forgotten a letter in a long-form A-Fa-f set and even done stupid stuff like a-e in short-form precisely because it's easier to be sloppy when you think it's difficult to get wrong. In terse code big errors are inconspicuous.

[1] In writing portable sed and tr code I typically list the letters individually as IME neither [[:alpha:]] nor [A-Za-z] are universally supported from the default implementations. /bin/sed and /bin/tr may not be the system's POSIX-compliant implementations or may require special environment presets, and it's usually easier to live within de facto limitations than to debug and maintain a complex preamble that attempts to locate the POSIX-compliant utilities.



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: