Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Your top 100 unix commands, for science (docs.google.com)
156 points by hackerpants on May 19, 2013 | hide | past | favorite | 79 comments


It's ironic that the standard one-liner for this uses a pipeline but only counts the first word of the first command of the pipeline. Given itself as input, "sort" ought to be the most used command, but "history" is the only one counted. (I bring this up not for the sake of standard HN nitpicking but to point out that you probably do want to cover at least `sort` in your workshop, and it will be underrepresented in these results, along with `grep`, `wc`, etc.)

Unfortunately, shell grammar is complex enough that a correct one-liner is probably infeasible. For example, one of my top "commands" if you count by words is an environment variable setting prepended to an actual command.

EDIT: you can get good enough results by just splitting on "|", as others have suggested here -- any parts of regular expressions, etc that aren't really commands will probably be infrequent enough to get lost in the noise, and treating || as containing an empty command won't hurt. If you're going to catch ||, though, might as well get && too... and now you're going down the rabbit hole :-)


Yup, this oversight is officially embarrassing.


Quick hack - you can include the commands you pipe to with this:

    history                  \
        | sed "s/^[0-9 ]*//" \
        | sed "s/ *| */\n/g  \
        | awk '{print $1}'   \
        | sort               \
        | uniq -c            \
        | sort -rn           \
        | head -n 100        \
        > commands.txt
I haven't tried to account for pipe symbols inside strings - it didn't seem work it.

In case there are commands you want then to exclude (which I do) then you might want to "head -200", remove the commands you don't want to provide, and then trim to 100.

Added in edit - having done this a good half of my top 100 are actually scripts, so this is pretty pointless for me unless I rummage through them to find the common commands.

Added in edit again:

OK, here's a version that only includes actual system commands, and hence filters out all my personal scripts and commands:

    history                      \
        | sed "s/^[0-9 ]*//"     \
        | sed "s/ *| */\n/g"     \
        | awk '{print $1}'       \
        | xargs which            \
        | sed "s.^/usr.."        \
        | grep ^.bin             \
        | sed "s/^.*\///"        \
        | sort                   \
        | uniq -c                \
        | sort -rn               \
        > commands.txt


I think you want s///g on your second sed... Also that doesn't work in Mac OS X for some reason (their sed doesn't appear to interpret \n in the replacement text). I replaced it with perl to make that part work:

    perl -pe 's/ *\| */\n/g'
I still haven't gotten the whole thing to work yet because my history contains the above history pipeline and so it's splitting the "|" that inside the sed command onto multiple lines which is causing "xargs which" to balk because quotes are not matching or something:

    xargs: unterminated quote
Shells are amazing until spaces or quotes are involved! :-)


Yes, I've inserted the "g" in the appropriate sed commands. Thanks - good catch.

Some systems seem to require \r instead of \n - I know vim's behavior differs from sed's in this, so that might be an issue.

With xargs balking, you can throw the error stream at that point, for convert it to a loop over the alleged commands:

    for c in $( long thing before the xargs )
    do
        which $c
    done \
    | long thing after the xargs.
Specifically:

    for f in $( \
        history                      \
            | sed "s/^[0-9 ]*//"     \
            | sed "s/ *| */\n/g"     \
            | awk '{print $1}'       \
        )
    do
        which $f 2> /dev/null
    done                         \
        | sed "s.^/usr.."        \
        | grep ^.bin             \
        | sed "s/^.*\///"        \
        | sort                   \
        | uniq -c                \
        | sort -rn               \
        > commands.txt
And yes spaces are a pig, and can lead to all sorts of ambiguities that don't have reasonable ways of resolving them, especially in filenames.


"\r" doesn't work either. It's just not interpreting those kind of escapes. I get lines like this:

    historyrsed "s/^[0-9 ]*//"rsed 's/ *r*/\r/g'rless
The for loop is a good idea, though I prefer the "while read" idiom since it fits in with the pipeline better:

    ...
    | awk '{print $1}'                      \
    | while read line; do which $line; done \
    ...
That finally works for me. Here's the whole thing:

    history                                     \
        | sed "s/^[0-9 ]*//"                    \
        | perl -pe 's/ *\| */\n/g'              \
        | awk '{print $1}'                      \
        | while read line; do which $line; done \
        | sed "s.^/usr.."                       \
        | grep ^.bin                            \
        | sed "s/^.*\///"                       \
        | sort                                  \
        | uniq -c                               \
        | sort -rn


This works for me in zsh on OSX

history \ | sed "s/^[0-9 ]//" \ | perl -pe 's/ \| */\n/g' \ | gawk '{counts[$1] += 1} END { for (x in counts) { print counts[x],x}}' OFS="\t" \ | column -t \ | sort -k 1,1nr \ | head -100

Ok not sure how to format this on HN https://gist.github.com/nyxwulf/5608955#file-gistfile1-sh


From what I recall to represent s newline in sed on osx you need to add an actual carriage return.

Take the poster's script above and copy it to a text file. Now replace \n with a new line, save and it should run.


Thanks very much! As of 2:14pm, it's using your new version for greater pipe accuracy =)


Not too difficult to guess what I'm doing. I also use a lot of sed, grep, xargs and such, but they're typically are piped and don't show up here. I removed some local scripts from here that won't make sense to others.

  1800 git
   701 cd
   541 rsence
   539 ruby
   420 ssh
   354 ls
   305 host
   260 make
   242 subl
   202 mosh
   196 brew
   193 sudo
   189 rvm
   184 gem
   131 cat
   130 irb
   117 say
   116 make
    98 rm
    94 svn
    83 curl
    66 whois
    66 find
    65 ping
    60 mv
    57 mongo
    55 scp
    53 python
    47 open
    43 man
    40 du
    40 cp
    39 mkdir
    39 launchctl
    38 which
    33 lm4flash
    31 while
    30 bluecloth
    22 time
    22 ps
    21 npm
    21 miniterm
    21 killall
    20 locate
    19 ln
    18 tail
    18 #
    17 pwd
    17 ./configure
    15 dd
    14 nslookup
    13 kill
    12 sleep
    12 lsof
    12 df
    12 chdiff
    11 echo
    11 chmod
    10 touch
    10 mount
    10 ioreg
    10 ifconfig
     9 traceroute
     9 telnet
     9 rsync
     9 pip
     8 ljfuse
     8 fdisk
     8 dig
     7 ulimit
     6 tar
     6 mongod
     6 java
     6 base64
     6 autoconf
     6 alias
     6 /usr/bin/ruby
     5 reset
     5 node
     5 diff
     5 automake


In case anyone is interested in the relevant incantation:

  history | awk '{print $2}' | sort | uniq -c | sort -rn | head -n 100
OP, maybe add a curl command or something to post the resulting answer to your website form?


Excellent idea! I am investigating how to best do that.


This should submit the commands but not any of the optional questions:

  hist=`history | awk '{print $2}' | sort | uniq -c | sort -rn | head -n 100`; curl --data-urlencode -i -X POST -H 'Content-Type: application/x-www-form-urlencoded; charset=utf-8' -d "draftResponse=%5B%5D%0D%0A&pageHistory=0&entry.194207258=&entry.1414618252=&entry.1080345712=$hist" https://docs.google.com/forms/d/1XNMoSdfYFe_WkPfU--M88oL00PDLIOAo1HxjhZvZYJ4/formResponse


To anyone copying/pasting: the whole command extends far to the right of what's visible.


Not very difficult to tell how I use my computer :D

    139 vim
    137 coverage
    135 ls
    105 cd
     98 git
     98 fg
     52 clear
     25 sudo
     19 ./manage.py
     17 grep
     16 mysql
     14 source
     14 find
     13 python
     11 rm
     10 ssh
      8 mv
      8 go
      7 ps
      6 rsync
      6 pip
      5 xargs
      5 mkdir
      5 cp
      4 xclip
      4 sed
      4 pianobar
      4 man
      3 vi
      3 touch
      3 pkill
      3 deactivate
      3 alsamixer
      2 time
      2 tar
      2 startx
      2 ssh-add
      2 sprunge
      2 kill
      2 java
      2 ./build.sh (script to build a Go program)
      2 ./bin/api_server
      1 xbacklight
      1 which
      1 s
      1 rmdir
      1 redshift
      1 lesss
      1 less
      1 jobs
      1 history
      1 hg
      1 export
      1 dmesg
      1 date
      1 cat
      1 ./all.bash
      1 acpi


Well.

   1053 ,
   1002 %
    679 make
    659 vi
    630 s
    293 cm
    278 l
    257 ack
    246 grep
    218 a
    183 co
    147 w
    143 sudo
    133 rm
    129 cat
    114 ..
    110 cd
    109 h
    107 go
     93 ssh
     86 mv
     84 v
     71 git
     70 find
     67 irb
     63 ./manage.py
     62 echo
     61 apt-cache
     55 ~
     46 ruby
     46 jekyll
     46 ada
     43 mkdir
     43 erl
     42 man
     40 ./rebar
     39 gpg
     38 dc
     37 nc
     37 d
     35 rh
     35 cx
     34 cp
     32 ps
     31 curl
     31 bundle
     30 file
     30 dig
     29 ad
     28 chmod
     27 tmux
     27 bin/koosk6la
     26 ping
     26 h1
     26 gdb
     25 r
     25 ./gundrey
     25 fab
     25 ./deploy
     24 tail
     24 lt
     24 dialyzer
     24 bd
     22 y
     22 mm
     22 less
     21 wtf
     21 web
     21 st
     21 g++
     20 stp
     20 rsync
     20 cb
     19 rbm
     18 valgrind
     17 which
     17 ../rebar
     16 wget
     15 wo
     15 unzip
     14 whois
     14 ./rewrite.rb
     14 kill
     14 cabal
     14 -
     13 wt
     13 mushroom
     13 for
     12 Work/draftable
     12 mnf
     12 ls
     12 
     11 vmware-gksu
     11 pcp
     11 nanoc
     11 ./live.py
     11 k6
     11 firefox
     11 Code/kivikakk.ee
     10 tar


"s"? "a"? Mind explaining some of these?


"git status -sb" and "git add -p", respectively. Most of the 1-3 letter ones are git aliases (and notice how often I use them!), corresponding to the aliases here: https://github.com/kivikakk/dotdirs/blob/master/gitconfig#L3... (some rubbish in my .zshrc makes it such that "a -> git a")


what do your , & % commands do?


ls and fg, respectively. (the latter is standard)


Be careful if you are in the habit of using environment variables to specify API keys or database passwords. One of my top commands is `FACEBOOK_SECRET=...`.


Is it a good idea to keep passwords in environment variables?

Isn't it safer to create a credentials file and give it the appropriate chmod?


From a deployment point of view, environment variables are a pretty good choice.

http://www.12factor.net/config


vacri@devbox:~$ ps aux | grep elasticsearch

112 6725 0.1 36.7 1965924 1411164 ? SLl May03 37:11 /usr/lib/jvm/java-7-openjdk-amd64//bin/java -Xms1g -Xmx1g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -Delasticsearch -Des.pidfile=/var/run/elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp :/usr/share/elasticsearch/lib/elasticsearch-0.90.0.jar:/usr/share/elasticsearch/lib/:/usr/share/elasticsearch/lib/sigar/ -Des.default.config=/etc/elasticsearch/elasticsearch.yml -Des.default.path.home=/usr/share/elasticsearch -Des.default.path.logs=/var/log/elasticsearch -Des.default.path.data=/var/lib/elasticsearch -Des.default.path.work=/tmp/elasticsearch -Des.default.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.ElasticSearch


I'm not sure what you're trying to say, but environment variables and arguments are different things. Environment variables avoid exactly that problem


Yes! I don't want to know anybody's secrets. I won't be publishing the raw data for this reason, in case anything like this gets through by accident.


I did history | grep -v "=" | .... to ensure that commands where I specified environment variables are ignored.


It might just be my OCD showing, but 'clear' is my number one by an order of magnitude.

next 20:

  177 ls
  167 cd
  122 brew
  120 vim
  119 cabal
  117 cindy (an alias for ssh'ing to my local dev box)
  115 sudo
  114 tmux
  111 man
  110 less
  110 openssl
  109 rm
  108 whois
  107 ghci
  107 lein
  106 ssh
   86 strings
   75 smbclient
   75 racket
   65 nc


I use ctrl+L a lot in my term. It clears.


command + k on a mac, I don't even want to know how frequently I use that.


Control L works on Mac too.


My history file is de-duplicated (oh-my-zsh enables hist_ignore_dups), so this sort of approach is never accurate.

Git comes out on top, but presumably because most of my invocations are unique ("git commit -m ..."), while things like "ls" are farther down the list.

Relevant: https://github.com/paulmars/huffshell


I work in bioinformatics and git is my most frequently used command. I wish this was more often the case in science.


do you know about the organization Software Carpentry? http://software-carpentry.org/

They work on teaching scientists to be better at software, and are always looking for people to do workshops


This probably won’t be useful, since I use a lot of aliases. Here are my more common commands:

    g (git)
    l (ls)
    c (cd)
    v (vim)
    make
    fg
    rm
    m (mv)
    s (sudo)


Unfortunately, my history settings mess with my results. Commands that are three letters or less just get ignored. Identical commands repeated in succession are also only counted once.

  #History settings
  HISTCONTROL=ignoreboth
  shopt -s histappend
  HISTSIZE=2000
  HISTFILESIZE=2000
  HISTIGNORE=?:??:???:clear:tmux
Still, arguments count towards that length and my top seem to be git, vim, cd, rake, and open. I suspect fg and ls would be higher up otherwise.


This skips all the commands I pipe through, which excludes over half of the commands I use!


Try:

  history | sed 's/|/\n1 /' | awk '{print $2}' | sort | uniq -c | sort -rn | head -n 100
Of course this doesn't differentiate between quoted pipes and ones actually being used.


The ‘awk’ there effectively still drops everything after the second limiter, i.e. you won’t get a different result.

Edit:

    history | sed -e 's/^ *[0-9]*  //' -e 's/| */\n/'  | awk '{print $1}' | sort | uniq -c | sort -rn | head -n 100
works for me.


Yeah I just realized that. Fixed.


Excellent point. I'm definitely going to put that caveat at the beginning of any analysis I post. (and thanks to ctrl_freak for posting an alternative!)


I'm either really bad at using git, or love the crap out of it. Not sure which.


    149 sudo
    123 cd
     79 ls
     57 nano
     47 get (alias for sudo apt-get install)
     40 acpi
     36 clear
     18 sshlg (custom command to access my ssh server)


    738 ll (alias for a more reformed `ls` output)
    549 python
    457 cd
    397 git
    173 exit
    157 vim
    150 less
    142 ssh
    122 ./synchVeiled.sh ( script I'd been using a lot for a project )
    121 sudo
    109 scp
    109 fab
    101 gca ( alias for "git commit -am" )
     97 curl
     88 bash
     74 grep
     68 startx
     56 rm
     56 lguf ( alias to show all files in a git repo that are not being tracked )
     55 history
     54 whois
     49 htop
     48 source
     40 echo
     39 foreman


This would be fantastic if all it had was the instructions to see your own top commands. What a great way to discover opportunities for automation and time saving!

I can't wait for the full results.


You can leave your email address in the form if you want to get an update when I post them.

I'll also post the results to HN, of course :)


For those using the bash shell, the hash builtin:

    $ hash | sort -nr
may be informative, producing counts and command paths, although only for the current shell instance.


I should start wiping my history. A lot of passwords I typed in while focused on the wrong terminal came up in my top 100.


Here are my top ten: d l m o ll e pwd rm dict open.

d <foo> is short for {cd <foo>;ls}. l is ls. m is mv. o is Apple's open command. e is emacsclient.

The only reason grep is not in that list is that I do my grepping through an emacs command (M-x grep).

pwd is frequent because I do not put the working directory in my shell prompt.


Just in case someone is worried, last time I checked, the html source of whats copied seems clean and copy pastable. Referring to this: http://thejh.net/misc/website-terminal-copy-paste


For those using ZSH and prefer history with timestamps (setopt EXTENDED_HISTORY), the awk command should print the 4th arg instead

    history | sed -e 's/^ *[0-9]*  //' -e 's/| */\n/'  | awk '{print $4}' | sort | uniq -c | sort -rn | head -n 100


Many of my top commands are 1- or 2-letter aliases. g for git, v for vim, and so forth.

I had to change the command slightly to work with zsh -- instead of 'history' I did 'history 1' to get the full (10k entries for me) history instead of the most recent 16.


55 git 41 ls 39 cd 18 mvim 10 rm 9 dart2js 7 dart 6 which 6 view 5 brew 4 historynsed 3 sudo 2 setngrep 2 ruby 2 pub 2 nsort 2 clear 1 set 1 scp 1 pwd 1 nuniq 1 node 1 nhead 1 nawk 1 mkdir 1 fd ## alias "ls -l | grep '^d'" 1 cp 1 cls 1 chmod


I've posted a list of commands I often use, here: https://github.com/logotype/useful-unix-stuff/blob/master/us...


I run wildly different commands on different hosts so I actually took three hosts I'm often logged into and combined their most run commands in history into one 100 line file.

Fun project, good luck!


My history seems to have timestamps on FreeBSD-CURRENT running bash.

I had to use

history | sed "s/^[0-9 ]//" | sed "s/ | */\n/g" | awk '{print $3}' | sort | uniq -c | sort -rn | head -n 100 > commands.txt



Heh, my top five are javac, java, git, cd, and mrt.


The list from my laptop is very different from my development machine of choice. Not sure if adding it twice would skew the results.


I really want to see the output of this survey. Any chance we can sign up to get an email notification when its published?


My top 5: vim, mysql, grep, php, nginx

I suppose you could guess that my Linux box is an LNMP server pretty easily.


I regularly wipe my .history. Sorry.


Hilariously, about a quarter of my top 100 are typos. Time to make some aliases.


Why not update the spreadsheet to include a link showing the current top 100?


'egrep' is the first on my list.

I guess the future is all about regular expressions :)


You probably want

`history -100`

on OS X, anyway, the default only gives 15 or so commands in history


On 10.8, I'm getting the full 500 lines of history. Not sure that the behaviour has ever been different.


I believe I have set HISTFILESIZE TO 2000 or something, I still need to compact it, and have commands like ls pruned by HISTIGNORE.

I use the awkscript below to compact it:

# histsort.awk --- compact a shell history file # Thanks to Byron Rakitzis for the general idea

     {
         if (data[$0]++ == 0)
             lines[++count] = $0
     }
     
     END {
         for (i = 1; i <= count; i++)
             print lines[i]
     }


Huh, I didn't know that. What command gives the entire history on OS X, then?


history shows the full .bash_history, but by default HISTFILESIZE is 500 so it only stores 500 entries.


I ran this in Mac OS without a hitch, gave 100. I use zsh, maybe the default shell doesn't?


Super-facile gpm plopping grabs across six nox tty consoles onto emacs and as apps arguments is the most pleasant, productive, and relaxed user interface I know.


A godsend for the lazy ones


887 git

702 ls

623 cd

618 gs

349 mate

340 ga

313 bundle

251 rake

221 ssh

197 gd

164 cat

159 grep

153 rm

148 rails

131 cap

127 gem

125 exit

122 powder

109 vagrant

98 vim

67 rbenv

66 make

65 zeus

65 ping

63 cp

61 gco

58 mv

55 sudo

51 ps

47 mkdir

43 kill

37 grm

36 echo

33 ruby

30 foreman

29 gdb

26 brew


I only know one... startx


276 fg

188 jobs

113 grep

74 git

72 ls

72 cd

53 vi

clearly no one else uses vim like i do.


I do, but only for the past 4-5 months. I send vim to the background/fg it very often these days.


curl ssh cd ls wget


bash only :-(




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: