Hacker News new | past | comments | ask | show | jobs | submit login

You're still giving up entropy because even-numbered file sizes are quite a bit more common:

    $ find / -ls | awk '{print $7}' >/tmp/filesizes
    $ rev /tmp/filesizes | cut -b1 | sort -n | uniq -c | sort -n
    5718 5
    5816 3
    5823 1
    5901 9
    6463 7
    7549 4
    7958 8
    8240 0
    8991 6
    14507 2



Very cool. I just tried it on my own machine and got this (archlinux with lots and lots of python libraries/data junk):

Changed the command up a bit since it was including zero byte files and directories:

  sudo find / -size +0 -type f -ls | awk '{print $7,$NF}' > /tmp/filesizes
 110920 7
 111306 5
 111312 1
 111345 9
 111362 3
 130138 0
 130511 4
 131528 8
 139000 2
 152213 6
Some other interesting things (dirname is too slow...):

  cat filesizes | grep "6 " | awk '{print $NF}' | sed 's/\/[^/]*$//' | sort | uniq -c | sort -nr | head
   1607 /usr/share/man/man3
    906 /home/matt/.cache/mozilla/firefox/8jc2n3qa.default/cache2/entries
    825 /usr/bin
    808 /home/matt/.tmux/resurrect
Then again a lot of this is flawed because it's looking at /sys. Too lazy to fix it now :)


Any idea why 2 is so common? This same trend shows up on my system and seems to be caused by 102-byte files.


Good question. On OS X it seems the size of a directory is typically a multiple of 102 bytes (often 102 or 306). I ran my script again with "find / -type f" to select only regular files, and it showed file sizes ending in 2 being just slightly more common than 0. Odd sizes are still less common than even ones, though, so the overall conclusion is unchanged regardless of whether you consider checksumming of directory entries to be a valid application.


thx for the nice answer!




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: