Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Having written a Rust wc implementation a few years ago (https://github.com/Freaky/cw), I had a look at theirs.

It's pretty naive - a simple linewise read_until loop, a conditional to avoid word splitting and such if it's not needed, and for some reason it collects results into an array and prints when it's done rather than printing as it goes.

It doesn't support --files0-from like GNU wc, so isn't a drop-in replacement from that perspective. It also has the sadly common Rust trope of only supporting filenames that are valid UTF-8.

It doesn't seem overly slow considering its simplicity - usually trading blows with GNU and BSD wc. Perhaps the most glaring omission is the lack of a fast path for -c, which should reduce to a stat() call. Also unfortunate not to use the excellent bytecount crate to provide a very fast -l/m path.

The read_until loop also makes its memory use unpredictable compared with other wc's. If you run it on /dev/zero it will try to eat your computer.



GNU wc doesn't seem to have that -c fast path. I can tell because files in /proc report 0 for st_size even though they're not empty:

    $ wc -c /proc/self/cmdline
    25 /proc/self/cmdline
    $ stat -c '%s' /proc/self/cmdline
    0


It certainly has a fast path, it just mixes it up with lseek() for filesystems like /proc: https://github.com/coreutils/coreutils/blob/9de1d153f82243ae...

FreeBSD just uses fstat: https://github.com/freebsd/freebsd-src/blob/e4b8deb222278b2a...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: