Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For languages like C, C++, and Rust, the bottleneck is going to mainly be system calls. With a big buffer, on an old machine, I get about 1.5 GiB/s with C++. Writing 1 char at a time, I get less than 1 MiB/s.

    $ ./a.out 1000000 2000 | cat >/dev/null
    buffer size: 1000000, num syscalls: 2000, perf:1578.779593 MiB/s
    $ ./a.out 1 2000000 | cat >/dev/null
    buffer size: 1, num syscalls: 2000000, perf:0.832587 MiB/s
Code is:

    #include <cstddef>
    #include <random>
    #include <chrono>
    #include <cassert>
    #include <array>
    #include <cstdio>
    #include <unistd.h>
    #include <cstring>
    #include <cstdlib>

    int main(int argc, char **argv) {

        int rv;

        assert(argc == 3);
        const unsigned int n = std::atoi(argv[1]);
        char *buf = new char[n];
        std::memset(buf, '1', n);

        const unsigned int k = std::atoi(argv[2]);

        auto start = std::chrono::high_resolution_clock::now();
        for (size_t i = 0; i < k; i++) {
            rv = write(1, buf, n);
            assert(rv == int(n));
        }
        auto stop = std::chrono::high_resolution_clock::now();

        auto duration = stop - start;
        std::chrono::duration<double> secs = duration;

        std::fprintf(stderr, "buffer size: %d, num syscalls: %d, perf:%f MiB/s\n", n, k, (double(n)*k)/(1024*1024)/secs.count());
    }
EDIT: Also note that a big write to a pipe (bigger than PIPE_BUF) may require multiple syscalls on the read side.

EDIT 2: Also, it appears that the kernel is smart enough to not copy anything when it's clear that there is no need. When I don't go through cat, I get rates that are well above memory bandwidth, implying that it's not doing any actual work:

    $ ./a.out 1000000 1000 >/dev/null
    buffer size: 1000000, num syscalls: 1000, perf: 1827368.373827 MiB/s


I suspect (but am not sure) that the shell may be doing something clever for a stream redirection (>) and giving your program a STDOUT file descriptor directly to /dev/null.

I may be wrong, though. Check with lsof or similar.


There's no special "no work" detection needed. a.out is calling the write function for the null device, which just returns without doing anything. No pipes are involved.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: