Mummy – web server written in Nim that returns to the ancient ways of threads

v0idzer0 · on Dec 31, 2022

This thread model is actually similar to the approach that Node.js takes in using libuv as a thread pool. There’s a main thread listening for incoming requests that immediately offloads the handling of them to the next available thread in the pool. Of course the handler thread can also offload some of its internal work to other pool threads.

What other server thread models exist? I know php-fpm gives each request it’s own process, but I can’t think of any other feasible strategies off the top of my head.

tomphoolery · on Dec 31, 2022

There's also the process worker pool model, in which a master process forks a number of worker processes to handle all requests. Unicorn (for Ruby) does things this way...in part because when it was released, Rails apps couldn't handle multithreading all that well. I think Ruby also had some trouble with it, so this was just the easiest/most predictable way of getting good performance out of your Rails app.

Originally, the way `php-fpm` works is pretty much how the entire web worked. Every request into a server would go through `inetd`, it would fork a handler process, and the request would respond when the process exited. This became very difficult to handle at scale, mostly because of the overhead for forking a process, so the next logical step is to have a pool of worker processes ready to go in order to immediately receive requests. This is great for applications that don't need to get redeployed very often, but as we moved into more of a continuous deployment workflow, it began to break down as worker processes would need to be restarted, going back to that whole "overhead" thing. Unicorn suffers from this problem a bit, if you try to `SIGHUP` and reload application code when workers are still processing requests, Unicorn will wait until those processes have finished responding to restart the process and load the new version. If a client is taking too long and hanging onto the process, it's very possible that the process will just never reload the code and you'll have weird errors happening every so often.

A solution to this problem is to move this pool of worker processes into a pool of threads, which allow the server to control a bit more about how the application code is reloaded, and not have to deal with the overhead of forking processes. I believe that's how NGINX, Node.js, Puma, et. al. work under the hood...there's a thread pool of workers and another thread that listens to requests. Everything is event-driven, so when a new event comes in, the listener thread just sends that event off to the pool of worker threads. Basically the same idea, but using an event loop as a model for better concurrency support. (Puma is a bit different because it does allow for worker processes in addition to threads, but this isn't necessary, it just allows for better performance on larger machines)

masukomi · on Dec 31, 2022

ruby only had green threads until recently. If you actually wanted separate parallel work you had to fork.

vlovich123 · on Dec 31, 2022

I believe that you can also have multiple threads accepting connections on the same fd. This lets the kernel do the scheduling which removes the need for a coordinator thread. You can then choose to handle the connection on the same thread pinned to a specific CPU (per-cpu) or have a cross-thread task queue that whatever CPU is idle can handle (work stealing).

daper · on Dec 31, 2022

Varnish (HTTP caching only) also uses one thread per client. I believe worker threads are used to handle reuqests while a dedicated thread handles all the idle connections between requests using epoll(). Also per-threads stack size is lowered so thousands of threads don't occupy massive amount of memory.

Single threaded HTTP servers have their own issues. If the bottleneck is the storage then lack of async open()/stat() and some other calls is problematic. We feel that serving hundreds of millions of files (long tail content) from slow storage using nginx. For that reason you can configure nginx to spawn multiple processes.

habibur · on Dec 31, 2022

Thought nginx epolls file i/o too along side socket i/o. Or did you find that the first call to open() or stat() stalls, while read/write after that continues normally?

Matthias247 · on Jan 1, 2023

File IO isn’t epoll-able. The operations are always blocking.

Those could however be offloaded onto a threadpool, to avoid the blocking to affect any other requests that are processed by the same Nginx worker. Nginx however only partially does that - whole file read and write operations are offloaded a whole bunch of other IO (stat, open, close) are executed on the main thread. I guess due to implementation challenges - one can’t just make one operation async but also needs to make each operation that utilizes those methods async.

habibur · on Jan 1, 2023

Searched nginx source. Seems like it's using aio_read() aio_write()

./src/os/unix/ngx_file_aio_read.c

API's doc https://man7.org/linux/man-pages/man3/aio_read.3.html

Even though the source is there don't mean it's compiled or used though. Might or might not be. No idea about the details.

rkeene2 · on Dec 31, 2022

I wrote an extremely fast static HTTP server for this purpose [0].

[0] https://filed.rkeene.org/

metadat · on Dec 31, 2022

Off-topic: What happened to rkeene1?

btw I really appreciate the design of you site, it's simple, clean, and beautiful. Timeless.

bch · on Jan 1, 2023

It looks like @rkeene2 styled it, but that’s a site served up by fossil[0].

[0] https://www2.fossil-scm.org/home/doc/trunk/www/index.wiki

forgotmypw17 · on Dec 31, 2022

Respect for supporting HTTP/1.0

elcritch · on Dec 31, 2022

I've been watching this one. It could be handy for embedded where it'd be nice to have a lightweight combined html/websocket server.

nhatbui · on Dec 31, 2022

Also in microservices land you could have the service doing its business logic but needing to support a server for metrics and/or debugging. Just piling on, this is good to study :)

andrewstuart · on Dec 31, 2022

Those who view the source code will surely die of the curse.

skocznymroczny · on Dec 31, 2022

I bet it's just a one big wrapper anyway

isthisthingon99 · on Dec 31, 2022

Off-topic, but isn't the consensus that deaths related to opening of the tombs are caused by some bacteria?

danparsonson · on Dec 31, 2022

They were entirely coincidental (https://en.m.wikipedia.org/wiki/Curse_of_the_pharaohs).

j16sdiz · on Dec 31, 2022

It is ok-ish, given how low level nim is.

xigoi · on Dec 31, 2022

Woooosh

darthrupert · on Dec 31, 2022

This is great! Async still has its place for extremely low latency requirements, but threads should be good enough for almost every problem.

iroddis · on Dec 31, 2022

What kind of latency are you talking about? Async latency for completing a task isn’t deterministic, and there’s no guarantee data will be processed as soon as it becomes available. Async runtimes rely heavily on hints from their tasks as to when to poll next.

Generally, low latency means producing a result as soon as possible. Threads are ideal for that case, expending spinning cpu time for asap processing.

The best description I’ve heard of async is concurrent waiting, vs concurrent processing for threads (from the excellent zero2prod book).

zamalek · on Dec 31, 2022

What? The known/accepted tradeoff of async is gaining scalability (max concurrent connections) at the cost of latency.

darthrupert · on Jan 1, 2023

Indeed. Dunno what I was thinking there. Thanks for the correction!