Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

  The only difference between Go and the pthread model is that Go has a userland scheduler.
That's a _huge_ difference in the context of a network service, and also obscures some other details.

If I have several thousand long-polling clients, one kernel thread per client is simply not realistic. It uses up alot of memory, and the context switching can be costly. Throwing more hardware at the problem is just throwing money down the drain. And if throwing money down the drain is fine, one might as well use one process per connection with blocking I/O.

In order to avoid that in Rust, Node, or most other languages one needs to use a callback or future. But both a callback and future require a dynamic allocation and deallocation per invocation of each function that might block. Especially under heavy load (lots of broadcasts), that's alot when you consider chaining invocations, as you must when composing non-blocking I/O interfaces. Chaining futures hinders compiler optimizations. And in general it hinders function composition. (I commonly see claims that the react pattern is better for composition, yet I never see such people writing _all_ their code that way, e.g. for things like string manipulation. In reality most concurrency, even in network services, is effectively sequential within a wider context beyond the immediate operation. Have you ever tried to implement, e.g., a non-blocking MySQL protocol parser with callbacks/futures? It's ugly.)

People underestimate how efficient and useful storing function invocation state on a stack is. Goroutines (or contiguous, dynamically-sizeable stacks like in Lua) give you the best of all worlds--efficiency of a contiguous stack without the high fixed costs. That means performance _and_ scalability for massively concurrent network services, but also for other patterns--like being able to convert a callback-based push interface into a pull interface, without having to refactor the intermediate code, which can be amazingly powerful for things like parsers.

But for a language like Rust it's understandably difficult to implement stackful coroutines, let alone goroutines. I just wish people wouldn't whitewash the issue.



> If I have several thousand long-polling clients, one kernel thread per client is simply not realistic. It uses up alot of memory, and the context switching can be costly.

The memory cost of pthreads that people generally refer to is the stack. Userland scheduling doesn't have anything to do with the stack size. You can get the stack size down very low in a 1:1 thread model: 10kB (or, in future Linux kernels, 6kB). That's comparable to the stack size of a goroutine.

Context switching cost depends on your use case. For I/O, keep in mind that you have a round trip through the kernel either way.

> being able to convert a callback-based push interface into a pull interface, without having to refactor the intermediate code, which can be amazingly powerful for things like parsers.

You can efficiently do that with threads too. In fact, that is exactly how HTML parsing works in every modern browser: the parser runs off a separate thread and forwards DOM create events to the main thread. HTML parsing is about the most performance-critical parsing setting you can think of.


This is a big advantage to Go, you're totally correct, but it does come at a cost with the GC and a runtime. For some this is acceptable, for others it is not. But this is not the reason that I personally decided to bet on Rust and not Go, though they were part of it.

The three main reasons that Go doesn't fit with the work I want to do on the network is all around compile time guarantees:

1) Strong type checking on Null 2) Required handling of Errors 3) Generics

These all help me write stable software that I have higher confidence in than what I've written in Go.

In terms of the debate on Futures, etc. you are right that it's simpler to write pure stack based functions, but the overhead of tokio and futures in Rust is not as high as that of a goroutine, but it does take more thought. It's a trade off on productivity vs. stability. In your opinion it's ugly and hurts productivity, but to me it is pure elegance with zero overhead cost at runtime (and if you preallocate an arena/slab it can even have known memory costs). After working with tokio over the last half year or so, I find that I'm as productive as I am when writing single threaded blocking IO code (granted there was an initial ramp up time before I became as comfortable as I am now).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: