Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This unfortunately breaks down with (bad acting) NIFs. Thankfully you can mark 'em as the dirty evil little things that they are (with negligible overhead): ERL_NIF_DIRTY_JOB_CPU_BOUND. [1]

I implore anyone interested in Erlang or its surrounding languages, to read its source code. [2] More specifically, the BEAM. I'll warn you that it's very 80's hackeresque, but in a good way. Incredibly pragmatic. The way they achieve their world-class scheduling? Tip: not via a perfect theoretical design cooked up in some comp-sci lab. [3] If you've been following game engine design within the last 9 years or so, you'll probably have a good idea of how they do it. [4][5]

[1]: https://medium.com/@jlouis666/erlang-dirty-scheduler-overhea...

[2]: https://github.com/erlang/otp/tree/maint/erts/emulator/beam

[3]: https://hamidreza-s.github.io/erlang/scheduling/real-time/pr...

[4]: http://blog.molecular-matters.com/2012/04/05/building-a-load...

[5]: There's a hell of a lot of depth to this problem domain. Charles Bloom's guiding principles for Oodle are spot, and well worth a look.



> I'll warn you that it's very 80's hackeresque, but in a good way. Incredibly pragmatic.

I consider the BEAM VM as one of the marvels of software engineering.

You know it is good, when you explain to other programmers that you can have something like an isolated memory process just like an OS process, with preemption and only a few Ks of memory, with a low latency GC, with distribution across machines built in -- and they don't believe me.

Even experienced senior developers are skeptical saying stuff like "well yeah but those are green threads then and they have to yield explicitly -- nope, they don't", "so you have callbacks then somehow, no, not callbacks", "what do you mean separate memory spaces, it is a single OS process right?" and so on. It sounds like magic -- this stuff shouldn't exist, right, but the awesome thing is, it does.

Moreover, it is not a hacked up version from a lab some place or a PhD disertation proof of concept -- this is what powers banking, databases, messaging, and probably more than 50% of smartphone access to the internet today.


I'll just put this here in case anyone's interested.

https://github.com/mbrock/HBEAM

It's a Haskell executor of .beam programs, in very early prototype stage, and I abandoned working on it 5 years ago (apparently).

Of course it's not meant to be competitive in any way, it's basically just for fun, and because I wanted to learn more about how Erlang works.

The function `interpret1` in the middle of this file has the main opcode switch.

https://github.com/mbrock/HBEAM/blob/master/src/Language/Erl...

I think it's about the smallest subset needed to run a factorial program, but also to implement the very basics of mailboxes with send/receive/timeout.

It uses GHC's shared transactional memory for the mailboxes:

https://github.com/mbrock/HBEAM/blob/master/src/Language/Erl...

Someone, fork it and finish it! :-)


I'm interested. How does the preemptive scheduling work with blocking system calls? So if an Erlang process tries to read from standard input using the read syscall (suppose we haven't set non blocking mode on the fd), why does it not block the scheduler that it runs on? Or does Erlang implements its own set of syscall wrappers that uses epoll under the hood?


There is an IO thread pool for some blocking operations like file IO, and there is also epoll for sockets, for example if I see this in the prompt on my laptop:

   $ erl
   Erlang/OTP 18  ... [smp:4:4] [async-threads:10] ...
The async-threads indicates it has started 10 IO threads. So if a process needs to read from a file on a slow disk, it will dispatch that request to one of those threads and then it will be descheduled (put to sleep).

The smp:4:4 says there are 4 schedulers configured and enabled. Usually there is a scheduler (as an OS thread) which runs on each core (also highly configurable, with custom topoligies, affinities etc). Those schedulers will pick and run Erlang processes.

Typically to ensure fairness, each process is allowed to run a set number of reductions (think of them like equivalent bytecode, but say an internal C driver like a regex parser should also periodically yield and report it consumed a given number of reductions, so it can be descheduled).

Funny enough all that sounds a bit like an operating system, and that's a good way to conceptualize it. Erlang/OTP is like an OS for your code. A modern OS is expected to be resilient against bad processes messing with other processes' memory, multiple applications should run and be descheduled preemptively etc.


You've got it: processes are not allowed to run blocking syscalls under the hood. Transparently that work is handled by backing IO threads.


I'm interested as well! How does it work then?

(1) How does it preempt threads (my guess: it doesn't actually preempt threads, the interpreter yields after a certain number of instructors, or (if compiled) the compiler inserts conditional yields in each loop/function call/return)?

(2) How are memory spaces isolated (my guess: they aren't really, it's just that the memory allocator doesn't mix memory allocated by different threads)?


(1) In the normal case it just counts a certain number of VM instructions each process runs before it gets to rescheduled. But it gets interesting as well with internal modules or C modules (for example regex matcher), in that case the C module periodicaly as it works through the data, reports that it consumed some number of reductions and is possibly told to yield now. (On that note: in 19.0 we'll have dirty schedulers by default so in that case blocking long running C code will be handled much better).

(2) Erlang VM instance (it is called a node) is an OS process (plus some helper processes, but they are not important in this case) is of course one heap from kernel's point of view. But internal allocator keep spaces separated for BEAM's data. It is not always as basic as in some cases for binary blocks and sub-blocks it can actually share and ref-count them.

In the new release I like that it could have mailbox outside the main heap of each Erlang process. That could be an interesting parameter to play with.


> (1) How does it preempt threads (my guess: it doesn't actually preempt threads, the interpreter yields after a certain number of instructors, or (if compiled) the compiler inserts conditional yields in each loop/function call/return)?

Yep. The interpreter lets each thread do 2000 reductions (roughly == function calls), or until it waits for new messages if that's sooner.


> This unfortunately breaks down with (bad acting) NIFs.

Where possible, I think the Erlangy thing to do would be to just have a separate executable and communicate with that. Also ensures that things keep running if there's a segfault or something.


Exactly! Well, kinda.

You can always throw the NIFs onto other nodes. You can also write ports that are glorified external processes.

Other times, you just need low-latency and high throughput. In that case, you expect failure and design accordingly.


     (bad acting) NIFs. Thankfully you can mark 'em as the dirty evil little things that they are (with negligible overhead): 
As a side issue, is there an easy method to determine if a NIF is problematic in this regard? I've used jiffy[0] in several codebases, but I keep reading these warnings and wondering whether I should be doing so.

[0] https://github.com/davisp/jiffy


The warnings are mostly about NIFs you write yourself, which you typically avoid if possible. And jiffy itself goes quite a long way to cooperate with Erlang VM's internals (at least so I've heard).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: