Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There is a reason: the BEAM is almost not prone to huge GC pauses. Bigger load results in every actor responding very slightly slower. Nothing else.

Many other systems don't have this property. They fall down under pressure.

Gosh, a huge chunk of HN is always so dismissive. At least read up a bit beforehand, man. The criticisms should be informed and benefit the readers, not only express a generic skepticism.



The beam just garbage collects each actor separately, and it so happens that much of the time your actor has finished before a gc happened so you never see the cleanup.

The beam also has a spectacular failure mode: OOM whenever messages come in at a higher rate than they are processed. The lack of backpressure mechanisms mean a huge amount of beam language developers spend way too much time recreating their own way for dealing with this or pretend it is not a problem at all. This means too many libraries in the ecosystem behave totally differently under load.


I can see how you would think that, yes. In practice I haven't noticed it except in super rare cases where processes (actors) hold on to huge binaries / strings -- which is one of the weak points of BEAM's GC.


I've been bit by this. In reality you need to know your shit when it comes to tuning the Beam and GC to achieve decent performance under load without triggering OOM.


Not denying it, yeah, it's a blind spot and it's one of the more advanced topics when you start having actually loaded nodes in production.


The big offenders is serializing huge jsons and chucking small binaries slices from that into an ets table.

Also concatenating binaries. Don't do that, use an iolist


Yep. Has bitten me in the bottom before.


My experience of this is not theoretical.


Maybe you should write a blog post about it, many in the Elixir's ecosystem are very serious devs and are always looking at ways to improve it.


There are entire github repos dealing with it.

Frankly the Elixir ecosystem, which is not without merit, is more interested in perpetuating the myth of magic scalability by virtue of the beam.


Frankly I am seeing that as a myth, you seem to have made up your mind some time ago or judged by 1-2 occasions.

I am on ElixirForum every day and worked with Elixir for 7 years and have never seen anyone "perpetuate myths". I've seen some people willing to "increase adoption" which was always met with resistance by the wider community -- we believe growth should be organic.

Pretty sad stance from you though, I have no idea why people get so ticked off when another programmer wants to tell them about a secret weapon.

If you are not willing to try it, that's fair. Say that. Claiming you know stuff about the ecosystem while a guy who is there every day is not seeing that at all comes across as... strange. Biased. And not arguing in good faith. :(


> If you are not willing to try it, that's fair. Say that. Claiming you know stuff about the ecosystem while a guy who is there every day comes across as... strange. Biased. And not arguing in good faith. :(

I am not willing to drag others, such as those that wrote the repos, into a technical discussion with people out to act as you are.


I really don't get your replies and why their tone has to be the way it is but, have it your way. I tried.


The guy you are responding to was completely calm and reasonable. Didn't say anything attacking or otherwise. I'm not sure why you are seemingly trying to cast him (and the Beam) in such a bad light, with seemingly no reason to back it up.


He is adopting the missionary tactic of feigning a desire for discussion when it is just veiled evangelism.


He seemed to be asking you to back up your claims and you seem to be just saying claiming that you don't need to and he is the problem...


Both of you are afflicted by that logical fallacy of failing to understand that you not encountering a phenomenon does not mean it does not happen or is rare, it just means you didn't encounter it.

If you try telling people that did encounter that phenomenon that in practice they wouldn't/didn't then you shouldn't be surprised if they question why they started talking to you in the first place.


Yeah but when you don't have any evidence and just accuse him baselessly as you have been... well...


Do you genuinely fail to see how your behaviour proves my point?

You cannot just hound people with demands because you don’t like what they are saying.


> Do you genuinely fail to see how your behaviour proves my point?

I genuinely see only one thing: I asked you to elaborate but you are convinced that I am pretending to discuss while I, again genuinely, actually did want to discuss.

You asserting something about me, a person whose mind you cannot read is confusing and quite aggressive, in a very uncalled-for manner too. But as I said already -- have it your way, I disengaged because it became apparent you are not interested in discussing. OK. It's your right.

What's not OK is you claiming that I am not interested in discussing however, and I maintain that I was interested in discussing.

> You cannot just hound people with demands because you don’t like what they are saying.

1. I am not "hounding" you for anything, I asked a question.

2. You are again assuming my motivation and I assert that you have gotten it wrong. You that I "disliked what you said" is a borderline personal attack and an off-topic. I was confused why you claimed what you did and wanted you to elaborate, to find out what made you think like that and if I can change your mind with a few anecdotes and some facts (that are hard to look up because they require scanning a forum; yet they are there and are visible to everyone who engages with the platform).

BTW, if you really have known anything at all about the Elixir ecosystem you would know that its creator, to this day, engages with users on ElixirForum and asks for their feedback on what they find lacking. That sort of engagement and genuine discussion spirit that you claim I (as a part of the Elixir community) don't have.

That alone invalidates your point entirely.

I am disengaging second and final time, let future readers decide for themselves.


Calling someone “biased” and “acting in bad faith” is a personal attack and violates this site’s rules. People get rate limited for far less on this site.


I'd argue taking things out of context and deliberately painting the commenter in a bad light is not a nice forum discourse.

I said, very plainly and visibly, that my parent commenter's unwillingness to back up negative claims COMES ACROSS as biased and ARGUING (NOT "acting") in bad faith.

Come on now, this stuff is not hard, the message is literally up there. Not sure why you had to editorialize it and thus misconstrue it?


He didn't call him that at all. He said him being unwilling to explain his points and instead just making claims comes across that way.

He at no time called him biased or said he was directly acting in bad faith


> Biased. And not arguing in good faith. :(


You are leaving the prior part of his sentence out. Regardless, Before he said that you weren't willing to answer either.


You are both gaslighting.

As I plainly explained, right at the top, it is not theoretical. There is no point engaging people with evidence if they are so dismissive of basic facts.

But then that is also true here. Your claims about him not saying what he plainly did are just bizarre.


I think that's developers using GenServer.cast when they should be using call. Call gives you back pressure mechanism.


TBH at the load we had we got substantial savings by eventually replacing usage of gen_server as well, though that probably isn’t a good idea much of the time.

The OOMs were largely being caused by calls to and from other services (i.e. kafka) so the answer proved to be in controlling the rate at which things come in and out at the very edge.

From what I saw I got the impression the Beam devs assumed memory and CPU usage go together so a system that is under load memory wise would also be CPU wise, but this isn’t the case if your fan out and gather involves holding large* values on which the response is based, even if for tiny amounts of time.

EDIT: *large meaning "surprisingly small" if you're coming from other universes.


This sounds like ets tables holding on to binaries that were extracted from JSON. This is something that iirc pager duty ran into.


The underlying problem we had* was the rate work was being completed was lower than the rate requests were coming in which causes the mailboxes on the actors to grow indefinitely.

In golang the approximate equivalent is a buffered channel that would start blocking because it has run out, but the beam will just keep those messages piling on to the queue as fast as possible until OOM. This is obviously a philosophical tradeoff.

* I should qualify that each request here had a life in low double digit ms, but there were millions of them per second, and these were very big machines.


Why weren't your processes dropping messages? Also I think you can tell the VM to not allow the process to exceed a certain message size and trigger some sort of rate limiting or scaling out

Edit: huh. I could swear the VM had memory limit options. Guess not. Time to rewrite it in zig!


Yeah, I think that's the assumption people had been operating under.

That team would have thoroughly endorsed a zig rewrite! It was a very odd situation where most of us liked erlang the language but found the beam to be an annoying beast, whereas most of the world seems to be the opposite.


That does not in any way explain a drop of 95%, which IMO is ridiculous and points to other issues.

The system they created now is totally different from the one they had. It’s more efficient by an insane margin. Choice of language seems like it would have trouble breaking the top 5 major reasons.


Why is it ridiculous?

I migrated Rails apps to Elixir before, we reduced from 15 servers to 3, and 1 was basically "if crap hits the fan", we could have gotten away with 2 easily.

It's worrying that a supposedly high-quality forum like HN receives comments with no substance. If you have an actual counter-argument, let's discuss. If not, well, not an interesting exchange.


You keep insisting HN is not providing comments up to your standards. Let’s not go there OK? I just disagree with your analysis. I think it is too simplistic to state GCs cause this and Elixir somehow magically causes 95% efficiency boosts.

All I’m saying is that if you can drop 1330 servers just like that, there might be something more going on than Python’s slowness.

This is from experience. I have seen people create slow and fast systems with just about any tech. I can make Elixir crawl, I can assure you of that.

I have seen Python apps use 10 servers and reduced it to one as well. Same tech, just a more efficient mindset. It’s IMO a bit too simplistic to say systems with GCs fall over when under load.


Sure, if you want to expand the discussion to "everyone can make every tech stack act badly" then you might have had an argument. I don't find that argument compelling however -- it's borderline meaningless.

Also nobody used the word "magically" before you did. Note that.

What's your argument exactly? That Elixir is overrated? Or something else?

Furthermore, I am not insisting on my standards of the quality of comments. I am under the impression that's the expected quality of comments on HN at large.


Good sir, I am not looking for trouble. I concur, my argument is nebulous at best.

To be clear, I think Elixir is marvelous. This post spiked my interest in it actually. Sorry if I come across ignorant. That’s because I am.


Thank you for recognizing that we were going nowhere. Apologies if my tone was sharp.

I am not evangelizing tech -- I am a polyglot and I use what I find is best suited for a job, and Elixir happens to cover quite a lot of ground. That's all really. I also use Rust and Golang quite a bit.

I simply get ticked off when people start demeaning something without seriously working with it or even reading a bit beforehand. Sorry if I mistakenly put you in that group.


You’re comparing Elixir to Python and Rails. Many of us have seen Python replaced with other languages for an astronomical improvement. Python and Ruby are the slowest category of languages; they’re easily beaten and you need to offer some evidence as to why the improvement was derived from migrating to Elixir specifically rather than moving away from Python/Rails.


Sure, not wrong on the outset, but you do come across as dismissive of Elixir while stating this.


I'm not dismissive of Elixir, I'm dismissive that Elixir magically solved this problem in a way other languages couldn't. If you have some supporting evidence or rationale as to why Elixir is uniquely able to solve this problem, I'm happy to hear it, but so far you've offered up "low latency GC" which isn't unique to Elixir and itself doesn't adequately explain the degree of improvement over Python (GC latency alone doesn't reduce from 200 servers to 4). Again, I'm happy to entertain arguments about why Elixir is uniquely able to improve performance, but I'm not going to take it on faith (which you interpret as 'dismissive of Elixir').


Okay, how about "a runtime that has been extremely carefully crafted for the lowest possible latency all the way to the point of the hardware falling over"?

It's very hard to provide evidence unless we make a screen-share call where I show you real time dashboards of services being bombarded with thousands of requests per second and for you to see for yourself how the median latency numbers climb from 25ms to 45ms and then fall back to 20-30ms after the burst load subsides.

I find it difficult to just describe this because as much as I've seen it many times in practice, it's also practically impossible (NDAs and compliance nightmares) to demonstrate it to a programmer outside the companies I've worked with without violating all sorts of laws. :(

But yes, basically: a super latency optimized runtime, a GC that's not very sophisticated but it elegantly dodges most GC problems by simply releasing all memory linked to an actor as soon as it quits (and Erlang/Elixir encourage you to spawn many of those in the right conditions; not for every single thing though), and one of the very fastest dynamic languages in the world, probably second only to JS's V8.

All of that is combined with me working with several other programming languages and their hosting solutions which were tripping over themselves when 1000 req/s started coming in (looking at you, Ruby 3.X and Puma and a few other servers; or PHP, or Python).

TL;DR: reliability is much better, latency is predictable.

Weirdest thing is: people don't believe it. If you only knew the CTOs I worked with: they were extremely data-driven and they would not allow me or anyone to just pull all of that out of their bottom. All had to be proven with numbers, and me and my teams did that, many times.

I understand the skepticism somewhat, but you and a few others seem to look at Elixir through the lenses of "too good to be true", and IMO you should try relaxing that skepticism to some extent. And try to be little more sympathetic because again, I literally cannot give you the hard cold data without violating at least three laws.


If you're seeing GC pauses of >1ms in Go, please report a bug


Golang is actually my #2 after Elixir (simply because Elixir code is much more terse and often more readable).

So yeah, Golang's GC is world-class, no argument from me.


Go also doesn’t have huge GC pauses (and moreover idiomatic Go generates very little garbage)and I have a hard time seeing how GC pauses would contribute so significantly. Java also allegedly has a very low latency collector.


Apologies, I was only responding to a single point which was meant to counter another. I am well aware that Golang's GC is world-class and it's my second most loved language after Elixir.


If and that's a very, very, very big if, the current open GC are really leading to to much pauses. Then you can go to Azul and buy a better VM and GC, further improving the performance compared to BEAM.


Yep, agreed, I am just listing possibilities. More often than not a performance loss in Erlang/Elixir is caused by GC pauses but you can do a lot to reduce or outright eliminate those.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: