Hacker News new | past | comments | ask | show | jobs | submit login

Hard to respond to this, it's too high level and generic to disagree with.

But in practice, if you're on the BEAM and use OTP as intended, your code is fully parallel, split into fault-tolerant pieces, and one crash does not affect the entire system.

I have been responsible of two Elixir monoliths in the last 6 years, and I have not seen them crash even once. Not even during prototyping, not even during load testing, never once. Without having to adopt defensive coding strategies. I write 1/4 the error checking code than if it were Rust, and 1/10 than if it were Python or Go.

How many platforms can realistically promise this?

Note, I haven't said my code is bug free. It's buggy because I'm only human and make plenty of mistakes. But the BEAM saves me from myself, saves me from third-party JSON endpoints that for a split second return corrupted data, saves me because the DB connection dropped for a second, and saves me from the myriad of transient heisenbugs that happen in production.

The bugs are still there. But the system is fault-tolerant because it has been designed to be.

--

If Saša Jurić can't prove it to you, then no one can: https://youtu.be/JvBT4XBdoUE

(I'm pretty sure every time someone posts this link, the number of Elixir converts increases a notch)




> How many platforms can realistically promise this?

I love the OTP framework. So much that I suffer from "erlang envy" in every other language I'm using.

Which is why I made this a couple years ago: https://linkdd.github.io/triotp/ (not benchmarked, not tested in production, only an experiment)


> If Saša Jurić can't prove it to you, then no one can

I think it would have been better if he had more time. The first part was nothing special as such, it could have been easily replicated in even crusty old things like Delphi, never mind modern C# or similar.

It only became interesting when he showed the introspection and when he scaled out to another instance. These things you cannot do so easily in other systems. So would have been more interesting to spend more time on that stuff.

I'm sure there are other videos doing just that, just my thoughts on this one as a "PR video".


Did you use live view for those projects? If so, what sort of scale are they? I've been seriously considering learning Elixir+Phoenix, but don't whether live view is suitable for serious projects (where serious blog project ∉ serious projects).


After wondering this myself, I finally just went all in on LiveView and was extremely surprised at how efficient it was. I have had over 4,000 concurrent users on a single LiveView with real-time messaging, streaming transcriptions, viewer counter updates (don't use presence for this), and the typical interactivity (signing on, navigating, etc.). On a single 8-core machine it maxed at 20% CPU during the few minutes everyone was signing on (via OAuth) and idled at less than 5%. RAM never exceeded 900MB. I've since distributed the same application into EMEA and APAC to provide better latency for those users (which takes libcluster and about 20 lines of config to set up), and zero problems with that either.


You have to provide how your view looks like, because a large part of memory consumption coming from diffs that server keeps. Each session will have their similar diffs.


Of course, each case will be unique. I heavily optimized with temporary assigns so the message lists would not be persisted in server memory. This is even better now with the new streams functionality. The knobs to tweak are there.


Check out this recent talk about cars.com moving to LiveView: https://www.youtube.com/watch?v=XzAupUHiryg

Seems like most of their issues would not apply to a greenfield project and now that things are stable, they're quite happy with the choice.

Traffic estimate: https://www.similarweb.com/website/cars.com/#overview


Not a direct answer but the BEAM, Elixir and LiveView are ridiculously easy to scale vertically (i.e. add more CPU and RAM, the scheduler takes care of everything else), and horizontally (cluster forming and distribution of processes among multiple nodes is a first-class feature)

If people build multi-billion dollar companies on top of PHP and Node.js, I would not even be concerned about scaling Phoenix.


I'm sure your software is great, and I don't mean this personally, but I just find this philosophy really strange. You don't write defensive code, and you don't often check for errors. You just let pieces of your application fail, and feel that this is OK because the overall application continues on. IMHO, Elixir isn't "saving" you from "transient heisenbugs", it's encouraging you not worry about them, which means that they will proliferate.


The "let it crash" philosophy has a misleading name. It's not "don't handle errors", but it's "delegate the error handling to a component dedicated to this".

With OTP, you have what we call a "supervision tree". Your software is divided into components (erlang processes) and organized in a tree where the leafs are your components, and the nodes are the "supervisors" which will catch the error, and restart/retry the operation.

What's the result? Imagine you have a component that is polling data from the database and then sending a message to another component based on that data. For a split second:

  - there is a network error
  - you lose the connection to the database
  - the component crashes
  - its supervisor notice the crash, restart the process
If the network error was temporary, everything goes back to normal, and your system did not stop for a temporary network error. And your component does not need to have "retry code" (it is delegated to the parent supervisor).

If the network error was not temporary, the parent supervisor will notice that the component crashes far too often, and will decide to crash as a result. Letting the parent supervisor of the supervisor deal with the problem.

In Erlang/Elixir, processes can "monitor" other processes. They will receive a message when the target processes dies (normally, or abnormally). This allows you to delegate the error handling so yes, you write less defensive code, because that code is located in a dedicated component that will be notified automatically.


> You don't write defensive code, and you don't often check for errors.

There is a limit to how defensive you can make your code. As soon as you call out to another service for data, the universe of errorinf situations they can create goes to infinity. For example they could happily send you a 200 but return mistyped JSON. What do you do in that case? Are you genuinely going to write that as a test condition to make sure the transactional chain in your system handles this and rolls back the db inserts that preceded it? I can tell you that in 99% of cases idiomatically written elixir will "do the right thing" if this happens, and that is a consequence of "let it fail" philosophy. Moreover, if you try to write error handling code in the traditional sense, you will probably get it wrong, it will be hard to understand, and it will be hard to debug if your implementation is not quite right.


How can I ever defend my position of lead engineer responsible for these systems, when someone on the Internet, that has never seen my code running, is adamant my systems are unstable and lousy with bugs.

Trying to continue this conversation is a waste of our time. Feel free not to use Elixir.


Again, I'm sure your code is great, and I'm not asking you to defend anything. I'm simply responding to the stated philosophy in the linked article.


The "let it crash" is a sound and proven design philosophy, but it would be easier to grok if you were to spend time learning the platform, rather than comparing it in a vacuum with the things you already know.

It is quite frustrating when someone has already decided something they don't know is dumb or makes no sense. It is more plausible the unknown unknowns are blinding your judgement, than Erlang and Elixir developers worldwide suffering a collective hallucination, is it not?


Yes. munchler, see:

Empty your cup.

https://wiki.c2.com/?EmptyYourCup


I'm an Elixirist, love the idea of "let it crash", and I still code defensively.

To me these phrases are more about the guarantees—if it does crash it recovers—and less about dogmatic advice that everyone follows.

But there's time and place for different levels of defensive programs. I get to pick and choose based on where I see risk, but I know the whole app isn't going to blow up if I missed something.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: