Hacker News new | past | comments | ask | show | jobs | submit login

> Or, if you prefer, you can still fallback to Redis, which is something I would consider doing if the geoip lookups are expensive (either in terms of time or money).

What if the geoip lookup took 1.5 seconds to look up from a remote API? Is ETS still the right choice?

Based on your statement, it sounds like you wouldn't since that's a long time (relative to a 25ms response). But if ETS is meant to be used as a cache, wouldn't that defeat the purpose of what it's meant to be used for?

Like, if I wanted to cache a PostgreSQL query that took 1 second to finish. Isn't ETS the primary place for such a thing? But 1 second is a long execution time. I know Cachex (the Elixir lib) uses ETS to cache things, so now I'm wondering if I've been using it for the wrong thing (caching database calls, API call results, etc.).

Normally in Python or Ruby I would have cached those things in Redis and lookups are on the order of microseconds when Redis is running on the same $5 / month VPS as my web server. It's also quite speedy over a local network connection too for a multi-server deploy. Even with a distributed data store in Elixir, you'd hit the same network overheard right?

> if I need distributed state, I just use to the database too.

This part throws me off because I remember hearing various things in Phoenix work in a distributed fashion without needing Redis.




> What if the geoip lookup took 1.5 seconds to look up from a remote API? Is ETS still the right choice?

I would use ETS to cache local lookups (for all cores in the same node). Then fallback to Redis to populate the ETS cache. But again, feel free to skip one of ETS or Redis. The point is that ETS adds a different tool you may (or may not) use.

> Like, if I wanted to cache a PostgreSQL query that took 1 second to finish. Isn't ETS the primary place for such a thing?

Here is the math you need to consider. Let's say you have M machines with N cores each. Then remember that:

1. ETS is local lookup

2. Redis is distributed lookup

If you cache the data in memory in Ruby/Python, you will have to request this data in PostgreSQL M * N times to fill in all of the caches, one per core per node. Given the amount of queries, I will most likely resort to Redis.

In Elixir, if you store the data in ETS, which is shared across all cores, you will have to do only M lookups. If I am running two or three nodes in production, then I am not going to bother to run Redis because having two or three different machines populating their own cache is not an issue I would worry about.

> > if I need distributed state, I just use to the database too.

Apologies, I meant to say "persistent distributed state" as not all distributed state is equal. For ephemeral distributed state, like Phoenix Presence and Phoenix PubSub, there is no need for storage, as they are about what is happening on the cluster right now.


My opinion is that this depends entirely on the cost relative to the overall task, and how likely cache hits are to occur. If cache hits are very likely and the task occurs frequently, I'd strongly consider storing it in ETS. If cache hits are unlikely, then it depends purely on how expensive the task is, but generally there isn't a lot of benefit to caching things that are infrequently accessed.

I wouldn't cache database queries unless the query is expensive, or the results rarely change but are frequently accessed.

Generally though, whether to store something in ETS or not is situational - your best bet is actually measuring things and moving stuff into ETS later when you've identified the areas where it will actually make a meaningful difference.

> This part throws me off because I remember hearing various things in Phoenix work in a distributed fashion without needing Redis.

This is true, but it depends on what kind of consistency model you need for that distributed state. The data you are referring to (I believe) is for Phoenix Presence, and is perfectly fine with an eventually consistent model. If you need stronger guarantees than that, you'll need a different solution than the one used by Phoenix - and for most things that require strong consistency, its better to rely on the database to provide that for you, rather than reinvent the wheel yourself. There are exceptions to that rule, but for most situations, it just doesn't make sense to avoid hitting the database if you already have one. For use cases that would normally use ETS, but can't due to distribution, Mnesia is an option, but it has its own set of caveats (as does any distributed data store), so its important to evaluate them against the requirements your system has.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: