> DO NOT say things like "we should add a cache to reduce latency” We should add...

mgraczyk · on March 2, 2023

What I am saying here is that it's not true in all cases, so candidates should avoid blurting out rehearsed sentences like that.

Caches usually reduce load on the thing behind the cache. They can sometimes reduce latency in ways that matter, but often don't. For example if the p99 requests all miss in cache, then the cache won't help.

Kranar · on March 3, 2023

But that's not true, the total latency including a cache is:

P(cache_hit) * cache_read_latency + (1 - P(cache_hit)) * cache_miss_latency

Assuming your scenario of 99% cache misses, that means you need a cache whose read latency is less than 1% of the cost of a cache miss. There are plenty of ways to design such a cache so that you still get a net performance benefit, even in your greatly exaggerated scenario.

One very simple example of an incredibly cheap cache that is almost certainly going to cost less than 1% of the cache miss latency is a bloom filter. Bloom filters can be tuned to be incredibly space efficient as well.

https://en.wikipedia.org/wiki/Bloom_filter

Please don't be so dismissive of people who answer questions by giving generally good and standard advice, just because you have some silly corner case that you want to play "gotcha" on.

mgraczyk · on March 3, 2023

You are talking about minimizing the expected or average latency, which is very rarely important. The actual goal will depend on the broader system but typically you want to minimize the latency of the worst 95%, 99%, 99.9% etc of requests. Whether or not the cache reduces latency depends on the distribution of the cache hits. In practice, items that miss a simple LRU cache often have the longest latency in other parts of the system, in which case the cache would not reduce p99 latency.

EDIT: I recommend reading the Google SRE book or the famous "tale at scale" article.

https://www.barroso.org/publications/TheTailAtScale.pdf

Kranar · on March 3, 2023

Well folks, you heard it here. Average latency is rarely important and if someone presents a very standard and common way to reduce latency of a system, you should dismiss them because they should be aware that your specific system that they have no knowledge of would result in 99% of cache misses.

I used to work at Google long ago in the platforms division, in fact I worked on the BigTable cache (among other things, mostly related to performance). It would be very sad indeed if today's SRE book dismisses caching as a vital and standard optimization strategy and instead plays all kinds of gotchas with potential candidates as I believe you're doing.

With that said, you article you linked does not in any way support your claim about caching, on the contrary it hardly discusses caching at all. It's as if you just wanted to dump a document you thought I wouldn't read as a way to be dismissive.

mgraczyk · on March 3, 2023

What's your ldap? I can look you up and see what you were working on, and whether it required making design decisions around reduced average system latency.

And yeah this is exactly my point. If you present a "standard and common" solution that isn't applicable to the question I actually asked, and if you blindly apply solutions without thinking about what problem is actually being solved, then that's bad in an interview.

Average latency is usually not the thing you want to reduce because it's not representative of what any actual user is experiencing.

Kranar · on March 3, 2023

Why would you do that? Now you're being weird and kind of creepy.

With that said, to the best of my knowledge it is the same as my HackerNews handle, so you are welcome to find whatever info you'd like on that, but please understand your request is very creepy and akward.

bluedevilzn · on March 3, 2023

I work on Ads latency at Google. If we remove our cache, our p95 will definitely suffer. We have experiments to confirm this.

In fact, one of the key projects that is being worked on is to improve caching to reduce latency both at median and at tail.

gilbetron · on March 2, 2023

If you need to reduce average latency and the cache would actually do that, sure. Cache doesn't magically make all your network latencies drop, though.

jeron · on March 2, 2023

yeah, not sure what is too hard to understand about that. Perhaps it depends on the context but I highly doubt you can go wrong by saying a cache should be added to reduce latency.

lhorie · on March 2, 2023

The way it typically fall aparts for my candidates is I ask about optimizations, they say X could be cached, then I ask about cache invalidation and lo and behold, it didn't even occur to them that the cache needs to be invalidated at some point, so they blurt something along the lines of doing the entire expensive operation again and checking it against the cache, or something similarly nonsensical.

More common than I'd like to admit.

nonethewiser · on March 2, 2023

Isn’t it better to know you can add a cache to reduce latency, even if you’re not sure how to invalidate it?

lhorie · on March 3, 2023

The problem isn't demonstrating understanding of what a cache does conceptually, in fact any SWE worth their salt ought to be able to explain caching. The problem is if the candidate's decision to add a cache causes the system behavior to become obviously incorrect upon an ounce of further inspection, because this demonstrates a lack of foresight about local maxima vs global maxima, and systems are all about trade-offs at the macro scale.

Caching is an easy thing to blurt out in an interview setting, but not all problem spaces benefit from caching and caching often isn't the only available solution, or the most ideal one.

Misdicorl · on March 3, 2023

I started typing no, but then....

I honestly can't decide which would be worse: a developer who literally doesn't know what a cache is or one who installs caches with bad invalidation policies.

ThalesX · on March 2, 2023

> then I ask about cache invalidation

No doubt with a follow up on how the variable would be best named.

old_hat · on March 2, 2023

That's obvious. It's called "x".

Apocryphon · on March 2, 2023

You mean x1

mgraczyk · on March 2, 2023

In practice, "latency" is too vague to be what matters. In many systems it's the worst 1% or 0.1% of latencies that matter. Adding a cache will improve latency in those cases only in some situations.

Usually adding caches helps by reducing load on the service behind the cache.

N_A_T_E · on March 2, 2023

Only two hard things in computer science, blah blah.