What I am saying here is that it's not true in all cases, so candidates should avoid blurting out rehearsed sentences like that.
Caches usually reduce load on the thing behind the cache. They can sometimes reduce latency in ways that matter, but often don't. For example if the p99 requests all miss in cache, then the cache won't help.
Assuming your scenario of 99% cache misses, that means you need a cache whose read latency is less than 1% of the cost of a cache miss. There are plenty of ways to design such a cache so that you still get a net performance benefit, even in your greatly exaggerated scenario.
One very simple example of an incredibly cheap cache that is almost certainly going to cost less than 1% of the cache miss latency is a bloom filter. Bloom filters can be tuned to be incredibly space efficient as well.
Please don't be so dismissive of people who answer questions by giving generally good and standard advice, just because you have some silly corner case that you want to play "gotcha" on.
You are talking about minimizing the expected or average latency, which is very rarely important. The actual goal will depend on the broader system but typically you want to minimize the latency of the worst 95%, 99%, 99.9% etc of requests. Whether or not the cache reduces latency depends on the distribution of the cache hits. In practice, items that miss a simple LRU cache often have the longest latency in other parts of the system, in which case the cache would not reduce p99 latency.
EDIT: I recommend reading the Google SRE book or the famous "tale at scale" article.
Well folks, you heard it here. Average latency is rarely important and if someone presents a very standard and common way to reduce latency of a system, you should dismiss them because they should be aware that your specific system that they have no knowledge of would result in 99% of cache misses.
I used to work at Google long ago in the platforms division, in fact I worked on the BigTable cache (among other things, mostly related to performance). It would be very sad indeed if today's SRE book dismisses caching as a vital and standard optimization strategy and instead plays all kinds of gotchas with potential candidates as I believe you're doing.
With that said, you article you linked does not in any way support your claim about caching, on the contrary it hardly discusses caching at all. It's as if you just wanted to dump a document you thought I wouldn't read as a way to be dismissive.
What's your ldap? I can look you up and see what you were working on, and whether it required making design decisions around reduced average system latency.
And yeah this is exactly my point. If you present a "standard and common" solution that isn't applicable to the question I actually asked, and if you blindly apply solutions without thinking about what problem is actually being solved, then that's bad in an interview.
Average latency is usually not the thing you want to reduce because it's not representative of what any actual user is experiencing.
Why would you do that? Now you're being weird and kind of creepy.
With that said, to the best of my knowledge it is the same as my HackerNews handle, so you are welcome to find whatever info you'd like on that, but please understand your request is very creepy and akward.
If you need to reduce average latency and the cache would actually do that, sure. Cache doesn't magically make all your network latencies drop, though.
yeah, not sure what is too hard to understand about that. Perhaps it depends on the context but I highly doubt you can go wrong by saying a cache should be added to reduce latency.
The way it typically fall aparts for my candidates is I ask about optimizations, they say X could be cached, then I ask about cache invalidation and lo and behold, it didn't even occur to them that the cache needs to be invalidated at some point, so they blurt something along the lines of doing the entire expensive operation again and checking it against the cache, or something similarly nonsensical.
The problem isn't demonstrating understanding of what a cache does conceptually, in fact any SWE worth their salt ought to be able to explain caching. The problem is if the candidate's decision to add a cache causes the system behavior to become obviously incorrect upon an ounce of further inspection, because this demonstrates a lack of foresight about local maxima vs global maxima, and systems are all about trade-offs at the macro scale.
Caching is an easy thing to blurt out in an interview setting, but not all problem spaces benefit from caching and caching often isn't the only available solution, or the most ideal one.
I honestly can't decide which would be worse: a developer who literally doesn't know what a cache is or one who installs caches with bad invalidation policies.
In practice, "latency" is too vague to be what matters. In many systems it's the worst 1% or 0.1% of latencies that matter. Adding a cache will improve latency in those cases only in some situations.
Usually adding caches helps by reducing load on the service behind the cache.
We should add a cache to reduce latency tho…