You are talking about minimizing the expected or average latency, which is very rarely important. The actual goal will depend on the broader system but typically you want to minimize the latency of the worst 95%, 99%, 99.9% etc of requests. Whether or not the cache reduces latency depends on the distribution of the cache hits. In practice, items that miss a simple LRU cache often have the longest latency in other parts of the system, in which case the cache would not reduce p99 latency.
EDIT: I recommend reading the Google SRE book or the famous "tale at scale" article.
Well folks, you heard it here. Average latency is rarely important and if someone presents a very standard and common way to reduce latency of a system, you should dismiss them because they should be aware that your specific system that they have no knowledge of would result in 99% of cache misses.
I used to work at Google long ago in the platforms division, in fact I worked on the BigTable cache (among other things, mostly related to performance). It would be very sad indeed if today's SRE book dismisses caching as a vital and standard optimization strategy and instead plays all kinds of gotchas with potential candidates as I believe you're doing.
With that said, you article you linked does not in any way support your claim about caching, on the contrary it hardly discusses caching at all. It's as if you just wanted to dump a document you thought I wouldn't read as a way to be dismissive.
What's your ldap? I can look you up and see what you were working on, and whether it required making design decisions around reduced average system latency.
And yeah this is exactly my point. If you present a "standard and common" solution that isn't applicable to the question I actually asked, and if you blindly apply solutions without thinking about what problem is actually being solved, then that's bad in an interview.
Average latency is usually not the thing you want to reduce because it's not representative of what any actual user is experiencing.
Why would you do that? Now you're being weird and kind of creepy.
With that said, to the best of my knowledge it is the same as my HackerNews handle, so you are welcome to find whatever info you'd like on that, but please understand your request is very creepy and akward.
EDIT: I recommend reading the Google SRE book or the famous "tale at scale" article.
https://www.barroso.org/publications/TheTailAtScale.pdf