I really wonder how an HTTP application doesn't suffer from performance hits when it's based on 2000 micro services. Lets say even just 30 of those get used by a call to monzo.com. How does this not cause at least let's say ~300ms delay? I guess all the calls actually come from a local memory cache and there is almost never a real http call made to these microservices. Otherwise I have no idea how microservices are ever viable.
Some of the calls can fan out in parallel, but in my experience it's not good for performance, even with fewer services. The remote call overhead certainly adds up, but another issue is that each service does redundant work. E.g. each service might have to fetch settings for the user in order to respond. You can refactor to eliminate this (e.g. fetch settings once and pass as a parameter to each call), but it's a lot more work to make this change across many services.
If you look at something like K8s, it'll try and make contact with the other service running on the same node as itself, thus removing networking delays.
It's entirely feasible that all incoming requests will hit a single node and stay there.
Still, hitting multiple microservices in a single request will have an overhead when marshalling all the http requests and whatnot.
I worked at Monzo a while back, and, while the 'Platform' (~= SRE) team were brilliant, and did what you describe along with much much more, regardless the performance impact of thousands of microservices could be characterised as approximately "what you would expect". Hundreds of thousands a month on AWS to service a few million customers, a whole team needing to work on a project for [I can't recall how many] months to write some bodge-y fixes so app load could be brought under 10 seconds, lots of pathological request paths efflorescing in the service graph, etc.
That being said, there was genuinely need for microservices, to an extent. A bank's architecture is very different from a CRUD web app. Most of the code running wasn't servicing synchronous HTTP requests, but was doing batch or asynchronous work related (usually at two or three degrees of separation) to card payments, transfers, various kinds of fraud prevention, onboarding (which was an absolutely colossal edifice, very very different from ordinary SaaS onboarding), etc.
So we'd have had lots of daemons and crons in any case. And, to be fair, we started on Kubernetes before it was super-trendy and easy to deploy - it very much wasn't the 'default' choice, and we had to do a lot of very fundamental work ourselves.
But yeah, in my view we took it too far, out of ideological fervour. Dozens of - or at most a hundred-ish - microservices would have been the sweet spot. Your architectural structure doesn't need to be isomorphic to your code or team structure.
This is a potential downside of this architecture. As others have already said, there are mitigations, but fundamentally each edge request is going to be more costly (time or money) to serve than having it served by a single machine/DB.
One of our engineering principals is to avoid premature optimisation, which is possibly one of the reasons our architecture has grown in this way. So far, whenever we've needed to fix a performance issue we've been able to solve it locally rather than change the architecture.
At the business level, we've been optimising for growth rather than costs, but this could change in the future, at which point we may need to reconsider our architecture. But for now it's working for us.