If anyone has heard Joe Armstrong's talk about how communication is limited by latency and data can only travel so fast. I think having smaller a partitions locally is an optimal point.
If You want global consistency then you'll have to either spend some time at runtime to achieve it, Have complicated protocols, fast networking, synchronized clocks.
Does this look like actor model (from Erlang) if you squint a bit?
You might also appreciate this talk on building a loosely related architecture using Erlang, though it doesn't implement an actor-per-database pattern – https://www.youtube.com/watch?v=huGVdGLBJEo
I was thinking something very similar. Once you've accepted any need at all for global state the next move is to reorient to minimizing it with horizontally scalable point local state and a small targeting dataset and tiered caching system.
If You want global consistency then you'll have to either spend some time at runtime to achieve it, Have complicated protocols, fast networking, synchronized clocks.
Does this look like actor model (from Erlang) if you squint a bit?