While we are talking about quotas, can you maybe add an easy way of checking how much you've used/got left?
Apparently now you need to use google-cloud-quotas to get the limit and google-cloud-monitoring to get the usage.
VS Code copilot managed to implement the first part, getting the limit using gemini-2.5-pro, but when I asked gemini to implement the second part it said that integrating cloud-monitoring is too complex and it can't do it !!!!
The thing is that the entry level of provisioned throughput is so high! I just want a reliable model experience for my small Dev team using models through Vertex but I don't think there's anything I can buy there to ensure it.