Hacker News new | past | comments | ask | show | jobs | submit login

We have moved our quota system to Dynamic Shared Quota (https://cloud.google.com/vertex-ai/generative-ai/docs/quotas) for 2.0+ models. There are no quotas in DSQ. If you need a guaranteed throughput there is an option to purchase Provisioned Throughput (https://cloud.google.com/vertex-ai/generative-ai/docs/provis...).



While we are talking about quotas, can you maybe add an easy way of checking how much you've used/got left?

Apparently now you need to use google-cloud-quotas to get the limit and google-cloud-monitoring to get the usage.

VS Code copilot managed to implement the first part, getting the limit using gemini-2.5-pro, but when I asked gemini to implement the second part it said that integrating cloud-monitoring is too complex and it can't do it !!!!


The thing is that the entry level of provisioned throughput is so high! I just want a reliable model experience for my small Dev team using models through Vertex but I don't think there's anything I can buy there to ensure it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: