BTW, I have noticed that when tested outside GCP, the OpenAI compat endpoint has significantly lower latency for most requests (vs using the genai library). VertexAI is better than both.
Any idea why or if that will change?
BTW, I have noticed that when tested outside GCP, the OpenAI compat endpoint has significantly lower latency for most requests (vs using the genai library). VertexAI is better than both.
Any idea why or if that will change?