The gateway is integrated directly into our chat (https://glama.ai/chat). So you can use most of the things that you are used to having with Claude. And if anything is missing, just let me know and I will prioritize it. If you check our Discord, I have a decent track record of being receptive to feedback and quickly turning around features.
Long term, Glama's focus is predominantly on MCPs, but chat, gateway and LLM routing is integral to the greater vision.
I would love feedback if you are going to give a try frank@glama.ai
The issue isn't API limits, but web UI limits. We can always get around the web interface's limits by using the claude API directly but then you need to have some other interface...
The API still has limits. Even if you are on the highest tier, you will quickly run into those limits when using coding assistants.
The value proposition of Glama is that it combines UI and API.
While everyone focuses on either one or the other, I've been splitting my time equally working on both.
Glama UI would not win against Anthropic if we were to compare them by the number of features. However, the components that I developed were created with craft and love.
You have access to:
* Switch models between OpenAI/Anthropic, etc.
* Side-by-side conversations
* Full-text search of all your conversations
* Integration of LaTeX, Mermaid, rich-text editing
Ok, but that's not the issue the parent was mentioning. I've never hit API limits but, like the original comment mentioned, I too constantly hit the web interface limits particularly when discussing relatively large modules.
Your chat idea is a little similar to Abacus AI. I wish you had a similarly affordable monthly plan for chat only, but your UI seems much better. I may give it a try!
Who is glama.ai though? Could not find company info on the site, the Frank name writing the blog posts seems to be an alias for Popeye the sailor. Am I missing something there? How can a user vet the company?
As another commenter in this tread said, we are just a 'frontend wrapper' around other people services. Therefore, it is not particularly difficult to add models that are already supported by other providers.
The benefit of using our wrapper is that you can use a single API key and you get one bill for all your AI bills, you don't need to hack together your own logic for routing requests between different providers, failovers, keeping track of their costs, worry what happens if a provider goes down, etc.
The market at the moment is hugely fragmented, with many providers unstable, constantly shifting prices, etc. The benefit of a router is that you don't need to worry about those things.
Scaling infrastructure to handle billions of tokens is no joke.
I believe they are approaching 1 trillion tokens per week.
Glama is way smaller. We only recently crossed 10bn tokens per day.
However, I have invested a lot more into UX/UI of that chat itself, i.e. while OpenRouter is entirely focused on API gateway (which is working for them), I am going for a hybrid approach.
The market is big enough for both projects to co-exist.
We currently serve ~10bn tokens per day (across all models). OpenAI compatible API. No rate limits. Built in logging and tracing.
I work with LLMs every day, so I am always on top of adding models. 3.7 is also already available.
https://glama.ai/models/claude-3-7-sonnet-20250219
The gateway is integrated directly into our chat (https://glama.ai/chat). So you can use most of the things that you are used to having with Claude. And if anything is missing, just let me know and I will prioritize it. If you check our Discord, I have a decent track record of being receptive to feedback and quickly turning around features.
Long term, Glama's focus is predominantly on MCPs, but chat, gateway and LLM routing is integral to the greater vision.
I would love feedback if you are going to give a try frank@glama.ai