More

acchow · 2026-04-08T04:25:32 1775622332

> I think something like 96GB RTX PRO 6000 Blackwells would be the minimum to run a model of this size with performance in the range of subscription models.

GLM 5.1 has 754B parameters tho. And you still need RAM for context too. You'll want much more than 96GB ram.

acchow · 2026-04-07T17:13:27 1775582007

Which in the series specifically?

acchow · 2026-04-03T22:27:01 1775255221

H1b is tied to employment, not to the employer. You can change employers on the same H1.

It’s not great. But this is similar to how health insurance is tied to employment, not to the employer. Both citizens and H1 employees experience the same abuse here

sarchertech · 2026-04-03T22:33:46 1775255626

No it’s worse for them. A person on an H-1B has a ticking time bomb to find a new job or leave the country.

acchow · 2026-03-25T04:23:40 1774412620

> asserting that LLMs will never generate 'truly novel' ideas or problem solutions

I don't think I've had one of these my entire life. Truly novel ideas are exceptionally rare:

- Darwin's origin of the species - Godel's Incompleteness - Buddhist detachment

Can't think of many.

acchow · 2026-03-09T09:50:24 1773049824

> Every MCP server injects its full tool schemas into context on every turn

I consider this a bug. I'm sure the chat clients will fix this soon enough.

Something like: on each turn, a subagent searches available MCP tools for anything relevant. Usually, nothing helpful will be found and the regular chat continues without any MCP context added.

phh · 2026-03-09T10:15:24 1773051324

Absoultely.

I'll add to your comment that it isn't a bug of MCP itself. MCP doesn't specify what the LLM sees. It's a bug of the MCP client.

In my toy chatbot, I implement MCP as pseudo-python for the LLM, dropping typing info, and giving the tool infos as abruptly as possible, just a line - function_name(mandatory arg1 name, mandatory arg2 name): Description

(I don't recommend doing that, it's largely obsolete, my point is simply that you feed the LLM whatever you want, MCP doesn't mandate anything. tbh it doesn't even mandate that it feeds into a LLM, hence the MCP CLIs)

fennecbutt · 2026-03-09T10:05:22 1773050722

Yup, routing is key. Just like how we've had RAG so we don't have to add every biz doc to the context.

I agree with the general idea that models are better trained to use popular cli tools like directory navigation etc, but outside of ls and ps etc the difference isn't really there, new clis are just as confusing to the model as new mcps.

sathish316 · 2026-03-09T14:23:41 1773066221

You’re spot on. Anthropic blogs talk about a ToolSearchTool to solve this problem - https://www.anthropic.com/engineering/advanced-tool-use

Terretta · 2026-03-09T12:19:30 1773058770

> > Every MCP server injects its full tool schemas into context on every turn

> I consider this a bug. I'm sure the chat clients will fix this soon enough.

ANTHROP\C's Claudes manage/minimize/mitigate this reaonably.

edgyquant · 2026-03-09T11:25:53 1773055553

That’s a trade off, now you need multiple model calls for every single request

ekianjo · 2026-03-09T10:19:28 1773051568

Yes we just RAG to be applied on tools. Very simple to implement.

edgyquant · 2026-03-09T11:26:43 1773055603

I don’t think so. Without a list of tools in context the ai can’t even know what options it has, so a RAG like search doesn’t feel like it would be anywhere near as accurate

ekianjo · 2026-03-09T12:15:34 1773058534

The RAG helps select the tool needed for the task at hand. Semantic search returns only the tools that match. Very efficient.

acchow · 2026-03-09T09:47:38 1773049658

But MCP uses Oauth. That is not a "worse version" of API keys. It is better.

The classic "API key" flow requires you to go to the resource site, generate a key, copy it, then paste it where you want it to go.

Oauth automates this. It's like "give me an API key" on demand.

acchow · 2026-03-05T11:30:11 1772710211

Interesting pricing differential. Seems in your country, that IdeaPad is significantly cheaper than the price in the US. But for your Macbook Neo, it's the other way around.

Wonder why that is.

graycrow · 2026-03-05T14:23:00 1772720580

No idea. Maybe Lenovo includes purchasing power in the price calculation for some reason, such as making more money in the U.S. while gaining market share here in Czechia, where purchasing power is lower. Apple may be able to afford not to do that.

acchow · 2026-03-04T00:44:58 1772585098

I had the opposite issue. Trackpoints stated hurting my hand because it requires significantly more force than the Mac's touchpad.

acchow · 2026-03-03T01:33:05 1772501585

We probably shouldn't deprecate AM for emergency broadcasts given you can listen to AM radio with grass https://www.youtube.com/watch?v=b9UO9tn4MpI

Aloha · 2026-03-03T05:16:06 1772514966

Only at the point of emission however...

acchow · 2026-03-01T09:45:32 1772358332

> Claude code always give me rate limits

> 50+ tps with qwen3.5 35b a4b on a 4090

But qwen3.5 35b is worse than even Claude Haiku 4.5. You could switch your Claude Code to use Haiku and never hit rate limits. Also gets similar 50tps.

CapsAdmin · 2026-03-01T13:17:05 1772371025

I haven't tried 4.5 haiku much, but i was not impressed with previous haiku versions.

My goto proprietary model in copilot for general tasks is gemini 3 flash which is priced the same as haiku.

The qwen model is in my experience close to gemini 3 flash, but gemini flash is still better.

Maybe it's somewhat related to what we're using them for. In my case I'm mostly using llms to code Lua. One case is a typed luajit language and the other is a 3d luajit framework written entirely in luajit.

I forgot exactly how many tps i get with qwen, but with glm 4.7 flash which is really good (to be local) gets me 120tps and a 120k context.

Don't get me wrong, proprietary models are superior, but local models are getting really good AND useful for a lot of real work.