More

weavie · 2026-04-29T08:55:30 1777452930

The idea was, move fast and break things - but then pick them up and fix them. Companies realised they didn't really have to fix them properly as the users still stuck around.

weavie · 2026-04-15T15:07:39 1776265659

weavie · 2026-04-15T11:15:53 1776251753

I think this chap is just joking.

weavie · 2026-04-12T14:06:47 1776002807

How good are local LLMs at coding these days? Does anyone have any recommendations for how to get this setup? What would the minimum spend be for usable hardware?

I am getting bored of having to plan my weekends around quota limit reset times...

throwaway2027 · 2026-04-12T14:47:40 1776005260

Some claim that some of the recent smaller local models are as good as Sonnet 4.5 of last year and the bigger high-end models can be as almost as good as Claude, Gemini and Codex today, but some say they're benchmaxed and not representative.

To try things out you can use llama.cpp with Vulkan or even CPU and a small model like Gemma 4 26B-A4B or Gemma 4 31B or Qwen 3.5 35-A3B or Qwen3.5 27B. Some of the smaller quants fit within 16GB of GPU memory. The default people usually go with now is Q4_K_XL, a 4-bit quant for decent performance and size.

https://huggingface.co/unsloth/gemma-4-26B-A4B-it-GGUF

https://huggingface.co/unsloth/gemma-4-31B-it-GGUF

https://huggingface.co/unsloth/Qwen3.5-35B-A3B-GGUF

https://huggingface.co/unsloth/Qwen3.5-27B-GGUF

Get a second hand 3090/4090 or buy a new Intel Arc Pro B70. Use MoE models and offload to RAM for best bang for your buck. For speed try to find a model that fits entirely within VRAM. If you want to use multiple GPUs you might want to switch to vLLM or something else.

You can try any of the following models:

High-end: GLM 5.1, MiniMax 2.7

Medium: Gemma 4, Qwen 3.5

https://unsloth.ai/docs/models/minimax-m27

https://unsloth.ai/docs/models/glm-5.1

https://unsloth.ai/docs/models/gemma-4

https://unsloth.ai/docs/models/qwen3.5

https://github.com/ggml-org/llama.cpp

weavie · 2026-04-12T15:40:28 1776008428

Thank you, I'll look into it. For someone who is used to just working with second hand thinkpads, this stuff gets expensive fast!

ac29 · 2026-04-12T16:00:30 1776009630

The very best open models are maybe 3-12 months behind the frontier and are large enough that you need $10k+ of hardware to run them, and a lot more to run them performantly. ROI here is going to be deeply negative vs just using the same models via API or subscription.

You can run smaller models on much more modest hardware but they aren't yet useful for anything more than trivial coding tasks. Performance also really falls off a cliff the deeper you get into the context window, which is extra painful with thinking models in agentic use cases (lots of tokens generated).

rexreed · 2026-04-13T12:10:43 1776082243

You can also run these models on the cloud with Ollama. You might say what's the difference, but these are models whose performance will stay consistent over time, whether run locally or in the cloud. For $200 a year I'm getting some pretty fantastic results running GLM 5.1 and even Minimax 2.7 and Kimi 2.5 and Gemma 4 on Ollama's cloud instances. And if you don't like Ollama's cloud instance, you can run it on your own cloud instance from the very same providers that Ollama is using. They use NVIDIA cloud providers (NCPs) although not sure which ones specifically and claims that the "cloud does not retain your data to ensure privacy and security." [https://ollama.com/blog/cloud-models]

weavie · 2026-04-13T13:55:25 1776088525

Interesting. On the pricing page, there are still limits placed on the usage. How restrictive have you found them?

weavie · 2026-04-12T19:25:28 1776021928

What are the best open models?

mathieudombrock · 2026-04-13T07:32:46 1776065566

This includes open and closed models ranked by popularity and other metrics.

https://openrouter.ai/rankings

mathieudombrock · 2026-04-13T07:32:46 1776065566

This includes open and closed models ranked by popularity and other metrics.

https://openrouter.ai/rankings

weavie · 2026-04-06T22:22:33 1775514153

You could make it happen in about a week and $50 worth of tokens.

weavie · 2026-03-31T13:15:20 1774962920

I don't think it's so clear cut. The problem is that his personality defects have allowed him to be influenced by people who are truly malevolent. Those people lurk more in the shadows and so avoid the condemnation that they deserve. Trump is their obvious useful idiot with the target painted on his head.

weavie · 2026-03-31T11:51:53 1774957913

I've worked at companies before where they have balked at spending $300 to buy me a second hand thinkpad because I really wanted to work on a Linux machine rather than a mac. I don't see them throwing $unlimited at tokens to find vulnerabilities, at least until after it's too late.

acdha · 2026-04-01T00:40:36 1775004036

I think you’re right that they’re going to skimp as much as regulators & the market let them, but that Thinkpad would cost a lot more than $300: a new platform is an ongoing cost for maintenance, security, and interoperability – not crushing, but those factors quickly outweigh the hardware.

weavie · 2026-02-26T22:42:45 1772145765

I get the ocassional file named `1` lying around.

weavie · 2026-02-08T21:14:54 1770585294

Have you tried reading the documentation it generates?

weavie · 2026-02-08T00:01:56 1770508916

As someone in the UK for whom that link is blocked, I wonder if that meme is doubly apt.

cedws · 2026-02-08T14:53:36 1770562416

Children successfully protected. Nevermind the child groomers on Roblox.