Hacker Newsnew | past | comments | ask | show | jobs | submit | acters's commentslogin

Instead of telling the LLM that "run"works like a cli, maybe just tell the LLM that "run" will execute sh/bash/zsh/etc scripts?

I tried over 20 variations of different system prompts. Once I changed my tool to expect the colon, it also felt like it was running/calling tools faster, but I need to do a larger test to be sure.

I have a 1660ti and the cachyos + aur/llama.cpp-cuda package is working fine for me. With about 5.3 GB of usable memory, I find that the 35B model is by far the most capable one that performs just as fast as the 4B model that fits entirely on my GPU. I did try the 9B model and was surprisingly capable. However 35B still better in some of my own anecdotal test cases. Very happy with the improvement. However, I notice that qwen 3.5 is about half the speed of qwen 3


I have personally seen a rise of LLMs being too lazy to investigate or do some level of figuring out things on their own and just jump to conclusions and hope you tell them extra information even if it is something they can do on their own.


I'm partially fascinated by their reliance on this model. I do miss the models before gpt 5. Openai is quietly locking it away into some vault as we just need to accept whatever model is current. I think I can sympathize with these people on only one merit and that is nostalgia and entertainment. I still load up old versions of software. I still watch old shows. I still play old video games. Under the lens of entertainment, I will never be able to be entertained by the objectively worse models. Old chats are kind of still there but not really, the UI is obviously different and probably will get deleted when I stop paying for the subscription and try to claw back some of my life away from chatting with these stupid models. It's dangerous to hold any meaningful memory with these cloud LLMs. Not to mention the social media traps people fell for, that I was proactively avoiding. I did get some part of me attached to gpt 4o. I quickly realized it and moved away from it. This post is a mixture of complex emotions but it is just what I felt like posting. It's fine to ridicule people for wanting to be that deeply attached but these cloud LLMs show how easily it is to start a social habit and lose it in an instant. We need more healthcare push to prevent (and treat the) social attachment from happening to LLMs.


If it's alright to be pedantic, anyone with programming knowledge can do the same without these tools. What these offer is tried and tested secure code for client side needs, clear options and you don't need to hand roll code for.


You can program without tools? I want to see that. Do you still have switches to alter RAM content, or do you use the butterfly method?


who's hand rolling code anymore these days though?


So basically the same as censorship because that is the exact same thing blocking ports does.


Devstral 2 is free from the API. That has to be a bigger point to what makes it better. The price to performance ratio is practically better in every way. Does it matter if the performance is slightly worse when it is practically free?


Yes, but if it's actually competitive that won't last that long. Mistral will do the same as google (cut their free tier by 50x or so) if they ever catch up. Financially anything else would make no sense.

Of course currently Mistral has an insane free tier, 1 billion tokens for each(?) of their models per month.


I am still running an i5 4690k, really all I need is better GPU but those prices are criminal. I wish I got a 4090 when I had the chance rip


intel arc b580 (i think that's the latest one) isn't obnoxiously priced but you're going to have to face the fact that your PCIE is really very slow. But it should work.

if you want to save even more money get the older Arc Battlemage GPUs. I used one it was comparable with an RTX 3060; i returned it because the machine i was running it in had a bug that was fixed 2 days before i returned it but i didn't know that.

I was seriously considering getting a b580 or waiting until the b*70 came out with more memory, although at this point i doubt it will be very affordable considering VRAM prices going up as well. A friend is supposedly going to ship me a few GTX 1080ti cards so i can delay buying newer cards for a bit.


By older Arc, I presume you're referring to Alchemist and not Battlemage in this case.

One of my brothers has a PC I built for him, specced out with an Intel Core i5 13400f CPU and an Intel Arc A770 GPU, and it still works great for his needs in 2025.

Surely, Battlemage is more efficient and more compatible in some ways over Alchemist. But if you keep your expectations in check, it will do just fine in many scenarios. Just avoid any games using Unreal Engine 5.


yeah i had an A770; it should be ~$200-$250 now on ebay, lightly used. It's, in my opinion, worth about $200, if it's relatively unused. As i mentioned, it's ~= RTX 3060 at least for compute loads, and the 16GB is nice to have for that. But for a computer from the 4th gen i'd probably only get a A380 or A580; the A380 is $60-$120 on ebay.


Note that some tinkering may be required for modern cards on old systems.

- A UEFI DXE driver to enable Resizable BAR on systems which don't support it officially. This provides performance benefits and is even required for Intel Arc GPUs to function optimally.

List of working motherboards

https://github.com/xCuri0/ReBarUEFI/issues/11


you need to enable rebar even for gaming? i had to enable rebar for pytorch usage (the oneAPI requires it iirc).


That is why I like harmonic app, there is an invite button separating the upvote and downvote. Never going to have this kind of issue


The reality is that advertisers will be able to inject their products into the LLMs through manufactured results, prompt engineering and possibly long term deals integrating training data for their brand and product lines.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: