More

tobyhinloopen · 2026-04-02T05:58:47 1775109527

How would you know the invocation is correct when written by a human? Don’t humans make mistakes?

rmunn · 2026-04-02T17:39:25 1775151565

Sure, humans make mistakes... but rarely, vanishingly rarely about commands they use often. Are you going to make a non-typo kind of mistake when typing `ls -l`? AI hallucinations don't happen all the time, but they happen so much more often than "vanishingly rarely".

That's why you can't just vibe-code something and expect it to work 100% correctly with no design flaws, you need to check the AI's output and correct its mistakes. Just yesterday I corrected a Claude-generated PR that my colleague had started, but hadn't had time to finish checking before he went on vacation. He'd caught most of its mistakes, but there was one unit test that showed that Claude had completely misunderstood how a couple of our services are intended to work together. The kind of mistake a human would never have made: a novice wouldn't have understood those services enough to use them in the first place, and an expert would have understood them and how they are supposed to work together.

You always, always, have to double-check the output of LLMs. Their error rate is quite low, thankfully, but on work of any significant size their error rate is pretty much never zero. So if you don't double-check them then you're likely to end up introducing more bugs than you're fixing in any given week, leading to a codebase whose quality is slowly getting worse.

tobyhinloopen · 2026-04-01T07:17:24 1775027844

Russia is the aggressor, Iran is a defender. That’s a huge difference.

tobyhinloopen · 2026-03-27T16:42:29 1774629749

how many users are using lockdown mode

avazhi · 2026-03-27T18:40:54 1774636854

I’ve been using it for more than a year.

Parts of it are pretty inconvenient, like with iMessage and FaceTime not working normally, but aside from that it’s not noticeable for my use case.

Despite the inconveniences, unless animated emmojis are important to you I don’t know why you wouldn’t enable it given how strong its protections are.

snailmailman · 2026-03-27T16:58:39 1774630719

Every day users? Probably not many. It forcibly disables lots of nice-to-have features.

But users who need a highly secure phone? It’s entirely possible to use the phone without media embeds in iMessage, or shared photo albums, or websites loading in 900 fonts. It’s a trade off likely worth making in some situations.

ectospheno · 2026-03-27T17:31:16 1774632676

You can make a shared photo album with family members. It’s everyone else that is problematic with the feature enabled. In my case I only want to share with my wife and son so it wasn’t a detractor for me.

ectospheno · 2026-03-27T17:03:39 1774631019

I’ve used it on my personal iPhone since the feature was released. The impact to my life has been minor. I can’t share some thing with my wife in the health app and my son can’t SharePlay with me in the car while I use CarPlay.

tgv · 2026-03-27T18:15:36 1774635336

I turned it on, out of curiosity, and the impact is minimal, for me.

captn3m0 · 2026-03-27T17:02:24 1774630944

I was using it till the 26 upgrade on my iOS 13 Mini. Became very sluggish and unusable that I had to disable it. It clearly isn't tested well.

JumpCrisscross · 2026-03-27T18:42:11 1774636931

I turn it on when I travel overseas, and have considered turning it on when I’m near border regions in America.

It’s mostly that I don’t want to be that guy that leaks my company’s secrets.

tobyhinloopen · 2026-03-24T11:36:56 1774352216

I use Preact without reactivity. That way we can have familiar components that look like React (including strong typing, Typescript / TSX), server-side rendering and still have explicit render calls using an MVC pattern.

yde_java · 2026-03-24T11:43:07 1774352587

How and when do your components update in such an architecture?

tobyhinloopen · 2026-03-24T13:05:47 1774357547

View triggers an event -> Controller receives event, updating the model as it sees fit -> Controller calls render to update views

Model knows nothing about controller or views, so they're independently testable. Models and views are composed of a tree of entities (model) and components (views). Controller is the glue. Also, API calls are done by the controller.

So it is more of an Entity-Boundary-Control pattern.

gr4vityWall · 2026-03-24T11:48:44 1774352924

From what I can tell, they do full page reloads when visiting a different page, and use Preact for building UIs using components. Those components and pages then get rendered on the server as typical template engines.

threatofrain · 2026-03-25T01:14:12 1774401252

Could you show an example?

tobyhinloopen · 2026-02-22T21:34:26 1771796066

Neat! I was looking for something like this

harshdoesdev · 2026-02-22T21:36:48 1771796208

thanks! let me know how it goes

tobyhinloopen · 2026-02-06T15:31:49 1770391909

Way too expensive, I'll wait for a free/open source browser optimized to be used by agents.

antves · 2026-02-06T15:48:08 1770392888

Our approach is actually very cost-effective compared to alternatives. Our browser uses a token-efficient LLM-friendly representation of the webpage that keeps context size low, while also allowing small and efficient models to handle the low-level navigation. This means agents like Claude can work at a higher abstraction level rather than burning tokens on every click and scroll, which would be far more expensive

verdverm · 2026-02-06T15:56:50 1770393410

If a potential user says it is too expensive, better to ask why than to tell them they are wrong. You likely have assumptions you have not validated

antves · 2026-02-06T16:22:24 1770394944

Definitely! Making Smooth as cost-effective as possible it's been a core goal for us, so we'd really love to hear your thoughts on this

We'll continue to make Smooth more affordable and accessible as this is a core principle of our work (https://www.smooth.sh/images/comparison.gif)

verdverm · 2026-02-06T16:44:04 1770396244

are your evals / comparisons publicly/3rd party reproducible?

If it's "trust me, I did a fair comparison", that's not going to fly today. There's too much lying in society, trusting people trying to sell you something to be telling the truth is not the default anymore, skepticism is

antves · 2026-02-06T16:56:19 1770396979

That's a great point, we'll publish everything on our docs as soon as possible

tobyhinloopen · 2026-02-10T07:57:51 1770710271

I'm paying a fixed amount on Claude and other agents, so "more tokens" is "free" for me. There's a lot of niche tools out there but I think we all have "subscription fatigue".

But maybe that's just me - Maybe im just not your target audience :)

tobyhinloopen · 2026-02-03T14:47:35 1770130055

Same! If I put the skill's instructions in the general AGENTS.md, it works just fine.

tobyhinloopen · 2026-02-03T14:47:07 1770130027

ln -s to the rescue!

smithkl42 · 2026-02-03T15:18:57 1770131937

That doesn't work very well if your developers are on Windows (and most are). Uneven Git support for symbolic links across platforms is going to end up causing more problems than it solves.

tobyhinloopen · 2026-02-06T13:42:11 1770385331

Win developers aren't using WSL?

flurdy · 2026-02-03T15:29:04 1770132544

It's why I wrapped my tiny skills repo with a script that softlink them into whichever is your skills folder, defaulting to Claude, but could be any other.

I treat my skills the same as I would write tiny bash scripts and fish functions in the days gone to simplify my life by writing 2 words instead of 2 sentences. Tiny improvement that only makes sense for a programmer at heart.

[1] https://github.com/flurdy/agent-skills

davidkunz · 2026-02-03T14:48:03 1770130083

The root cause should be fixed.

xrd · 2026-02-03T14:56:45 1770130605

Why not hardlinks?

dmd · 2026-02-03T15:03:00 1770130980

You can't hardlink a directory.

tobyhinloopen · 2026-01-22T19:19:07 1769109547

I had to read it twice as well, I was so confused hah. I’m still confused

rtkwe · 2026-01-22T19:30:26 1769110226

They probably organize individual accounts the same as organization accounts for larger groups of users at the same company internally since it all rolls up to one billing. That's my first pass guess at least.

tobyhinloopen · 2026-01-22T19:18:19 1769109499

So you were generating and evaluating the performance of your CLAUDE.md files? And you got banned for it?

Aurornis · 2026-01-22T19:50:39 1769111439

I think it's more likely that their account was disabled for other reasons, but they blamed the last thing they were doing before the account was closed.

pocksuppet · 2026-01-22T20:28:08 1769113688

And why wouldn't you? It's the only information available to you.

alistairSH · 2026-01-22T19:23:46 1769109826

It reads like he had a circular prompt process running, where multiple instances of Claude were solving problems, feeding results to each other, and possibly updating each other's control files?

Hackbraten · 2026-01-22T20:49:29 1769114969

They were trying to optimize a CLAUDE.md file which belonged to a project template. The outer Claude instance iterated on the file. To test the result, the human in the loop instantiated a new project from the template, launched an inner Claude instance along with the new project, assessed whether inner Claude worked as expected with the CLAUDE.md in the freshly generated project. They then gave the feedback back to outer Claude.

So, no circular prompt feeding at all. Just a normal iterate-test-repeat loop that happened to involve two agents.

epolanski · 2026-01-22T19:27:14 1769110034

What would be bad in that?

Writing the best possible specs for these agents seems the most productive goal they could achieve.

NitpickLawyer · 2026-01-22T19:46:35 1769111195

I think the idea is fine, but what might end up happening is that one agent gets unhinged and "asks" another agent to do more and more crazy stuff, and they get in a loop where everything gets flagged. Remember that "bots configured to add a book at +0.01$ on amazon, reached 1M$ for the book" a while ago. Kinda like that, but with prompts.

epolanski · 2026-01-22T19:49:30 1769111370

I still don't get it, get your models better for this far fetched case, don't ban users for a legitimate use case.

alistairSH · 2026-01-22T21:12:32 1769116352

Nothing necessarily or obviously bad about it, just trying to think through what went wrong.

andrelaszlo · 2026-01-22T19:46:19 1769111179

Could anyone explain to me what the problem is with this? I thought I was fairly up to date on these things, but this was a surprise to me. I see the sibling comment getting downvoted but I promise I'm asking this in good faith, even if it might seem like a silly question (?) for some reason.

alistairSH · 2026-01-22T21:18:16 1769116696

From what I'm reading in other comments, the problem was Claude1 got increasingly "frustrated" with Claude2's inability to do whatever the human was asking, and started breaking it's own rules (using ALL CAPS).

Sort of like MS's old chatbot that turned into a Nazi overnight, but this time with one agent simply getting tired of the other agent's lack of progress (for some definition of progress - I'm still not entirely sure what the author was feeding into Claude1 alongside errors from Claude2).