Hacker Newsnew | past | comments | ask | show | jobs | submit | best commentslogin

Even if bigs exists to work around what Google is doing, that isn’t the right way forward. If people don’t agree with Google move, the only correct course of action is to ditch Chrome (and all Chromium browsers). Hit them where it hurts and take away their monopoly over the future direction of the web.

I sympathize with the pedantry here and found Fielding's paper to be interesting, but this is a lost battle. When I see "REST API" I can safely assume the following:

- The API returns JSON

- CRUD actions are mapped to POST/GET/PUT/DELETE

- The team constantly bikesheds over correct status codes and at least a few are used contrary to the HTTP spec

- There's a decent chance listing endpoints were changed to POST to support complex filters

Like Agile, CI or DevOps you can insist on the original definition or submit to the semantic diffusion and use the terms as they are commonly understood.


My most recent example of this is mentoring young, ambitious, but inexperienced interns.

Not only did they produce about the same amount of code in a day that they used to produce in a week (or two), several other things made my work harder than before:

- During review, they hadn't thought as deeply about their code so my comments seemed to often go over their heads. Instead of a discussion I'd get something like "good catch, I'll fix that" (also reminiscent of an LLM).

- The time spent on trivial issues went down a lot, almost zero, the remaining issues were much more subtle and time-consuming to find and describe.

- Many bugs were of a new kind (to me), the code would look like it does the right thing but actually not work at all, or just be much more broken than code with that level of "polish" would normally be. This breakdown of pattern-matching compared to "organic" code made the overhead much higher. Spending decades reviewing code and answering Stack Overflow questions often makes it possible to pinpoint not just a bug but how the author got there in the first place and how to help them avoid similar things in the future.

- A simple, but bad (inefficient, wrong, illegal, ugly, ...) solution is a nice thing to discuss, but the LLM-assisted junior dev often cooks up something much more complex, which can be bad in many ways at once. The culture of slowly growing a PR from a little bit broken, thinking about design and other considerations, until its high quality and ready for a final review doesn't work the same way.

- Instead of fixing the things in the original PR, I'd often get a completely different approach as the response to my first review. Again, often broken in new and subtle ways.

This lead to a kind of effort inversion, where senior devs spent much more time on these PRs than the junior authors themselves. The junior dev would feel (I assume) much more productive and competent, but the response to their work would eventually lack most of the usual enthusiasm or encouragement from senior devs.

How do people work with these issues? One thing that worked well for me initially was to always require a lot of (passing) tests but eventually these tests would suffer from many of the same problems


I think two things can be true simultaneously:

1. LLMs are a new technology and it's hard to put the genie back in the bottle with that. It's difficult to imagine a future where they don't continue to exist in some form, with all the timesaving benefits and social issues that come with them.

2. Almost three years in, companies investing in LLMs have not yet discovered a business model that justifies the massive expenditure of training and hosting them, the majority of consumer usage is at the free tier, the industry is seeing the first signs of pulling back investments, and model capabilities are plateauing at a level where most people agree that the output is trite and unpleasant to consume.

There are many technologies that have seemed inevitable and seen retreats under the lack of commensurate business return (the supersonic jetliner), and several that seemed poised to displace both old tech and labor but have settled into specific use cases (the microwave oven). Given the lack of a sufficiently profitable business model, it feels as likely as not that LLMs settle somewhere a little less remarkable, and hopefully less annoying, than today's almost universally disliked attempts to cram it everywhere.


IMO other than the Microsoft IP issue, I think the biggest thing that has shifted since this acquisition was first in the works is Claude Code has absolutely exploded. Forking an IDE and all the expense that comes with that feels like a waste of effort, considering the number of free/open source CLI agentic tools that are out there.

Let's review the current state of things:

- Terminal CLI agents are several orders of magnitude less $$$ to develop than forking an entire IDE.

- CC is dead simple to onboard (use whatever IDE you're using now, with a simple extension for some UX improvements).

- Anthropic is free to aggressively undercut their own API margins (and middlemen like Cursor) in exchange for more predictable subscription revenue + training data access.

What does Cursor/Windsurf offer over VS Code + CC?

- Tab completion model (Cursor's remaining moat)

- Some UI niceties like "add selection to chat", and etc.

Personally I think this is a harbinger of where things are going. Cursor was fastest to $900M ARR and IMO will be fastest back down again.


I went on a deep dive on this scandal about a year or so ago. One thing that struck me is the class element.

Basically, the Post Office leadership could not understand why someone would buy a PO franchise. It's a substantial amount of money up front, and people aren't allowed to buy multiple franchises, so every PO was an owner/operator position. Essentially people were "buying a job".

The people in leadership couldn't understand why someone would buy the opportunity to work long hours at a retail position and end up hopefully clearing a middle class salary at the end of the year. They assumed that there must be a real reason why people were signing up and the real reason was to put their hands in the till.

So they ended up assuming the postmasters were stealing, and the purpose of the accounting software was to detect the fraud so it could be prosecuted. When the accounting software started finding vast amounts of missing funds, they ignored questions about the software because it was working as intended. I bet if the opposite had happened, and it found very little fraud, they would have become suspicious of the software because their priors were that the postmasters were a bunch of thieves.


Can this ever work? I understand what you're trying to do here, but this is a lot like trying to sanitize user-provided Javascript before passing it to a trusted eval(). That approach has never, ever worked.

It seems weird that your MCP would be the security boundary here. To me, the problem seems pretty clear: in a realistic agent setup doing automated queries against a production database (or a database with production data in it), there should be one LLM context that is reading tickets, and another LLM context that can drive MCP SQL calls, and then agent code in between those contexts to enforce invariants.

I get that you can't do that with Cursor; Cursor has just one context. But that's why pointing Cursor at an MCP hooked up to a production database is an insane thing to do.


I know there are plenty of more serious issues people have with Mozilla's direction and focus, but patronizing stuff like this really grinds my gears.

> Which animal best represents your Firefox browsing style? [List of emoji animals]

The marketing/PR trend of speaking to communities as though they're kindergartners is distracting and off-putting. This is the most egregious part but the whole post has a similar tone.

I'll note that I'm not saying outreach should necessarily be professional or devoid of fun/humor. There's just a sterile, saccharine way about Mozilla's community engagement that evokes artificiality.


I used to want to donate to Mozilla Foundation, but I've long lost any hope that the corporation would spend that money in a way that makes sense to me. The pessimist on me would expect donated money to be spent on more built-in "campaigns", "studies" or ads. Or maybe a bonus for their executives.

I just want Firefox to be faster. I'm donating to Floorp (a Firefox fork), at least they seem focused on making the browser better.


> You might be asking: why did you rewrite tmux in Rust? And yeah, I don’t really have a good reason. It’s a hobby project. Like gardening, but with more segfaults.

I love this attitude. We don’t necessarily need a reason to build new things. Who knows what will come out of a hobby project. Thanks to the author for the great write up!

Also, my gardening is full of segfaults, coding a new project is definitely safer to my yard.


The "spreadsheet" example video is kind of funny: guy talks about how it normally takes him 4 to 8 hours to put together complicated, data-heavy reports. Now he fires off an agent request, goes to walk his dog, and comes back to a downloadable spreadsheet of dense data, which he pulls up and says "I think it got 98% of the information correct... I just needed to copy / paste a few things. If it can do 90 - 95% of the time consuming work, that will save you a ton of time"

It feels like either finding that 2% that's off (or dealing with 2% error) will be the time consuming part in a lot of cases. I mean, this is nothing new with LLMs, but as these use cases encourage users to input more complex tasks, that are more integrated with our personal data (and at times money, as hinted at by all the "do task X and buy me Y" examples), "almost right" seems like it has the potential to cause a lot of headaches. Especially when the 2% error is subtle and buried in step 3 of 46 of some complex agentic flow.


The best bypass is to use Firefox. uBlock Origin works best in Firefox:

https://github.com/gorhill/uBlock/wiki/uBlock-Origin-works-b...


According to Indian regulators, every trading day Jane Street would:

1) buy large volumes of stocks and/or stock futures that are part of an index tracking India’s banking sector, early in the day,

2) subsequently place large options trades, betting that the index would decline or volatility would spike later in the day, and

3) later in the day, cash out of the large long positions, dragging the index lower, making far more money on the options trades than on the long positions.

Jane Street can and likely will claim the firm was only arbitraging away pricing inefficiencies, nothing more, nothing less. It was just business as usual, etc., etc.

However, given the scale of the operation, Jane Street's actions sure look like textbook market manipulation. Calling it like I see it.


Decades ago in my first abnormal psych course, the prof warned us that there was an almost iron-clad law that students will immediately start self diagnosing themselves with “weak” versions of every disorder we learn about. In my years since then, it has absolutely held true and now is supercharged by a whole industry of TikTok self-diagnoses.

But there are a few things we can learn from this:

- if you give people the chance to place a label on themselves that makes them feel unique, they’ll take it.

- if you give people the chance to place a label on themselves to give a name/form to a problem, they’ll take it.

- most mental disorders are an issue of degree and not something qualitatively different from a typical experience. People should use this to gain greater empathy for those who struggle.


It's weird that you say both she had no material power and also seem to imply the valuation drop and lawsuits were due to her ineptitude?

Anyway she volunteered to be a puppet for a man who is clearly off the rails and her legacy will forever be stained.


Fun fact: movie sales, in terms of tickets sold, peaked in 2002. [1] All the 'box office records' since then are the result of charging way more to a continually plummeting audience size.

And this is highly relevant for things like this. People often argue that if movies were so bad then people would stop watching them, unaware that people actually have stopped watching them!

Even for individual movies. For all the men-in-spandex movies, the best selling movie (by tickets sold) in modern times is Titanic, 27 years ago.

[1] - https://www.the-numbers.com/market/


> One user, who asked not to be identified, said it has been impossible to advance his project since the usage limits came into effect.

Vibe limit reached. Gotta start doing some thinking.


> It's really sad to me how we have completely fucked a lot of youth with social media, smart phones,

You have to be careful with Gen Z threads like this on Reddit and Twitter. They are inherently biased toward Gen Z people who are chronically online and deep into social media.

If you spend time with kids in the real world, you learn very rapidly that most of them aren't on platforms like Reddit and Twitter. Of those who use Reddit, few of them actually post anything or even have accounts.

The subset of Gen Z who actually post on Reddit is small and a lot of them fit the description of chronically online, so it's no wonder that Reddit Gen Z people speak as if their generation is not socially engaged at all.


I've found this to be one of the most useful ways to use (at least) GPT-4 for programming. Instead of telling it how an API works, I make it guess, maybe starting with some example code to which a feature needs to be added. Sometimes it comes up with a better approach than I had thought of. Then I change the API so that its code works.

Conversely, I sometimes present it with some existing code and ask it what it does. If it gets it wrong, that's a good sign my API is confusing, and how.

These are ways to harness what neural networks are best at: not providing accurate information but making shit up that is highly plausible, "hallucination". Creativity, not logic.

(The best thing about this is that I don't have to spend my time carefully tracking down the bugs GPT-4 has cunningly concealed in its code, which often takes longer than just writing the code the usual way.)

There are multiple ways that an interface can be bad, and being unintuitive is the only one that this will fix. It could also be inherently inefficient or unreliable, for example, or lack composability. The AI won't help with those. But it can make sure your API is guessable and understandable, and that's very valuable.

Unfortunately, this only works with APIs that aren't already super popular.


I think the amount of turmoil around these deals is giving more weight to the possibility that we’re in a massive bubble thats quite divorced from any kind of fundamentals. Sooner or later the bubbles gonna burst.

If you’re one of today’s lucky 10,000 and haven’t heard the original 500-mile email story, you can read it at https://web.mit.edu/jemorris/humor/500-miles.

(discussed previously on HN 5 years ago – https://news.ycombinator.com/item?id=23775404 – and 10 years ago – https://news.ycombinator.com/item?id=9338708)


It's so, so , so hard to walk the line between persistence (which leads to glory) and stubbornness (which leads to more time following already wasted time.)

Congratulations for walking this line correctly.

I agree that some sort of market validation is necessary to at least pretend you are on the former not the latter. Those early usage spikes are helpful reminders that there is a business here somewhere.

I'll also make a note that you spent time on marketing from the early days. Writing blog posts, promoting said posts, having a Discord server, committing to answer emails, all of this is marketing and its likely lead to success more than the code.

I notice whenever there was a dip in revenue, marketing (in the form of more blog posts) was the response. I suspect that was intentional, and definitely a better approach than "let me go away and silently code more features."

So there are valuable lessons to others here. Congratulations not just on the current success but also on sharing the path that leads to success. Ultimately you can show the way, but you can't make people learn from it.

Oh, and I like the bootstrapping approach. I did the same, and I'm not sorry. It's longer and harder but also skips an enormous amount of extra work.


The law enforcement agencies which behaved the way law enforcement agencies always behave and did what anyone with even the slightest familiarity with how law enforcement acts thought they would do with the data. This outcome was 1000% predictable even if the details were not.

If you're gonna be angry at someone be angry at the people among us were in favor of the creation of this data set because they foolishly thought it would be used to combat mundane property crime or because perhaps they thought that subjecting motorists to an increased dragnet would be a good thing for alternative transportation, or some other cause, think that they have done no wrong despite warnings of the potential for something like this being raised way back when the cameras and the ALPRs were being put up.

These things will keep happening until it is no longer socially acceptable to advocate for the creation of data collection programs that are a necessary precondition.


The problem is these meetings are so low information density even an AI summary is not worth my time. And it’s not some elitist mindset. It’s like the entire reason there are these regular meetings is to make some mid level person feel better. They like giving directions vocally because that authority is harder to question than if they wrote up a memo and all the receivers can poke holes in it. I’m convinced most meetings are to make up for poor writing skills.

One factor is the ongoing campaigns from number of moral crusading groups who lobby them to cut off payment processing for things they don't approve of. NCOSE has been working for decades on the project, and targeting credit card companies has been a successful tactic for them for a decade or so.

[1] https://www.eff.org/deeplinks/2020/12/visa-and-mastercard-ar...

[2] https://www.newsweek.com/why-visa-mastercard-being-blamed-on...

[3] https://scholarworks.iu.edu/dspace/bitstreams/761eb6c3-9377-...


There's a term I read about a long time ago, I think it was "aesthetic completeness" or something like that. It was used in the context of video games whose art direction was fully realized in the game, i.e. increases in graphics hardware or capabilities wouldn't add anything to the game in an artistic sense. The original Homeworld games were held up as examples.

Anyway, this reminded me of that. Making these pictures in anything but the tools of the time wouldn't just change them, they'd be totally different artworks. The medium is part of the artwork itself.


Here's the full paper, which has a lot of details missing from the summary linked above: https://metr.org/Early_2025_AI_Experienced_OS_Devs_Study.pdf

My personal theory is that getting a significant productivity boost from LLM assistance and AI tools has a much steeper learning curve than most people expect.

This study had 16 participants, with a mix of previous exposure to AI tools - 56% of them had never used Cursor before, and the study was mainly about Cursor.

They then had those 16 participants work on issues (about 15 each), where each issue was randomly assigned a "you can use AI" v.s. "you can't use AI" rule.

So each developer worked on a mix of AI-tasks and no-AI-tasks during the study.

A quarter of the participants saw increased performance, 3/4 saw reduced performance.

One of the top performers for AI was also someone with the most previous Cursor experience. The paper acknowledges that here:

> However, we see positive speedup for the one developer who has more than 50 hours of Cursor experience, so it's plausible that there is a high skill ceiling for using Cursor, such that developers with significant experience see positive speedup.

My intuition here is that this study mainly demonstrated that the learning curve on AI-assisted development is high enough that asking developers to bake it into their existing workflows reduces their performance while they climb that learing curve.


One time they let her speak publicly it turned out to be a disaster. She never had any say and worst part is she was not even a good fall guy, it was clear who’s pulling the strings. The most immaterial and inconsequential hire ever.

I love all the replies on Twitter thanking her but during her time the valuation dropped 80% and they were suing advertisers for not advertising. Remarkably inept.


"Fair enough. Since this was our first OSS project, we didn’t realize at first. We’ve now revised it. Thanks for your contribution."

We didn't notice that we copied your codebase, changed the name then pretended to have built it in four days?

Good grief.


To the NY Times: please don't say they died by suicide. The passive voice makes it sound like some act of God, something regrettable but unavoidable that just somehow happened. It's important not to sugarcoat what happened: the postmasters killed themselves because the British state was imprisoning them for crimes they didn't commit, based on evidence from a buggy financial accounting system. Don't blur the details of what happened by making it sound like a natural disaster.

Horizon is the case that should replace Therac-25 as a study in what can go wrong if software developers screw up. Therac-25 injured/killed six people, Horizon has ruined hundreds of lives and ended dozens. And the horrifying thing is, Horizon wasn't something anyone would have previously identified as safety-critical software. It was just an ordinary point-of-sale and accounting system. The suicides weren't directly caused by the software, but from an out of control justice and social system in which people blindly believed in public institutions that were actually engaged in a massive deep state cover-up.

It is reasonable to blame the suicides on the legal and political system that allowed the Post Office to act in that way, and which put such low quality people in charge. Perhaps also on the software engineer who testified repeatedly under oath that the system worked fine, even as the bug tracker filled up with cases where it didn't. But this is HN, so from a software engineering perspective what can be learned?

Some glitches were of their time and wouldn't occur these days, e.g. malfunctions in resistive touch screens that caused random clicks on POS screens to occur overnight. But most were bugs due to loss of transactionality or lack of proper auditing controls. Think message replays lacking proper idempotency, things like that. Transactions were logged that never really occurred, and when the cash was counted some appeared to be missing, so the Post Office accused the postmasters of stealing from the business. They hadn't done so, but this took place over decades, and decades ago people had more faith in institutions than they do now. And these post offices were often in small villages where the post office was the center of the community, so the false allegations against postmasters were devastating to their social and business lives.

Put simply - check your transactions! And make sure developers can't rewrite databases in prod.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: