The most useful frame here is looking at log odds. Going from 15% -> 16% means
-log_2(.15/(1-.15)) -> -log_2(.16/1-.16))
=
2.5 -> 2.39
So saying 16% instead of 15% implies an additional tenth of a bit of evidence in favor (alternatively, 16/15 ~= 1.07 ~= 2^.1).
I don't know if I can weigh in on whether humans should drop a tenth of a bit of evidence to make their conclusion seem less confident. In software (eg. spam detector), dropping that much information to make the conclusion more presentable would probably be a mistake.
It would be a political catastrophe right now if the EU blocked US companies due to them needing to comply with temporary US court orders. My guess is this'll be swept under the rug and permitted under the basis of a legal obligation
I remember awhile back that they were and do train on repositories to the point that I never wanted to use GitHub for anything other than submitting bug reports to projects.
Maybe the non-training only applies if you pay protection money? But then you run into the whole if it's public there's nothing stopping some other AI that isn't MS from accessing the repository and training on it.
They could read the whole git history and have all issue tracker tickets in the context, and maybe even recordings from meetings. It remains to be seen though if such large context will yield usable results.
Windsurf and Cursor feel like temporary stopgaps, products of a narrow window in time before the landscape shifts again.
Microsoft has clearly taken notice. They're already starting to lock down the upstream VSCode codebase, as seen with recent changes to the C/C++ extension [0]. It's not hard to imagine that future features like TypeScript 7.0 might be limited or even withheld from forks entirely. At the same time, Microsoft will likely replicate Windsurf and Cursor's features within a year. And deliver them with far greater stability and polish.
Both Windsurf and Cursor are riddled with bugs that don't exist upstream, _especially_ in their AI assistant features beyond the VSCode core. Context management which is supposed to be the core featured added is itself incredibly poorly implemented [1].
Ultimately, the future isn't about a smarter editor, it's about a smarter teammate. Tools like GitHub Copilot or future agents will handle entire engineering tickets: generating PRs with tests, taking feedback, and iterating like a real collaborator.
Humans can figure out a lot given enough time. While all the hype for us is finance, management, machines, electronics and software etc it is not unthinkable a previous civilization went all in on soil. Terra Preta seems to be quite sophisticated.
Feature request: Overlapping time periods. I was a high school student from 1993-1998, but I was a university student from 1994-2000. (And then a student at a different university from 2001-2005.) Many people won't have high school and university overlapping, but overlapping education and career often do.
Maybe interesting to add relationships (at very least marriage?) as visible time periods too.
The classic audio diagnostic on Linux/Unix is some beeps occurring and kernel debug text on all virtual terminals when something really horrible has gone wrong.
If you want to make a beep in a linux console, try adding this to your shell scripts. It should trigger the default system "beep" sound. I used to include this in my scripts that ran really long tests or cluster jobs to wake me up and check the results.
Sadly, on many laptops and PCs today, there seems to be no action on the PC speaker at all, but you can configure this system beep code to trigger a sound of your choice from your favorite gui console application.
In Python I believe you can trigger the system 'beep' sound with one of the following:
print("\a") # cross platform
print('\007') # linux only
My gui favorite, Konsole does not make any sound using these standard methods unless you manually configure it to play a file for "Bell in focused session" under Notifications.
Of course this plays a wav/ogg file instead of triggering the PC speaker.
It can be really hard to make a little beep these days, when you consider you may have 4 different sound outputs (one for each display and video card, one or two for the motherboard), application specific audio levels that get set to quiet or muted by default arbitrarily, and then application specific opt-ins needed just to support a little sound that was kind of a failsafe notification in earlier times.
> You might think how do you define that, but the companies have already made a whole science out of it so it's not that abstract anymore
Hardly. Addictiveness does not exist in binary. There are many people who obsessively check their email or refresh news websites. There is no doubt that social media companies choose the algorithms that maximize engagement and so most probably they also maximize addiction, but _any_ algorithm will cause addiction to some extent. What's the limit? How do we even measure this?
Something that is maybe a little more interesting is banning the practice of recommending "negative content" because it produces more engagement than "positive content". How this is defined is also somewhat squishy, but we can at least try to define it -- content that is likely to provoke negative emotions, like anger, fear, aggression, etc.
I think there's a much clearer through-line to argue that recommending negative content on social media produces a substantial negative externality, and that moves this into the category of things like environmental regulations.
-log_2(.15/(1-.15)) -> -log_2(.16/1-.16))
=
2.5 -> 2.39
So saying 16% instead of 15% implies an additional tenth of a bit of evidence in favor (alternatively, 16/15 ~= 1.07 ~= 2^.1).
I don't know if I can weigh in on whether humans should drop a tenth of a bit of evidence to make their conclusion seem less confident. In software (eg. spam detector), dropping that much information to make the conclusion more presentable would probably be a mistake.