Yeah and to give a more recent example, it's exactly like how RAM, storage, and other computer parts have gotten much cheaper over the last 3 years... oh wait.
This is exactly what makes it dangerous. Food can taste ok but actually cause you to get sick. Not all bacteria is going to taste off. I'm assuming you're not a chef because if you were then you'd know how absurd your statement is.
For a super simple example, if you don't properly handle or cook raw meat then you risk getting sick even though the food might not immediately taste bad. Maybe that's obvious to you but might not be to the person preparing the food. Another example: Rhubarb pie is supposed to be made with the leaves and not the stalk because the stalk is poisonous and can cause illness. Just kidding, it's actually the other way around but if you were just reading a ChatGPT recipe that made that mistake maybe you wouldn't have caught it.
> When a question touches restricted data — student PII, sensitive HR information — the agent doesn’t just refuse. It explains what it can’t access and proposes a safe reformulation. "I can’t show individual student names, but here’s the same analysis using anonymized IDs."
This part is scary. It implies that if I'm in a department that shouldn't have access to this data, the AI will still run the query for me and then do some post-processing to "anonymize" the data. This isn't how security is supposed to work... did we learn nothing from SQL injection?
- The bot giving out PII by accident. You ignore it and report it.
- You trying to fool the bot into giving you PII you're not supposed to have. But you've created an audit trail of your 100 failed prompt injections. The company fires you.
This isn't public facing, open to anyone. This is more like a shared printer in the office.
In the strongest interpretation of that it would offer only data which the user is allowed to access. Why do you assume that them implementing a feature to prevent PII being accessed that they then turn around and return data which the user is not supposed to access?
If it's PII data the best thing for them to do is not even allow the AI to have access to it. They're admitting to that so I doubt they've gone through the effort to forward the user's auth token to the downstream database.
And with security it's always best to assume the worst case (unless you're certain that something is safe) because that would lead you to add more safeguards rather than less.
To be fair to them, the architecture description said that each datasource had a unique agent, so the orchestrator AI didn't have direct data access, and that they specifically only allow access to data the user has permissions for.
Unclear if each datasource agent is ALSO AI based though, in which case it has just pushed the same concern down the line one hop.
There's a difference between 1000 diverse humans with varied traits making errors that should cancel out because of the law of large numbers vs 10 AI with the same training data making errors that would likely correlate and compound upon each other.
I see a lot of researchers working on newer ideas so I wouldn't be surprised if we get a breakthrough in 5-10 years. After all, the gap between AlexNet and Attention is All You Need was only 6 years. And then Scaling Laws was about 3-4 years after that. It might seem like not much progress is being made but I think that's in part because AI labs are extremely secretive now when ideas are worth billions (and in the right hands, potentially more).
Of course 5-10 years is a long time to bang our heads against the wall with untenable costs but I don't know if we can solve our way out of that problem.
Yep, last time we got "a lot of researchers working on newer ideas", it took them 20 years to get into a working idea, and other 20 years to get it mature enough to make an AI boom.
I don't think anyone is talking about it because it's not a very productive conversation to have. I'm not particularly bullish on vibe coding either but if you could explain what exactly about vibe coding causes these specific issues then it could be more interesting to discuss.
But as it stands, the more likely reason is capacity crunch caused by a chips shortage and demand heavily outpacing supply. You vibe coding reason is based on as much vibes as their code probably is.
Vibe coding does not usually produce performant code, it produces spaghetti with the goal of making the user asking for work to be done to go away as soon as possible with a (often barely) working solution.
I recently vibe-translated a simple project from Javascript to C, where Javascript was producing 30fps, and the first C version produced 1 frame every 20 seconds. After some time trying to get the AI to optimize it, I arrived at 1fps from the C project. Not a win, but the AI did produce working C code.
I have no doubt that if I had done this myself (which I will do soon), with the appropriate level of care, it would have been 30fps or more.
This is super cool but part of me wishes I could skip to the later levels rather than redo college homework from a decade ago. Maybe that ruins the fun but also slogging through the early levels (especially when the UI is a bit rough around the edges and doesn't support copy paste) isn't fun either.
> Based on current internal deliberations, the company could launch its first touch-screen Mac in 2025
It looks like those leaks aren't too far off what I'm saying. Deadlines slipping by 1-2 years isn't way off especially for such a new/different product direction. And the rumor also said "could" which means even internally, it wasn't a strong claim.
Yes and it's an article about a leak 3 years ago. And there were more "leaks" before that. I just can't be bothered to research and link the obvious to argument against an "opinion".
I think the idea is that by creating these shared .claude files, you tell the agent how to develop for everyone and set shared standards for design patterns/architecture so that each user's agents aren't doing different things or duplicating effort.
> That means that when we close the issue, we believe it has a high chance of being fixed
I agree with this iff it's being done manually after reading the issue. stalebot is indiscriminate and as far as "owing" the user, that's fair, but I'd assume that the person reporting the bug is also doing you a favor by helping you make things more stable and contributing to your repo/tool's community.
I partially agree, but even with stalebots nobody is measuring the maintainers' productivity. So when they made the choice to use stalebots, they did that because they believe that's best for the project. It's different from corporate.
Nobody is measuring their productivity, but people definitely look at how many open issues they have and potentially how long those issues have existed. They’re likely incentivized to close issues for appearances.
With a popular open source project, you'll quickly get to a number of bug reports that you have no chance of ever solving. You will have to focus on the worst ones and ones affecting most users.
At the same time, you want to communicate to users that this is the case so they don't have wrong expectation. But also, psychologically it is demotivating to have a 1000+ open bugs queue with no capacity to re-triage and only two maintainers able to out a few fours in every month or every week.
In open source, "won't fix" means either "not in scope — feel free to fork" or "no capacity ever expected — feel free to provide a fix".
The optimization problem is how do you get the most out of very limited time from very few people, and having 1000+ open bugs that nobody can keep in their head or look for duplicates in is mentally draining and stops the devs from fixing even the top 3 bugs users do face.
The problem is that your users also have limited time and if it's clear you're not even looking at issues where someone has put in lots of effort to help you then you're only going to get lazy issues and it will actually take more effort from you to do all that work yourself if you want to reach the same software quality.
I think you are missing the point: a user putting in a lot of effort into a bug report is usually trying to help themselves get the bug fixed.
As a maintainer, you will obviously look at that bug with more appreciation: but if you estimate it will take you 3 months of active development to fix it that you will have to spread over a full year of your weekends (which you can't afford), what would you do?
And what would a reasonable user rather see? Yes, this is an issue, but very hard to fix, and I don't have the time, or just letting the bug linger?
reply