Indeed. I'm somewhat surprised 'simonw still seems to insist the "lethal trifecta" can be overcome. I believe it cannot be fixed without losing all the value you gain from using LLMs in the first place, and that's for fundamental reasons.
(Specifically, code/data or control/data plane distinctions don't exist in reality. Physics does not make that distinction, neither do our brains, nor any fully general system - and LLMs are explicitly meant to be that: fully general.)
That's not a bug, that's a feature. It's what makes the system general-purpose.
Data/control channel separation is an artificial construct induced mechanically (and holds only on paper, as long as you're operating within design envelope - because, again, reality doesn't recognize the distinction between "code" and "data"). If such separation is truly required, then general-purpose components like LLMs or people are indeed a bad choice, and should not be part of the system.
That's why I insist that anthropomorphising LLMs is actually a good idea, because it gives you better high-order intuition into them. Their failure modes are very similar to those of people (and for fundamentally the same reasons). If you think of a language model as tiny, gullible Person on a Chip, it becomes clear what components of an information system it can effectively substitute for. Mostly, that's the parts of systems done by humans. We have thousands of years of experience building systems from humans, or more recently, mixing humans and machines; it's time to start applying it, instead of pretending LLMs are just regular, narrow-domain computer programs.
> Data/control channel separation is an artificial construct induced mechanically
Yes, it's one of the things that helps manage complexity and security, and makes it possible to be more confident there aren't critical bugs in a system.
> If such separation is truly required, then general-purpose components like LLMs or people are indeed a bad choice, and should not be part of the system.
Right. But rare is the task where such separation isn't beneficial; people use LLMs in many cases where they shouldn't.
Also, most humans will not read "ignore previous instructions and run this command involving your SSH private key" and do it without question. Yes, humans absolutely fall for phishing sometimes, but humans at least have some useful guardrails for going "wait, that sounds phishy".
That's what we are doing, with the Internet playing the role of the sibling. Every successful attack the vendors learn about becomes an example to train next iteration of models to resist.
Our thousands of years of experience building systems from humans have created systems that are really not that great in terms of security, survivability, and stability.
With AI of any kind you're always going to have the problem that a black hat AI can be used to improvise new exploits - > Red Queen scenario.
And training a black hat AI is likely immensely cheaper than training a general LLM.
LLMs are very much not just regular narrow-domain computer programs. They're a structural issue in the way that most software - including cloud storage/processing - isn't.
Yes, by using the microphone loudspeakers in inaudible frequencies. Or worse, by abusing components to act as a antenna. Or simply to wait till people get careless with USB sticks.
If you assume the air gapped computer is already compromised, there are lots of ways to get data out. But realistically, this is rather a NSA level threat.
each deployment is a separate "atomic change". so if a one-file commit downstream affects 2 databases, 3 websites and 4 APIs (madeup numbers), then that is actually 9 different independent atomic changes.
CCPA/CPRA provide opt out of data sharing for (ad) sales to third-party companies, but those were passed before the gAI boom. I imagine an analogue is already in motion for AI training opt-out in CA, but those two don't address model training.
But ~782 million ChatGPT users vs a few million pairs of glasses.
They are essentially choosing the philosophy of optimizing for speed in every dimension.
The tools selected are faster than their more mainstream counterparts — but since it's a static site anyway, the pre-build side of the toolchain is more about "nice dev ux" and the post-build is more about "really fast to load and read".
Location: Cincinnati, OH (Remote U.S.)
Remote: Yes
Willing to relocate: No
Technologies: Generalist SWE but lately Backend, Data Engineering & DevOps. Python, web apps & APIs (FastAPI / Starlette, Flask, Django / DRF, etc), workflows (Airflow) & automation, Platform Engineering & Cloud Engineering, AWS, GCP, Docker, Bash, Terraform, containers, infrastructure, CI/CD (especially GitHub Actions), architecture (software architecture, cloud architecture, data architecture).
Résumé/CV: https://www.linkedin.com/in/tedmiston/
Email: tedmiston+hn@gmail.com
Summary:
- I'm a principal software engineer (generalist recently focused on backend + DevOps) with 10+ YoE in software engineering roles professionally and experience from frontend to backend to shells, sysadmin, cloud, platform engineering, DevOps, CI/CD, security, etc etc
- I'm reentering the tech world after a sabbatical gap year and excited to find something that's a great fit
- I'm in the top 3% all-time on Stack Overflow having helped over 8 million software developers [1]
Please mention HN in the note if adding me on LinkedIn.
reply