I see a lot of people suggesting solutions that, while helpful, wouldn’t actually solve the key problem in supply chain security.
- You can’t trust the package registry because of security flaws in the registry itself (as seen here as well as in NPM just a few months ago[1]). The CDN can be hacked, or an insider can attack the infrastructure.
- You can’t rely on code signing since a maintainer can go rogue and sabotage a package at any time (as happened with colors.js and faker NPM packages in January[2]) or a new maintainer could be added to the project and possess a valid signing key but turn out to be a bad actor (as happened with the event-stream NPM package in 2018[3]).
- You can’t audit every line of code in every dependency because the cost in terms of time and expertise is prohibitive to all but the biggest organizations (e.g. Google) and the most security sensitive applications (e.g. certain crypto projects and financial applications).
The two solutions I’m most excited about are (1) auditing package behavior with static analysis to detect when package behavior changes (e.g. new network connections, filesystem accesses, install scripts) like we do at https://socket.dev (disclosure: I am the founder) or (2) package sandboxing of which Lavamoat is the best example (however the performance impact is too high to apply this to every package in an app at present, and it also requires maintaining a policy configuration for each package).
There are multiple different problems with different solutions of varying impact.
I think you can probably split the two areas of interest into:
1. A package maintainer's credentials are compromised
2. A package repository is compromised
And the two attack vectors into:
1. The build script(s)
2. The runtime library
You can cut off the "repository is compromised" path with signing. An attacker doesn't have the maintainer's private key so even if they can modify source/packages on the server they can't "trick" the client into verifying it.
(1) is harder. Let's assume we have package signing, we know that any package we have was signed with a key that we hope only the maintainer has access to. At this point, either the maintainer is compromised or malicious. We can make compromise harder in a few ways, but should ultimately assume compromise.
One way to reduce compromise is to have the signing key stored on a hardware token that requires proof-of-presence for signatures.
Still, at this point we have "build script bad" and "library bad". Both are much harder problems, with solutions that you've alluded to to some extent - that is, sandboxing behaviors.
What this requires is a way to say "this code can do these things". This is how browser extensions / mobile apps work - they have to declare their permissions and you have to ack them any time they change.
Doing this for build scripts isn't too hard. You can run the scripts in a "hermetic" build system and have each script execute serially in a restricted environment - if one needs networking, give it networking, etc. There's no native support for this but imo it wouldn't be that hard to add it in.
Doing this for libraries is much harder. You'd need a native capabilities system in the language, and changes to capabilities would break the API. But sandboxing entire processes isn't that hard. The vast majority of services don't require egress to the public internet, meaning that an attacker is already going to have a hell of a time if they just get a shell into some box that they can't even communicate with. So I'd say start there - limit processes to what they can do and that limits the impact of a compromised library.
So altogether, none of these approaches seem super hard. We have signing, sandboxed builds (they can be pretty loosely sandboxed tbh - do a 'fetch', and then cut off internet for build scripts/ limit fs access), sandboxed services.
It'd be nice to have something more robust, but today you can do everything above without a ton of effort.
- You can’t trust the package registry because of security flaws in the registry itself (as seen here as well as in NPM just a few months ago[1]). The CDN can be hacked, or an insider can attack the infrastructure.
- You can’t rely on code signing since a maintainer can go rogue and sabotage a package at any time (as happened with colors.js and faker NPM packages in January[2]) or a new maintainer could be added to the project and possess a valid signing key but turn out to be a bad actor (as happened with the event-stream NPM package in 2018[3]).
- You can’t audit every line of code in every dependency because the cost in terms of time and expertise is prohibitive to all but the biggest organizations (e.g. Google) and the most security sensitive applications (e.g. certain crypto projects and financial applications).
The two solutions I’m most excited about are (1) auditing package behavior with static analysis to detect when package behavior changes (e.g. new network connections, filesystem accesses, install scripts) like we do at https://socket.dev (disclosure: I am the founder) or (2) package sandboxing of which Lavamoat is the best example (however the performance impact is too high to apply this to every package in an app at present, and it also requires maintaining a policy configuration for each package).
[1]: https://www.theregister.com/2021/11/16/github_npm_flaw/
[2]: https://www.theregister.com/2022/01/10/npm_fakerjs_colorsjs/
[3]: https://www.theregister.com/2018/11/26/npm_repo_bitcoin_stea...