Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hmm, where to begin? This is an old idea. It has all been tried before in the JVM world and yet support for it is now being removed, which is in my view a pity given that Now Is The Time. But the problems encountered trying to make it work well were real and would need to be understood by anyone trying the same in the JS world.

Understand that Java had it relatively easy. Java was designed with a sandbox as part of the design from day one, the venerable SecurityManager. The language has carefully controlled dynamism and is relatively easy to statically and dynamically analyze, at least compared to JavaScript. The libraries were designed more or less with this in mind, and so on.

So what went wrong?

Firstly, the model whereby you start with a powerful "root" capability and then shave bits off doesn't have particularly good developer usability. It requires you to manually thread these little capabilities through the call stack and heap, which is a nightmare refactoring job even in a language like Java let alone something with sketchy refactoring tooling like JavaScript. Lots of APIs become awkward or impossible, something as basic as:

    var lines = readFile("library-data.txt");
is now impossible because there's no capability there, yet, developers do expect to be able to write such code. Instead it would have look like this:

    function readFile(appDataPath) {
        var url = appDataPath.resolve("library-data.txt");
        var lines = appDataPath.readLines();
    }

    readFile(rootFileSystem.resolve("/app/data"));
Can you do it? Yes. Does it make code that was once concise and obvious verbose and non-obvious? Also yes.

Consider also the pain that occurs when you need a module that has higher privileges than the code calling it (e.g. a graphics library that needs to load native code, but you don't want to let the sandboxed code do that). In the pure caps model you end up needing a master process that "tunnels" powerful caps through to the lower layers of the system, breaking abstractions all over the place.

Secondly, this model means you can never add new permissions, change the permissions model or have different approaches because refining permissions == refactoring all your code, globally, which isn't feasible.

Thirdly, this model imposes cap management costs on everyone even if they don't care about security because they know the code is trustworthy e.g. because their colleagues wrote it, it came from a trustworthy vendor, or because it'll run in a process sandbox. Even if you know the code is good it doesn't matter, you still have to supply it with lots of capabilities, you still have to implement callbacks to give it the capabilities it needs on demand and so on.

These problems caused Java to adopt a mixed capability/ambient permissions model. In the SecurityManager approach you assigned permissions based on where code came from and stack walks were used to intersect all the sources on the stack. Java also allowed libraries to bundle data files within them, and granted libraries read access to their resources by default. That solved the above problems but introduced new ones, in particular, it lowered performance due to the stack walking, plus now library developers had to document what permissions they needed and actually test the code in a sandboxed context. They never did this. Also the approach was beaten from time to time by people finding clever ways to construct pseudo-interpreters out of highly dynamic code, such that malicious code could get run without the bad guy being on the stack at all.

Fourthly, it's dependent on everyone playing defense all the time. If your object might get passed in to malicious code, then it has to be designed with that in mind. A classic mistake:

    class Foo {
       private ArrayList<String> commands;

       void addCommand(String command) { commands.add(command); }
       List<String> getCommands() { return commands; } 
    }
The author's intent was to make an object in which you can read the list of commands but not write them. But, they're returning the collection directly instead of using an immutable wrapper. Fine in normal code, but oops, in sandboxed code now you have a CVE. Bugs like this are non obvious and the tooling needed to find them isn't straightforward. These bugs are a drain on development.

Fifthly, Spectre attacks mean that a library that can get data to an attacker via any route can exfiltrate data from anywhere in the process. You may not care about this, and for many libraries there may be no plausible way they can exfiltrate data. But it's another sharp edge.

Finally, it all depends on the ecosystem having minimal native code dependencies. The moment you have native code in the mix, you can't do this kind of sandboxing at all.

Now. All these are challenges but they don't mean it's impossible. Sandboxing of libraries is clearly and obviously where we have to go as an industry. The Java approach didn't fail only due to the fundamental difficulties outlined above - the SecurityManager was poorly documented and not well tuned for the malicious libraries use case, because it was really meant for applets. After the industry gave the Java team so much shit over that, they just sort of gave up on the whole technology rather than continuing to iterate on it. It may be that a team with fresh eyes and fresh enthusiasm can figure out solutions for the above issues and make in-process sandboxing really happen. I wish them the best, but anyone who wants to work on that should start by spending time understanding the SecurityManager architecture and how it ended up the way it did.

https://dl.acm.org/doi/pdf/10.1145/2030256.2034639



Author here. Thankyou so much for this summary of java's approach. I learned Java when I was a kid in the 90s and I remember seeing some SecurityManager stuff in the java standard library and I remember having no idea what that was or why I would want any of it. Its funny to think that decades later I would propose re-inventing it.

As for the code, surely something like this could work?

    var lines = readFile("library-data.txt", capabilityToken);
But yeah, even in the example in my post the capability tokens are annoying and feel cumbersome.

Another poster in this thread suggested maybe expressing capabilities in your package.json file. Maybe when you pull in a dependency you can say "oh, and rather than inheriting all my capabilities, only give this library access to capability X. That would provide a nice ramp, but there's a whole new set of problems that way, since you'd need to be able to express something like "the capability you need for the redis client is network access to this specific IP address". And that specific capability needs to be passed all the way through the dependency tree to whatever finally opens the socket.

Expressing this in a granular way in code is easy, but noisy. But if we do it in package.json, maybe thats not going to be expressive enough.

Anyway, like you, I hope someone smart takes the time to have another stab at this. The security model where we trust all software engineers is obviously breaking down at this point. Short of a model like this, I'm not sure how we can really solve this problem at all. In any case, thankyou for sharing your wisdom.


Re: your example. What is "capabilityToken" in this case? What does it grant you, precisely? Is it a directory? A file? Something else? The classical approach to using caps with files is you create a File type of some kind, which encapsulates the permissions and lets you derive from it e.g. sub-directories, files in that directory but not navigate up the tree. Or it's associated with some whitelist of files.

For that to work you need not only a carefully designed set of types but also they must be able to protect their internals. JavaScript historically hasn't had this, I don't know about modern versions, but the ability to restrict monkey-patching, reflection over private fields etc is a must.


> For that to work you need not only a carefully designed set of types but also they must be able to protect their internals. JavaScript historically hasn't had this, I don't know about modern versions, but the ability to restrict monkey-patching, reflection over private fields etc is a must.

At the bottom of the post I sketched out how we could make this work in practice in javascript. We can use a Symbol[1], and then have that be a key into a Map owned by the builtin capabilities library. That would make the token itself safe from being messed with.

But so long as the capabilities library uses whatever the object is as a key in a JS Map (with the value being the token's scope), we could just as easily use anonymous objects or something else.

[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


I think the issue is more the code that uses the capability itself. Like, if I can just read the capability straight out of the object that owns it, or monkey-patch the definition of some other object it calls into so I can use its capabilities indirectly, then you still lose. That's what I meant by playing defense all the time. If you give a bit of sandboxed code a generic utility object, it can all go wrong.


The idea here is that there's 2 things: The token (a Symbol() or something) and the scope of capabilities which that token gives you. The capabilities themselves are stored in a Map that you don't control. Javascript function scopes give us everything we need to hide that map and make sure nobody can modify it. The only methods which are exposed are things like getScopeForToken() which reads from the map (and does a deep clone) then returns that scope object.

In privileged methods like fs.writeFile(), you don't pass the scope. You pass the token. And that method would explicitly go and check if that token has the scope that it needs to write to the passed path.

But I do hear you about playing defense. I mentioned it in the post - there's probably a bunch of subtle ways you could use javascript to mess with things. Covering all of these cases would need some serious rigor.


I don't know how relevant it still is, but did you ever look at the old Google Caja project?

https://en.wikipedia.org/wiki/Caja_project

It was trying to implement capabilities in JavaScript, but failed because JS was too dynamic at the time. It might be that newer language versions have made it possible but it'd be worth researching why they gave up on it.

Caja was designed by Google research scientist Mark S. Miller in 2008[3][4] as a JavaScript implementation for "virtual iframes" based on the principles of object-capabilities. It would take JavaScript (technically, ECMAScript 5 strict mode code), HTML, and CSS input and rewrite it into a safe subset of HTML and CSS, plus a single JavaScript function with no free variables. That means the only way such a function could modify an object, was if it was given a reference to the object by the host page. Instead of giving direct references to DOM objects, the host page typically gives references to wrappers that sanitize HTML, proxy URLs, and prevent redirecting the page; this allowed Caja to prevent certain phishing and cross-site scripting attacks, and prevent downloading malware. Also, since all rewritten programs ran in the same frame, the host page could allow one program to export an object reference to another program; then inter-frame communication was simply method invocation.


I spent some time with one of the Caja developers back in 2010 or so, before it was made public.

From memory, the problem they were trying to solve was a bit different. From what I remember, they wanted to be able to run potentially hostile user supplied javascript code inside the JS VM purely using source code level validation. So for example, Caja needed to make sure the sandbox container didn't access the global object (since then it could escape its sandbox). And because simple code like this: (function () { return this })() evaluates to the global object, they banned the keyword this in sandboxed code.

I'm hoping there's a way we can give untrusted code more or less full access to the JS environment, but just limit its access to the rest of the operating system. Javascript was first developed for web browsers, and to this day most javascript still has little to no need to access the rest of the operating system directly.

But Javascript's obsessively granular modularity works in our favor here. If you look at a library like express, the core library makes vanishingly few calls to the nodejs environment. `app.listen()` is the only method I know about which wouldn't "just work" in this new world I'm proposing. And thats just a convenience wrapper around `require('http').createServer(app)` anyway. All the hard work happens in libraries like express.static - but thats trivially easy to swap out for another package that supports capabilities correctly, if we need to do that.

A bad library could always be buggy - we can't stop that. I mostly want to stop opportunistic developers from taking advantage of the machines their modules are run on, so we can detect (and stop them) from doing nasty things. But as a few people have mentioned, this approach might be stuck "always playing defense". The nice thing about caja is that it was "complete". There were no weird edge cases left over in the language that the sandbox authors didn't consider. Thats what I'm worried the most about here.


> Lots of APIs become awkward or impossible, something as basic as[...]

I mean, wouldn't you use a `readFile()` function like that by passing in the file handle? So:

    var lines = readFile(fs.open("library-data.txt"));
...where, if you're in a library somewhere, `fs` may be a capability to a directory that you've been passed rather than a global granting access to the entire filesystem. This doesn't feel much more awkward than your example of:

   var lines = readFile("library-data.txt");
EDIT: I am assuming you have an `fs.open()` that returns a file handle here; Node's doesn't seem to and instead takes a callback as an argument. You get the idea though.


That's pretty much what I said, no? It gets awkward: now your library can't just load some data table it needs from a file, it has to have either some sort of initialization step where you give it the capabilities it needs or it has to take them in the API call itself.

Now let's say you change the implementation such that it needs a new permission. You have to pass that in, which may well mean passing it in from the root of the app through a long call stack. Quite painful. Programmers like conveniences such as being able to give a string instead of a file handle.


I’m sure programmers do like that convenience, but if the consequence is that we’re giving every library access to everything the rest of the app has access to, I don’t think that’s tenable long-term.

> Now let's say you change the implementation such that it needs a new permission. You have to pass that in, which may well mean passing it in from the root of the app through a long call stack.

Sure, but put another way: you can’t change the implementation of your library to grant yourself more access to the system without the calling application being aware of it. Is this potentially inconvenient? Sure. But it does mean that the developer of the calling program knows pretty dang well what access they’re handing over to the library.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: