> And what a pdf reader has to do with javascript is a mystery as well. The whol...

jacquesm · on Aug 7, 2015

Oh great. What could possibly go wrong, give javascript access to local storage through some 'hard to trigger' gate. That's just asking for it. Hindsight and all that but still, this is not a good idea. A browser should not use it's own internal language sandboxed for the web to have access to the local system through some loophole. It's only a matter of time before such a loophole becomes an exploit. I wonder if javascript has access to devices such as cameras and microphones through similar loopholes. That would be a bit of a problem.

_wmd · on Aug 7, 2015

You haven't the slightest understanding of software security, PDF.js was written to replace a component authored in a memory-unsafe language for which exploits were being found at a rate measured in tens per year. Since introduction PDF.js has only had 2 holes that were directly exploitable, neither leading to remote code execution which was the default behaviour for pretty much any bug found in Acrobat.

If you don't want a browser that has some notion of "local file context" you should just sell your laptop and go live in a cave. FWIW the entire Firefox UI and every plugin for it is _written_ in Javascript served from local disk. Chrome, Safari and IE aren't far behind

zzzeek · on Aug 7, 2015

> If you don't want a browser that has some notion of "local file context" you should just sell your laptop and go live in a cave.

this is kind of a silly statement. nobody would argue a program shouldn't be able to access local files, in this case, we would presume, PDF content that's been downloaded into a cache. the very simple argument is that the code which deals with opening and reading files from disk should be completely isolated from scripting-language code that runs dynamically in the same object space as the front-end scripting environment. e.g., put the .js in a sandbox by default the way we used to take for granted.

I understand that in the Mozilla suite, this barn door was left open years ago and the horses are far and wide by now.

_joev · on Aug 7, 2015

That's not true. There have been PDF.js exploits that lead straight to RCE. This has the additional downside of leading to immediate compromise on every platform.

Example (used by Mariusz Mlynski to win Pwn2Own this year): https://www.mozilla.org/en-US/security/advisories/mfsa2015-3...

_wmd · on Aug 7, 2015

Thanks for the link, and bummer, too late to edit my comment.. now I'll be wrong on the Internet for perpetuity :)

shimshim · on Aug 7, 2015

join the club

anon1385 · on Aug 7, 2015

If large amounts of code written in memory unsafe languages is such a concern then Mozilla should immediately stop adding large numbers of highly complex new features implemented in unsafe code to Firefox every year, mostly to do things that have absolutely nothing to do with displaying web pages but are enabled by default for political reasons.

Just like switching to PDF.js was a decision taken to try and reduce the security attack surface, the decisions to add webgl, webrtc, webfonts, webm, websockets, new css features and so on were all decisions taken in the full knowledge that adding those things would vastly increase the attack surface and inevitably lead to security exploits. These new web features are responsible for a slew of new vulnerabilities and new classes of information leaks.

ronjouch · on Aug 7, 2015

> (a) If large amounts of code written in memory unsafe languages is such a concern then Mozilla should immediately stop adding large numbers of highly complex new features implemented in unsafe code to Firefox every year,

> (b) ... mostly to do things that have absolutely nothing to do with displaying web pages but are enabled by default for political reasons.

(a) Mozilla is working on adding/replacing parts of Firefox with a language emphasizing security (among other things). First Rust push in Firefox landed a Rust mp4 parser [1], on 2015-06-17. Others will come; in the meantime, the world keeps turning, and users / web. developers expect these new web features, which Moz devs implement with the infrastructure they have and know. They're not going to cross their hands and declare a moratorium until Rust (or other security-mitigating features/changes) are fully integrated.

(b) Not sure what you mean by political reasons and maybe you want to stay stuck in 1992, but I don't, and like many users I do want "webgl, webrtc, webfonts, webm, websockets, new css features and so on" .

EDIT I'd have added "You can install links if you want a simple browser letting you read static html documents", which you would have answered with "But I can't, every website require these features now", to which I'd have answered "a. Yeah, not everyone (that's an understatement) does progressive enhancement, but ultimately b. The times they are a-changing"

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1175322

baghira · on Aug 7, 2015

It doesn't help your position the fact that you are unable to express it without belittling anybody who disagrees with you using stuff like "stay stuck in 1992" ("If you don't like America you should go to Russia!").

Also, the links/lynx jokes have really gotten tired, plenty of people browse the web with ublock, no(t)script, webgl and webrtc disabled and so on. The pretense that anybody who tries to retain a modicum of control on what its browser does and does not it a luddist is frankly irritating.

And the whole language debate is completely off point, we have plenty of safe(r) languages for writing stuff, the misguided idea is that the only way to do so is to use javascript and stick the resulting program inside the browser.

ronjouch · on Aug 7, 2015

> It doesn't help your position the fact that you are unable to express it without belittling anybody who disagrees with you using stuff like "stay stuck in 1992" ("If you don't like America you should go to Russia!").

True, that was useless, could have just said "I and many users do want these features" . Thanks, and sorry anon.

> the links/lynx jokes have really gotten tired, plenty of people browse the web with ublock, no(t)script, webgl and webrtc disabled and so on. The pretense that anybody who tries to retain a modicum of control on what its browser does and does not it a luddist is frankly irritating.

That wasn't a links joke, I could have phrased it with your own words "You can install ublock, no(t)script, and disable webgl/webrtc if you want a simple browser letting you read static html documents" , and "But I can't, every website require these features now" would still be an answer.

My conclusion isn't that "anyone trying to retain a modicum of control on what its browser does and does not is a luddist" --and I do use some of these extensions too--, it's that the barebones web experience anon wants is broken now (and probably forever), due to:

a. Sadly, non-respect of progressive enhancement in cases where it's possible (documents).

b. The fact that _some_ parts of the web are increasingly not documents, but whole apps whose progressive-enhancement baseline (running without all the bells and whistles) would do nothing because they depend on these features.

> And the whole language debate is completely off point, we have plenty of safe(r) languages for writing stuff, the misguided idea is that the only way to do so is to use javascript and stick the resulting program inside the browser.

Yes. Development practices, testing, fuzzing, and safe(r) languages, like Rust.

baghira · on Aug 7, 2015

Ok, I guess that I misread the tone of your post (i.e. we largely agree).

I don't however think that the web it that broken without those features (javascript being the harder one to police). Judging from the browsing habits of my family members, they don't spend nearly as much time inside web applications as the HN news cycle would lead me to believe: some news sites, some webmail (and even there, when presented with a decent looking mail application they happily switched), the most basic functions of facebook, and "utilities" i.e. web banking, traveling, university websites.

None of these uses requires the ability to play quake3 inside firefox, or are really applications inside a webpage. Same probably goes for all the browsers in the workplace, for instance.

I'll agree with you that few sites will do progressive-enhancement (and decent accessibiliy), I'm just disappointed in the defeatist attitude of browser vendors and expert users: the idea of having a browser safe mode that you can lock down doesn't strike me as such an impossiblity and it would give some incentive to developers to put their act together.

ronjouch · on Aug 7, 2015

> None of these uses requires the ability to play quake3 inside firefox, or are really applications inside a webpage.

Maybe, for now. But WebRTC/WebSockets have a value proposal for real-time interaction in collaborative office suites. Canvas/WebGL have one for performance in authoring tools and for articles illustrations. Documents are readable in your default serif/sans-serif set, but WebFonts are a good designer/author tool just like fonts are in print. Etc... Renouncing this added value because each new feature increases the attack surface sounds like throwing the baby with the bathwater.

> I'm just disappointed in the defeatist attitude of browser vendors and expert users: the idea of having a browser safe mode that you can lock down doesn't strike me as such an impossiblity and it would give some incentive to developers to put their act together.

Two thoughts:

1. Such a "Safe mode" disabling features presents high risks of breaking tons of sites, leaving non-expert users in the dark, and these users are the most likely to be clueless about what's wrong and may just switch to another browser. In the case of JavaScript, Firefox is actually going the opposite way of what you want, by making it harder to disable it [1]. The closest to your wish with Firefox is probably to use their LTS version, ESR, where the dust settled for a little while more (but which ironically, was affected by today's exploit ^^).

2. Can what you are proposing be a "mode"? Take the "Reader View" mode of recent Firefox builds, proposing a Readability-like mode streamlining long reads: this one is clearly a _mode_, you click on it, the text turns big, page gets sepia, side content disappears, you know you're in it and you're not going to constantly browse with it. But would you alternate between "default" mode and "Safe" mode? What a terrible choice to make, you would certainly stay on "Safe" mode, and at this point it becomes transparent that the browser constantly altering content, deepening cluelessness for non-expert users in case of breakage.

2.1. EDIT this reminds me a lot of Polaris tracking protection [2], a project/feature of recent Firefox builds to block http requests of trackers, for privacy. I use the feature, and even I, a moderately "expert" user, was left puzzled when it blocked all the images an article (can't recover it, it was a Russian article/domain of a photographer exploring the remnants of a space shuttle launch military site). Anyway, Polaris had the images domain in its blacklist and blocked them. Glancing at the console, I saw Polaris blocking and disabled the time of a page refresh. But how to handle this simply for non-expert users? This is tough to implement, and directly opposes the "don't break userland" equivalent of the web.

[1] http://limi.net/checkboxes-that-kill/ , https://bugzilla.mozilla.org/show_bug.cgi?id=873709 , https://news.ycombinator.com/item?id=5968237

[2] https://wiki.mozilla.org/Polaris#Tracking_protection

baghira · on Aug 7, 2015

1. Yes, you have higlighted a source of frustration: currently to limit certain features one must either install half a dozen extensions on chromium or firefox, or stick to ESR versions of firefox, or gtkwebkit browsers (which I'm afraid do lag behind the apple upstream when it comes to security fixes). Hopefully with CEF and servo swapping out an engine for another will be easier, so the situation may improve a bit.

In an ideal world, this would be the purpose standards are: all the browsers agree on a set of minimum features, and security conscious users or administrators can decide to stick to that (I have no clue on whether other browser vendors would be interested). This would break websites in a predictable manner. After all sooner or later browser vendors will probably decide to break all tls-less websites. Some websites would be broken, but for people using a screen reader the web is already broken, and at least the would have a clear metric to point at when dealing with banks/news sites/institutions: if it breaks firefox/chrome/safari/edge safe mode, the webdesigner is doing something wrong. Similarly the limit imposed by organizations would help: if you are an entrerprise website you must render correcly in this mode. I'm convinced that administrators enforcing a "no IE policy" on the workplace did help move us away from a world in which frontpage's HTML was acceptable. My parents and users of entreprise workstation don't have browser choice anyway: they cannot install software.

2. Sure, the problem with modes is the problem with the UAC: you end up asking permission so often that you devalue the role of permssions, or you require the user to constantly check the current status of the application (e.g. the lock icon for SSL), which most users won't do. Polaris probably suffers from similar problems, as all "restrictive" extensions do. I'll admit that my solution is squarely aimed at users that cannot switch browser (or cannot switch browser mode), similarly to the gatekeeper role of apple on iphone, only giving the power to switch to administators/technically advanced users, which apple does not.

anon1385 · on Aug 7, 2015

>Mozilla is working on adding/replacing parts of Firefox with a language emphasizing security (among other things).

The safety of the implementation language is far from the only concern when considering the security impact of modern browser features. The recent WebRTC issues are well documented, as was the HSTS 'supercookies' issue. Even something seemingly fairly innocuous like css keyframe animation can be used to do remote timing attacks without js to leak browser state such as browsing history[1]. SVG filters in Firefox allowed information to be read from arbitrary pages through timing attacks, till they removed some of the optimisations[2]. Those kinds of things are not solvable with a safer language (in some cases that probably makes fixing timing attacks more difficult/impossible). I'm sure there are more of these kinds of things to be found. Some of them are realistically never going to be fixed now because they are baked into the standards and the browser vendors clearly care more about animating gizmos and not breaking existing sites than leaking users browser state.

[1] https://www.nds.rub.de/media/nds/veroeffentlichungen/2014/07...

[2] http://www.contextis.com/documents/2/Browser_Timing_Attacks.... and https://www.mozilla.org/en-US/security/advisories/mfsa2013-5... Read the bug to see how difficult it was for the devs to fix the issues without making the feature unusable - it took years

>I'd have added "You can install links if you want a simple browser letting you read static html documents", which you would have answered with "But I can't, every website require these features now", to which I'd have answered "a. Yeah, not everyone (that's an understatement) does progressive enhancement, but ultimately b. The times they are a-changing"

I'm not concerned about myself. I disable stuff like WebGL that I don't use, and I block most Javascript etc etc. My concern is for the average user who has absolutely no idea these features even exist, never mind knowing which ones they can turn off without breaking the sites they use. The general insecurity of the web affects me (and everybody else). When a site gets hacked because one of the admins was exploited by a browser vulnerability and my details get leaked that affects me.

ronjouch · on Aug 7, 2015

> The safety of the implementation language is far from the only concern when considering the security impact of modern browser features. [...] Those kinds of things are not solvable with a safer language (in some cases that probably makes fixing timing attacks more difficult/impossible). I'm sure there are more of these kinds of things to be found. Some of them are realistically never going to be fixed now because they are baked into the standards and the browser vendors clearly care more about animating gizmos and not breaking existing sites than leaking users browser state.

Good points, didn't know about the SVG exploit having taken so long. Rust (which, as you say, is no silver bullet) is one data point showing Mozilla's commitment to security, but the variance in the time to fixing exploits is worth consideration. Today's exploit was fixed in one day, SVG took 18 months. Why? Did Moz do a good job at prioritizing based on the severity / availability of exploits in the wild, or was the long time to SVG fix just caused by technical difficulties? I don't know, maybe a mozillian involved can comment.

jacquesm · on Aug 7, 2015

> If you don't want a browser that has some notion of "local file context" you should just sell your laptop and go live in a cave.

Thank you for your constructive advice.

And I note that so far my stuff written in 'memory unsafe languages' has been in production since '99 or so without a compromise to date over 100's of billions of requests.

Maybe it's not just the language.

And what business does a browser have with a .pdf file anyway, where does that end? excel sheets? word documents? proprietary format 'x'? Web browsers should stick to web browsing or at least have a mode where they will stick to just web browsing.

dragonwriter · on Aug 7, 2015

> And what business does a browser have with a .pdf file anyway, where does that end? excel sheets? word documents? proprietary format 'x'? Web browsers should stick to web browsing or at least have a mode where they will stick to just web browsing.

Displaying arbitrary media content is web browsing; the web is an interconnected network of servers providing hypermedia content that is self-describing as to content type so that clients (like browsers) can appropriately choose how to handle content based on its type.

Its true that early web browsers only handled HTML, plain text, and a few image formats internally, and relied on external software to handle all other media -- but all of that, including the parts for which they relied on external software -- is part of "web browsing".

jacquesm · on Aug 7, 2015

Sure, and if I install some plug-in to deal with a proprietary format that's my own doing and risk. But by default a browser should stick to a sensible subset otherwise we might as well author our web-pages in .pdf format instead of HTML.

Anyway, I've already been called grumpy and being told to sell my laptop and go live in a cave so I'll give HN a miss for the next couple of days or so.

tripzilch · on Aug 8, 2015

:-( but I like your comments!

(also let me take the moment to tell you that your recent "Nothing to hide" blog post is great, thanks for writing that)

tripzilch · on Aug 8, 2015

You know, they would try to render Word documents and Excel sheets if they could, if the formats weren't quite as Lovecraftian as they are, to display properly.

I bet they'd even try to render .PSD files.

So maybe PDF just hits the bad spot of being just not too arcane to implement, yet still being crazy enough to be a gigantic attack surface.

coffeeaddicted · on Aug 7, 2015

With external pdf-viewers the browser asked each time before downloading and showing a pdf. Now it displays pdf's by default. That's why an exploit could sneak through as advertisment. It couldn't have done that before.

stingraycharles · on Aug 7, 2015

You're getting old and grumpy, Jacques, before you know it you will start your sentences with "back in the day..." :)

On a more serious note, I guess this is the toll we have to pay for innovation pushing. I can understand the reasoning behind writing everything in JS: it allows you to consolidate a lot of mechanisms in a single platform. Once you have that platform secure, any application you will write will (should?) be secure too.

Too bad that theory and practice are usually not the same, in practice..

baghira · on Aug 7, 2015

I disagree that this is innovation.

What innovation and what benefits do I reap by using pdf.js? It's slower and has less features than okular. It's stuck inside a firefox window, so I cannot add a window rule for it (barring adding one for firefox in general).

The same holds on windows: why would I use pdf.js when there are faster, lighter pdf readers (e.g. sumatra) or the actual adobe acrobat reader and its eight bilions features? Heck, I've also noticed that many users will skim the file and then forget to save it, so it doesn't even help less tech savy users.

There are genuine improvement in the new web technologies, but they are mixed with a lot of stuff that simply does not belong there, and with the insufferable attitude "you can do in javascript, hence you should it in javascript" (I'm not criticizing you, eh, and there is some security argument to be made).

anc84 · on Aug 7, 2015

Everyone is punished for Windows' Adobe Reader. I never got it either. PDF is not a web format. I would not want to read doc files in my browser either. Evince(-light) starts up in milliseconds.

randallsquared · on Aug 7, 2015

> PDF is not a web format.

Sure it is: http://tools.ietf.org/html/rfc3778 ;)

> I would not want to read doc files in my browser either.

This doesn't make sense to me. Why should the viewer care about the implementation details of a document? If I click on a link to a document, I want to see the result in the browser, and I think that that's the correct default. Only if I'm clicking on something which produces something that isn't intended to be a document (an archive, for example) does opening another program make sense as the default.

stingraycharles · on Aug 7, 2015

I agree with you that this innovation isn't a particulary good one. However, a single platform in a language that allows you develop and test rapidly (which, arguably, javascript is) is a consequence about the ever increasing push for innovation, which I can understand.

In addition to that, I am very glad that Chrome and Firefox ship with their own PDF readers and I don't have to deal with Adobe anymore to read a portable document format.

vilhelm_s · on Aug 7, 2015

The benefit I guess is that it acts like any other webpage. If I click on a link to a pdf, that pdf replaces the page I was looking at. If I click on a link in the pdf, the target of the link replaces the current page. I can have them open in Firefox tabs just like every other thing I look at online.

anonbanker · on Aug 7, 2015

printing from pdf.js in linux is a bit of a headache as well, compared to okular or evince. Usually takes about 10 times as long (no joke) to print a pdf from inside firefox.

gavia1 · on Aug 7, 2015

That's if it works at all! I've found that pdf.js fails to print entirely when the document is sufficiently large. For example, when printing a scanned white paper from 20+ years ago.

leni536 · on Aug 7, 2015

I have mupdf in firefox (iceweasel) using mozplugger. I could always set it up to not display pdfs, only downlad them, use mupdf through mozplugger or use the builtin pdf.js viewer. Having said that I'm not sure that mupdf is safer than pdf.js, but it's much faster.

baghira · on Aug 7, 2015

I was debating the merit of having the pdf reader bundled and on by default, instead of having the pdf file downloaded like most other files. One can certainly change firefox's behaviour in the settings (unlike webgl).

oldmanjay · on Aug 7, 2015

I disagree that you are the arbiter of defining what innovation is, and the rest of your comment is similarly far outside of the bounds of the applicability of your opinion.

JustSomeNobody · on Aug 7, 2015

Writing a PDF viewer in JS isn't innovation. That word gets thrown around way to much. It's more like renovation.

jacquesm · on Aug 7, 2015

The old is hard to avoid, I'll work on the grumpy :)

wglb · on Aug 7, 2015

Hindsight and all that but still, this is not a good idea.

Do you mean the browser, or just this particular feature?

Recommended reading: http://lcamtuf.coredump.cx/postxss/ and http://lcamtuf.coredump.cx/tangled/.

If you do choose to read them, I recommend doing it earlier in the day--fitful sleep has been observed after evening reading of the above.

jacquesm · on Aug 7, 2015

I'm aware of those. It's the 'inband/out-of-band' problem of old rearing its ugly head again, if you mix code/control and data in one stream it's asking for trouble.

k8tte · on Aug 7, 2015

Uh, welcome to the web @ 2015

st0p · on Aug 7, 2015

WebRTC ;)

https://en.wikipedia.org/wiki/WebRTC