Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Firefox has rewritten its entire rendering engine, and its entire CSS engine. Writing a browser is incredibly hard, but not infeasible.


While writing rendering engine isn't easy to do, it's extremely easy compared to the HTML part. Even Mozilla isn't in a hurry to rewrite all that code.


Depends what you mean by the "HTML part". The HTML parser, which is the only HTML-specific part, was replaced in Firefox 4, with one implementing the algorithm defined in the WHATWG HTML spec.

The DOM code there's definitely less motivation to rewrite, in large part because there's a lot less benefit to be gotten from rewriting large parts of it (versus layout or style where there's much more entanglement across the codebase).


HTML parsing is not that hard compared to CSS/layout/fonts (or even figuring out layout), a competitive JavaScript engine, and the myriad of APIs and site compatibility problems OP talked about.

My HTML parser uses SGML which is more generic as it takes the HTML grammar (a DTD) as parameter and computes state machine tables etc. dynamically based on it, thus a bit harder, but still very much doable.


Does that HTML parser follow all the HTML5 parsing/error-handling rules, so that it conforms to the spec's behavior for random tag soup full of broken markup? Or are you assuming "clean" HTML?


No, it follows the normative description of HTML as specified in chapter 4 of the HTML spec. The redundant procedural spec for parsing HTML is strictly aimed at browser implementers, and in particular to reach same behaviour accross browsers in the presence of errors. Note that the covered fragment still contains the rich tag omission/inference rules for HTML and other minute details, based on formal SGML techniques, though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: