Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: How to develop/plan a document format
2 points by mgualt on July 14, 2012 | hide | past | favorite | 2 comments
This is a question about how to approach the design and development of a document format.

The basic problem is this: in math (and other disciplines), we write a LaTeX markup file and then compile it to PDF. PDF is basically a paper simulator and lacks many features which webpages/sites can have.

An alternative would be to have a versatile but still constrained markup language (e.g. an extension of LaTeX, or a constrained org-mode) which can be compiled or "exported" to HTML among other formats.

Some of the benefits would include

- Folding of text: hiding parts of the document until further details are desired

- more sophisticated linking between (parts of) documents

- nonlinear/hierarchical document structure

- including media such as animation/tutorials

- running code in the document

The question is: how to approach the development of the system. For example, many of the features could be implemented, say in jquery or some such, but the system should be independent of implementation. How does one proceed in a future-proof way whereby the choices are not regretted down the line?

Note that this is not about typesetting -- I am aware of the web typesetting problem. This is about the inadequacy of PDF as a document format for the future.



> How does one proceed in a future-proof way whereby the choices are not regretted down the line?

What's wrong with HTML that has classnames, or the new data attributes that HTML5 has? They're not dependent on any one implementation and HTML seems like it's the most future-proof thing out there apart from plain text.

In order:

- <div class='expando'>Hide/Show this text.</div>

- <a href="#something">Go somewhere</a>

- <a href="otherdoc.html">Hyperlink</a>

- <video src="hello.mp4" />

- <code>console.log('hello world!');</code>

Almost any JS developer can make these work without any library, from what the browser provides, alone. Give the innerHTML to eval, for example. It seems like the most future-proof way to go, and I'll be happy to be proven wrong and find out about something I don't know about. But Markdown nor any of its derivatives don't seem to be good enough.

Of course, you could also write a Markdown parser that has constructs that will compile to the above too.

> This is about the inadequacy of PDF as a document format for the future.

And I'm guessing, since we're doing this over an HTML document, that that'll be sufficient instead.


Thanks for the reply:

I don't know about HTML5, so I will look into that. But my concern is more about how to design the system. Let me explain.

LaTeX is quite a formidable system nowadays, with a well-developed markup language optimized for mathematics and technical writing, as well as a compiler which does advanced typesetting (for PDF output, that is). Because of the way the system is organized, there are thousands of packages available for LaTeX which essentially extend the markup language and provide new functionality to the compiler to produce diagrams, for example, or certain alternate PDF formats, etc. The amount of work done in the LaTeX domain in terms of development is quite amazing.

In view of this, is there a way of extending LaTeX in such a way that all this functionality ports over to the creation of more active documents/webpages?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: