Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

pandoc


To reinforce this: pandoc has been the go-to for a long, long time and they have encountered and addressed tons of issues, which is especially important for two underspecified and over-provisioned formats like HTML and pdf.

Go through the revision and bug history to see a sample of issues you're avoiding by using a highly-trafficked, well-supported solution.

The only reason not to use it is when they say they don't support a given feature that you need; and the nice thing there is that they'll usually say it, and have a good reason why.

The other reason to use pandoc is that while you might currently want PDF as your outbound format, you might end up preferring some other format (structured logically instead of by layout); with pandoc that change would be easy.

Finally, pandoc is extensible. If you do find that you want different output in some respect, you can easily write an plugin (in python or haskel or ...) to make exactly the tweak you need.


doesn't pandoc rely on some engine itself?


Yep, you need something like XeTeX in order to render the PDF.


Curious why that matters to you?

I mean everything has dependencies (some of the solutions elsewhere require Chrome and other common solutions require the JVM). At least Pandoc is GPL.


It matters because pandoc is not rendering the website to pdf, it converts the html to latex and then uses a latex engine to render the pdf.


Forgive me but I don’t understand why that matters to you and am trying to understand what the issue with Latex is.

Because lots of things work this way. For example compilers built on LLV uses an intermediate language and Python uses byte code.

I suspect some html to pdf tools go through postScript.


There are multiple ways to "depend", so if pandoc executes some external tool all of the work then might as well use that external tool directly. You will get more control over how the conversion happens, know for what search for when in trouble etc.


My understanding and experience is that Latex has a significant learning curve and Pandoc provides a more gentle front end.

Of course Latex gives you fine control to hand tune the engine…but that doesn’t seem like what the OP is looking for.


Sure, I don't mean that anyone would look at the Latex in between. I'm just saying that if tool x directly calls tool y to do the job then might as well use tool y directly.

Since hammers and nails are a common tool-workpiece example…consider the nail gun.

Theoretically you can drive nails with a 22 caliber blank cartridge without making the “call” through a nail gun. But you won’t finish laying shingles as quickly and easily…

Or to put it another way, there’s a reason assemblers are almost always better than machine code and compilers are almost always better than assemblers for the ends people care about.

I mean why use Latex at all when you could write your own typesetting language? Maybe because you are not a knuth.


You're confusing wrappers with alternatives. The comparison is more like if somebody published a script called html-to-pdf.sh which directly calls, e.g, chrome, would you want to use this script or use chrome directly? I would prefer the latter because (1) I would know what actually does the conversion, (2) I would know what to search for on the web should I need to tweak the output. This knowledge gives me more power as I know the actual converter. The wrapper script perhaps only helps with what the command line should be.

Does pandoc do JavaScript? For stuff that is rendered (I don't want animated, interactive PDFs...).




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: