edit: changed ".pdf" from a slightly longer approach to ".html" which actually exists in this workflow. Thanks @TheCabin!
edit(2): ... and gave a valid path to this listdir. Check before I post... check before I post...
edit(3): ... and removed the os.listdir line no longer needed in this approach. Gosh. Just ignore what I was saying and build your own approach. That'll probably be faster at this rate.
Avoids the awkward file system structure by using pdfkit.from_url, but creates one .pdf for each chapter. I tried using a list of urls, but pdfkit failed because my version of wkhtmltopdf did not accept multiple input files.
Edit: pdfkit.from_file also fails on my system when passing multiple files. If that works for you, multiple urls are probably fine, too.
Absolutely right. That's erroneously copied from converting each HTML file separately with the intention to merge afterwards. Thank you for pointing that out!
For what it's worth I got pretty decent conversion results of one file at a time with the Ubuntu repository version (0.9.9)
But that version doesn't work with collecting it into a single output file, no.
I think it was a mistake to try shaving off a couple of lines and a step at the price of a more convoluted install and a brittle process.
Most people would probably be best off just converting one HTML file at a time to pdf (e.g. pdfkit.from_file(filename_in_html,filename_in_html[:-4]+".pdf") in some sort of iteration over the file names) and then concatenating the resulting PDF:s, for instance from command line by
Look, I'm not planning on printing the whole thing, binding it and sticking it on my shelf. I want the pdf because then I can use it when I'm offline, across devices and search it easily.
If I want this book in hard copy, then I will purchase it - I've done this regularly with free digital books - but when it is offered free digitally then in my opinion prohibiting to only certain file formats is futile (as evidenced here), and such constraints are ineffective attempts to encourage people to buy the hard copy through inconvenience.
And I must add that this is no slight to the authors, whom have my greatest appreciation for compiling their vast knowledge into a book and offering it for free. These guys are legends.
I guess this could be automated. For instance, you could download all html files using a plugin like "Download Them All" with a renaming mask like "inum-nameinum.ext" and then try:
There are also tools to convert the resulting files to a single pdf. The only problem I got is, that the woff fonts are not rendered by "wkhtmltopdf" :-/ Ideas?