Let’s just get this out in the open – I hate the GPL. I absolutely find it repulsive. The GPL has its uses, but unfortunately, those uses are not what most people assume when they imagine open source software. Part of this is undoubtedly the product of the era and culture the GPL was born, but not all. The GPL exists and continues to draw popularity in part due to its popularity – as circular as that is.
Converting a single HTML file to an ePub is straightforward, with many free tools available for this purpose. But, if your goal is to convert multiple HTML files, and only a portion of each file, into an eBook with a proper table of contents, cover image, etc., what do you do?
All of these requirements are necessary for creating a professional ePub, but yet surprisingly no tool existed which could do all of these things without considerable manual effort. Like any good software developer, if no tool exists for a job, and the only other option is manual work, I took the laziest path and created a new tool to get the job done.
That new tool is called html2epub and is a command line app which can:
- Generate a professional looking ePub from a series of web pages
- Strip out unnecessary HTML
- Convert HTML into XHTML as to be compliant with the ePub spec
- Embed images
- Embed Gist code snippets
- Rewrite chapter to chapter links for proper ePub navigation
- Support for Table of Contents navigation
- Support forms-based authentication
I have tried to keep this utility as simple to use as possible, despite its many features. Let’s look at how to get started.
On macOS installing html2epub is greatly simplified by brew. Simply run:
brew install jwhitehorn/brew/html2epub
This will download and install htmlepub, and its dependencies, and register the command in your PATH. With that completed, you can generate an ePub as easily as:
html2epub --url https://www.datasyncbook.com \ --toc ./example/toc.xhtml \ --cover ./example/cover.png \ --contents ./example/contents.json \ --title "Data Synchronization" \ --subtitle "Patterns, Tools, & Techniques" \ --author "Jason Whitehorn"