Book, Open Source

Generating ePub Books From HTML

Converting a single HTML file to an ePub is straightforward, with many free tools available for this purpose. But, if your goal is to convert multiple HTML files, and only a portion of each file, into an eBook with a proper table of contents, cover image, etc., what do you do?

This was exactly the crossroads I found myself at when attempting to create an ePub version of my book. Each chapter of the book was represented by a unique web page, and I needed an automated way of quickly downloading all of those and combining them into an eBook. To make things more interesting, only a portion of each page was necessary – who wants to see a web page’s header, footer, and navigation bar on an ePub? Additionally, images needed to be downloaded and embedded into the ePub, and Github Gist code snippets needed to be downloaded and represented without the use of Github’s Javascript tags.

All of these requirements are necessary for creating a professional ePub, but yet surprisingly no tool existed which could do all of these things without considerable manual effort. Like any good software developer, if no tool exists for a job, and the only other option is manual work, I took the laziest path and created a new tool to get the job done.

Introducing html2epub

That new tool is called html2epub and is a command line app which can:

  • Generate a professional looking ePub from a series of web pages
  • Strip out unnecessary HTML
  • Convert HTML into XHTML as to be compliant with the ePub spec
  • Embed images
  • Embed Gist code snippets
  • Rewrite chapter to chapter links for proper ePub navigation
  • Support for Table of Contents navigation
  • Support forms-based authentication

I have tried to keep this utility as simple to use as possible, despite its many features. Let’s look at how to get started.

Getting Started

On macOS installing html2epub is greatly simplified by brew. Simply run:

brew install jwhitehorn/brew/html2epub

This will download and install htmlepub, and its dependencies, and register the command in your PATH. With that completed, you can generate an ePub as easily as:

html2epub --url https://www.datasyncbook.com \ --toc ./example/toc.xhtml \ --cover ./example/cover.png \ --contents ./example/contents.json \ --title "Data Synchronization" \ --subtitle "Patterns, Tools, & Techniques" \ --author "Jason Whitehorn" read more

Internet

IPv6 Test Website

I put together a quick, but fun, website for testing your local IPv6 compatibility – 🐟🌮.ws.

If you can visit 🐟🌮.ws, then your current network has functioning IPv6. If you get a “server not found” (or similar) error, then you cannot currently browse IPv6 only websites. read more

Objective-C, Quick Tips

Constructing an NSArray with NSString Copies

The scenario might sound specific, but I am confident you’ve encountered something similar before. You need to construct an array, with a known number of duplicates of a string. Perhaps you’re constructing a template, and need a fixed number of placeholder elements, or you’re parameterizing a query and need a dynamic number of placeholders. In either case, you were probably left writing a rather ugly bit of logic in the middle of a routine that was otherwise focused on the task at hand. read more

Humor, Javascript

The Mostly Last Day of February

Jason / February 28, 2018

Today is the last day of February – mostly. A day that is overshadowed by the far more popular, but less frequently occurring Leap Day. Why is February 29th called “Leap Day”, it’s not the one being leaped over, that’s poor February 28th – what did it ever do? read more