The EPUB file format

EPUB is the abbreviation for electronic publication. It is a file format for publishing books and other types of content in a reflowable fashion. This means that the content can adapt itself to fit the available screen space. An EPUB file can be viewed on a 3.5″ cell phone as well as a 10″ tablet or the 22″ monitor of a desktop computer. If the line width is adjusted, the text reflows to make optimum use of the available screen estate. Images can get scaled to achieve the same effect.
An EPUB publication consists of a single file that has the file extension .epub. Below is the official logo for EPUB files, published in June 2010 by IDPF.
Logo for the EPUB file format
This page provides a general overview of the EPUB file format. It discusses

  • The basics of the file structure
  • Font handling
  • Image embedding
  • How the create EPUB files
  • How to view or read EPUB files
  • Troubleshooting EPUB files
  • A comparison with other file formats

The page ends with some pointers to other interesting sites.

The file format

EPUB is based on three open standards:

  • Open Publication Structure (OPS) – An EPUB 2.0 file uses XHTML 1.1  to construct the content of a publication. In essence this means that an EPUB file consists of one or more web pages. Even though you could include the entire content of a book or newspaper in a single page, it is better that such a file doesn’t exceed 300K, both for performance and compatibility reasons. Just like with regular web pages, the styling and layout is defined using cascading style sheets (CSS). In EPUB files a subset (limited series of commands) of CSS2 needs to be used. Many of the new features of CSS3, such as rounded boxes or drop shadows, are not available yet. For backwards compatibility, a creator can also use DTBook instead of XHTML to encode the content.
  • Open Packaging Format (OPF) – This part of the specs deals with structural information such as metadata (who is the author and the publisher, what is the title,..), the manifest (a list of all the files inside an epub file) and the table of contents. These data are all embedded using XML.
  • Open Container Format (OCF) – As the above descriptions should have made it clear an EPUB document consists of a series of files. The OCF specs define how all those files end up being packaged in one single container file. ZIP compression is used for this. If you take an EPUB file and change the .epub extension to .zip, you can decompress the publication and take a look at all those files.

Font handling in EPUB files

There are two 2 mechanisms for handling fonts in EPUB files:

  • Since an EPUB file is a collection of web pages, fonts can be referenced in the CSS.
  • Embedding fonts: Font embedding is a technique in which characters shapes (glyphs) are included in a file. a As far as I know it is best done using OpenType fonts & @font-face in CSS.
    • Useful to maintain a specific look & style.
    • Practical for math books or to maintain diacriticals (accents,…) and other non-ASCII characters.
    • Not all EPUB readers support embedded fonts.
    • Not many font licenses allow for this type of distribution.

Images in EPUB files

The EPUB file format supports the following image file formats:

  • GIF
  • JPG
  • PNG
  • SVG, which means EPUB file can also contain vector graphics, e.g. for logos or maps

The standard is flexible, meaning that it is permitted to use additional file formats, such as Flash. If that is the case, the EPUB file needs to include an alternate rendering of that file in one of the 4 above supported file formats. This means that you can embed a movie in an EPUB file but the software creating the file should maybe take one of the initial frames of the movie and embed it as a JPEG image. A reader device such an an e-book reader that cannot display movies can then display the JPEG image instead.

How to view EPUB files

I often use Adobe Digital Editions, a free EPUB viewer from Adobe, to read EPUB files. Digital Editions can be a bit quirky. If an EPUB file contains errors, it may crash.
Since FireFox is my main browser, I also have the EPUBreader plug-in installed.

With the exception of the Amazon Kindle, most other e-readers on the market can display EPUB files. Since I am not that happy with the currently available devices, I don’t have much experience with them.

How to create EPUB files

For people who are familiar with designing for print, using Adobe InDesign to create EPUB files may be the easiest solution. The InDesign Export for digital editions menu option can be used to export a publication to an EPUB file. You can find a set of tutorials here.
You can find an interesting summary of EPUB creation tools here.

Troubleshooting EPUB files

No information available yet

Other sources of information

The full specifications of EPUB can be found on the IDPF site.

8 August 2013

11 Responses to “The EPUB file format”

  1. Parisien says:

    Very useful…. !!

  2. suba says:

    Thank u very much.

  3. Deborah says:

    Thanks! My cover art (jpg) is showing as page one and the cover that exports comes out generic. Do you know what i am doing wrong?

    Thanks!
    Deborah

    • Laurens says:

      My knowledge is limited to using Adobe InDesign and Agfa Apogee Media to generate EPUB files. It isn’t even clear if you are using either application or how the cover art is embedded in the source file. Sorry, but troubleshooting this type of problem can be difficult and time consuming.

  4. Laurie says:

    My graphics (jpg) are showing up way too large when I convert to epub, and in a different location on the page than they are in the original file. Any ideas on how to fix that? Also, my title page is repeated as the first page of the document with wacky spacing.

    By the way, there’s an apostrophe issue in this article – in the “images” section above, the word “logos” shouldn’t include an apostrophe – it’s plural, not possessive. Just thought I’d mention it.

    • Prem CM says:

      You can change graphics to PNG format, in case, if u wnat only jpg format, u can change the resolution using any Image editor. Regarding Title page, may be you are using any Converter for ePUB. In case of any repitation of Title page, Cover, etc. you can delete those files in OEBPS folder. I am not clear about apostrophe.

  5. Edwin M. says:

    I have one copy of an out-of-print book I wrote years ago, containing both text and visuals. I want to convert these pages to an eBook format (ePub?).

    Is this possible, or can such a conversion only come from a digital file … not hard copy?

    Can I scan each page first and somehow convert these scanned pages into an ePub file?

    Ed

    • Laurens says:

      ePub files aren’t meant to contain scans of full pages. Given the limited to non-existant scaling options in ebook readers, a file created that way would not be very practical. It is better to use an OCR application to convert the scanned text to actual text and redo the layout of the book as a digital file.

  6. Bastette says:

    “ZIP compression is used for this. If you take an EPUB file and change the .epub extension to .zip, you can decompress the publication and take a look at all those files.”

    Thank you for this excellent advice! This worked perfectly when nothing else did.

  7. Fraggy says:

    You can also create EPUBs with Atlantis Word Processor:

    http://www.atlantiswordprocessor.com/en/help/ebook.htm

    It can convert any existing document to EPUB. You can also use it to create a new EPUB from scratch.

  8. Malcolm says:

    How do I place tables in EPUB. Have tables with Greek and Hebrew letters in them. Also, would these fonts have to be embedded in the EPUB file?


Advertising