Difference between revisions of "ePub"

From Wiki
Jump to navigation Jump to search
(documenting exported dir/file structure)
(section about zipping)
 
(8 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
{{todo|This page is work in progress and an update to the current pages [[Epub]] and [[Epub_Sample]].}}
 
{{todo|This page is work in progress and an update to the current pages [[Epub]] and [[Epub_Sample]].}}
  
< [[XML]] | [[HTML]] | [[Epub|Old ePub docs]] | [[Epub_Sample|Old ePub Sample]] >
+
< [[XML]] | [[Export]] | [[Epub|Old ePub docs]] | [[Epub_Sample|Old ePub Sample]] >
  
Beware, these ePub/HTML facilities of ConTeXt work only with MkIV since some version in December 2014 and require additional work on your source code!
+
The ePub facilities of ConTeXt are based on its [[Export]] of XML/XHTML. Make sure you get useful export output from your project before you try ePub.
  
= Minimal example and structure of export files =
+
= Documentation about ePub =
  
<texcode>
+
* [https://en.wikipedia.org/wiki/EPUB Wikipedia entry] with some examples
% mode=mkiv
+
* [http://idpf.org/epub/20/spec/OPF_2.0.1_draft.htm Open Packaging Format] (OPF) 2.0.1 specs; contains also information about NCX
\setupbackend[export=yes]
+
* [https://www.iso.org/standard/53255.html ePub3 ISO/IEC TS 30135-1:2014] (Buy specs for 118 CHF?)
 
+
* [https://github.com/w3c/epubcheck ePub Check] validator (most validation tools are based on this)
\starttext
 
\input tufte
 
\stoptext
 
</texcode>
 
 
 
If you compile this as {{code|1=minimal.tex}}, you get a directory structure like this:
 
 
 
<pre>
 
minimal.tex
 
minimal.log
 
minimal.pdf
 
minimal.tuc
 
minimal-export
 
├── cover.xhtml
 
├── images
 
├── minimal-div.xhtml
 
├── minimal-pub.lua
 
├── minimal-raw.xml
 
├── minimal-tag.xhtml
 
└── styles
 
    ├── minimal-defaults.css
 
    ├── minimal-images.css
 
    ├── minimal-styles.css
 
    └── minimal-templates.css
 
</pre>
 
 
 
We will further refer to these files without the prefix ("minimal-"). We reformatted the code copies a bit to make them smaller and better readable.
 
 
 
== div.xhtml ==
 
 
 
<xmlcode>
 
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
 
<!--
 
    input filename  : minimal
 
    processing date  : Sat Jan 17 17:43:58 2015
 
    context version  : 2014.12.29 10:01
 
    exporter version : 0.33
 
-->
 
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:math="http://www.w3.org/1998/Math/MathML">
 
    <head>
 
        <meta charset="utf-8"/>
 
        <title></title>
 
<link type="text/css" rel="stylesheet" href="styles/minimal-defaults.css" />
 
<link type="text/css" rel="stylesheet" href="styles/minimal-images.css" />
 
<link type="text/css" rel="stylesheet" href="styles/minimal-styles.css" />
 
    </head>
 
    <body>
 
        <div class="warning">Rendering can be suboptimal because there is no default/fallback css loaded.</div>
 
<div>
 
We thrive in information--thick worlds because of our marvelous and everyday capacity to select, edit, single out, structure, highlight, group, pair, merge, harmonize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, distinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, refine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats.
 
</div>
 
    </body>
 
</html>
 
</xmlcode>
 
 
 
== tag.xhtml ==
 
 
 
<xmlcode>
 
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
 
<!--
 
    input filename  : minimal
 
    processing date  : Sat Jan 17 17:43:58 2015
 
    context version  : 2014.12.29 10:01
 
    exporter version : 0.33
 
-->
 
<?xml-stylesheet type="text/css" href="styles/minimal-defaults.css" ?>
 
<?xml-stylesheet type="text/css" href="styles/minimal-images.css" ?>
 
<?xml-stylesheet type="text/css" href="styles/minimal-styles.css" ?>
 
<document href="minimal" language="en" date="Sat Jan 17 17:43:58 2015" context="2014.12.29 10:01" xmlns:m="http://www.w3.org/1998/Math/MathML" file="minimal" version="0.33">
 
We thrive in information--thick worlds because of our marvelous and everyday capacity to select, edit, single out, structure, highlight, group, pair, merge, harmonize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, distinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, refine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats.
 
</document>
 
</xmlcode>
 
 
 
== raw.xml ==
 
 
 
<xmlcode>
 
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
 
<!--
 
    input filename  : minimal
 
    processing date  : Sat Jan 17 17:43:58 2015
 
    context version  : 2014.12.29 10:01
 
    exporter version : 0.33
 
-->
 
<?xml-stylesheet type="text/css" href="styles/minimal-defaults.css" ?>
 
<?xml-stylesheet type="text/css" href="styles/minimal-images.css" ?>
 
<?xml-stylesheet type="text/css" href="styles/minimal-styles.css" ?>
 
<document language="en" date="Sat Jan 17 17:43:58 2015" context="2014.12.29 10:01" xmlns:m="http://www.w3.org/1998/Math/MathML" file="minimal" version="0.33">
 
We thrive in information--thick worlds because of our marvelous and everyday capacity to select, edit, single out, structure, highlight, group, pair, merge, harmonize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, distinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, refine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats.
 
</document>
 
</xmlcode>
 
 
 
== pub.lua ==
 
 
 
<texcode>
 
return {
 
["htmlfiles"]={ "minimal-div.xhtml" },
 
["htmlroot"]="minimal-div.xhtml",
 
["identifier"]="3ce74458-4cdd-829d-ace4-cf535fb00519",
 
["imagefile"]="styles/minimal-images.css",
 
["imagepath"]="images",
 
["images"]={},
 
["language"]="en",
 
["name"]="minimal",
 
["stylepath"]="styles",
 
["styles"]={ "minimal-defaults.css", "minimal-images.css", "minimal-styles.css" },
 
["xhtmlfiles"]={ "minimal-tag.xhtml" },
 
["xmlfiles"]={ "minimal-raw.xml" },
 
}
 
</texcode>
 
  
 
= Minimal ePub =
 
= Minimal ePub =
  
If you run {{code|1=mtxrun --script epub --make minimal}} on above example, you get a structure like:
+
If you already ran ConTeXt on your project and got the file structure as described in [[Export]], you can run {{code|1=mtxrun --script epub --make minimal}} to get a structure like:
  
 
<pre>
 
<pre>
Line 144: Line 35:
 
</pre>
 
</pre>
  
* epub: The epub file is just a zipped version of the directory structure.
+
* .epub: The epub file is just a zipped version of the directory structure.
 
* mimetype: contains only "{{code|1=application/epub+zip}}"
 
* mimetype: contains only "{{code|1=application/epub+zip}}"
 
* container.xml points to the root file "minimal.opf"
 
* container.xml points to the root file "minimal.opf"
 
* cover.xhtml is the cover image shown by ePub readers
 
* cover.xhtml is the cover image shown by ePub readers
* opf: list of all resources, this keeps the "book" together. See also [[http://www.idpf.org/epub/20/spec/OPF_2.0.1_draft.htm|OPF specs]].
+
* -div.xhtml: your content
* ncx: table of contents
+
* .opf: list of all resources, this keeps the "book" together. See also [http://www.idpf.org/epub/20/spec/OPF_2.0.1_draft.htm OPF specs].
 
+
* .ncx: table of contents
That’s nice, but the contents make no sense yet. We have to add more structure and metadata.
 
 
 
= Required structuring =
 
 
 
The export contains usable content only for content that is "well structured" in an XML sense. That means, you need to mark everything, from markup spans over paragraphs and enumeration items to chapters and parts with {{code|1=\start... … \stop...}}, for example:
 
 
 
<texcode>
 
...
 
\startsection[title={A section}]
 
 
 
\startparagraph
 
 
 
\input tufte
 
 
 
\startitemize[packed,joinup]
 
  \startitem First \stopitem
 
  \startitem Second \stopitem
 
  \startitem Third \stopitem
 
  \startitem Fourth\stopitem
 
\stopitemize
 
 
 
\stopparagraph
 
 
 
\startparagraph
 
\input knuth
 
\stopparagraph
 
  
\startparagraph
+
= Compressing the structure =
\input hagen
 
\stopparagraph
 
  
\stopsection
+
The file structure outlined above gets packaged into a ZIP archive with the .epub extension.
...
 
</texcode>
 
  
= Minimal useful example =
+
* <tt>mimetype</tt> must be the first directory entry of the archive.
 +
* You must not use "extra file attributes", e.g. from macOS (AKA resource forks) or NTFS; use the X parameter of zip.
 +
* You don’t want some other lurking files like macOS’s <tt>.DS_Store</tt> directory information or PDF images; exclude them from recursive packaging.
  
<texcode>
+
    EPUB=minimal.epub
% mode=mkiv
+
    zip -u9X  $EPUB mimetype
\setupbackend[export=yes]
+
    zip -u9X  $EPUB META-INF/container.xml
\setupmainlanguage[en]
+
    zip -u9rX $EPUB OEBPS -x OEBPS/.DS_Store OEBPS/Text/.DS_Store OEBPS/Images/.DS_Store "*/.DS_Store" "OEBPS/Images/*.pdf"
  
\starttext
 
  
\startparagraph
+
= TODO =
\input tufte
 
\stopparagraph
 
  
\stoptext
+
Write about
</texcode>
+
* Cover
 +
* ToC
 +
* Readers
 +
* Styling
 +
* Images
 +
* Better workflow

Latest revision as of 09:35, 6 March 2020


TODO: This page is work in progress and an update to the current pages Epub and Epub_Sample. (See: To-Do List)


< XML | Export | Old ePub docs | Old ePub Sample >

The ePub facilities of ConTeXt are based on its Export of XML/XHTML. Make sure you get useful export output from your project before you try ePub.

Documentation about ePub

Minimal ePub

If you already ran ConTeXt on your project and got the file structure as described in Export, you can run mtxrun --script epub --make minimal to get a structure like:

minimal.epub
minimal-epub
├── META-INF
│   └── container.xml
├── OEBPS
│   ├── cover.xhtml
│   ├── images
│   ├── minimal-div.xhtml
│   ├── minimal.opf
│   ├── nav.xhtml
│   ├── styles
│   │   ├── minimal-defaults.css
│   │   ├── minimal-images.css
│   │   └── minimal-styles.css
│   └── toc.ncx
└── mimetype
  • .epub: The epub file is just a zipped version of the directory structure.
  • mimetype: contains only "application/epub+zip"
  • container.xml points to the root file "minimal.opf"
  • cover.xhtml is the cover image shown by ePub readers
  • -div.xhtml: your content
  • .opf: list of all resources, this keeps the "book" together. See also OPF specs.
  • .ncx: table of contents

Compressing the structure

The file structure outlined above gets packaged into a ZIP archive with the .epub extension.

  • mimetype must be the first directory entry of the archive.
  • You must not use "extra file attributes", e.g. from macOS (AKA resource forks) or NTFS; use the X parameter of zip.
  • You don’t want some other lurking files like macOS’s .DS_Store directory information or PDF images; exclude them from recursive packaging.
   EPUB=minimal.epub
   zip -u9X  $EPUB mimetype
   zip -u9X  $EPUB META-INF/container.xml
   zip -u9rX $EPUB OEBPS -x OEBPS/.DS_Store OEBPS/Text/.DS_Store OEBPS/Images/.DS_Store "*/.DS_Store" "OEBPS/Images/*.pdf"


TODO

Write about

  • Cover
  • ToC
  • Readers
  • Styling
  • Images
  • Better workflow