Epub Sample

From Wiki
Revision as of 14:08, 17 January 2015 by Hraban (talk | contribs) (add links to new pages)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

TODO: Beware, this doesn’t fit the current export files and ePub workflow as of January 2015! (See: To-Do List)


< Epub | XML | Export | New ePub docs>

Creating an ebook with ConTeXt is still tedious and needs a lot of manual work - that will not change, since everyone has other needs, uses different structures etc. Here I’ll show you my workflow for creating ebooks of my songbooklets (that use LilyPond via filter module for the notes).

I’m using ConTeXt’s Project structure, separating content in products (for me: single booklets) and components (for me: single songs) with a common stylesheet (environment).

Beware, you need a current beta version of ConTeXt, since Hans fixed some export related bugs in the last few days!

--Hraban 13 September 2014.

ConTeXt setup

In your environment or product, you need these settings (perhaps not all of them):

\setupexport[
	hyphen=yes,
	%firstpage={cover.jpg}, % is ignored
	title={Songbook},
	subtitle={},
	author={Hraban}
]
\setupbackend[export=export.xml]
\settaggedmetadata[
	% here you can set as many metadata entries as you like
	title={Songbook},
	name=ebook, % this becomes the name of the output directory
	author={Hraban},
	subtitle={},
	version={\expanded\currentdate} % doesn’t work
]
\setupinteraction[state=start,
	color=,contrastcolor=,
	% these settings are for PDF metadata
	title={Songbook},
	subtitle={},
	keywords={},
	author={Hraban}
]

\definehighlight[emph][style=italic] % use \emph{something} instead of {\em something}

Make sure to tag all your structural elements with \start...-\stop..., e.g. \startchapter, but even \startparagraph!

In places where \startparagraph does not work, such as itemizations, where it causes a blank line after the bullet and before the item text, use \bpar (and closing \epar) to tag paragraphs.

Then you can call ConTeXt and its ePub script:

context mysongbook
mtxrun --script epub --make mysongbook

The first creates export.xml and a bunch of other files. The second creates a directory ebook.tree with the proper structure for ePub. The ePub file in the tree directory is unusable.

We’ll mostly work with "export.xml" that contains all your content (check that, you’ll miss everything that was not properly tagged).

ePub structure

This is the directory structure that we need for our ePub:

/songbook.tree/
├── META-INF
│   └── container.xml
├── OEBPS
│   ├── Fonts
│   │   └── somefont.otf
│   ├── Images
│   │   ├── ...
│   │   ├── c_farewell-1.png
│   │   ├── c_farewell-2.png
│   │   ├── c_farewell.png
│   │   ├── ...
│   │   └── cover.jpg
│   ├── Styles
│   │   └── style.css
│   ├── Text
│   │   ├── _intro.html
│   │   ├── aut_1.html
│   │   ├── ...
│   │   └── aut_99.html
│   ├── cover.html
│   ├── songbook.opf
│   └── toc.ncx
└── mimetype

We use mimetype and container.xml unchanged from ConTeXt’s epub script and (re)create everything else. At the end this structure is just zipped with an "epub" file ending.

Unchanged files

For the records:

mimetype

application/epub+zip

container.xml

<?xml version="1.0" encoding="UTF-8"?>

<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
    <rootfiles>
        <rootfile full-path="OEBPS/songbook.opf" media-type="application/oebps-package+xml"/>
    </rootfiles>
</container>

Transform XML to HTML

Even if the ePub format is supposed to work with any XML, most readers only accept HTML. I use the free version of Saxon and some XSL transformations for the conversion.

The incantation goes like saxon -o:content.xhtml -s:export.xml -xsl:export2html.xsl. I installed Saxon on my Mac with MacPorts, then instead of just "saxon" you must call java -jar /opt/local/share/java/saxon9he.jar.

export2html.xsl

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version= "2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output 
	method="xml"
	encoding="utf-8"
	indent="yes"
	omit-xml-declaration="yes"
/>

<xsl:variable name="within-paragraph">0</xsl:variable >
<xsl:variable name="within-section">0</xsl:variable >
<xsl:variable name="previous-section">0</xsl:variable >

<xsl:template match='section'>
<!--
	<xsl:if test="@detail='part'">
		<xsl:text disable-output-escaping="yes"><![CDATA[</div></body></html>]]></xsl:text>
	</xsl:if>
-->
    <xsl:result-document method="xml" href="aut_{@implicit}.html">
		<html xmlns="http://www.w3.org/1999/xhtml">
		<head>
			<meta charset="utf-8" />
			<title><xsl:value-of select='./sectiontitle'/></title>
			<xsl:for-each select="//metavariable">
			<meta>
				<xsl:attribute name="name">
					<xsl:value-of select="@name"/>
				</xsl:attribute>
				<xsl:attribute name="content">
					<xsl:apply-templates/>
				</xsl:attribute>
			</meta>
			</xsl:for-each>
			<link rel="stylesheet" href="../Styles/style.css" type="text/css" ></link>
		</head>
		<body>
			<xsl:attribute name="lang">
				<xsl:value-of select='//document/@language'/>
			</xsl:attribute>
			<!--
			<xsl:variable name="previous-section">{$within-section}</xsl:variable >
			<xsl:variable name="within-section">{@detail}</xsl:variable >
			-->
			<xsl:apply-templates/>
			<!--
			<xsl:variable name="within-section">{$previous-section}</xsl:variable >
			<xsl:variable name="previous-section">0</xsl:variable >
			-->
		</body>
		</html>
    </xsl:result-document>
</xsl:template> 


<xsl:template match="sectiontitle">
	<xsl:choose>
		<xsl:when test="../@detail='part'">
			<h1><xsl:apply-templates/></h1>
		</xsl:when>
		<xsl:when test="../@detail='chapter'">
			<h2><xsl:apply-templates/></h2>
		</xsl:when>
		<xsl:when test="../@detail='Titel'">
			<h2><xsl:apply-templates/></h2>
		</xsl:when>
		<xsl:when test="../@detail='TitelKlein'">
			<h2><xsl:apply-templates/></h2>
		</xsl:when>
		<xsl:when test="../@detail='section'">
			<h3><xsl:apply-templates/></h3>
		</xsl:when>
		<xsl:when test="../@detail='subsection'">
			<h4><xsl:apply-templates/></h4>
		</xsl:when>
		<xsl:otherwise>
			<h6 class="../@detail"><xsl:apply-templates/></h6>
		</xsl:otherwise>
	</xsl:choose>
</xsl:template>

<xsl:template match="sectioncontent">
<div class="{../@detail}-content">
	<xsl:apply-templates/>
</div>
</xsl:template>

<xsl:template match="externalfilter">
<div class="{@detail}">
	<xsl:apply-templates/>
</div>
</xsl:template>

<xsl:template match="lines">
<div class="lines">
	<xsl:apply-templates/>
</div>
</xsl:template>

<xsl:template match="line">
	<xsl:apply-templates/> <br />
</xsl:template>

<xsl:template match="list">
<ul>
	<xsl:apply-templates/>
</ul>
</xsl:template>

<xsl:template match="listitem">
<li><xsl:apply-templates/></li>
</xsl:template>

<xsl:template match="listcontent">
<span class="listcontent"><xsl:apply-templates/></span>
</xsl:template>

<xsl:template match="listpage">
<span class="listpage"><xsl:apply-templates/></span>
</xsl:template>

<xsl:template match="break">
<xsl:if test="within-paragraph = 0">
<xsl:text disable-output-escaping="yes"><![CDATA[</p><p>]]></xsl:text>
</xsl:if>
<xsl:if test="within-paragraph > 0">
<br/>
</xsl:if>
</xsl:template>

<xsl:template match="paragraph">
<xsl:variable name="within-paragraph">1</xsl:variable >
<p><xsl:apply-templates/></p>
<xsl:variable name="within-paragraph">0</xsl:variable >
</xsl:template>

<xsl:template match="delimited">
<span class="delim-{@detail}"><xsl:apply-templates/></span>
</xsl:template>

<xsl:template match="construct">
<div class="struct-{@detail}"><xsl:apply-templates/></div>
</xsl:template>

<xsl:template match="highlight">
<xsl:if test="@detail = 'emph'">
<em><xsl:apply-templates/></em>
</xsl:if>
</xsl:template>

<!-- all the images in my songbook are notes generated by LilyPond via t-filter; in ePub I shorten the file names -->


<xsl:template match="image">
<img src="../Images/{substring-after(@name,'prd_hraban-temp-lilypond-')}.png" id="{@id}" alt="{@name}" />
</xsl:template>

<xsl:template match="link">
<a href="{@location}" title="{@destination}"><xsl:apply-templates/></a>
</xsl:template>

<xsl:template match="register">
</xsl:template>

<xsl:template match="registerentry">
</xsl:template>

<xsl:template match="registerpage">
</xsl:template>


</xsl:stylesheet>

style.css

Now you have a (hopefully usable) (X)HTML file, you need a CSS for the styling.

A simple example:

body {
	font-family: TeX Gyre Schola, Century Schoolbook, serif;
	font-size: 12pt;
	margin: 0 auto;
	max-width: 40em;
	line-height: 1.44;
	-webkit-hyphens: auto;
	-moz-hyphens: auto;
	-ms-hyphens: auto;
	hyphens: auto;
}

h1, h2, h3, h4, h4, h6 {
	font-family: TeX Gyre Heros, Helvetica, Arial, sans-serif;
	color: #286000;
	-webkit-hyphens: manual;
	-moz-hyphens: manual;
	-ms-hyphens: manual;
	hyphens: manual;
}

.Titel, .chapter {
	margin-top: 1em;
	border-top: 1px solid #600a00;
}

img {
	max-width: 100%;
	max-height: 100%;
}

.lilypond {
	margin: 1em 0;
}

.lilypond img {
	width: 100%;
	margin-bottom: 0.25em;
}

Create a Cover

Create a cover.xhtml:

export2cover.xsl

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version= "2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output 
	method="xml"
	encoding="utf-8"
	indent="yes"
	omit-xml-declaration="yes"
/>

<xsl:template match="/">
<html>
    <head>
        <title><xsl:value-of select='//metavariable[@name="title"]'/></title>
	<link rel="stylesheet" href="style.css" type="text/css" ></link>
	<link rel="stylesheet" href="Styles/style.css" type="text/css" ></link>
    </head>
    <body>
        <div>
		<img src="Images/cover.jpg">
			<xsl:attribute name="alt">
				<xsl:value-of select='//metavariable[@name="title"]'/>
			</xsl:attribute>
		</img>
        </div>
    </body>
</html>
</xsl:template>

</xsl:stylesheet>

If you don’t have a different cover picture, create one from the first page of your PDF (using ImageMagick’s convert):

convert -density 196 songbook.pdf'[0]' +repage cover.jpg

Create OPF

The OPF file keeps the others together, it lists all resources of the ebook.

You will have to adapt the listing of images (since I use only PDFs converted to PNG) and add fonts manually, if you ship any.

export2opf.xsl

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version= "2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output 
	method="xml"
	encoding="utf-8"
	indent="yes"
/>

<xsl:template match="/">
<package version="2.0" xmlns="http://www.idpf.org/2007/opf" unique-identifier="BookId">
<!-- see http://www.idpf.org/epub/20/spec/OPF_2.0.1_draft.htm -->


	<identifier id="1"><xsl:value-of select='//metavariable[@name="identifier"]'/></identifier>
	<identifier id="isbn" scheme="isbn"><xsl:value-of select='//metavariable[@name="isbn"]'/></identifier>
	<title><xsl:value-of select='//metavariable[@name="title"]'/></title>
	<creator><xsl:value-of select='//metavariable[@name="author"]'/></creator>
	<subject><xsl:value-of select='//metavariable[@name="subject"]'/></subject>
	<description><xsl:value-of select='//metavariable[@name="description"]'/></description>
	<publisher><xsl:value-of select='//metavariable[@name="publisher"]'/></publisher>
    <language><xsl:value-of select='//document/@language'/></language>
	<rights><xsl:value-of select='//metavariable[@name="rights"]'/></rights>

    <metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
        <dc:title><xsl:value-of select='//metavariable[@name="title"]'/></dc:title>
        <dc:language><xsl:value-of select='//document/@language'/></dc:language>
        <dc:identifier id="BookId" opf:scheme="UUID">urn:uuid:3bbe3f04-4275-841f-44d7-7d3a4e0794a5</dc:identifier>
        <dc:creator><xsl:value-of select='//metavariable[@name="author"]'/></dc:creator>
        <dc:date><xsl:value-of select='//document/@date'/></dc:date>
        <meta name="cover" content="cover-html" />
    </metadata>

    <manifest>
        <item id="cover-html" href="cover.html" media-type="application/xhtml+xml"/>
        <item id="ncx" href="toc.ncx" media-type="application/x-dtbncx+xml"/>
        <item id="style" href="Styles/style.css" media-type="text/css"/>
        <item id="intro-html" href="Text/_intro.html" media-type="application/xhtml+xml"/>
  		<xsl:for-each select='//section'>
  		<item media-type="application/xhtml+xml">
  			<xsl:attribute name="id">aut_<xsl:value-of select="@implicit"/>-html</xsl:attribute>
  			<xsl:attribute name="href">Text/aut_<xsl:value-of select="@implicit"/>.html</xsl:attribute>
		</item>
	  	</xsl:for-each>
  		<item media-type="image/jpeg" id="cover" href="Images/cover.jpg"/>
 		<!-- again, LilyPond related images -->

  		<xsl:for-each select='//image'>
  		<item media-type="image/png" id="{@id}">
  			<xsl:attribute name="href">Images/<xsl:value-of select="substring-after(@name,'prd_hraban-temp-lilypond-')"/>.png</xsl:attribute>
  		</item>
	  	</xsl:for-each>
	  	<!-- add fonts manually -->

	  	<!--
	  	<item media-type="application/x-font-otf" id="" href="Fonts/xyz.otf" />
	  	<item media-type="application/x-font-ttf" id="" href="Fonts/xyz.ttf" />
	  	-->
    </manifest>

    <spine toc="ncx">
        <itemref idref="cover-html" />
        <itemref idref="intro-html" />
  		<xsl:for-each select='//section'>
  		<itemref>
  			<xsl:attribute name="idref">aut_<xsl:value-of select="@implicit"/>-html</xsl:attribute>
		</itemref>
	  	</xsl:for-each>
    </spine>

</package>
</xsl:template>

</xsl:stylesheet>

Create NCX (table of contents)

This one will probably differ from your setup – I (ab)use an index for an alphabetically ordered table of contents. The structure that ConTeXt outputs for index entries is somewhat uncomfortable...

export2ncx.xsl

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version= "2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output 
	method="xml"
	encoding="utf-8"
	indent="yes"
/>

<xsl:template match="/">
<!--
<!DOCTYPE ncx PUBLIC "-//NISO//DTD ncx 2005-1//EN" "http://www.daisy.org/z3986/2005/ncx-2005-1.dtd">
-->
<ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1">

    <head>
        <meta name="dtb:uid"           content="BookId" />
        <meta name="dtb:depth"         content="1" />
        <meta name="dtb:totalPgeCount" >
        	<xsl:attribute name="content">
        	<xsl:value-of select='count(section[starts-with(@detail,"Titel")])' />
        	</xsl:attribute>
        </meta>
        <meta name="dtb:maxPageNumber">
        	<xsl:attribute name="content">
        	<xsl:value-of select='count(section[starts-with(@detail,"Titel")])' />
        	</xsl:attribute>
        </meta>
    </head>

    <docTitle>
        <text><xsl:value-of select='//metavariable[@name="title"]'/></text>
    </docTitle>

    <docAuthor>
        <text><xsl:value-of select='//metavariable[@name="author"]'/></text>
    </docAuthor>

    <navMap>
        <navPoint id="aut_0" origin="aut_0" playOrder="0">
            <navLabel>
                <text>Cover</text>
            </navLabel>
            <content src="cover.html"/>
        </navPoint>
        <navPoint id="aut_1" origin="aut_1" playOrder="1">
            <navLabel>
                <text>Start</text>
            </navLabel>
            <content src="Text/_intro.html"/>
        </navPoint>
        
	<xsl:for-each select="//registerentry">
		<xsl:variable name="location">
			<xsl:value-of select='(./descendant::link/@location)[1]'/>
		</xsl:variable >
		<xsl:variable name="origin">
			<xsl:value-of select='//registerlocation[@internal=$location]/ancestor::section[starts-with(@detail,"Titel")]/@implicit'/>
		</xsl:variable >

		<navPoint>
			<xsl:attribute name="id">
				<xsl:value-of select='$location'/>
			</xsl:attribute>
			<xsl:attribute name="origin"><xsl:value-of select="$origin" /></xsl:attribute>
			<xsl:attribute name="playOrder">
				  <xsl:value-of select="2 + count(preceding-sibling::registerentry)" />
			</xsl:attribute>
				<navLabel>
					<text><xsl:value-of select='./registercontent'/></text>
				</navLabel>
				<content>
			<xsl:attribute name="src">Text/aut_<xsl:value-of select='$origin' />.html</xsl:attribute>
				</content>
		</navPoint>
	</xsl:for-each>

    </navMap>
</ncx>
</xsl:template>

</xsl:stylesheet>

The links to a named anchor should work, but don’t in my reader. I must investigate further...

Converting images

In my case, all images are note lines, generated by LilyPond in one temporary directory. Your mileage will vary. ConTeXt’s epub script creates a list of images in export-images.css, you might want to parse that to find all images.

cd $LILYPONDTEMPDIR
for IMG in prd_songbook-temp-lilypond-*.pdf; do
	NEWIMG=${IMG#prd_songbook-temp-lilypond-}
	convert -density 196 $IMG'[0]' -trim +repage ../songbook.tree/OEBPS/Images/${NEWIMG%.pdf}.png
done

Create ePub

Look at the tree above - are all your files in the right location? Then zip your tree!

cd songbook.tree
zip -uqr ../songbook.epub *

Now test it with your reader or editor (e.g. Calibre). (Apple iBooks or Adobe Digital Editions don’t work.)

Shell script

Of course I don’t do all these steps above manually. Here’s my little shell script.

epub.sh

#!/bin/bash
# call as "./epub.sh productname"

# ConTeXt is not in my usual path (setuptex needs too long to be called with every shell);
# probably your installation is at a different location
if [ "$TEXROOT" == "" ]; then
	source ~/Library/texmf/tex/setuptex ~/Library/texmf/tex
fi

# all my products are named like "prd_something.tex"
PRD=$1
if [ ! -e prd_${PRD}.tex ]; then
	echo "Product ${PRD} not found!"
	exit 1
fi

# all XSL files are in this WORKDIR
WORKDIR=epub-workflow

# my ConTeXt wrapper "makeit.sh" script creates PDFs with a date
ISODATE=`date +"%Y-%m-%d"`
PRDPDF=${PRD}_${ISODATE}.pdf

# LilyPond is run in this TEMPDIR and creates its note images there
TEMPDIR=lilytemp

EPUBPREFIX=epub_
TREE=${PRD}.tree
OEBPS=${TREE}/OEBPS
SAXON="java -jar /opt/local/share/java/saxon9he.jar"
# http://saxonica.com/documentation/html/using-xsl/commandline.html

if [ ! -f export.xml ]; then
	echo "File export.xml not found! Trying to run ConTeXt..."
	# you should replace this call with your own
	./makeit.sh ${PRD}
	exit 2
fi

if [ ! -f $PRDPDF ]; then
	echo Product $PRDPDF not found!
	echo Please run "makeit.sh $PRD"
	exit 3
fi

if [ ! -d $OEBPS ]; then
	echo Directory $OEBPS missing!
	echo Creating new ePub ...
	mtxrun --script epub --make ${PRD}
fi

if [ -d $OEBPS ]; then
	echo "Creating directories (might exist) ..."
	mkdir $OEBPS/Styles
	mkdir $OEBPS/Images
	mkdir $OEBPS/Fonts
	mkdir $OEBPS/Text

	echo Creating HTML ...
	$SAXON -o:$OEBPS/Text/_intro.html -s:export.xml -xsl:$WORKDIR/export2html.xsl
	echo Creating OPF ...
	$SAXON -o:$OEBPS/${PRD}.opf -s:export.xml -xsl:$WORKDIR/export2opf.xsl
	echo Creating ToC NCX ...
	$SAXON -o:$OEBPS/toc.ncx -s:export.xml -xsl:$WORKDIR/export2ncx.xsl
	echo Creating Cover ...
	$SAXON -o:$OEBPS/cover.html -s:export.xml -xsl:$WORKDIR/export2cover.xsl
	echo Creating cover image from first page of PDF ...
	convert -density 196 $PRDPDF'[0]' +repage $OEBPS/Images/cover.jpg
	
	echo Copying files ...
	cp style.css $OEBPS/Styles/
	# This copies only fonts from your project directory; adapt.
	cp *.?tf $OEBPS/Fonts/

	echo Converting images from PDF to PNG ...
	cd $TEMPDIR
	for IMG in prd_${PRD}-temp-lilypond-*.pdf; do
		NEWIMG=../$OEBPS/Images/${IMG#prd_${PRD}-temp-lilypond-}
		NEWIMG=${NEWIMG%.pdf}.png
		if [ ! -f $NEWIMG ]; then
			convert -density 196 $IMG'[0]' -trim +repage ${NEWIMG}
		fi
	done
	cd ..
	
	# delete old epub
	rm $TREE/${PRD}.epub
	rm ${EPUBPREFIX}${PRD}.epub
	echo Creating ePub ${EPUBPREFIX}${PRD}.epub ...
	cd $TREE
	zip -uqr ../${EPUBPREFIX}${PRD}.epub *
	cd ..
fi
echo Ready.