RTL

From Wiki
Jump to navigation Jump to search

Here we collect several tips and tricks for dealing with right-to-left (RTL) texts as well as BiDi (bidirectional) texts. The large number of hooks in different commands makes it possible to use CONTEXT's support for such documents. However, there are limitations and/or bugs, as well. Everything here is tested in MkIV and we recommend that you try one of the most recent versions since certain bugs are fixed and features are added only in the beta. All the credit should go to Hans, Wolfgang and others who respond to questions on the mailing list.

Basics

Use \setupalign to change the text direction to RTL. This command sets the page, paragraph and text directions.

\setupalign[r2l]
\starttext
\input knuth
\stoptext

This produces:

BiDi text

To work with documents with mixed RTL and LTR text, we need the bidi algorithm implemented in \setupdirections. Since numbers are LTR elements in Arabic, Persian and other RTL languages, we almost always need this.

% mode=mkiv
\usemodule[simplefonts]
\setmainfont[dejavusans][features=arabic,range=arabic]
% Some fonts have glyphs for ZWNJ, etc.  Use the following to suppress them.
\setcharacterstripping[1]
\setupdirections[bidi=global,method=one]
\setupalign[r2l]
\starttext
 این نمونه ساده از متنی فارسی است که در \CONTEXT\ تهیه شده است.
توجه کنید که عدد ۱۰۰۰ به صورت صحیح نمایش داده می‌شود.
\stoptext

This produces:

internal error: copy error rU5lFO/cropped.pdf

LTR paragraph in RTL document

In addition to short LTR text pieces, sometimes one needs an LTR paragraph in a mainly RTL document. Using \righttoleft at the beginning of a paragraph (i.e., in vertical mode) achieves this. This command needs to be placed inside a group to limit its scope. I use the following.

\definestartstop[LTR] [before={\begingroup\lefttoright},after=\endgroup]

This can be used either as \LTR{some English text} for short pieces (for example, inside an RTL paragraph) or as a start/stop construct to produce LTR paragraphs.

Indic numerals

Arabic, Persian and Urdu documents typically use a different set of digits, called Indic; it's kind of ironic that the normal digits used in Latin languages are called Arabic! The ConTeXt beta has several number conversion methods to achieve this: persiandecimals, arabicdecimals. Note that these are different from persiannumerals and arabicnumerals. The naming of the latter is inspired from Roman numerals and the like, which use letters for displaying numbers.

To use these conversion methods, you should look for a conversion or numberconversion in the command of interest. Setting that to the above values would achieve the desired behavior. However, a few elements lack proper BiDi support; so it is best to place the above conversions inside a left-to-right block to guarantee correct number display.

For instance to get the page numbers using Indic digits, we can use the following.

\setuppagenumber[numberconversion=persiandecimals]

Indic numerals in math mode

In Persian (and perhaps Arabic and Hebrew) we often want to have Indic numerals in math formulas. We can use a fallback mechanism to substitute all Latin digits by the Indic ones.

\resetfontfallback  [mathdigits]
% use arabicindic for standard Arabic (Indic) digits
\definefontfallback [mathdigits] [dejavusansmono] [digitsextendedarabicindic] [check=yes,force=yes,offset=digitsnormal]
\definefontfallback [mathdigits] [dejavusansmonobold] [digitsextendedarabicindic] [check=yes,force=yes,offset=digitsbold]
\definefontsynonym[MathRoman][file:xits-math.otf][features=math\mathsizesuffix,goodies=xits-math,fallbacks=mathdigits]
\setupbodyfont[dejavu]
\starttext
$3+2=5 \quad \bf 3+2=5$ \endgraf
\stoptext

This produces:

This works in the beta but not on the stable here.

By (re)defining certain macros, we can use commands like \digits to properly translate decimal points, thousand separators, etc. to their Arabic/Persian equivalents.

\def\digitsperiodsymbol{٫}
\digits{1.5}

Structural elements

Footnotes

One may has complex requirements for footnotes in a bidirectional documents[1], but there are two basic elements that most RTL documents ask for.

  • The footnote text had better go through the BiDi algorithm because it may contain numbers or otherwise LTR material.
  • Footnote rule should either default to the right-hand side or agree with the direction of the first footnote paragraph.
\setupfootnotes[rule=paragraph] % available in beta only; set it to right to have it always on the right.
\startsetups[bidi:footnotes]
  \setupdirections[bidi=on]
\stopsetups
\setupnotes[footnote][setups=bidi:footnotes]

To change the numbers to Indic numerals (used in Arabic, Persian and Urdu documents), you can use the following.

\setupnotation[footnote][numberconversion=persiandecimals] % available in beta only

In general, footnotes originating in RTL paragraphs are typeset on the right and those within LTR text produce LTR footnotes. With the above change, both types end up with Indic digits. Some more work is to be done if these are not what you want.

Sectioning

To get the section numbers right as well as their directions both in the head and in the table of contents, I use the following setup.

\setuphead[part,chapter,section][conversion=LTRpersiandecimals,numberstyle=\righttoleft]

Itemize

Use persiandecimals or the other values to number itemizations.

\startitemize[persiandecimals]
\item First
\item Second
\stopitemize

The multi-column versions (using keys columns or horizontal) have a default left-to-right direction. To fix that, use either of the two solutions below. One sets is per \startitemize while the other sets it globally.

\startitemize[columns,two][direction=reverse]
\item First
\item Second
\item Third
\item Fourth
\stopitemize
% The global solution
\setupmixedcolumns[itemgroupcolumns][direction=reverse]

Enumerations and descriptions

Enumerations are a kind of descriptions, hence I focus on the former here. Same restrictions apply to the latter as well. The discussion here is based on recent betas, which differs slightly from the (old) stable version here.

In MkIV we use alternative option in \defineenumeration to pick its general form. The only value that works properly is serried.

% mode=mkiv
\usemodule[simplefonts]
\setmainfont[dejavusans][features=arabic,range=arabic]
\setupdirections[bidi=global,method=one]
\setupalign[r2l]
\defineenumeration[theorem][alternative=serried,hang=2,width=fit,text=قضیه ,numberconversion=persiandecimals]
% persiandecimals came back in a beta and is not available on the stable from texlive 2015.
\starttext
\starttheorem
مجموعه اعداد صحیح نسبت به اعمال جمع و تفریق بسته است.
\stoptheorem
\stoptext

This produces:

internal error: copy error 8aHFpm/cropped.pdf

The alignment of the second line is not right here but it works fine on my machine. Maybe the garden's CONTEXT version is old.

Using alternative=left or alternative=right produces almost fine results in RTL but the placement of head text on the right-hand side adds some extra space. To avoid this, one can set width=fit. Using alternative=top places the head on the left-hand side even in RTL mode. This can be fixed via setting headalign (if the text is itself RTL).

In any case, the worst result comes from setting hang which places the hanging on the incorrect side (left for RTL) and pushes the head text into the margin on the correct side (right in RTL)!

Multi-column document

Similar to the above, we can pick right-to-left ordering for the columns.

\startcolumns[n=2,direction=left]
\input knuth
\stopcolumns

There is currently a bug that leads to incorrect section numbering inside RTL columns.

Tables

CONTEXT has several mechanisms for typesetting tables: see Tables_Overview. Natural tables are the recommended construct, but they are somewhat verbose. It's easier for quick small tables to use the older macros adapted (and enhanced) from the TaBlE package. This is not actively developed, so we do not hope for new RTL features in particular.

Using them on their own ignores the paragraph direction and sets an LTR table. However, when wrapped in \leftaligned or its friends, we get RTL/LTR tables depending on the paragraph direction.

\usemodule[simplefonts]
\setmainfont[dejavusans][features=arabic,range=arabic]
\setupdirections[bidi=global,method=one]
\setupalign[r2l]
\starttext
\midaligned{\starttable[|c|c|]
\NC \REF[cB]{ماه} \NC \REF[cB]{تعداد روز} \NC \AR
\NC ژانویه \NC ۳۱ \NC \AR
\NC فوریه \NC  ۲۸ \NC \AR
\NC مارس \NC ۳۱ \NC \AR
\stoptable}
\stoptext

This produces:

internal error: copy error lhq9nI/cropped.pdf

TODO

Here are some of the things to be added:

  • Natural tabels
  • Float numbering
  • Margin notes
  • Dates and time
  • Indices and sorting


For more information, you can look at Dabeer that is a sample set of macros (ongoing work) for typesetting Persian documents using ConTeXt.