[tex4ht] indexing4ht and accented characters

LianTze Lim liantze at gmail.com
Sun Mar 8 07:25:59 CET 2020


Hello,

I've been trying to get tex4ebook working with indices containing accented
characters.

My sample.tex file:

%%%%%%%
\documentclass{book}
\usepackage[noautomatic]{imakeidx}
\makeindex

\begin{document}
\chapter{Foo}
\section{Bar}
Lorem ipsum dolor\index{dolor} sit amet, consectetur adipiscing elit, sed
do eiusmod tempor incididunt ut labore et dolore magna aliqua.

\chapter{Baz}
Ut enim ad minim veniam, quis nostrud exercitation ullamco
laboris\index{láboris} nisi ut aliquip ex ea commodo consequat. Duis aute
irure dolor in reprehenderit in voluptate velit esse cillum
dolore\index{dolore|seealso{dolor}} eu fugiat nulla pariatur. Excepteur
sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt
mollit anim id est laborum.

\printindex
\end{document}
%%%%%%%

(Note that if \index{láboris} is changed to \index{laboris} then this can
be compiled with tex4book + xindy perfectly.)

My sample.cfg, using indexing4ht from
https://github.com/michal-h21/helpers4ht:

%%%%%%
\RequirePackage{indexing4ht}
\Preamble{xhtml}
\begin{document}
\EndPreamble
%%%%%%

I prefer using latexmk so my sample.mk4 contains

%%%%%%%
Make:latexmk {}
%%%%%%%

and my latexmkrc to invoke xindy during tex4ebook:

%%%%%%
$makeindex = 'xindy -L english -C utf8 -M %R.xdy %O -o %D %S';
%%%%%%

These files are in the attached tex4ebook-indexing.zip. When running
tex4ebook with

tex4ebook -c sample.cfg -e sample.mk4 sample.tex

I get the errors

----------
[ERROR]   htlatex: Compilation errors in the htlatex run
[ERROR]   htlatex: Filename Line Message
[ERROR]   htlatex: ./sample.xref 26 Missing \endcsname inserted.
[ERROR]   htlatex: ./sample.xref 26 LaTeX Error: Missing \begin{document}
in `sample.cfg'.
[ERROR]   htlatex: ./sample.xref 26 Extra \endcsname.
[ERROR]   htlatex: ./sample.xref 26 Missing \endcsname inserted.
[ERROR]   htlatex: ./sample.xref 26 Missing \endcsname inserted.
[ERROR]   htlatex: ./sample.xref 26 Extra \endcsname.
----------

Looking in sample.log I see:

----------
(./sample.xref
! Missing \endcsname inserted.
<to be read again>
                   \let
l.26 ...name acp:c\endcsname {3}aboris}{idxkw3}{9}
                                                  %
The control sequence marked <to be read again> should
not appear between \csname and \endcsname.
-----------


And the offending line in sample.xref looks like this:

%%%%%%
\:CrossWord{idxkwl\let \prOteCt \relax \let \prOteCt \relax \Protect
\csname acp:c\endcsname {3}aboris}{idxkw3}{9}%
%%%%%%

The generated .epub looks just fine, and the index entry looks just fine
too (with "láboris").

My question is if the error about the .xref would cause any real problems
to the .epub, and if it can be avoided?


(I do have another question regarding indexing4ht. There's a comment in
indexing4ht.ht that says

% configure variable which is saved on every index call
% section mark is used by default, other possible values may be paragraphs,
or
% links to individual index entries
\NewConfigure{indexvalue}{1}
\Configure{indexvalue}{\getCurrentSectionNumber}
\NewConfigure{indexidentifier}{1}
\Configure{indexidentifier}{\CurSecHaddr}


Is it easy to override and use "links to individual index entries" instead,
with a line or two of \Configure{indexvalue} and
\Configure{indexidentifier} in one's own .cfg file? I've tried looking at
the imakeidx.4ht at https://tex.stackexchange.com/a/350012 but could
extract the minimal set of lines to copy.)

Thanks,
LianTze
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tug.org/pipermail/tex4ht/attachments/20200308/7a6739ac/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tex4ebook-indexing.zip
Type: application/zip
Size: 1595 bytes
Desc: not available
URL: <https://tug.org/pipermail/tex4ht/attachments/20200308/7a6739ac/attachment.zip>


More information about the tex4ht mailing list.