[tex4ht] curiosity about unicode.4hf
gamboz at medialab.sissa.it
Tue Mar 14 10:30:12 CET 2017
On Mon, 13 Mar 2017 22:53:47 +0100,
Karl Berry wrote:
> Hi Matteo,
> I get "a.html" that contains:
> I guess you're expecting the literal UTF-8 right single quote instead of
> the entity syntax?
> AFAIK, ' and " are illegal in attributes,
> I have used those characters in attribute values. Anyway, how are
> attributes related to the example? I'm baffled here, sorry.
my fault, sorry, more correct is
' is illegal in attributes delimited by ' (e.g. xxx='aa'aa')
" is illegal in attributes delimited by "
I wrote that only because unicode.4hf forces " to be an entity and I
thought I can see the reason in the "delicacy" of using " in some
> (and #x2018 is not in the file - texlive2016).
> Does anyone know why &x2019; ended up in unicode.4hf?
> I don't know why Eitan decided to translate ASCII ' to the Unicode
> entity value and leave ASCII ` output as literal UTF-8 (with your options).
> I don't know what the implications would be of changing it, either; not
> something I would want to do lightly.
> Briefly looking at the source file (tex4ht-fonts-4hf.tex), I don't see
> any explanation. Could have missed it.
> Does outputting the entity cause some problem?
not directly (see below)
> htlatex a "xhtml" " -cunihtf -utf8"
> Why do you want to use those options in the first place?
> (Just wondering.)
I use tex4ht to transform some TeX fragments to XML
For instance the authors names of some physics articles:
which is correctly shown by any decent xml viewer as:
(for instance https://repo.scoap3.org/record/19196/files/main.xml)
Sometimes, I need to compare the author's name from these XML to what
we have in our DB, and what we have in our DB is always in the form
(with simple ' instead of ’ or ’)
This is not a big problem (I just replace ’ with ' and do my
But I was wandering why forcing the use of ’ and, as you noted,
I did not want to change it without knowing the rationale behind it
More information about the tex4ht