[l2h] Any way of accurately identifying/converting em- and en-dashes?

Stuart Rossiter monsieurrigsby at googlemail.com
Wed Dec 9 16:50:49 CET 2009


  This revisits issues raised (but not resolved) in a 2003 post:

It appears that latex2html is (still) converting em- and en-dashes to
-- and - respectively. Since hyphens are also left as -, there is then
no way to distinguish (in the HTML) between things that were en-dashes
and normal hyphens (so you can't do the conversions to &endash; etc.
manually, even if you want to).

Also, the main script has do_cmd_texteemdash and do_cmd_textendash
routines (to convert to --- and -- respectively), but these don't seem
to get used when you explicitly use \textemdash and \textendash
commands, which I thought would be a way round this problem (it still
does the conversions to -- and -).

So it appears that:

-- latex2html can't distinguish these dashes properly (I assume that,
as for quotes, this is an issue with being able to definitively
identify them), although it's distinguishing *something* in doing the
conversions to -- and - ! (so maybe this *can* be fixed?)

-- there is also no way to "preserve" the dashes from the original in
a way which would allow for accurate manual adjustments afterwards.

Am I missing something, or is there any advice people can offer?

Thanks in advance,

