[XeTeX] hyphenating words with a hyphen
Ricard Roca
ricardroca at gmail.com
Mon Nov 6 02:35:55 CET 2006
Hi
> It sounds like you'd be able to get around your problem by using a
> different character for the "hyphenation hyphen" than the ascii one. I
> don't really know this stuff, but marking up your text with the
> unicode char "2010 (‐) would seem to fix the conundrum above.
>
> \lccode"2010="2010
I have tested with char "2010 (unicode hyphen), present only in some
fonts, and with char "00AD (virtual hyphen), present in any font with a
complete latin1 charset (more useful). If I input any of these chars
instead of char "002D (the normal hyphen) words can break in places
other than the hyphen, but when the optimal break point coincides with
the hyphen, TeX breaks the word adding another hyphen after the written
one (--). Obviously, if I have changed \lccode"00AD="00AD, TeX thinks
char "00AD is a normal letter, and tries to add a hyphen after it. I can
change the hyphenation patterns and don't let TeX hyphenate a word after
an explicit hyphen, of course, but that's ridiculous... With this method
of changing the input hyphen you can choose to change the
\defaulhyphenchar or not: the result is the same. For me it's clear that
the input char must be the normal hyphen (char "002D), because TeX has
to know it's possible to break a word just after that sign, but without
adding another hyphen sign, as it would do after a letter. If we use
another char, and we tell TeX it's a letter (with \lccode), TeX will put
a hyphen sign after it (--); if we doesn't change the \lccode, TeX won't
break the word if the optimal break point coincides with the explicit
hyphen.
If we use the normal hyphen as the input char and change the
\defaulthyphenchar, compound words can break at points other than the
hyphen, but will never break if the optimal break point coincides with
the hyphen
The main reason because I would change this TeX behaviour is the
following: when I copy text from another place, if I want good
hyphenation, I have to change all the hyphens from - to |-|, but taking
care with those hyphens that appear in math mode, or the ones found in
measures like -2in or dashes (-- ---). I can refine the
find-and-substitute algorithm of the text editor and, e.g. only change
those hyphens that appear between letters, but that would substitute
$a-b$ too. So I always have to revise everything carefully, and
sometimes it is a bit tedious.
So, changing the character used doesn't work for me: in fact, it's
better for me to have |-|, because I can see which hyphens I have
changed (otherwise they look the same), and I would have to do the
find-and-substitute operation anyway.
Thanks,
Ricard
More information about the XeTeX
mailing list