[XeTeX] hyphenating words with a hyphen

Ricard Roca ricardroca at gmail.com
Mon Nov 6 02:35:55 CET 2006


Hi
> It sounds like you'd be able to get around your problem by using a 
> different character for the "hyphenation hyphen" than the ascii one. I 
> don't really know this stuff, but marking up your text with the 
> unicode char "2010 (‐) would seem to fix the conundrum above.
>
> \lccode"2010="2010
I have tested with char "2010 (unicode hyphen), present only in some 
fonts, and with char "00AD (virtual hyphen), present in any font with a 
complete latin1 charset (more useful). If I input any of these chars 
instead of char "002D (the normal hyphen) words can break in places 
other than the hyphen, but when the optimal break point coincides with 
the hyphen, TeX breaks the word adding another hyphen after the written 
one (--). Obviously, if I have changed \lccode"00AD="00AD, TeX thinks 
char "00AD is a normal letter, and tries to add a hyphen after it. I can 
change the hyphenation patterns and don't let TeX hyphenate a word after 
an explicit hyphen, of course, but that's ridiculous... With this method 
of changing the input hyphen you can choose to change the 
\defaulhyphenchar or not: the result is the same. For me it's clear that 
the input char must be the normal hyphen (char "002D),  because  TeX has 
to know  it's possible to break a word just after that sign, but without 
adding another hyphen sign, as it would do after a letter. If we use 
another char, and we tell TeX it's a letter (with \lccode), TeX will put 
a hyphen sign after it (--); if we doesn't change the \lccode, TeX won't 
break the word if the optimal break point coincides with the explicit 
hyphen.

If we use the normal hyphen as the input char and change the 
\defaulthyphenchar, compound words can break at points other than the 
hyphen, but will never break if the optimal break point coincides with 
the hyphen

The main reason because I would change this TeX behaviour is the 
following: when I copy text from another place, if I want good 
hyphenation, I have to change all the hyphens from - to |-|, but taking 
care with those hyphens that appear in math mode, or the ones found in 
measures like -2in or dashes (-- ---). I can refine the 
find-and-substitute algorithm of the text editor and, e.g. only change 
those hyphens that appear between letters, but that would substitute 
$a-b$ too. So I always have to revise everything carefully, and 
sometimes it is a bit tedious.

So, changing the character used doesn't work for me: in fact, it's 
better for me to have |-|, because I can see which hyphens I have 
changed (otherwise they look the same), and I would have to do the 
find-and-substitute operation anyway.

Thanks,

Ricard


More information about the XeTeX mailing list