[tex-hyphen] Unicode code points or UTF-8 codes?

Claudio Beccari claudio.beccari at gmail.com
Wed Apr 13 17:19:47 CEST 2016


On 13/04/2016 01:01, Mojca Miklavec wrote:
> Well, you'll probably end up with one weird-looking pattern "8́"
> (looking like "eight with acute" and in fact saying "do not hyphenate
> before the combining acute accent"), but such is life ...
This very weird-looking way of setting a no break code before the 
combining acute accent is valid for both lualatex and xelatex (I used 
the value of 4, but probably I will end up with the value of 2).
The less weird-looking way of setting the no break code by means of 
^^^^0301 is good for xelatex but it does not work with lualatex.
I thought this would be an interesting piece of information for you for 
your team when it's time to generate the 8-bit (EC) compliant pattern 
files, because certainly you cannot transform a U+0301 code point into e 
non existing T1 code.

Mojca, I am sure that you and the team do a wonderful work when it's 
time to make up the various file necessary for ptex, pdftex, xetex, and 
luatex; the suggestion you gave me to avoid worrying about these details 
(because at due moment you'll do the whole hob with a suitable script) 
is not valid for me and, I suppose, for any other author of pattern 
files. The reason is simple: before sending you our files we have to 
test them; in order to do so we have to generate correct files, save 
them in the due places of our personal or local tree, create or edit the 
local language.{def|dat|dat.lua}files, generate the overall language.* 
files, create the formats, run suitable tests; and when we have working 
files, then we can send you our results.

Only through this delicate process I (we) can avoid sending you trash 
files that are buggy or do not perform as intended.
Unfortunately I am non fluent with luatex, and the lua scripting 
language; therefore I have to handle my tests with TeX and LaTeX codes. 
Apparently now everything is working properly also with xelatex and 
lualatex; and the gloss-latin.ldf apparently is handling correctly thre 
loaded pattern files as I wanted to do from the very beginning, but I 
want to be "more sure" of the correctness of the results. Since I got 
involved with the GregorioTeX group and we are trying to have good and 
reliable pattern files to hyphenate liturgical latin, my tests might 
last longer than what I hoped, therefore do not expect I send you the 
edited and hopefully correct pattern files in a short time; ... unless 
you ask me to send you temporary files, so you can experiment with the 
new proposed directory structure and the difficult transformation to 
8-bit patterns. Please let me know in case you'd like a temporary  set 
of pattern files.

Thank you for the precious suggestions you gave me. All the best
Claudio


More information about the tex-hyphen mailing list