[XeTeX] Sanskrit hyphenation
Yves Codet
ycodet at club-internet.fr
Tue Mar 29 09:23:32 CEST 2005
Hello.
Le 28 mars 05, à 16:12, somadevah at aol.com a écrit :
> One question. The sanhyph.tex file defines patterns for Devanagari
> script only. Would it not make sense to add other scripts too (using
> the same patterns)? In Southern India Sanskrit is often written with
> local scripts (as it is sometimes in Bengal etc)...
It's a good idea, but I don't know all of those scripts very well and
the patterns would have to be checked. Besides I wonder how the initial
loop:
\newcount\n \n="0901
\loop \lccode\n=\n \ifnum\n<"0963 \advance\n by 1 \repeat
can be modified so as to make it go through 0981--09CD (Bengali),
0B82--0BCD (Tamil)... 0D02--0D4D (Kannada). Or could it simply be:
\newcount\n \n="0901
\loop \lccode\n=\n \ifnum\n<"0D4D \advance\n by 1 \repeat
> and then should it not include roman transliteration too? However, I
> must note that if the exactly same patterns are used for the diacritic
> Roman transliteration the result is readable but a bit strange (you
> get lines beginning with consonant clusters).
I think it's strange because we compare with hyphenation habits in
English, German, French... But if we bear in mind that:
tya-
ktvā
is a mere transposition of:
त्य-
क्त्वा
in Latin script, it doesn't seem so strange. If we don't want such
hyphenations (personally I'm not shocked by them), I guess there should
be etymological patterns and it may take a fairly long time to define
them. Also, they should be best described in another file, I suppose,
otherwise the above loop could be:
\newcount\n \n="0061
\loop \lccode\n=\n \ifnum\n<"0D4D \advance\n by 1 \repeat
and we might hit some memory limit.
Kind regards,
Yves
More information about the XeTeX
mailing list