[tex-hyphen] Hyphenation patterns for Belarusian
Arthur Reutenauer
arthur.reutenauer at normalesup.org
Wed Aug 31 16:38:27 CEST 2016
> Here it is http://extensions.services.openoffice.org/en/project/dict-be-official
Thanks.
> The file itself is in cp1251 and needs conversion to UTF-8
> iconv -f cp1251 -t UTF-8 < ./hyph_be_BY.dic > ./hyph_be_BY.txt
> + some hand editing to put the content inside \patterns{}
Thanks, I know how to do that :-)
> According to comment on line 1414: intention to include such awkward patterns
> was to prohibit hyphenation if any part that is composed solely of consonants.
There’s something odd anyway. I still suspect the actual list of
patterns does not reflect the intention of the author.
> Ok, I'll ask.
Thanks. I don’t mind being copied on the conversation, even if it is
in Belarusian. You should contact Sviatlana Liasovich as well, since
she’s mentioned as having made corrections; in fact I think it would be
accurate to consider her as the sole author of the OpenOffice file,
since I can’t discern any trace of the original patterns.
>> That’s correct, but actually I would just write
>>
>> д2ж
>> д2з
>> .пад3
>>
>> Using lower numbers to begin with makes it easier to refine later.
>>
>> That being said, is пад really always a prefix?
>
> This would make life too easy :) In some words it is a part of the root and is hyphenated differently.
> E.g.: па-да-ру-нак, па-дзел, вы-па-дак, па-да-плё-ка.
OK, that’s what I suspected :-) In that case it’s probably safer to
stick to
д2ж
д2з
.па2д3ж
.па2д3з
and input падзел as an exception: \hyphenation{па-зел}.
You need an even number after .па because of patterns of the type CVn,
with n an odd number to allow break; the OpenOffice patterns have C8V3,
but I would recommend CV1.
> Hyphenation right before й or ў is prohibited at all times, no exceptions. So 8 will be just right, I believe.
That sounds right. It’s of course all right to use 8 when break is
really prohibited, but the current files use way too much of them.
Best,
Arthur
More information about the tex-hyphen
mailing list