[tex-hyphen] hyphenation for Bulgarian language
anton at lml.bas.bg
Sat Oct 21 22:57:22 CEST 2017
[I am sending a CC of this message to Georgi Boshnakov and Stoyan Dimitrov]
As far as I understand, in order to accept new hyphenation patterns
1. to have a permission to distribute them with free license;
2. to have them encoded in UTF-8;
3. since the patterns are generated algorithmically, to have the
script which generates them;
4. to have an analysis of the differences between the new patterns and
the existing patterns by Mr. Boshnakov;
5. to have the opinion of Mr. Boshnakov about the new patterns.
I believe we were able to satisfy all these requirements. You already
got a message by Mr. Boshnakov. And at the end of this message you
will find url adresses that you can use in order to download a shell
script `hyph-bg.sh` with a permissible license which can be used in
the following ways.
This will print a short usage instructions.
This will generate (on the standard output) a text about the Bulgarian
hyphenation, including an analysis of the differences between the
Bulgarian hyphenation patterns by Mr. Boshnakov and the proposed new
If the system you use has pandoc installed, then you can also use one
of the following options in order to have an easier to read document:
In order to generate Bulgarian hyphenation patterns for TeX, the
following options should be used:
hyph-bg.sh --safe-morphology --standalone-tex
Both the left and the right hyphen mins are 2.
One important difference between the line-breaking algorithm used by
TeX and the line-breaking algorithm used by most other software is
that the algorithm of TeX is smart and can produce perfect results
even with fewer hyphenation possibilities. Because of this, with TeX
it makes sense to use hyphenation patterns which separate the words
only in the preferred positions. On the other hand, with software
using dumb line-breaking algorithm, it is perhaps preferable to use
hyphenation patterns which provide more hyphenation possibilities.
If it is possible to provide two different sets of the Bulgarian
hyphenation patterns, then the other software (not TeX!) should use
patterns produced in the following way:
(The option --no-hyphen-mins is because the current versions of Mozilla
ignore the hyphen mins in words containing a dash.)
The following are url addresses you can use in order to download the
script `hyph-bg.sh` and the results produced by it.
The script itself:
Documentation about the Bulgarian hyphenation:
The same in format PDF:
Hyphenation patterns for TeX:
Hyphenation patterns for other software:
More information about the tex-hyphen