[tex-hyphen] Serbian (Serbo-Croation) hyphenation patterns.

Mojca Miklavec mojca.miklavec.lists at gmail.com
Wed Jun 11 13:04:07 CEST 2008


On Wed, Jun 11, 2008 at 12:02 PM, Dejan Muhamedagic wrote:
> Hi Mojca,
>
> On Mon, Jun 09, 2008 at 08:31:46PM +0200, Mojca Miklavec wrote:
>> On Fri, Jun 6, 2008 at 12:42 PM, Dejan Muhamedagic wrote:
>> > Hi,
>> >
>> > On Fri, Jun 06, 2008 at 12:10:43PM +0200, Mojca Miklavec wrote:
>> >> Dear Dejan Muhamedagi??,
>> >>
>> >> I'm trying to reach the author of Serbo-Croation hyphenation patterns,
>> >> and I hope that one of the addresses above is the right one, if not,
>> >> please ignore the message (or let me know that I wrote to the wrong
>> >> person).
>> >>
>> >> The problematic part of the patterns (shhyphl.tex on CTAN) is
>> >> apparently this one:
>> >>
>> >> % General permission for use and non-profit redistribution is granted.
>> >> % For commercial use, contact the below address.
>> >
>> > Yes, I've been aware of that this precluded wider distribution.
>> > Sorry for not acting earlier. I'll replace the license with some
>> > which would allow free distribution.
>>
>> Thanks a lot!
>>
>> Well, besides distributing the patterns - distributing them alone is
>> not a problem, the question is how to make them work (including them
>> automatically).
>>
>> From that point of view it might make no sense to keep updates on the
>> shhyph.tex file. If you would agree with LPPL licence in the attached
>> file (I'm no licence expert - you may choose another one), that file
>> would be included on TeX Live, and it would make sense to make changes
>> to that file (if you decide to make any further changes to the file).
>>
>> We would probably add some more statements to make it clear what we
>> did with the original files nad from where those files came, but I
>> would prefer to have all the files under the roof before figuring out
>> that I forgot to write something and then repeat the whole process of
>> adding yet another piece of information to dozens of files.
>
> Agreed.
>
>> Once you decide for the licence (to satisfy Robin and the TeX Live
>> team), I have some more questions for you. Currently, serbian in
>> language.dat loads Latin labels:
>>
>>   \def\prefacename{Predgovor}%
>>   \def\refname{Literatura}%
>>   \def\abstractname{Sa\v{z}etak}%
>>   \def\bibname{Bibliografija}%
>>   \def\chaptername{Glava}%
>
> This should probably read "Poglavlje".

Can you send a patch to the Babel author? Or if someone has a better
suggestion ...

>>   \def\appendixname{Dodatak}%
>>   \def\contentsname{Sadr\v{z}aj}%
>>
>> but Cyrillic patterns. That makes no sense to me, but I might be
>> missing something obvious (except for many posts I have seen on
>> internet complaining that serbian doesn't work or at least asking
>> how/where to get cyrillic labels).
>
> No idea, didn't produce that file.
>
>> >> People argue that the patterns have a no-commercial licence, so the
>> >> patterns do not make it into TeX Live.
>> >>
>> >> The obscure part is that Cyrillic patterns which are nothing else but
>> >> a derived work with exactly the same patterns and the following
>> >> copyright note are fine and included in TeX Live:
>> >>
>> >> % This is `srhyphc.tex' file. It contains hyphenation patterns for Serbian
>> >> % language in the Cyrillic alphabet. TeX font encoding is T2A.
>> >> %
>> >> % This file is distributed under the terms of the GNU General Public License.
>> >> % Latest version of the license is at <http://www.gnu.org/copyleft/gpl.html>.
>> >> %
>> >> % Version: 1.0a
>> >> % Last change: 2003-06-09
>> >
>> > IIRC, I was never consulted regarding these changes. One or both of the
>> > authors did contact me, but right now can't remember the outcome.
>> > To the least, I am sure that I've never approved the change of
>> > the name of the patterns/language. I do however recognize the
>> > need to adjust them to various new circumstances, such as new
>> > alphabet and new language names. And for a long time I haven't
>> > been active in the TeX community.
>> >
>> >> %
>> >> % Credits:
>> >> %  - Initial hyphenation patterns for T1 font encoding by Dejan Muhamedagi\'c
>> >> %  - Improvements and adaptation to T2A font encoding by Strahinja Radi\'c
>> >> %  - Further improvements and integration into one file by Aleksandar Jelenak
>> >
>> > It would also be interesting to see the improvements.
>>
>> See the attached files (that definitely need some statements about
>> what we have done with them), but apart from the notes ... this is how
>> they should look like. You can't do anything with the attached file
>> yet, but I can send you the other file needed to load the patterns
>> properly. (Or you can get them in the svn repository
>> svn://tug.org/texhyphen/trunk/tex/)
>>
>> We did not modify any functionality - the words won't hyphenate any
>> different (neither worse nor better), but it will be easier to handle
>> the patterns with XeTeX and LuaTeX that can both handle the patterns
>> natively.
>>
>> I'm sending you the files to check the content and licence.
>
> Thanks, I'll take a look. It's been rather busy this week, so I
> probably won't be able to do anything on the matter before the
> weekend.

OK.
But we're really asking about the licence permission at the first
place. Any other modifications to the text can be added at any time.
Also, please take a short glimpse at the cyrillic patterns, just to
check if the conversion was OK. I hope it was, but it's always good to
have a second opinion.

> ***
>
> There's the language/patterns name issue I would like to raise.
>
> Since the country fell apart (Yugoslavia), there are now three
> language names: Bosnian, Croatian, and Serbian, where there was
> only one: Serbocroatian. Please note that the patterns name is
> shhyphl.tex where "sh" stands for srpskohrvatski (Serbocroatian)
> and "l" for the latinic alphabet.
>
> There can't be any dispute about whether the patterns are usable
> for all three languages, so I think that the patterns file should
> be shared between the three.

I probably agree about that, but there's no additional "Bosnian"
support in Babel either. I do not know what they use.

Croatians have some patterns that are different. It have no idea
whether that means better or worse, but since they have them, I would
not try to convince them to use shhyph without any comparison being
made.

> Therefore, I think that the prefix
> of the patterns file should remain "sh". Last time I looked,
> there has been proliferation of new names such as srhyph or
> hrhyph (though I believe that hrhyph was a different pattern set
> altogether).

Yes, hrhyph are different patterns, and srhyph are cyrillic ones
(loaded by default together with latin labels!!!).

But we're modifying + renaming all of them now, so shhyph would not be
used at all, so it makes no sense to try to modify & include it now.
Let's focus on new patterns instead.

See http://www.tug.org/svn/texhyphen/trunk/tex/patterns/utf8/ for the
new patterns. I did not commit yours yet since I'm waiting for your
approval to modify the licence, but the patterns would have been named
hyph-sr-latn.tex. I think that if Bosnians decide to use them one day,
they can still add an entry to language.dat or load the patterns
inside Bosnian pattern loader.

Unless you're willing to add support for Bosnian to Babel as well.
Even though I understand the Serbo-Croatian language(s), I don't hear
any difference between them (I do not distinguish them), so I cannot
be of any help here.

I'm pretty sure that if we call the patterns Serbocroatian now, some
people will pop up at some time complaining that the language doesn't
exist any more and they will try to convince Karl to rename them. A
similar situation with "Norwegian".

A really nice thing to do would be adding support for Cyrilic script
for Serbian to babel though.

Mojca

A short summary:
[priority]
- Do you agree with LaTeX licence?
[once you have time]
- Please check both files for possible conversion mistakes & licence statement.
- If you want, try to figure out how to commit the change in the label
to Babel (Glava seems a bit weird translation to me as well)
- Any chance to convinnce someone to submit Cyrillic labels to Babel?
(It's really only about translating a dozen of labes.)
- if you want, you may inspect the situation with Bosnians, but if
they don't do that alone, nobody can help them (if they won't even
know that their language exists in Babel, what's the use?)


More information about the tex-hyphen mailing list