[tex-hyphen] Newest GitHub additions into CTAN?

Stojan Trajanovski stojan.trajanovski at gmail.com
Wed Dec 30 17:19:53 CET 2020


Dear Mojca,

As you correctly pointed out bellow, unfortunately, there is no encoding
for cyrgj and cyrkje in T2A (but also in the other cyrylic encodings like
OT2, LWN, LCY, X2, T2C, T2B).
To what I am aware, the biggest overlap is with the T2A, but still these
two characters are missing.

I don't recall it exactly, but this was probably the reason I used cyrgj
and cyrkje with apostrophes when making the babel contribution [page 12, 1].
[image: image.png]

When I removed the hyphen patterns containing cyrgj and cyrkje and manually
edited the files that are supposed to be created by a Ruby script, pdflatex
worked, otherwise it didn't work for me.

I don't have a special preference of whether T2A, some extension of T2A or
a specific macedonian encoding is the best choice as soon as the
hyphenation works in both unicode-aware as well as 8-bit engines. I trust
in your choice and I will leave it up to you.

Best,
Stojan

[1]
http://ctan.math.washington.edu/tex-archive/macros/latex/contrib/babel-contrib/macedonian/macedonian.pdf

P.S. An encoding that contains the full alphabet (except the standard
Unicode, MAC., Windows- etc.) is the old (7-bit): JUS I.B1.004 (ISO-IR-147)
done in '88 in former Yugoslavia:
https://www.itscj.ipsj.or.jp/iso-ir/147.pdf but I don't think this
information is any useful. It seems no one put an effort after that :)

On Wed, 30 Dec 2020 at 11:03, Mojca Miklavec <mojca.miklavec.lists at gmail.com>
wrote:

> Dear Stojan,
>
> On Wed, 23 Dec 2020 at 21:30, Stojan Trajanovski wrote:
> >
> > I wanted to ask, what is the planned timing for uploading the most
> recent hyph-utf8 changes into CTAN?
>
> We wanted to do the upload in 2020, but we're currently stuck at
> consistency checking.
>
> Can you please clarify which encoding is (mainly) being used for
> typesetting Macedonian? (No, we are not going to support 6 variants of
> Cyrilic encodings.)
> When we first added the patterns we ended up assuming a special
> encoding that was fit for Macedonian only.
> (Not that I understood how that would be useful given that almost no
> fonts come with support for that encoding, but that's a different
> topic :)
>
> You asked for removal of two characters from 8-bit versions of
> patterns based on the argument that they are missing from T2A. But
> then I tried to compare T2A and our definition of "macedonian"
> encoding [1] and nothing matches any longer, so it's no longer clear
> to me what exactly the Macedonians would want.
>
> MAC.  unicode name     T2A
> 0x83  U+0453  cyrgj  # -
> 0x9A  U+0459  cyrlje # 0xA7
> 0x9C  U+045A  cyrnje # 0xBB
> 0x9D  U+045C  cyrkje # -
> 0x9F  U+045F  cyrdzh # 0xB6
> 0xBC  U+0458  cyrj   # 0x6A
> 0xBE  U+0455  cyrdze # 0xAF
>
> If we claim that we support the T2A encoding as opposed to a custom
> one, then *ALL* patterns will change, rather than just those that you
> manually removed from the patterns.
>
> I would be grateful for some clarification here, and hopefully we get
> some feedback from Vasil as well.
>
> Thank you,
>     Mojca
>
> [1]
> https://github.com/hyphenation/tex-hyphen/tree/master/hyph-utf8/source/generic/hyph-utf8/data/encodings
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tug.org/pipermail/tex-hyphen/attachments/20201230/856dfc74/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 92016 bytes
Desc: not available
URL: <https://tug.org/pipermail/tex-hyphen/attachments/20201230/856dfc74/attachment-0001.png>


More information about the tex-hyphen mailing list.