[XeTeX] Incorrect Bengali ZWJ behavior and v2 script spec
khaledhosny at eglug.org
Mon Mar 18 19:45:18 CET 2013
On Mon, Mar 18, 2013 at 10:13:16AM -0500, Ian-Mathew Hornburg wrote:
> Hello, and thanks to Khaled and contributors for their great work on this
> project! I do work in Indic scripts, and XeTeX has been immensely helpful
> in setting them.
> I may have identified two possible bugs in 0.9999.0: the release notes
> indicate that the version-2 OpenType Indic script tags are now supported,
> and I’ve been testing various Bengali-script fonts with the git version of
> XeTeX and a current install of TeX Live 2012 to check them for correct
> shaping behavior. I’ve posted a MWE reproducing some examples from the
> Microsoft standard here: [http://pastebin.com/mgAX8c7U].
> Microsoft ships two Bengali fonts (Vrinda and Shonar Bangla; both v6.80)
> with Windows 8 that support both the older (beng) and newer (bng2) Bengali
> script specs. (Older versions of each are shipped with Windows 7 and other
> Microsoft products.) The fonts behave correctly when using the beng script
> feature, with the exception of a particular ZWJ sequence: the Microsoft
> spec [https://www.microsoft.com/typography/OpenTypeDev/bengali/intro.htm]
> says that the sequence of consonant-hasant-ZWJ-consonant should prevent a
> ligature of the two consonants, then render a half-form of the first
> consonant. XeTeX currently fails to suppress the ligature.
This looks like a HarfBuzz bug, I can reproduce it with its test tool
and indeed Uniscribe gives a different result. If you can report this to
HarfBuzz developers, you will be able to describe the issue better than
me, else I’ll try to report it.
> However, when passing the bng2 script to the fonts, they both fail to
> render correctly at all. I might guess that something’s going wrong because
> the font contains both versions of the script tag, and I don’t know what
> happens on the HarfBuzz side when selecting which script to use when
> shaping a font.
That is a XeTeX bug, the tags with parsed as ISO codes not OpenType
tags. Should be fixed in the next point release.
More information about the XeTeX