[texhax] Justification through glyph variants

Pierre MacKay pierre.mackay at comcast.net
Sun Dec 4 15:12:42 CET 2011

On 12/2/2011 7:21 PM, Reinhard Kotucha wrote:
> On 2011-12-02 at 12:11:52 +0000, Philip TAYLOR wrote:
>   >  Joel C. Salomon wrote:
>   >
>   >  >  In some older Hebrew books, and in Hebrew calligraphy, a
>   >  >  technique used to align text to the outer margin is stretching
>   >  >  letters.  Certain letters are particularly stretchable; in fact,
>   >  >  Unicode has several "wide letters" encoded in the Alphabetic
>   >  >  Presentation Forms area.
>   >  >
>   >  >  For reference, compare:
>   >  >
>   >  >      א = ﬡ, ד = ﬢ, ה = ﬣ, כ = ﬤ, ל = ﬥ, ם = ﬦ, ר = ﬧ, ת = ﬨ.
>   >  >
>   >  >  At any rate, is there any way to make (any version of) TeX use
>   >  >  these to help justify lines?
>   >
>   >  I personally know of no way of instructing TeX to consider these
>   >  when optimising the layout of a paragraph, but Hàn Thế Thành's
>   >  microtypographic extensions to PdfTeX offer an alternative.  It
>   >  seems to me that, in an ideal world, what one would actually want
>   >  is a combination of the two such that given (for example) "ת" and
>   >  "ﬨ" as the lower- and upper- bound respectively, a variant of
>   >  Thành's work might usefully interpolate between the two.  What this
>   >  might add to the complexity of TeX's already complex paragraphing
>   >  algorithm [2], I do not like to think !
> It's a matter of fact that Thành's microtypographic extensions are a
> vast improvement.
> However, there is no way to interpolate between two glyphs of the same
> character.  Another problem is that pdfTeX doesn't support Unicode.
> Hence, even if a font provides "wide letters", an enormous amount of
> work is required to make them accessible.
> I absolutely agree with Arno.  I'm convinced that if there is a
> reasonable solution at all, it's definitely LuaTeX.
> Regards,
>    Reinhard
I ought not to get into this, because I don't know Hebrew, and have set 
at most a line of it by glyph identities alone, but the opportunity once 
again to support Phil is too good to miss,.

Let's start with the statement that pdfTeX doesn't support Unicode.  I 
have written a short,  simple and extendable plain TeX routine that 
allows me to set pages of mixed English and Korean (both modern and 
Classical forms) from the set of 256-character Unicode fonts that the 
Korean TeX Users Group has made available.  At the price of keeping a 
large library of fonts rather than a composite Open-Type font, it will 
accommodate even glyphs in  the U+1FFFF world, although I have never 
needed to.  I hope that pdfTeX has not deactivated the ordinary macros 
of plain TeX altogether.

In the antediluvian years before TeX, I set a text in mediaeval Arabic 
(Diocles, /On Burning Mirrors/) which made full use of the far more 
elaborate system of alternative characters in Arabic.  The font was a 
VideoComp stroked font designed by myself and my colleague Walter 
Andrews, and it was managed through two programs, KATIB and HATTAT, 
which were written in Fortran, with C subroutines, and even some 
embedded SDS  assembler code---this was in 1974--75, and the programs 
are obviously not retrievable.  They amounted to a sort of kindergarten 
version of TeX..

The first step was to break up each paragraph into lines, after it had 
been set ragged-left, and then apply the aesthetic criiteria of a 
calligrapher (hence the name HATTAT) to each line.  This could not be a 
blind mechanical substitution of long-form characters, but required 
several passes through a hierarchy of tests.  For example, the long form 
of Kaf (I regret that the Thunderbird editor makes it too difficult to 
illustrate this elegant character) or of final Sad and Dad almost always 
enhance line of Arabic, but they must not be clustered, and I think I 
remember that I allowed only two long-form Kafs in any one line.  Two 
long form Kafs stacked one above the other on adjacent lines was quite 
unacceptable and that had to be dealt with by hand coding, like 
proof-correction.  You have to have a way of selective turning off the 
substition in the input code,

One of the other messages in this chain suggests that the substitution 
should be shut off for the last line of a paragraph.  Certainly not so 
in Arabic, and probably not so in Hebrew.  There is little that enhances 
a paragraph so much as a swooping final Sad or Dad, or Kaf  at the end of
the last line.   Even the simple "tooth-letters,"  Ba, Ta, Tha, Nun and 
Ya, have elegant long forms that are very effective here.

The least satisfactory aesthetic adjustment is the extension of the join 
line, which is favored by flat-bottomed fonts in Arabic, but that is 
obviously not a problem in Hebrew.

Breaking up a paragraph into lines after a first pass through the 
paragrapher is already dealt with in EDMAC, which delivers the result as 
a sequence of one-line paragraphs.  I found EDMAC overly quirky the last 
time I used it, but it could probably be smoothed out.  With the speed 
of present CPU processors even on laptops, and with really generous RAM 
for them to play in, I suspect that the time consumed in making Unicode 
substitutions in each line would be quite acceptable and, like Phil, I 
see no reason to think that it would be impossible in basic TeX macros.  
I wouldn't like to take on the job myself, but at 78 years old, I tend 
not to take on such things,

The price of beauty is a little pain.

Pierre MacKay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/texhax/attachments/20111204/3537383c/attachment.html>

More information about the texhax mailing list