[XeTeX] New feature planned for xetex

Kamal Abdali k.abdali at acm.org
Fri Feb 19 16:26:08 CET 2016

Hi Zdeněk,

Kudos! You figured one out correctly, and got only close on the second one
because I gave you the wrong clue! Sorry. The two word-separated parsings
of the second text are:
جمالو   ہار   گیا۔
جما   لوہار   گیا۔
meaning: " Jamaloo was defeated" and "The ironsmith Jumma has left". While
we are on fun and games, it's worth mentioning an embarrassment related to
the same Nastaleeq ambiguity that a Pakistani TV channel suffered
recently.  An announcer pronounced the Urdu transliteration of the English
phrase "motor vehicle" as "MoTroo Haikal", most likely thinking of it as a
proper noun.

Kamal Abdali

On Fri, Feb 19, 2016 at 5:18 AM, Zdenek Wagner <zdenek.wagner at gmail.com>

> 2016-02-19 4:25 GMT+01:00 Kamal Abdali <k.abdali at acm.org>:
>> On Thu, Feb 18, 2016 at 7:38 PM, Zdenek Wagner <zdenek.wagner at gmail.com>
>> wrote:
>>> I have compared both and personally I like Jonathan's version. Of
>>> course, I am not an expert. I do not have any collection of high quality
>>> Urdu documents. I have only seen Mirza Ghalib's manuscript in his museum in
>>> New Delhi and some Urdu documents in the museum in LaL Qila. My knowledge
>>> of Urdu is very weak. Spoken Urdu is basically the same language as Hindi
>>> so that I can listen to BBC Urdu and understand almost everything but
>>> reading is difficult for me and I know nothing about calligraphy. It will
>>> take me hours to read the sample text, I can only recognize from the title
>>> that it is the Universal Declaration of Human Rights. Anyway, the larger
>>> interword spaces do not help me toread the text.
>>> As an example I am attaching the text from the Jama Masjid in New Delhi.
>>> Look at the beginning of the first line. There is a considerable space
>>> between آ and پ although آپ is a single word. The interword space between
>>> آپ and جامع is smalle that the space in the middle of جامع and there is
>>> almost no space between جامع and مسجد. There is no space between پر and
>>> زیارت but I still can see the words. In the third line the largest space is
>>> in the middle of پرکشش. Of course, it helped me to see the same text in
>>> Devanagari, I would probably be unable to read the Urdu text without it.
>> ​Zdeněk,
>> If each word in Urdu (or in any language written using Arabic characters)
>> formed a connected figure, then any amount of interword space (including
>> zero) would be OK. But since some letters connect with the next letter and
>> some do not, words often consist of two or more separate figures. Having
>> interword spaces then helps to delimit each word. Stringing words together
>> without any space between them is an incessant source of ambiguities and
>> problems. That's why all scripts for the Arabic alphabet other than
>> Nastaleeq now use interword spaces. This forum is not a place to go into
>> more details, so I'll just give you two examples in the form of
>> entertaining puzzles. Without interword spaces, you can read a certain Urdu
>> text (word string) as:
>> EITHER "He is eighty-four years old."
>> OR "That thief is eighty years old."
>> Another one can be read
>> EITHER "Jamaloo was defeated."
>> OR "Jumma went to Lahore."
>> (Jamaloo and Jumma are both common nicknames.) New learners are
>> constantly frustrated because the printed shapes in front of them provide
>> no visual help in separating the words. Basically, the script assumes that
>> you already know what you're trying to learn by reading!
>> Again, I am not calling for a ban on tight kerning, but I am asking
>> Jonathan to be flexible about interword spaces for anyone who wants it. At
>> present most Urdu word processors make it very difficult to overcome
>> interword space suppression in Nastaleeq fonts.
>> Kamal Abdali
> Hi Kamal,
> thank you for examples, I see the problem of چوراسی and چور اسی without
> and with the interword space. The spaces will be needed especially in
> textbooks of Urdu and in dictionaries.
> Could you, please, send me the second example in Urdu? It is interesting
> to me. I can guess that the second sentece ends with حلاحور گیا  and by
> similarity with Hindi I could imagine verb حارنا but then the first
> sentence would end with حار گیا
> The ending is thus different (حار versus حور) but as I wrote, I may be
> mistaken.
> I hope the first example in full is:
> وہ چوراسی سال کا ہے،
> وہ چور اسی سال کا ہے۔
> Zdeněk Wagner
> http://ttsm.icpf.cas.cz/team/wagner.shtml
> http://icebearsoft.euweb.cz
>> --------------------------------------------------
>> Subscriptions, Archive, and List information, etc.:
>>   http://tug.org/mailman/listinfo/xetex
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
>   http://tug.org/mailman/listinfo/xetex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/xetex/attachments/20160219/94e0fc91/attachment.html>

More information about the XeTeX mailing list