[XeTeX] [INDOLOGY] {भारतीयविद्वत्परिषत्} Re: Issues with Sanskrit 2003 font
Zdenek Wagner
zdenek.wagner at gmail.com
Mon Jun 19 10:58:30 CEST 2017
I should add one more note. Indian languages are complex, the same word may
have two or more orthographies and all of them are correct. Even Hindi has
two forms in Devanagari, हिंदी and हिन्दी, both are found with more or less
the same frequency. There are two forms of independent A, both are in use.
The character group nna is sometimes written as a conjunct, sometimes as
half-na + na. I have a leaflet where two parts of the text are printed each
with a different font. One part contains the nna conjunct, the other part
contains half-na + na. When working on @modernhindi in the Velthuis
Devanagari system, Anshuman Pandey asked both Indian typographers and
normal people what they prefer to see. This selection is implemented in
freefont.
The somewhat extreme case can be found in the Czech-Hindi dictionary
published by the Central Hindi Directorate and available as a preview
online:
http://hindinideshalaya.nic.in/hindi/onlinebook/czechHindiDictionary.asp
Use the "Next Page" or the direct link below to navigate to th colophon
page and look how "dvitiy" is printed:
http://hindinideshalaya.nic.in/hindi/onlinebook/czechHindiDictionary.asp?currentPage=4
The result is that there are often several forms and all of them are
equally correct. We cannot say that one form is correct and another is
wrong. The implementation is thus typographer's choice. And, of course,
languages evolve, orthography evolves. You can notice that some people are
not able to pronounce Brahma properly, they pronounce Bramha and even write
it this way in Devanagari, for instance here:
http://aajtak.intoday.in/story/arti-of-lord-shiva-1-870255.html
Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz
2017-06-19 8:55 GMT+02:00 Zdenek Wagner <zdenek.wagner at gmail.com>:
> 2017-06-19 1:16 GMT+02:00 Christian Boitet <Christian.Boitet at imag.fr>:
>
>> Hi, 18/6/17
>>
>> Marathi (script) is surely not a subset ofHindi (script) as, for example,
>> there are 2 letters "L" in Marathi and 1 only in Hindi.
>>
>
> You are right. I forgot Salman Khan who has this second L in his name.
>
>>
>> Maybe some colleagues from India or Pakistan could help. I put 3 in copy.
>>
>> IITB/CFILT is doing work on Hindi, Marathi, Bengali and more since years
>> under Prof. Pushpak Bhattacharyya. Ritesh Shah is finishing his PhD with
>> both of us and is a Gujarati native speaker.
>> Prof. Pushpak is currently President of the ACL and knows everybody in
>> NLP in India. He can certainly answer many questions and give pointers to
>> colleagues who know all the details of the "Indo-Pak" langages. Abbas
>> Malik, who also did his PhD with us, knows probably the most about Indo-Pak
>> languages and their scripts as he did his PhD on transliteration between
>> scripts of these languages (many have 2).
>>
>
> Yes, IITB/CFILT produces among others online dictionaries and corpra for
> both Hindi and Marathi. They use there own transliteration which is not
> very intuitive, for instance they use "w" for dental T which was a source
> of quite a lot of errors. I reported them and Jaya Saraswati corrected
> them. I think that their dictionaries are the best on the web.
>
>>
>> Best,
>> Christian Boitet
>>
>>
>>
>> Zdeněk Wagner
>> http://ttsm.icpf.cas.cz/team/wagner.shtml
>> http://icebearsoft.euweb.cz
>>
>> Le 18 juin 2017 à 17:35, Zdenek Wagner <zdenek.wagner at gmail.com> a écrit
>> :
>>
>> 2017-06-18 16:38 GMT+02:00 Mike Maxwell <maxwell at umiacs.umd.edu>:
>>
>>> On 6/18/2017 4:04 AM, Zdenek Wagner wrote:
>>>
>>>> as far as I know the Devanagari fonts are either Sanskrit with all
>>>> conjuncts that cannot be switched off or Hindi without the Sanskrit
>>>> conjuncts.
>>>>
>>>
>>> Do other languages that use Devanagari, like Gujarati, use the same
>>> conjuncts as Hindi?
>>>
>>
>> Gujarati is written in the Gujarati script. Devanagari is used in Marathi
>> and Nepali. There is a Nepali Linux Group, I offered them that I create
>> xindy rules and Steve White asked me about the conjuncts so that he could
>> implement the Nepali language but I got no reply from them. I have no
>> response from Marathi users either but I have some printed documents in
>> Marathi and it seems that the set of conjucts is the same as in nowadays
>> Hindi (Marathi does not use characters with nuktas, thus the name of the
>> Bollywood actress Priyanka Chopra is written as प्रियंका चोपड़ा in Hindi
>> newspapers and as प्रियांका चोप्रा in Marathi newspapers). I have not ben
>> to Rajastan so I do not know whether Rajastan, Mevari, Marvari have
>> differences but probably not.
>>
>> So the result is that Marathi is most probably a subset of Hindi hence
>> Language=Hindi can also be used for Marathi. Strictly Marathi font may be
>> unusable for Hindi because the charcters with nuktas and especially their
>> conjuncts and half forms need not be available in the font. I saw such a
>> font a few years ago but it was fixed.
>>
>>
>> Zdeněk Wagner
>> http://ttsm.icpf.cas.cz/team/wagner.shtml
>> http://icebearsoft.euweb.cz
>>
>>
>>
>>> --
>>> Mike Maxwell
>>> "My definition of an interesting universe is
>>> one that has the capacity to study itself."
>>> --Stephen Eastmond
>>>
>>
>>
>>
>> --------------------------------------------------
>> Subscriptions, Archive, and List information, etc.:
>> http://tug.org/mailman/listinfo/xetex
>>
>>
>> -------------------------------------------------------------------------
>>
>> Christian Boitet
>>
>> (Pr. émérite Université Grenoble Alpes)
>> Laboratoire d'Informatique de Grenoble
>> L I G
>>
>> Groupe d'Etude pour la Traduction Automatique
>>
>> et le Traitement Automatisé des Langues et de la Parole
>>
>> G E T A L P
>>
>>
>> --- Adresse postale ---
>> GETALP, LIG-campus
>> Bâtiment IMAG, bureau 339
>> CS 40700
>> 38058 Grenoble Cedex 9
>> France
>>
>>
>>
>> --------------------------------------------------
>> Subscriptions, Archive, and List information, etc.:
>> http://tug.org/mailman/listinfo/xetex
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/xetex/attachments/20170619/df76c06b/attachment-0001.html>
More information about the XeTeX
mailing list