[XeTeX] [INDOLOGY] {भारतीयविद्वत्परिषत्} Re: Issues with Sanskrit 2003 font

Zdenek Wagner zdenek.wagner at gmail.com
Mon Jun 19 08:55:14 CEST 2017


2017-06-19 1:16 GMT+02:00 Christian Boitet <Christian.Boitet at imag.fr>:

> Hi, 18/6/17
>
> Marathi (script) is surely not a subset ofHindi (script) as, for example,
> there are 2 letters "L" in Marathi and 1 only in Hindi.
>

You are right. I forgot Salman Khan who has this second L in his name.

>
> Maybe some colleagues from India or Pakistan could help. I put 3 in copy.
>
> IITB/CFILT is doing work on Hindi, Marathi, Bengali and more since years
> under Prof. Pushpak Bhattacharyya. Ritesh Shah is finishing his PhD with
> both of us and is a Gujarati native speaker.
> Prof. Pushpak is currently President of the ACL and knows everybody in NLP
> in India. He can certainly answer many questions and give pointers to
>  colleagues who know all the details of the "Indo-Pak" langages. Abbas
> Malik, who also did his PhD with us, knows probably the most about Indo-Pak
> languages and their scripts as he did his PhD on transliteration between
> scripts of these languages (many have 2).
>

Yes, IITB/CFILT produces among others online dictionaries and corpra for
both Hindi and Marathi. They use there own transliteration which is not
very intuitive, for instance they use "w" for dental T which was a source
of quite a lot of errors. I reported them and Jaya Saraswati corrected
them. I think that their dictionaries are the best on the web.

>
> Best,
> Christian Boitet
>
>
>
> Zdeněk Wagner
> http://ttsm.icpf.cas.cz/team/wagner.shtml
> http://icebearsoft.euweb.cz
>
> Le 18 juin 2017 à 17:35, Zdenek Wagner <zdenek.wagner at gmail.com> a écrit :
>
> 2017-06-18 16:38 GMT+02:00 Mike Maxwell <maxwell at umiacs.umd.edu>:
>
>> On 6/18/2017 4:04 AM, Zdenek Wagner wrote:
>>
>>> as far as I know the Devanagari fonts are either Sanskrit with all
>>> conjuncts that cannot be switched off or Hindi without the Sanskrit
>>> conjuncts.
>>>
>>
>> Do other languages that use Devanagari, like Gujarati, use the same
>> conjuncts as Hindi?
>>
>
> Gujarati is written in the Gujarati script. Devanagari is used in Marathi
> and Nepali. There is a Nepali Linux Group, I offered them that I create
> xindy rules and Steve White asked me about the conjuncts so that he could
> implement the Nepali language but I got no reply from them. I have no
> response from Marathi users either but I have some printed documents in
> Marathi and it seems that the set of conjucts is the same as in nowadays
> Hindi (Marathi does not use characters with nuktas, thus the name of the
> Bollywood actress Priyanka Chopra is written as प्रियंका चोपड़ा in Hindi
> newspapers and as प्रियांका चोप्रा in Marathi newspapers). I have not ben
> to Rajastan so I do not know whether Rajastan, Mevari, Marvari have
> differences but probably not.
>
> So the result is that Marathi is most probably a subset of Hindi hence
> Language=Hindi can also be used for Marathi. Strictly Marathi font may be
> unusable for Hindi because the charcters with nuktas and especially their
> conjuncts and half forms need not be available in the font. I saw such a
> font a few years ago but it was fixed.
>
>
> Zdeněk Wagner
> http://ttsm.icpf.cas.cz/team/wagner.shtml
> http://icebearsoft.euweb.cz
>
>
>
>> --
>>    Mike Maxwell
>>    "My definition of an interesting universe is
>>    one that has the capacity to study itself."
>>          --Stephen Eastmond
>>
>
>
>
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
>
>
> -------------------------------------------------------------------------
>
> Christian Boitet
>
> (Pr. émérite Université Grenoble Alpes)
> Laboratoire d'Informatique de Grenoble
> L             I               G
>
> Groupe d'Etude pour la Traduction Automatique
>
>                  et le Traitement Automatisé des Langues et de la Parole
>
> G        E             T          A              L                P
>
>
> --- Adresse postale ---
> GETALP, LIG-campus
> Bâtiment IMAG, bureau 339
> CS 40700
> 38058 Grenoble Cedex 9
> France
>
>
>
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
>   http://tug.org/mailman/listinfo/xetex
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/xetex/attachments/20170619/2859ccfc/attachment.html>


More information about the XeTeX mailing list