[XeTeX] babel

Javier Bezos listas at tex-tipografia.com
Thu Mar 24 08:21:26 CET 2016


Thank you very much. Very useful, and you confirm my suspect
the data in the CLDR is not always reliable. Furthermore, it's
obvious it's intended mainly for displaying plain text in
some especific contexts and not for fine typesetting. At first
my idea was to sinchronize more or less regularly the ini files
with the CLDR, but now I'm not sure it's a good idea.

> I do not understand the meaning of the encoding field.

The goal is to provide information about which encodings
support or have supported the language, even partially
(definitely, one couldn't say OT1 supports any language
except English and a few others). This field is essentially

> I understand hyphenchar (should be the same as in English in all mentioned
> languages) but do not understand the other hyphen* fields.

Most of them are intended for luatex (only for the languages
they make sense, of course).



> The minus sign in both Czech and Slovak should be –
> The quotes in both Czech and Slovak are „ and “ (the closing quote has its
> codepoint in Unicode but is rarely present in fonts, it is better to use
> English opening quote which has the same shape).
> In Czech (and maybe also in Slovak) the time separator is a period, in
> sport results and time tables a colon is used.
> Slovak: characters Ä Ď Ô Ť in index look strange to me, it should be proved
> by a native Slovak speaker.
> Hindi
> ====
> See the note on the encoding above
> A few misprints and missing items in the captions
> bib = संदर्भ-ग्रन्थ (or संदर्भ-ग्रंथ)
> contents - the version you have is one of the alternatives suggested by
> Anshuman Pandey but most books I have bought in India contain अनुक्रम
> part = खण्ड (or खंड)
> page = पृष्ठ
> proof = प्रमाण
> glossary = शब्दार्थ सूची
> cc, encl, and headto make no sense, I am probably the only man who writes
> business e-mails in Hindi...
> I have never seen abreviated months (a native Hindi speaker should help).
> The only abbreviations for days of week I have seen at the Aligarh railway
> station are:
> Monday = सो॰, Tuesday = मं॰, Wednesday = बु॰, Thursday = बृह॰, Friday = शुक॰
> (or शुक्र॰, the plate was not clearly readable), Saturday = शनि॰, Sunday =
> रवि॰. I would not be surprized if the ॰ punctuation were omitted.
> [characters] ङ  and ञ are not used in Hindi, they should be removed from index
> frenchspacing – I am afraid that it has no sense in Hindi as well as other
> Indic languages. The proper spacing was implemented in GNU Freefont (at
> least for Hindi) and is activated automatically by language switching. The
> rules are explained (in Hindi only, links to other languages switch to a
> different text) at
> https://hi.wikipedia.org/wiki/%E0%A4%B5%E0%A4%BF%E0%A4%95%E0%A4%BF%E0%A4%AA%E0%A5%80%E0%A4%A1%E0%A4%BF%E0%A4%AF%E0%A4%BE:%E0%A4%B9%E0%A4%BF%E0%A4%A8%E0%A5%8D%E0%A4%A6%E0%A5%80_%E0%A4%AE%E0%A5%87%E0%A4%82_%E0%A4%B8%E0%A4%BE%E0%A4%AE%E0%A4%BE%E0%A4%A8%E0%A5%8D%E0%A4%AF_%E0%A4%97%E0%A4%B2%E0%A4%A4%E0%A4%BF%E0%A4%AF%E0%A4%BE%E0%A4%81
> punctuation: danda । and double danda ॥ should be listed as the most
> important punctuation
> quotes: either English double quotes or English single quotes are used
> (depends on the preference of an author and/or a publisher)
> number: Both Devanagari and Arabic digits are used, it is hard to say which
> one should be he default
> counters: the way how list items are numbered does not conform to the LaTeX
> system. I have a normative document how it should be done, it is written in
> Marathi and I probably have also a Hindi version. Unfortunately I have not
> found time to implement it so far.
> Zdeněk Wagner
> http://ttsm.icpf.cas.cz/team/wagner.shtml
> http://icebearsoft.euweb.cz
> 2016-03-23 19:31 GMT+01:00 Javier Bezos <listas at tex-tipografia.com
> <mailto:listas at tex-tipografia.com>>:
>     Hi all,
>     I'm working on a new version of babel, with a new way to define
>     languages in a descriptive way, more than in a programmatic one (of
>     course, the latter won't be excluded because it's still necessary).
>     The idea is to create a set of ini file like those you can find on
>     https://latex-project.org/svnroot/latex2e-public/trunk/required/babel/locales/
>     They are tentative and some of them are incomplete. I'm working on the
>     code to read and 'transform' their data, but in the meanwhile I'd like
>     to improve the ini files. The first step in the roadmap is to provide
>     real utf-8 strings for captions and dates with current styles so
>     that they can be useable even without fontenc.
>     Any help or comments would be greatly appreciated.
>     [Crossposted to xetex and luatex lists.]
>     Javier
