[XeTeX] Localised [Xe][La]TeX (was : Localized XeLaTeX (was : Greek XeLaTeX))

Keith J. Schultz keithjschultz at web.de
Fri Oct 15 13:15:52 CEST 2010

I am aware that it is not a trivial task.

I have stated before that the "english" is to be the default and
what is far more important the actually use language!!
This would keep backward compatibility, in other words
inorder to use "legacy" files is a trivial task.

IF one wanted to use a localized language the would have to load a
list of translations for commands and units, etc and use a set command.

This approach would also allow for the switching between localized versions.

Where the code could be injected to provide the functionality is as you mentioned important.

I think I can safely assume that the tex code is somewhat archaic and not easy to patch.
Though I have not look at the sources since the 80s.


Am 15.10.2010 um 11:53 schrieb Philip Taylor (Webmaster, Ret'd):

> Keith J. Schultz wrote:
>> 	Like I saiud the best point to confront the problem is in the parser at
>> 	a low level directly in the xetex engine. so that the "normal" is distinguished
>> 	from the markup.
>> 	There seems to be a consensus that it would be a good idea to have the
>> 	markup localized. The idea seems workable.
>> 	But, the question remains is someone willing to do the work on the xetex engine.
>> 	Or is there anybody interrested in doing so.
> Much as I am in favour of any proposal which will allow more
> people to use TeX (or any other system) through the medium
> of their own first language, I think that there are many Many
> MANY problem areas to be resolved before we start discussing
> who might do the work.  And it is not even clear to me that
> implementing this at the level of the parser is necessarily
> the optimal solution, since unless there were a matching
> "unparser" one could still not obtain (say) the American
> English equivalent of a document marked up in (say) Sinhala,
> as a result of which only those familiar with Sinhala could
> help another Sinhala speaker with his/her problems.  So
> let me outline what I see as some of the more difficult
> problems and ask if solutions to these are obvious to others.
> 1) The TeX language consists in part of control words,
> control symbols and keywords, together with a small
> number of characters that are "reserved" in some sense
> as a result of their having a particular category code.
> And only during the processing of a TeX document is it
> possible to know with absolute certainty what role any
> particular character or sequence of characters is playing,
> since TeX allows (almost) /everything/ to change its
> meaning on the fly.  Thus the only program that could,
> with 100% certainty, convert a document marked up in
> Language A to one marked up in Language B would be TeX
> itself.  But as TeX was not written with this functionality
> in mind, it would have to be retrofitted to the TeX
> source itself : a distinctly non-trivial task.
> Comment : (1) deals solely with monolithic documents :
> those that make no reference to anything other than
> themselves.
> 2) In real life, monolithic documents have virtually disappeared.
> Even the simplest letter that I wrote, for example,
> will \input A4-Letter, which will itself \input A4 and
> \input Letter; and most things that I write will \input
> many files rather than just one.  And as a Plain TeX
> user, I am one of a tiny, vanishing, minority : most
> will be using LaTeX and some will be using Context.
> In both cases, there will be an automatic requirement
> that other files will be \input.  Consider the following :
> 	\documentclass {minimal}
> 	\begin {document}
> 	\end {document}
> and then consider its log file, which reads (in part)
> 	(e:/TeX/Live/2010/texmf-dist/tex/latex/base/minimal.cls
> 		Document Class: minimal 2001/05/25 Standard LaTeX minimal class
> 	)
> Note that even this most trivial of documents requires at least
> one adjunct file : "minimal.cls", in this case.
> Now "minimal.cls" is written using standard LaTeX markup, which
> is based on American English; thus at the point of \inputting
> this file, a "Universal" TeX processor would have to detect that
> a new file was being processed, ascertain the language in which
> it was marked up (and no such files currently carry any "Markup-
> language" pragmat within them to indicate the markup language),
> process this file under a different language régime, and then
> revert to the original language régime once the file being
> \input had ended.
> Comment : (2) deals with the class of documents that process
> the whole of another document before returning to continue
> to process themselves.
> 3) But TeX is not restricted to \inputting files; it can
> also open files for reading, and read them a line at a
> time.  It may then elect to process those lines as if
> they were TeX source, in which case each file opened would
> have to carry markup-language information, and the processing
> system would have to switch markup language before and after
> processing each line.
> Comment : (3) deals with the class of documents that process
> other documents on a line-by-line basis.
> 4) Of course, this is just touching the tip of the iceberg,
> We cannot know, a priori, whether a command \foo, embedded
> in a file "bar.tex", is referencing \foo from a document
> marked up in American English, French, or any other language
> that uses the Unicode characters "f" and "o".  The real
> complexities are absolutely horrific, and some very serious
> research and investigation would need to be carried out before
> this project could move from a "Wouldn't it be nice" to a "This
> is feasible and will take $n$ man-millenia to complete" state.
> This is not to suggest that we shouldn't start.  But nor
> should we underestimate the magnitude of the task that
> we are setting ourselves.
> Philip Taylor
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
> http://tug.org/mailman/listinfo/xetex

More information about the XeTeX mailing list