[XeTeX] XeTeX 0.9 and utf8accents.sty

Fri Feb 25 19:16:23 CET 2005

On 26 Feb 2005, at 3:39 AM, Bruno Voisin wrote:
>
> Le 25 févr. 05, à 17:42, Jonathan Kew a écrit :
>
>> I seem to recall there was some discussion a while ago about whether  
>> this should be implemented at a different level, as a font encoding  
>> or something, but I don't remember the details, or know how feasible  
>> this would be.
>
> I'm just realizing such issues in standard LaTeX are considered to  
> arise at the encoding level, and are defined in the two files (part of  
> the inputenc package)
>
> 	/usr/local/teTeX/share/texmf.tetex/tex/latex/base/utf8.def
> 	/usr/local/teTeX/share/texmf.tetex/tex/latex/base/utf8enc.dfu
>
> The former, for example, includes
>
> 	\DeclareUnicodeCharacter{00A9}{\textcopyright}
> 	...<snip>...
>
> while the latter includes these and a lot more. Don't know whether  
> that's related to both questions.

Yes, it is.
While the utf8accent stuff is the sort of thing that is defined in font  
encoding definitions, there is a bit of a grey area in trying to subset  
everything so that the definition is enforced as the LaTeX3 team would  
like (that is, EVERY single character in an encoding definition MUST  
ALWAYS be in a font that is referred to by that specific encoding).

The advantage would be that when using a font in XeTeX you could give  
it the "OSX" encoding, say, (as I previously endorsed, but later  
abandoned after the discussion JK referred to) and it would  
automatically load utf8accents for you. In an effort to emulate this  
transparency, fontspec loads the utf8accents package as a matter of  
course.

To actually create the encoding, you can simply use the following file  
in an appropriate place in the texmf tree:
=============
\ProvidesFile{osxenc.def}
\DeclareFontEncoding{OSX}{\newcommand\UTFencname{OSX}\RequirePackage{utf 
8accents}}{}
\endinput
=============

But it's not that simple. This works for the short term, but the TeX  
world doesn't like short term solutions.

Smarter people than me ideally need to come to an agreement to  
standardise these methods for all unicode input in TeX-like programs  
before it's given a proper font encoding name by the LaTeX3 team.  
(Although Frank Mittelbach recommended splitting up unicode into chunks  
and referring to them by eu1, eu2, etc. (eu == experimental unicode).)

But then we get back to the question: how many unicode characters do we  
REALLY need with internal LaTeX commands if XeTeX passes through  
unicode characters directly? There's many issues to be discussed, but I  
don't know where or when or by whom.

Will
...on the tip of the iceberg...