[XeTeX] anti-xunicode ;-)

Ralf Stubner ralf.stubner at physik.uni-erlangen.de
Tue Jul 25 20:58:20 CEST 2006


Hi Ross,

Ross Moore <ross at ics.mq.edu.au> writes:

> One thing that I noticed in doing this work is that it's
> impossible to tell exactly what goes into the PDF file,
> as the streams are encoded for compression.
> Is there a way to turn off compression ?  (on a Mac)
> JK ?
>
>   xetex --help   tells nothing about this.

Just for te record: With xdvipdfmx this can be done with the option 
'-z 0'. I don't know if xdv2pdf has a similar option. One could probably
also look at the xdv file, but I haven't tried that.

>> I am just not sure if LaTeX needs to know that a certain accented
>> character is available, since XeTeX seems to find that character when
>> presented with suitable decomposed input.
>
> That's true when you know that you have a "smart"(-ish) font.
>
> It's not so with old 7- or 8-bit fonts that you may be using
> within the same document, and which may still be needed for
> accents.
> So you need the means to be able to tell the difference.

I am unable to produce an example file for this. Inputting
<e><ogonekcomb> gives me the prebuild eogonek with every native font I
have tried, including "dumb" Type 1 fonts.

> However, we don't want people writing ad hoc macros that solve the
> problem in a very limited way. Later they will try to extend these
> macros into more complicated situations, fail at this, then ask
> for help fixing them.
>
> Better is to do it right, in the most general way, first off.

Agreed. I just think there are things which cannot be done via TeX
macros at the moment.

>> * With XeTeX one can test for a glyph for the precomposed  
>> character. One
>>   can also test for a combining accent. One cannot test for the
>>   existance of suitable smart font features that make the latter work
>>   properly, though. Imagine the case of U+1E0B ḋ together with a  
>> font
>>   that contains a combining dot accent but no prebuild ḋ. The ideal
>>   rendering for this character might have the dot to the left of the
>>   ascender of the d. Such a behaviour can be implemeted via a 'mark'
>>   feature, but we can't be sure. Just letting XeTeX render  
>> <d><dotcomb>
>>   might also produce results where TeX's fallback of centering the dot
>>   above the <d> would be prefereable.
>
> Such things can only be determined by a human, having a look and
> being dissatisfied with what the system can give automatically.

I think if a font contains a 'mark' feature for a give base-accent
combination, one can savely assume that the accent will be correctly
positioned using it. The problem is that you cannot check within XeTeX
if such a feature exists.

> So yes, get around it with the active-character trick.

The problem with active characters is that you cannot extend it to cases
where Unicode does not provide a precomposed character. And these cases
exist in the real world since AFAIK Unicode has basically stopped adding
more precomposed characters in the Latin script. 

cheerio
ralf



More information about the XeTeX mailing list