[XeTeX] xelatex and perltex: incompatibility?

Sat Dec 8 23:05:10 CET 2007

On 8 Dec 2007, at 4:13 am, Nickkk wrote:

> Hi
>
> There seems to be some incompatibility between perltex and xelatex  
> over the
> use of pipes.  But I might be making a basic mistake.
>
> I asked Scott Pakin, the perltex creator, if there might be an  
> incompatibility
> between the two.
>
>
> This is the problem:
>
> I'm using the latest perltex 1.5.  I'm using the texlive package  
> from debian
> testing.
>
> I used two different documents, including a very minimal:
>
> ==============================
>
> \documentclass[12pt]{report}
> %\usepackage{xltxtra,fontspec,xunicode}
> \usepackage{perltex}
> %\setmainfont{Myriad Pro}
> \perlnewcommand{\Hello}
> {
> 	return "goodbye, cruel world";
> }
> \begin{document}
> 	\Hello
> \end{document}
>
> ==============================

This works fine with the version of perltex that's on my (TeXLive- 
based) system, but in looking at the perltex code from the latest  
package I believe I know why it fails there; see below.

>
> (the problem also reproduces with xltxtra,fontspec,xunicode, and  
> Myriad Pro.)
>
>
> It works fine with pdflatex.
> It works fine with xelatex, so long as I have not used  
> \perlnewcommand.
>
> But with xelatex as the latex command, and using \perlnewcommand in  
> the
> preamble. ( perltex --latex=xelatex test.tex )
>
> I get this error:
>
> ==============
>  LaTeX Error: Missing \begin{document}.
>
> See the LaTeX manual or LaTeX Companion for explanation.
> Type  H <return>  for immediate help.
>  ...
>
> l.1 n
>      dinput
> ?
> ==============
>
> AND I get two or more "ndinput"s in the pdf before the actual  
> contents of the
> command, which otherwise seems to have worked fine.
>
> Interestingly, if I put the command after \begin{document} I don't  
> get the
> error, but I do get the "ndinput"s.
>
>
>
> Scott Pakin's response:
>
>> I haven't used XeLaTeX myself, but your minimal \perlnewcommand call
>> looks correct -- especially given that it works with pdflatex.  Yours
>> is the first report I've received of problems with PerlTeX+XeLaTeX
>> although I have heard that one of my other packages also has problems
>> with XeLaTeX (a completely different problem from yours, however).
>>
>> ...
>>
>> Strange.  Something certainly seems awry with xelatex's processing of
>> \endinput.  It's like it's discarding the "\e" then trying to typeset
>> the "ndinput".  If you create a file that ends with an \endinput line
>> and \input that into your main document, does xelatex process the
>> \endinput correctly?
>>
>> Do things improve with PerlTeX if you replace the line
>>
>>     if (!eval {mkfifo($pipe, 0600)}) {
>>
>> in perltex.pl with
>>
>>     if (1) {
>>
>> ?  I'm wondering if xelatex handles named pipes differently from how
>> pdflatex does.  The preceding change disables PerlTeX's named-pipe
>> support.
>
>
> I had a go at making the file that ends in \endinput, but it seems  
> to choke in
> the same way.
>
> When I amended the perltex file, it ran smoothly on this, and on a  
> more
> complex document.
>
> Scott Pakin thought people on this list would have a good insight  
> into how
> xelatex handles pipes:
>
>> Ask if there's anything unusual about the way xelatex
>> reads files that could be expected to break named pipes.  In the
>> meantime, I suppose I should add a --nopipe command-line option to
>> perltex for the sake of xelatex users.
>
>
> Any comments?

Yes, there is indeed something "unusual" that would explain this.  
Well, it's not really unusual, but I can see why it's different from  
(pdf)tex.

The problem is that xetex assumes it can use fseek() on the input  
file to reset its position, and that fails with pipes. The assumption  
was that TeX input files, as processed with \input or \read, etc.,  
will be normal disk files.

What happens when xetex opens an input file is that it reads the  
first few bytes from the file, in order to detect whether it is a  
UTF-16 (big- or little-endian) file, or has a UTF-8 "BOM" signature;  
if so, it will automatically set the appropriate encoding mode, and  
skip the BOM. But in the (common) case of a plain ASCII file (or  
UTF-8 with no BOM), it uses fseek() to reset the read position to the  
beginning of the file. That operation fails on pipes, and that's why  
you're losing the first two characters of \endinput.

Modifying xetex to eliminate the use of fseek() would be possible,  
but it's a bit of a nuisance; it'll still need to peek at the initial  
two or three bytes of the file, and then buffer them if they turn out  
to be normal characters rather than an encoding signature. I'll  
consider this...

Meanwhile, a rather hackish workaround would be for perltex to put a  
few leading spaces on the \endinput line that it writes. Then the  
fact that the initial bytes get lost won't actually affect the  
behavior, and leading spaces should be harmless to other engines.

JK