[texhax] best way to revise a large existing text
W.J. Metzger
wes at hef.ru.nl
Fri Oct 30 13:16:08 CET 2009
Frederik, Thanks for looking at this. So, it appears that if I would
upgrade to the newest perl, I would get a warning message instead of a
segmentation fault. Unfortunately, that is tricky, since I work on several
computers for which I am not the administrator, and he prefers to wait for
an rpm from Scientific Linux. Anyway, with the addition of the comment I
avoid the problem for now. Please inform me when you have a new version of
latexdiff. And thanks for your interest.
Regards, Wes
On Thu, 29 Oct 2009, Frederik Tilmann wrote:
> Wes
>
> apologies for not having read your previous email carefully enough. My perl
> still does not crash but I get a warning message that gives a clue why you
> get a segfault:
> "Complex regular subexpression recursion limit (32766) exceeded at
> /home/tilmann/bin/latexdiff line 1159."
> and as you say this occurs during the initial parsing (see below). Even with
> that error I still get a reasonable result out (same as before) but of course
> this could be different once differences are introduces. This bug is not very
> straightforward to fix but I have been thinking for some time about parsing
> the comments in a completely different way which would also solve this
> problem, and your problem is an impetus to do this with high priority. Even
> so unfortunately it will be some time before I get round to this, and in the
> meantime I can only recommend upgrading your perl version which at least
> allows approximate parsing of the troublesome documents.
>
> Regards
> Frederik
>
>
>
>
>> ~/tmp 85> latexdiff -V pap7.tex pap7.tex > pap7-diff.tex
>> This is LATEXDIFF 0.5 (Algorithm::Diff 1.15 so)
>> (c) 2004-2007 F J Tilmann
>> Preamble Internal Type UNDERLINE
>> Preamble Internal Type SAFE
>> Preamble Internal Type FLOATSAFE
>> Differencing preamble.
>> amsmath package detected.
>> Preprocessing body. (0.11 s)
>> Splitting into latex tokens
>> Parsing pap7.tex
>> Complex regular subexpression recursion limit (32766) exceeded at
>> /home/tilmann/bin/latexdiff line 1159.
>>
>> WARNING: Inconsistency in length of input string and parsed string:
>> This often indicates faulty or non-standard latex code.
>> In many cases you can ignore this and the following warning messages.
>> Note that character numbers in the following are counted beginning after
>> \begin{document} and are only approximate.DEBUG Original length 109458
>> Parsed length 109456
>> Complex regular subexpression recursion limit (32766) exceeded at
>> /home/tilmann/bin/latexdiff line 1191.
>>
>> in terms of $Q$ the \taumodel\ provides a good description, much bette
>> ^^^^^^^^^^^
>> Missing characters near word 7199 character index: 101834-101845 Length: 9
>> Match: |provides | (expected match marked above).
>> Parsing pap7.tex
>> Complex regular subexpression recursion limit (32766) exceeded at
>> /home/tilmann/bin/latexdiff line 1159.
>>
>> WARNING: Inconsistency in length of input string and parsed string:
>> This often indicates faulty or non-standard latex code.
>> In many cases you can ignore this and the following warning messages.
>> Note that character numbers in the following are counted beginning after
>> \begin{document} and are only approximate.DEBUG Original length 109458
>> Parsed length 109456
>> Complex regular subexpression recursion limit (32766) exceeded at
>> /home/tilmann/bin/latexdiff line 1191.
>>
>> in terms of $Q$ the \taumodel\ provides a good description, much bette
>> ^^^^^^^^^^^
>> Missing characters near word 7199 character index: 101834-101845 Length: 9
>> Match: |provides | (expected match marked above).
>> (0.65 s)
>> Pass 1: Expanding text commands and merging isolated identities with
>> changed blocks
>> 7848 matching tokens in 0 blocks.
>> 0 discarded tokens in 0 blocks.
>> 0 appended tokens in 0 blocks.
>> (0.08 s)
>> Pass 2: inserting DIF tokens and mark up.
>> 7848 matching tokens.
>> 0 discarded tokens in 0 blocks.
>> 0 appended tokens in 0 blocks.
>> (0.11 s)
>> Postprocessing body.
>> (0.03 s)
>> Done.
>
> On 28/10/09 11:28, W.J. Metzger wrote:
>> Dear Frederik,
>>
>> This is what I get too for latexdiff pap7.tex pap7.tex > pap7-diff.tex
>> But did you try it removing line 635? That is the line
>> %%%%%%%% } ADDING THIS LINE PREVENTS segmentation fault in latexdiff
>> If I remove that line I get the segmentation fault.
>>
>> Cheers, Wes
>>
>> On Tue, 27 Oct 2009, Frederik Tilmann wrote:
>>
>> > Dear Wes,
>> >
>> > I can't reproduce the error: see the following transcript.
>> >
>> > > ~/tmp 56> latexdiff pap7.tex pap7.tex > pap7-diff.tex
>> > >
>> > > WARNING: Inconsistency in length of input string and parsed string:
>> > > This often indicates faulty or non-standard latex code.
>> > > In many cases you can ignore this and the following warning messages.
>> > > Note that character numbers in the following are counted beginning
>> > > after
>> > > \begin{document} and are only approximate.DEBUG Original length 109535
>> > > Parsed length 109533
>> > >
>> > > in terms of $Q$ the \taumodel\ provides a good description, much bette
>> > > ^^^^^^^^^^^
>> > > Missing characters near word 7149 character index: 101911-101922
>> > > Length: 9
>> > > Match: |provides | (expected match marked above).
>> > >
>> > > WARNING: Inconsistency in length of input string and parsed string:
>> > > This often indicates faulty or non-standard latex code.
>> > > In many cases you can ignore this and the following warning messages.
>> > > Note that character numbers in the following are counted beginning
>> > > after
>> > > \begin{document} and are only approximate.DEBUG Original length 109535
>> > > Parsed length 109533
>> > >
>> > > in terms of $Q$ the \taumodel\ provides a good description, much bette
>> > > ^^^^^^^^^^^
>> > > Missing characters near word 7149 character index: 101911-101922
>> > > Length: 9
>> > > Match: |provides | (expected match marked above).
>> > > ~/tmp 57> latexdiff --version
>> > > This is LATEXDIFF 0.5 (Algorithm::Diff 1.15 so)
>> > > (c) 2004-2007 F J Tilmann
>> > > ~/tmp 58> perl --version
>> > >
>> > > This is perl, v5.10.0 built for i386-linux-thread-multi
>> > >
>> > > Copyright 1987-2007, Larry Wall
>> > >
>> > > Perl may be copied only under the terms of either the Artistic
>> > > License or
>> > > the
>> > > GNU General Public License, which may be found in the Perl 5 source
>> > > kit.
>> > >
>> > > Complete documentation for Perl, including FAQ lists, should be found
>> > > on
>> > > this system using "man perl" or "perldoc perl". If you have access to
>> > > the
>> > > Internet, point your browser at http://www.perl.org/, the Perl Home
>> > > Page.
>> > >
>> >
>> > pap7-diff.tex seems to contain reasonable output. Only change to
>> > pap7.tex is
>> > that some newlines get removed. (particularly before comments, or
>> > where there
>> > are multiple newlines)
>> >
>> > It is not the perl version either; I ran the same sequence on another
>> > machine, which has perlv5.8.0, and get the same output as above. Also
>> > get
>> > the same result with latexdiff-so 0.5, and latexdiff-fast 0.42.
>> >
>> > If anyone else is reading this thread, can someone else reproduce?
>> >
>> > Your other reported bug (ignores "\ " is a real shortcoming leading to
>> > the
>> > warnings and I will try to address this in the next version).
>> >
>> > Frederik
>> >
>> >
>> >
>> >
>> >
>> >
>> > On 27/10/09 16:20, W.J. Metzger wrote:
>> > > On Mon, 26 Oct 2009, Frederik Tilmann wrote:
>> > >
>> > > > Dear Wes
>> > > >
>> > > > I have never had any reports of segfaults, and I know some people
>> > > > have
>> > > > used
>> > > > it on their PhD thesis, so length should not really be an issue. It
>> > > > should
>> > > > really bail with a Perl error if there was anything wrong with the
>> > > > latexdiff
>> > > > code.
>> > > > What's your system and perl version? Did you try latexdiff-fast,
>> > > > which
>> > > > might
>> > > > be more robust if there is a memory problem with perl?
>> > > >
>> > > > Frederik
>> > >
>> > > Dear Frederik,
>> > >
>> > > I run on Scientific Linux 5.3, which is a clone of Red Hat Enterprise
>> > > 5.
>> > > The perl version is v5.8.8 built for i386-linux-thread-multi
>> > > latexdiff latexdiff-fast and latexdiff-so all gave the segmentation
>> > > fault.
>> > >
>> > > I tried doing it also on another machine with a slightly older
>> > > version of
>> > > perl v5.8.5, but with twice the memory. It also gave the segmentation
>> > > fault.
>> > >
>> > > I've played around with the tex file and found that the segmentation
>> > > fault
>> > > could be avoided by adding a comment line -- line 635 of the attached
>> > > file.
>> > > If that line is removed, I get the segmentation fault.
>> > >
>> > > The segmentation fault occurs very quickly, almost immediately. So I
>> > > think
>> > > that latexdiff has not started looking for the differences yet.
>> > >
>> > > I thought that the problem might be misinterpreting a { that was in a
>> > > comment, since adding a comment with a } got rid of the segmentation
>> > > fault.
>> > > I attempted to isolate the problem in a small test file, containing
>> > > only
>> > > the \begin{figure} - \end{figure} in which line 635 occurs. But I did
>> > > not
>> > > get a segmentation fault with or without line 635.
>> > > So the problem is more complicated than just the { in a comment.
>> > >
>> > >
>> > > Another problem, but only a slightly annoying one, is an apparent
>> > > misparsing of a line ending in a \
>> > > e.g.
>> > > use \ell\
>> > > rather than l to avoid confusion with 1
>> > > Apparently the blank after \ell\ is not seen and results in warning
>> > > messages.
>> > >
>> > > Further, differences in equations sometimes lead to incorrect
>> > > mathmode in
>> > > the difference file resulting in latex needing to insert a $.
>> > >
>> > > All in all, latexdiff seems to work well for text, but has some
>> > > problems
>> > > when things get complicated.
>> > >
>> > >
>> > > > W.J. Metzger wrote:
>> > > > > On Fri, 23 Oct 2009, martin f. krafft wrote:
>> > > > >
>> > > > > > also sprach Boris Veytsman <borisv at lk.net> [2009.09.22.1700
>> > > > > > +0200]:
>> > > > > > > Try latexdiff,
>> > > > > > > http://www.ctan.org/tex-archive/support/latexdiff/
>> > > > > >
>> > > > > > That was a marvelous suggestion. Thanks.
>> > > > >
>> > > > > It sounded good to me too. So I downloaded it and tried it --
>> > > > > works
>> > > > > fine
>> > > > > on small tex files, but when I tried it on 'real' files it results
>> > > > > in a
>> > > > > segmentation fault. Do others also have this experience?
>> > >
>> > > Cheers, Wes
>> > > --
>> > >
>> > > Dr. W. J. Metzger Experimental High Energy Physics Group
>> > > tel. +31-24-3653127 Faculty of Science
>> > > +31-24-3652099 (secr.) Radboud University Nijmegen
>> > > fax. +31-24-3652191 Heyendaalseweg 135
>> > > 6525 AJ Nijmegen, The Netherlands
>> > > e-mail: wes at hef.ru.nl or Wesley.Metzger at cern.ch
>> > > http://home.cern.ch/metzger/ or http://www.hef.ru.nl/~wes
>> >
>> >
>> > --
>> > Frederik Tilmann
>> > Bullard Laboratories Tel. +44 1223 765545
>> > Department of Earth Sciences Fax. +44 1223 360779
>> > University of Cambridge email: tilmann at esc.cam.ac.uk
>> > Madingley Road http://bullard.esc.cam.ac.uk/~tilmann
>> > Cambridge CB3 0EZ
>> > UK
>> >
>
>
>
--
Dr. W. J. Metzger Experimental High Energy Physics Group
tel. +31-24-3653127 Faculty of Science
+31-24-3652099 (secr.) Radboud University Nijmegen (**)
fax. +31-24-3652191 Heyendaalseweg 135
6525 AJ Nijmegen, The Netherlands
e-mail: wes at hef.ru.nl or Wesley.Metzger at cern.ch
http://home.cern.ch/metzger/ or http://www.hef.ru.nl/~wes
(**) On 1 Sept. 2004 the University of Nijmegen changed its name
More information about the texhax
mailing list