[OS X TeX] Meta-data facilities: was TeXShop's %& ugly bug

Tue Sep 14 16:17:05 CEST 2004

Le 14 sept. 04, à 15:11, Curtis Clifton a écrit :

>
> On Sep 14, 2004, at 4:48 AM, Jérôme Laurens wrote:
>
>>> How can "intended typesetting script" be too complex to include in 
>>> an XML file?  XML can represent arbitrary directed graphs.
>>
>> Yes, of course, you can implement easily the %! program =  feature 
>> TeXShop implements. But this is extremely limited.
>> How would you encode the BibTeX options the german user prefer?
>> How do you encode the engine used for the bibliography (there is a 
>> forthcoming MLBibTeX which extends BibTeX)?
>
> I'm sorry, I assumed that since you did development work you would 
> understand "arbitrary directed graphs".  Anything that can be 
> described by a context-free grammar (i.e., any common programming 
> language, scripting language, or command line command) can be 
> represented by a directed graph.  (In fact, by an acyclic directed 
> graph.)  I'm not suggesting a particular encoding; I'm merely refuting 
> your claim that "intended typesetting script" is too complex to encode 
> in XML.  That is clearly false.

Oh sorry, I agree that in theory you can encode quite anything you want 
with XML.
But I was focused on a practical implementation.
Before discussing on how we should store information, we all must agree 
on -what- information should be stored.
It was easy when only the file encoding and stuff like that was 
involved,
but typesetting issues are involving so many frontends on many 
systems...
If we want something really general to be shared, this must be a 
consensus adopted by many people.
My "complex" was only refering to this task, not the XML use itself.

>
>> This "typesetting script encoding" should not be TeXShop centric, nor 
>> even mac centric: this must fit the widest range of frontends 
>> whatever system is used.
>
> I am certainly not trying to argue that TeXShop is the right answer to 
> all problems.  I regularly collaborate with co-authors running both 
> Solaris and Windows, so I'm also not trying to find a Mac-centric 
> solution.  I am trying to get a clear, accurate description from you 
> of what it is you are proposing.  Spurious claims that "intended 
> typesetting scripts" are too complex to include in an XML file, and so 
> we must adopt your wrapper idea, do not further your argument.

This is not at all was I had in mind! Sorry if my words did not 
correspond to my thoughts.
The specifications of a TeX wrapper actually contains stuff about 
encoding and language in an XML file.
But no specs are given yet for the typesetting just because I don't 
know what to put in.
In fact, I have my own ideas so I've put specifications for iTeXMac 
private use only.
Once we have a good idea of what should be put, new specs will appear, 
there is room for that.

> Rather, they weaken it by demonstrating a lack of understanding.  This 
> isn't to say that I think your wrapper idea is faulty, just that parts 
> of your argument are.
>
>>> Is your TUG paper available on-line?
>>
>> See my TeX Wrapper Structure mail.
>
> Thanks!
>
>>
>>>
>>>> All the frontend specific data have nothing to do with that and 
>>>> must live somewhere else
>>>
>>> Why couldn't the XML schema for the project be extensible to include 
>>> front-end specific elements?  It would be simple to require that 
>>> front-ends ignore elements that they do not understand.  This way 
>>> iTeXMac and TeXShop specific data (for example) could live in 
>>> separate branches of the XML tree without interfering.  One could 
>>> also use attributes attached to a "frontend" element:
>>>
>>>
>>
>> - well XML files are not suitable for everything.
>
> I heartily agree.
>
>> - If you only need the information stored in a subtree, you must 
>> manage the whole tree'
>
> As I said before, standard library code can easily manage the rest of 
> the tree.

Yes but there is a risk of corruption.

>
>> - if 2 apps want to manage their own data (for example a text editor 
>> and a pdf viewer), they might have to write concurrently the same 
>> files. it may lead to synch problems...
>
> Now this is a good argument.
>
>> - The file system already has a tree structure.
>>
>> So if the information follows a tree structures, some subtrees might 
>> be stored as xml file, some leaves might be stored as single files 
>> and other subtrees as directories. For some data, the single XML file 
>> is more suitable, for other kind of information, a mixed mode seems 
>> more suitable.
>
> OK, now your wrapper idea is starting to make sense to me.
>
>> Moreover, having a metadata folder in which you can put any kind of 
>> thing is much more comfortable than a xml tree, for example the TeX 
>> engine knows about the file system, but does not know anything of 
>> XML. We can imagine to create a smart copy of the tex sources in the 
>> meta data folder: duplicate the file hierarchy as smart links and run 
>> TeX against the links. All these aux files are created in the smart 
>> hierarchy and the original folders are left untouched. This would be 
>> consistent because these aux file are just cached data, another kind 
>> of meta information.
>
> I don't understand how putting the aux files in a separate directory 
> maintains cross-platform capability.  For example, while working on a 
> paper, one of my co-authors did not want to deal with my bibliography 
> (.bib) files.  So I would generate the .bbl file with BiBTeX and 
> include it in the CVS repository that we shared.  He could then 
> re-typeset the paper---provided he didn't change the 
> references---without access to the bibliography files.  If the .bbl 
> file were put in a separate "meta-data" directory, how would LaTeX on 
> another platform find it?

If a frontend implements the above scheme I described, without 
notifying the user and without giving him a chance to work as we are 
used to work since decades, it is a preemptive frontend.
Bad frontend...

>
> I wonder if the community would be better served by a discussion of 
> the requirements for a meta-data facility, rather than lobbying for a 
> particular solution.  So far it seems that the requirements are:
>
> 1. The meta-data facility must provide information needed by a variety 
> of front-ends, viewers, or typesetting engines (hereafter, 
> "utilities") without interference:
>    a. It should not be possible for meta-data for one utility to be 
> accidentally interpreted as meta-data for another utility.
>    b. If two utilities are running simultaneously, they should not 
> corrupt each others meta-data.
>    c. Newer systems must respect the meta-data facilities of existing 
> systems
> 2. The meta-data facility must be cross-platform compatible.
>    a. At a minimum, it must be possible to use standard utilities on 
> non-Mac platforms without corrupting the meta-data.
>    b. The meta-data should be useable by meta-data-aware utilities on 
> non-Mac platforms.
> 3. The meta-data facility should help to hide the complexity of TeX.
>
> TeXShop's %& "bug" fails to satisfy 1a and c.

For 1a consider this as an accident. But all meta data embedded in the 
source file fail to satisfy 1b and 2a (the simplest text editor will 
allow to edit these infos)

> A single (XML) meta-data file fails to satisfy 1b.  I question whether 
> 3 is a valid requirement---certainly 1 and 2 would take priority over 
> 3.  For example, storing .bbl files in the same directory as the 
> source document is part of the meta-data facility for BiBTeX.  Thus, 
> putting this data in a subdirectory to satisfy 3 would violate 1c.  So 
> I would argue that the .bbl files must stay in the source document 
> directory.
>
> Are there other requirements for a meta-data facility?

3 is definitely a valid requirement just because not all users are 
TeXperts and need assistance.

add the following

4- The meta-data facility should help to -manage- the complexity of TeX.
5- private meta data should exist
6- shared metadata and private meta data should be clearly separated
7- utilities must be free to choose the format for their own private 
metadata storage

Shared meta data are information needed by different utilities, for 
example the text editor and the spell checker need to know the file 
encoding. For this kind of information, there is a possible problem of 
synchronisation, that might need some special management.
For private meta data only one utility is involved and the 
synchronisation problem is quite inexistant.

Once I have finish my implementation, if it works as expected I will 
start lobbying.
I guess I will have a working demo in a couple of weeks.
--------------------- Info ---------------------
Mac-TeX Website: http://www.esm.psu.edu/mac-tex/
           & FAQ: http://latex.yauh.de/faq/
TeX FAQ: http://www.tex.ac.uk/faq
List Post: <mailto:MacOSX-TeX at email.esm.psu.edu>