[tex-live] Better ways to find packages and documentation

Thu Jul 5 01:10:28 CEST 2007

Hello,

Zitat von Florent Rougon <f.rougon at free.fr>:

> Hi,
>
> Norbert Preining <preining at logic.at> wrote:
[...]

> > Question: Is any file included in more than 1 TLPOBJ?
> >
> > Answer:
> > 	current format:
> > 		grep '^ ' texlive.tlpdb | sort | uniq --repeated
>
> Beeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeep!
>
> Nope, this only works if two packages have the exact same file *paths*.
> The more useful thing would be to detect files with the same basename
> (which is, mmmyes, easy,

Yes, easy.

But what not is easy, is to follow the discussion, if one is not
already deep into the field of texlive / TeX-directoty sturcture,
CTAN-filesystem and the tagging-stuff.

Mee... I'm only a programmer, and I like it most to write code from
scratch. That's the reason why I stopped working with Norbert on his
perl-stuff.

Tell me what we need as input and what wee need as output,
then I can start.

Talking about thousands of documents, which one has to read,
or telling me "invent a subroutine xyz, there already is a stub,
change it on your will and make it run" and afterwards getting
the answer: "well, it doesn't fit into the concept" is not my
way I like to work.
Let me do the black box, and I can help.
If there are things to write from scratch, fine.

If there are no documentation about what to do, but tons of
files which are from the old or teh new structure, or ...
..this is annoying to me.
I'm a hacker, not an archaelogist. ;-)

So, somewhere you asked, if really only two are working on that
whole stuff, and I think, it is true.
I'm not good in taking up code of others. And so it seems,
you both have to do that.

If there is something of the tasks like mentioned above,
and I can choose my language of choice, I could help.

>
> > 	xml format:
> > 		shoot yourself ...
>
> Well, well, well, it's just a few Python lines of code away. Not shell,
> I admit.
>
> > If you need more examples ...
>
> OTOH, XML is eXtensible. *I*'ll give an example: when I started with my
> little movie catalog in XML, I had no <comments> elements yet (for
> storing my comments about a movie). When I found that useful, I was able
> to add them in the DB and the Python script was still working with *no*
> *modification* *at all*.

This can also be done with simple text-based fileformats.
It depends on how your parser is programmed.

It would also be possible to throw an error-message on
your extended fileformat (I don't know, if a DTD could achieve this,
but I think so; I'm not an XML-expert; but with some checks in the
parser-code, that analyses, what the XML-parser gives vack as results,
this should also be possible => extending forbidden or allowed?!)

> It simply ignored these new elements. When I

This might be good, but also be a bad choice,
depending on the application. A seemingly extension might be
a typo...

>
> So, you can extend the file format with no modification whatsoever to
> the program reading it, it still just works. When you have time, you can
> then extend the program to take advantage of the new data. Same thing
> with attributes (X-rated="yes" and such ;-) I hadn't thought of at
> first.

Yes, this *can* be a good way.

>
> One nice thing with XML is that the structure in the file can be
> directly mirrored to a structured object in
> Python/Perl/whatever-decent-programming-language (shell isn't one).

OCaml of course. :)

[...]
> > Furthermore I propose that we could extend the format of the docfiles
> > lines as follows:
> > docfiles size=*****
> >  file1 attrib1=value1 attrib2=value2 ...
> >  file2 attrib1=value1 ...
>
> This is more or less OK, but looks more and more like XML. :)

Well, yes, it goes that direction ;-)
but XML is more bloaty...

[...]
>
> And you'll have to come even closer to XML, because you need to quote
> the attribute values, since I need the "details" attributes from the
> Catalogue in order to display a nice description of each document in the
> UI:
>
>   <documentation details='Manual, PDF version:'  language='en'
>                  href='ctan:/macros/latex/contrib/hyperref/doc/manual.pdf'/>
>   <documentation details='Summary of options:'  language='en'
>                  href='ctan:/macros/latex/contrib/hyperref/doc/options.pdf'/>
>
> So, it's not:
>
>  file1 attrib1=value1 attrib2=value2 ...
>  file2 attrib1=value1 ...
>
> but rather:
>
>  file1 attrib1="value1" attrib2="value2" ...
>  file2 attrib1="value1" ...
>
> and you'll have to make up yet another quoting scheme for the cases
> where we need a double quote in an attribute value... See how you're
> slowly reinventing XML? :)

But some lines later, you tell us, that this case is unlikely. ;-)
(...blank lines and newlines...)

>
> ,----
> | Theorem (F. Rougon, 2007)
> |
> | Any custom text file format tends to become a degraded version of XML
> | as adding features requires to extend it.

One also could invent a kind of simple language,
that makes things possible by implemented commands
instead of a bloated fileformat.
But you could also argue that XML can also be used for that
purose ;-)

...but it would be more bloaty... or would you rewrite your
Python-code in a xml-style?! ;-)

[...]
> > Furthermore, we add (optionally) for every TLPOBJ a line
> > 	tags <tag1> <tag2> <tag3> ...
> > to get the per packages tagging.
>
> OK (yes, I admit we don't really need spaces/newlines in tag names).
>

see above ;-)

[...]
> > If you agree on that, we should start (in private email) to write a
> > decent proposal with:
> > - rational
>
> That's spellt "rationale"...

something like a documentation of the whole process?
Shat's what I've looked for, when I subscribed this list.

[...]
>
> (yes, I know you may be doing that intentionally to make XML look
> complicated :)
>
> (and yes, I do think XML is complicated *if* you want to understand all
> the myriad of extensions around it such as XPath, XLink, XML Schemas,
> RELAX-NG, XWhatever, but basic XML is simple)
>
> With your answers to this mail, I should be able to start working (which
> doesn't mean you'll see immediate results, because I'll have to learn Qt
> again and see how to work with libtagcoll, but being able to assemble
> all the parts in my little head will greatly improve my peace of mind
> :).

Qt?

Will the next texlive-DVD include graphical installers based on Qt?!

Or am I completely out of order of the discussion?

Maybe discussing per Mail is not helping here,
because I will not read 10.000 splitters of docs and code snippets,
only because there is no overview on the concept that is planned to implement.

Ciao,
   Oliver

-- 
http://me.in-berlin.de/~first/