[tug-summer-of-code] A couple of project proposals

Daniel Kirsch danishkirel at googlemail.com
Tue Jun 30 16:10:58 CEST 2009


> Very cool!  That's just what I was thinking of when I proposed
> handwriting-based symbol search for Google Summer of Code.  If you
> don't mind my asking, about how long did it take you to write that?
> (I don't know if you read the rest of this thread, but we had some
> discussion about whether this project would take far more than a
> summer to complete, and your answer can certainly help us calibrate
> project scope for next year.)

I am sorry to say that I have absolutely no idea how much time the
whole thing took. I have to stress that I was new to both pattern
recognition and inexperienced in LaTeX when I started detexify (first
commit on May 24). So the most time was spent getting into the topic
and I am still not sure if I chose the best
technologies/algorithms/features.

> I trained a few symbols on your site and noticed that many of the
> symbols are accented letters, just because there are so many of them:
> \acute{a}, \acute{b}, \acute{c}, ..., \acute{z}, \hat{a}, \hat{b},
> ..., \hat{z}, etc.  Maybe you could bias the training requests towards
> some of the more obscure or hard-to-name symbols?

I have now integrated the training into the searching wich makes more
sense anyway. The old training is still available and will always
offer the symbols with the least samples to be trained.

> Have you already trained the program on the typeset versions of the
> symbols, or do you require handwritten input?

Everything is based on handwritten input. That was a performance
decision. I have experimented with analyzation of image data and found
it to be too slow (in ruby with rmagick at least).

> Once again, good job!  I hope you manage to get the program trained on
> lots of symbols in the near future.

I would really like to support the Comprehensive List of LaTeX Symbols
but as already noted I am not very experienced with LaTeX. The System
should work with any kind of hand-drawn symbol but right now my
problem is that I don't know how to get all these symbols rendered for
the web. I am using MathTeX (http://www.forkosh.com/mathtex.html) to
render the Symbols.

Daniel


More information about the summer-of-code mailing list