Google Summer of Code 2008 and TUG

This year TUG is participating in Google's Summer of Code program as a “mentoring organization”.

All organizational SoC-related discussions for TUG happen on the, mailing list; feel free to subscribe.

This page has all the SoC-related information for TUG: our project ideas, a suggested project application template, information on submitting more ideas for the list, and a few general links.

The first three projects listed here were funded to go forward. The final proposals are listed on TUG's summary page at Google.

Project ideas

- Better Unicode compliance

The most recent variants of TeX (XeTeX, LuaTeX) allow direct UTF-8 input and are therefore thought of as Unicode-enabled’. Nevertheless, they lack many capabilities that Unicode requires: for example, none of the various TeX extensions (also known as ‘engines’) handle combining characters correctly: in some cases, sequences including combining characters will render correctly because the current font supports them, but no provision is made for general handling of Unicode combining characters. Nevertheless, TeX has always had an outstanding tradition of handling complicated diacritic marks (using the \accent primitive) which predates Unicode by almost 15 years; but there has been almost no attempt to put this into relation with Unicode. An interesting addition to TeX could attempt to make that relation explicit while implementing the processes specified by Unicode (in particular normalization, which defines canonical transformations between fully composed and fully decomposed character sequences).

Another example is bidirectional typesetting: although experiments for mixing right-to-left with left-to-right text in TeX were made as early as 1987 (thus again predating Unicode), little effort has been undertaken to make TeX compliant with the Unicode bidi algorithm. Another example is line breaking properties: TeX's hyphenation algorithm is universally thought as very good, but does not take into account Unicode line breaking properties.

Implementing these properties could be done in a number of ways, using either or both of the newest TeX engines (XeTeX or LuaTeX, as already mentioned) and would give rise to a much more fully Unicode-compliant extension of TeX.

Proposed by the potential student; mentor would be Eric Muller.

- texshow overhaul

texshow (sources, ConTeXt command list) is a web-based tool for displaying ConTeXt documentation. It is in need of significant work:

Proposed by the potential student; mentors would be Hans Hagen and Taco Hoekwater.

- Improve JavaScript support in MathTran

MathTran is a public web service, that translates TeX-notation mathematics into high-quality bitmap images, primarily for inclusion on web pages. This development of MathTran was funded by JISC and The Open University. At present MathTran serves about 30,000 images a day. MathTran is similar in many ways to Google Charts.

There is already some JavaScript code that helps us use MathTran on web pages, but there is not nearly enough, and what we have is somewhat scrappy and dependent on the browser: see and

Much more could be done with better JavaScript, and that's why the mathtran-javascript project has been set up.

This project is a good opportunity for someone who is interested in writing high-quality low-level JavaScript. No knowledge of TeX is required, although an interest in mathematics or related areas would be helpful.

Mentor would be Jonathan Fine. Continuing information available on Jonathan's MathTran blog.

- TeX Live packaging API bindings

Implement a C and/or Lua binding of the TeX Live package management system. Details.

Proposed by the potential student; mentors would be Norbert Preining and Karl Berry.

- Dublin Core metadata interface for TeX

Improve Dublin Core metadata support in TeX. This requires significant experience in programming TeX macros, although changes in the TeX implementation should not be needed. Details.

Mentors would be Matthew Leingang and Peter Flynn.

For students: project application template

Student proposals are submitted via the Google Summer of Code web site. When proposing your project, please make sure you include the following information. However, before submitting anything, it is wise to talk with the existing maintainers and other developers first, to establish contact, be sure there are no unexpected questions or issues, etc.

We are not necessarily expecting each of these questions to be individually answered in a proposal, but the information should be present.

Name, email address
We need to be able to communicate with you!

Project name, summary
Please write these out, rather than just referring to the suggestion (if you are proposing one of the ideas here), to help avoid misunderstanding.

Please explain how TeX users will benefit from the project.

What software will be added or changed? What parts of the project's code will be affected? Which documentation will you update?

When we read this section of your proposal, we will be trying to figure out how well you understand what needs to be done. We're more likely to accept proposals from students who show us that they know what needs to be done.

Please indicate how you and your mentor will track your progress as you work on the project, and how the mid-term evaluation of your project will be made. An important part of the mid-term evaluation is that it's the point where everybody has to make a judgement about whether the project is going to be completed in time. It's important to have specific criteria for both the mid-term evaluation point and the for the fully-completed project. That is, how will you know where you're halfway, and how will you know when you're done?

What will you be working on, and how long will each part of the work take? What objective results will be visible at each stage? How will you know if you are ahead or behind schedule? If you are unable to complete the project, are the results from part-way through still useful? How?

How will everybody know whether things are on-track at the halfway evaluation point?

Please mention any periods during the summer when you won't actually be available to work on the project (though remember, the Summer of Code project is expected to be your main activity).

Our experience from talking with other Summer of Code projects is that good communication is essential to students' success. Students who communicate clearly and frequently with their mentor are more likely to be successful. Please indicate the ways in which you will contact your mentor (and an approximate schedule or timeline for doing so) to ensure that they're always aware of your progress. While email is undoubtedly useful, real-time forms of contact help a lot. Also, how will your mentor be able to see your code as it progresses?

Why did this project appeal to you? How will you benefit from it? Why are you particularly suited to work on this? What will you do once the project is "finished"? Have you worked on any TeX-related software before? Of the skills that you will need to complete the project, which do you already have? What will you need to learn?

(Thanks to the GNU Project for allowing us to use their text as a starting point.)

More ideas welcome (especially with mentors)

More ideas are certainly welcome (the sooner the better, though of course not after the student application deadline of March 31). There are plenty of good possibilities in the TeX world. However, it is critical that ideas be accompanied by a willing mentor (and ideally a backup mentor) who can commit to the nontrivial amount of time needed to work with the student, pinging them to be sure there are no stoppers, etc.

In addition, please make sure that the description of your idea contains enough information (perhaps in the form of pointers to other information or mailing lists) for students to be able to research the feasibility of their implementing the idea.

Send project ideas to, and feel free to subscribe to that list.

$Date: 2008/05/06 13:49:29 $;
TUG home page; contact webmaster; (via DuckDuckGo)