[texhax] more on math rendering for the web (including Microsoft Word Symbol font and TeX for web)
Dan Doernberg
dan at fairness.com
Sun Jun 27 16:51:59 CEST 2010
Reinhrad, Brandon
I'm following up on our email from last week where you graciously answer several of our questions. We have been trying to do more research ourselves so we can ask as intelligent questions as possible, but dealing with Word's Symbol font is apparently a very tough problem and in an area we are not experts in.
A. I'd like for our software to be able to handle TeX documents. Two general questions come to mind (presumably easy ones, so I haven't tried to research TeX before asking):
1. Would it be trivial or difficult for our software to render TeX input documents for the Web? Would LaTeX and/or other variants be the same?
2. Does TeX have any built-in translaters for dealing with legacy documents from MS Word? What do other people do when confronted with problematic Word documents????
B. In case helpful to you... I came across someone who appears to be knowledgeable about Symbol font issues from the Microsoft community: Jay Freedman <http://www.word.mvps.org>. One comment I found in a post of his:
http://www.eggheadcafe.com/software/aspnet/35945671/symbol-font--not-unicode-compliant-how-to-searchreplace.aspx
Question--- The Symbol "font" available through the Insert Symbol dialog is not Unicode
compliant (for example, the ?? [Registered] glyph in Symbol comes from codes
x00D2 OR x00E2 (or their decmial equivalents; two different versions of it!),
instead of the proper Unicode x00AE)....
Jay Freedman answer:
Word's way of hiding the "true" font of symbols inserted through the Insert Symbol dialog is a historical leftover that has caused untold heartburn for two decades. The convoluted method for finding and replacing them is given at http://www.word.mvps.org/FAQs/MacrosVBA/FindReplaceSymbols.htm.
C. Here's our most recent attempt to explain our situation better:
We're doing some text --> HTML conversion work (Ruby on Rails in a Linux environment) and are running into problems rendering MS Word documents that have Symbol font characters (e.g. characters created via the Insert Symbol menu).
Here's one stripped down, step-by-step example of the problem:
1. I create a simple 1-line Word 2008 (Mac) document using the default Cambria font.
2. I type a few standard characters, then use Word's Insert | Symbol command to insert the Greek letter epsilon into that document.
3. I do a "Save As HTML".
4. Both Safari and Firefox can load the resulting HTML file, but both render the inserted epsilon character as a square (in effect, fail to render it).
5. When I view the source that resulted from the "Save As HTML" operation it shows that the epsilon character has been converted to the ASCII letter "e" AND rendered in the Symbol font (changed from Cambria).
Our processing stack is quite a bit more involved that this, and there could be issues with Symbol at one or more point in the stack (and maybe other fonts in addition to Symbol might cause problems as well!?).
One article we found simply recommended not to use Symbol font for the web. That's probably good advice, but training users "how to do things right" isn't an option as users of our software are not from any one organization, we may have to deal with legacy documents, etc.
What software approaches (automated!) might be available to us to detect and deal with problems so that Word documents can be rendered for the Web?
As before, all suggestions and informational tips will be appreciated...
Dan Doernberg, President, Fairness.com LLC
Email: dan at fairness.com
Web: http://nowcomment.com/
Phone/Fax: 434-975-0780
New: NowComment® <http://nowcomment.com/>
Turning Documents into ConversationsSM
///////////////////////////////////////////////
> It's a good idea to ask here.
> I think that he deserves a good answer, even if he doesn't use TeX.
> Regards,
> Reinhard
More information about the texhax
mailing list