[texhax] files related to just-sent email

Reinhard Kotucha reinhard.kotucha at web.de
Sat Jun 19 02:11:38 CEST 2010


On 18 June 2010 Dan Doernberg wrote:

 > On Jun 17, 2010, at 8:45 PM, Reinhard Kotucha wrote:
 > 
 > > On 17 June 2010 Brandon Kuczenski wrote:
 > > 
 > >> Curious that you would come to a TeX list for a question unrelated to 
 > >> TeX.  Your problems are due to the fact that MS does bizarre things in 
 > >> its typesetting.
 > 
 > Fair point... The history of the question is that TUG was a
 > referral of sorts (by Jim Hafner, IBM Research) from Joe Halpern
 > (Cornell Univ., former JACM editor).
 > 
 > > Dan asks on this list because he expects more expert knowledge
 > > here than he can expect from any M$-Word related mailing list.
 > 
 > Very true! That was my hope and has already been realized;
 > Brandon's initial explanation probably already solved and/or gives
 > us good clues to a couple of problems we faced.... and my
 > impression is that a quick glance was probably enough for him to
 > diagnose those issues. Maybe others might have ideas (without too
 > much pain or time) about the Greek characters rendering as squares?

Hi Dan,
it would be helpful to know the content of the HTML file.  If web
browsers print rectangles instead of glyphs, it's very likely that the
HTML file is correct but the font being used doesn't support these
glyphs.

Does your HTML file contain a tag 

     content="text/html; charset=utf-8"

? Without specifying a charset, everything is supposed to fail.

Could you provide a ***tiny*** HTML file which shows this error?

I ask for a tiny file because I recently wrote a program for debugging
UTF-8 encoded files.  The program is based on the databases provided
by http://unicode.org .  It converts

------------------------------------------------------------------
\documentclass{article}

\begin{document}
------------------------------------------------------------------

to

------------------------------------------------------------------
==[ line 1 ]==
FEFF: ZERO WIDTH NO-BREAK SPACE
005C: REVERSE SOLIDUS
0064: LATIN SMALL LETTER D
006F: LATIN SMALL LETTER O
0063: LATIN SMALL LETTER C
0075: LATIN SMALL LETTER U
006D: LATIN SMALL LETTER M
0065: LATIN SMALL LETTER E
006E: LATIN SMALL LETTER N
0074: LATIN SMALL LETTER T
0063: LATIN SMALL LETTER C
006C: LATIN SMALL LETTER L
0061: LATIN SMALL LETTER A
0073: LATIN SMALL LETTER S
0073: LATIN SMALL LETTER S
007B: LEFT CURLY BRACKET
0061: LATIN SMALL LETTER A
0072: LATIN SMALL LETTER R
0074: LATIN SMALL LETTER T
0069: LATIN SMALL LETTER I
0063: LATIN SMALL LETTER C
006C: LATIN SMALL LETTER L
0065: LATIN SMALL LETTER E
007D: RIGHT CURLY BRACKET
000A: <control> LINE FEED (LF)

==[ line 2 ]==
000A: <control> LINE FEED (LF)

==[ line 3 ]==
005C: REVERSE SOLIDUS
0062: LATIN SMALL LETTER B
0065: LATIN SMALL LETTER E
0067: LATIN SMALL LETTER G
0069: LATIN SMALL LETTER I
006E: LATIN SMALL LETTER N
007B: LEFT CURLY BRACKET
0064: LATIN SMALL LETTER D
006F: LATIN SMALL LETTER O
0063: LATIN SMALL LETTER C
0075: LATIN SMALL LETTER U
006D: LATIN SMALL LETTER M
0065: LATIN SMALL LETTER E
006E: LATIN SMALL LETTER N
0074: LATIN SMALL LETTER T
007D: RIGHT CURLY BRACKET
000A: <control> LINE FEED (LF)
------------------------------------------------------------------

I think that you understand now why I'm asking for a *tiny* file which
demonstrates your problem. :)

Regards,
  Reinhard

-- 
----------------------------------------------------------------------------
Reinhard Kotucha			              Phone: +49-511-3373112
Marschnerstr. 25
D-30167 Hannover	                      mailto:reinhard.kotucha at web.de
----------------------------------------------------------------------------
Microsoft isn't the answer. Microsoft is the question, and the answer is NO.
----------------------------------------------------------------------------


More information about the texhax mailing list