[texhax] how to create a non-breaking hyphen

Paul Tremblay phthenry at iglou.com
Mon May 15 06:06:25 CEST 2006

On Mon, May 15, 2006 at 12:09:26AM +0200, Uwe Lück wrote:
> This seems to be an XML or some script issue,
> having nothing to do with LaTeX -- therefore I can't help,
> having only a very rough idea of XML or of AWK/Pearl etc.
> (that kind of regular expressions).
> And I don't know who types which kind of text input
> with your application. If it is /you/ who types it,
> can't you type `//hyphen//' instead of `-', so the
> replacing is nearly 100% reliable!? And I thought that
> XML gives the possibility of /macro/ or "alias" definitions ...

XML handles non-breaking hyphens very elegantly by allowing you to
type the character in directly. Indeed, the XML I am working with is a
conversion from Microsoft RTF. The writer of the document knows
nothing about XML but can simply tell the program not to break this

The result looks like this:

<p>This is a paragraph with a non&#x2011;breaking hyphen.</p>

(The &#2011; is the ascii representation for a non breaking hyphen.)

The XML is already produced. I have no control over it. It is correct
XML, the type that everyone knows how to read. There is no simple way
to convert the non-breaking symbol to mbox{non&#x2011;breaking},
because normal methods (using XSLT) don't handle text very well.   

There is a sophisticated python script that converts XML to LaTeX.
This phython script works with the text, converting characters like 
$ into \textdollar. However, it can only convert one character at a

So neither the normal methods of XSLT nor the python script can handle
the non-breaking hyphen. And the usc.sty that handles most of the
unicode characters can't handle it either.

However, as long as I can simply replace &#x2011; with \mbox{-}, my
problem is solved.



*Paul Tremblay         *
*phthenry at iglou.com    *

More information about the texhax mailing list