[XeTeX] charlint?

Thu Mar 3 18:34:48 CET 2011

I guess this isn't really the place for this query, but perhaps someone has
had a similar problem.

I'm using XeTeX and TeXWorks for academic work, like many of us.  I
cut-n-past bibliographical information from sites like copac.ac.uk and
worldcat.org, into JabRef for use in my documents.  What I'm finding,
though, is that several of these big online bibliographical databases have
their records in un-normalized Unicode.  And it doesn't print nicely with
XeTeX.

Rather than struggel with XeTeX's accent placement, which seems to be an
unattractively per-font problem in any case, it makes better sense to me to
normalize the Unicode to NFC form.  I.e.,
http://en.wiktionary.org/wiki/Appendix:Unicode_normalization
What I'd like, ideally, is a little filter to run on my bib files
periodically to clean up any char+non-spacing-accent glyphs.

I've been looking around for an appropriate tool for this job.  All I could
find is charlint.pl.  But I can't for the life of me get it to work.  It
halts, throwing up errors along the lines of
\begin{verbatim}
Checking duplicates, takes some time.
Finished processing character data file(s).

Line 1: Non-Existing codepoints.
Giving up!
\end{verbatim}

Has anyone any better suggestions than charlint, or experience getting
charlint working?

Best,
Dominik
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/xetex/attachments/20110303/41f34d58/attachment.html>