[XeTeX] TECkit map for Latin alphabet to Unicode IPA

Daniel Greenhoe dgreenhoe at gmail.com
Thu Oct 27 10:06:52 CEST 2011


On Thu, Oct 27, 2011 at 1:01 PM, Andy Lin <kiryen at gmail.com> wrote:
> I've looked at the tipa code, and basically what they did was redefine
> \:, etc. to produce the characters. With that in mind...
> Add this to your tex document...

Thank you for your hard work on my behalf. And thank you for the
solution. It is certainly better than my solution --- which was no
solution.

I know that "beggars can't be choosers" and that "nobody likes a
whiner", but setting aside what perhaps wisdom should prevent, let me
say this:

What I would really like is a "drop in" solution involving a TECkit
map only. That is, I would like to be able to hand such a map off to a
linguist, and to tell him/her to simply add in something like this to
his/her tex file:
   \addfontfeatures{Mapping=tipa2uni}.
And that's it --- just one support file: a TECkit map file.

What you proposed requires the use of two support files: the TECkit
mapping file (to specify the mapping) and a tex file (a kind of style
file to redefine commands). I would like to basically replace the
<tipa> style package. But if I use your solution, I have taken away
the simple <tipa> style package and have replaced it with a new style
package plus an extra mapping file. In the end, how is this solution
better than the original <tipa> package solution.

In my mind (and again quite possibly in my mind only) the real problem
is with the fontspec package. I note that if I use my map, not with
XeLaTeX, but with the TECkit utility txtconv.exe, then there is no
problem converting "\:t" to U+0288. That is, if I have an input file
called in.txt that contains the character sequence "\:t" and invoke
this from a command line
   txtconv -t tipa2uni.tec -i in.txt -o out.txt
then the character sequence "\:t" is replaced by U+0288 in the output
file out.txt.

If I can do this conversion from the command line, why can't fontspec
handle it correctly? That is, before fontspec tries to interpret a
sequence beginning with "\" as a command, why can't it first check to
see if the sequence is up for replacement by a font mapping?

Would that be possible? If so, that would lead to what in my mind
(...) is a very clean and fairly elegant solution.

Dan

> and basically what they did was redefine



On Thu, Oct 27, 2011 at 1:01 PM, Andy Lin <kiryen at gmail.com> wrote:
> I've looked at the tipa code, and basically what they did was redefine
> \:, etc. to produce the characters. With that in mind...
>
> Add this to your tex document
> \newcommand{\setTIPAcommands}{
> \def\*\char"FE50 %Replace FE50 with your choice of unicode codepoint,
> I've chosen the small punctuation set because I haven't encountered
> them in the wild and I can't imagine someone entering these in by hand
> (as opposed to using a font's OpenType small caps feature).
> \def\;\char"FE51 %Ditto
> %etc...
> }
>
> Change your mapping definitions like the following
> U+FE51 latin_small_letter_t > latin_small_letter_t_with_retroflex_hook
> (I've used the unicode here, but you can redefine it, maybe semicolon_operator?)
> (Also, I've changed the bidirectional assignment to one-way... I seem
> to recall the bidirectional assignment was for things like ligatures
> in connected scripts, or contextual reassignments, where you're trying
> to assign a semantic equivalency between the two sides. Which is not
> what we're trying to hack here. Is it an important distinction? Not
> that I've seen. But then again, I have never tried searching for a
> retroflex t in Acrobat, so I couldn't tell you.)
>
> NB: The tipa manual mentions that these commands are not 100% safe.
> Keep that in mind if your code begins breaking in magical ways.
>
>
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
>



More information about the XeTeX mailing list