[luatex] dump a tfm to a file

luigi scarso luigi.scarso at gmail.com
Wed Jun 22 11:23:14 CEST 2011


On Wed, Jun 22, 2011 at 10:56 AM, Ulrike Fischer <luatex at nililand.de> wrote:
> Am Tue, 21 Jun 2011 14:09:04 -0500 schrieb Luis Rivera:
>
>>>> tex4ht doesn't use only this htf-files. At first it loads the tfm's
>>>> of the fonts mentioned in the dvi (I don't know exactly why).
>>>
>>> If it really needs TFMs, I do not think it can ever be compatible with
>>> luatex (or xetex, for that matter).
>>>
>>
>> afaik, tex4ht doesn't load the tfms to generate the dvi: that's
>> required by some tex engine (pdfTeX, afaik) which generates a dvi with
>> some tailored suited macros (it makes three passes, to ensure the bbl,
>> idx and other stuff are properly compiled); then, in the second stage,
>> tex4ht deciphers the dvi with the htf files to generate some xml
>> files; and in the third stage, t4ht assembles the final html/odt file
>> with the xmls, the images, and all the other stuff generated by the
>> previous steps. I collect that from reading the oolatex script, which
>> actually controls the whole process.
>>
>
>
>
> To avoid some confusion: There is tex4ht as "system", large package
> with various files, folders, configuration etc. And there is the
> central application tex4ht.exe.
>
> tex4ht.exe is a dvi-driver. It takes a dvi and generates eg a
> html-file. To be able to do this the dvi must contain a lot
> \specials. This specials are inserted by tex4ht.sty and various
> 4ht-files during the previous (lua)latex runs.
>
>
>> So, in my not very enlightened opinion, htf files are necessary only
>> because the tfm files that generate a dvi may have different
>> encodings, so the resulting dvi files are spaghetti encoded, and there
>> is some need to ensure that appropriate utf8 sequences are produced
>> from the messy dvi into the the generated xml files. htf files are
>> mainly maps from 8 bit encoded fonts into utf8.
>>
>> If a TeX engine could read and write files properly UTF8 encoded, the
>> need for htf files would be bypassed; tex4ht would only have to
>> translate typesetting instructions (from a target successor of dvi
>> format) into xml tags, since the encoding would be UTF8 right from the
>> beginning.
>
> The htf-files don't do only reencoding or mapping. They are used to
> control the "look" of the output. E.g.
>
> 'b' ''     98
>
> will give the expected "b" if the input is char98 (= b). But in
> another htf-file you find at position 98 this:
>
> 'B' '4'    98
>
> and this will give
>
> <span class="small-caps">B</span>
>
> (The <span>  comes from the '4' which is a class number).
>
> So the htf-files gives you a low-level mapping characters to other
> representations (like html entities) and of fonts to font features
> in html like small-caps, bold etc.
>
>
> The generation of the dvi works fine with luatex. The problems
> starts at the dvi -> html step with tex4ht (if the document uses
> system fonts). The dvi contains font names like "file:lm-modern..."
> and tex4ht looks (for still unknown reasons) for its tfm and can't
> find it.
>
> For a simple document I got around the problem by using the
> low-level command \font\test=Arial and renaming an arbitrary tfm to
> arial.tfm.  Currently I seem to be able to use ASCII and öäü, but
> the € is output to  ÿ. This looks like a 256-barrier ;-(. But
> perhaps one can get around it by extending the htf-files.
With context mkii (texlive 2009, pdftex engine: test.mkii is an utf-8 file)

%%% test.mkii
\enableregime[utf]
\starttext
goo
€æß@¢¢
\stoptext
%%%

$>texexec --dvi  test.mkii
$>tex4ht test.dvi


----------------------------
tex4ht.c (2009-01-31-07:33 kpathsea)
/opt/TeXLive2009/tl2009/bin/i386-linux/tex4ht test.dvi
(/opt/TeXLive2009/tl2009/texmf-dist/tex4ht/base/unix/tex4ht.env)
(/opt/TeXLive2009/tl2009/texmf-dist/tex4ht/ht-fonts/iso8859/1/charset/unicode.4hf)
(/opt/TeXLive2009/tl2009/texmf-dist/fonts/tfm/hoekwater/context/fmvr8x.tfm)
(/opt/TeXLive2009/tl2009/texmf-dist/tex4ht/ht-fonts/unicode/marvosym/fmvr8x.htf)
(/opt/TeXLive2009/tl2009/texmf-dist/fonts/tfm/public/lm/ec-lmr12.tfm)
(/opt/TeXLive2009/tl2009/texmf-dist/tex4ht/ht-fonts/alias/lm/lm-ec/ec-lm.htf)
Searching `lm-ec.htf' for `ec-lmr12.htf'
(/opt/TeXLive2009/tl2009/texmf-dist/tex4ht/ht-fonts/unicode/lm/lm-ec.htf)
[1 file test.html
]
Execute script `test.lg'


test.lg is
----------------------
htfcss: ec-lmbo  font-style: oblique;
htfcss: ec-lmbx  font-weight: bold;
htfcss: ec-lmbxi  font-style:italic; font-weight: bold;
htfcss: ec-lmbxo  font-style: oblique; font-weight: bold;
htfcss: ec-lmcsco  font-style: oblique;
htfcss: ec-lmri  font-style:italic;
htfcss: ec-lmro  font-style: oblique;
htfcss: ec-lmss  font-family: sans-serif;
htfcss: ec-lmssbo  font-family: sans-serif; font-style: oblique;
font-weight: bold;
htfcss: ec-lmssbx  font-family: sans-serif; font-weight: bold;
htfcss: ec-lmssdc  font-family: sans-serif;
htfcss: ec-lmssdo  font-family: sans-serif; font-style: oblique;
htfcss: ec-lmsso  font-family: sans-serif; font-style: oblique;
htfcss: ec-lmssq  font-family: sans-serif;
htfcss: ec-lmssqbo  font-family: sans-serif; font-style: oblique;
font-weight: bold;
htfcss: ec-lmssqbx  font-family: sans-serif; font-weight: bold;
htfcss: ec-lmssqo  font-family: sans-serif; font-style: oblique;
htfcss: ec-lmtcsc  font-family: monospace;
htfcss: ec-lmtcso  font-style: oblique; font-family: monospace;
htfcss: ec-lmtk  font-family: monospace;
htfcss: ec-lmtko  font-style: oblique; font-family: monospace;
htfcss: ec-lmtl  font-weight: light; font-family: monospace;
htfcss: ec-lmtlc  font-weight: light; font-family: monospace;
htfcss: ec-lmtlco  font-weight: light; font-style: oblique;
font-family: monospace;
htfcss: ec-lmtlo  font-weight: light; font-style: oblique;
font-family: monospace;
htfcss: ec-lmtt  font-family: monospace;
htfcss: ec-lmtti  font-family: monospace; font-style:italic;
htfcss: ec-lmtto  font-style: oblique; font-family: monospace;
htfcss: ec-lmvtk  font-family: monospace;
htfcss: ec-lmvtko  font-style: oblique; font-family: monospace;
htfcss: ec-lmvtl  font-weight: light; font-family: monospace;
htfcss: ec-lmvtlo  font-weight: light; font-style: oblique;
font-family: monospace;
htfcss: ec-lmvtt  font-family: monospace;
htfcss: ec-lmvtto  font-style: oblique; font-family: monospace;
File: test.html
--- characters ---
Font("ec-lmr","12","12","100")
Font("fmvr8x","","10","120")
------------------


The result is

1
goo &#x20AC;æß@cc


(where 1 is the pagenumber on the header)




-- 
luigi



More information about the luatex mailing list