[pdftex] Tweaking pdf outputs to produce one box per word.
Ross Moore
ross.moore at mq.edu.au
Sat Jul 3 02:57:33 CEST 2010
Hello CFP,
On 02/07/2010, at 4:48 PM, CFP wrote:
> Hello everyone!
> (I think this is where I should be asking, but I’m not totally sure… Forgive
> me if I'm mistaken :))
>
> I’m trying to tweak the output of the pdflatex command to make it produce
> one box per word. Consider this example:
> \documentclass{minimal}
>
> \begin{document}
> This is an example sentence.
> \end{document}
>
> When opened in a PDF editor after compilation, this sample will appear as
> one text box containing the sentence "This is an example sentence.". This is
> fine for most full-featured pdf readers. Yet on my sony e-reader, selection
> of words is based on boxes ; therefore my pdf reader will select the full
> sentence, hence failing to find a definition for the word I clicked.
Not sure what your definition of "boxes" is here.
What I think is happening is that, normally, there are no space
characters in the output that pdfLaTeX generates. Thus your e-reader
sees just a single word consisting of all the characters on a single
line of the PDF --- unless there are punctuation symbols, which may
then be treated as word-boundaries.
To test this theory, please try the attached PDF which *does*
have space characters to define word-boundaries. Indeed it has
full PDF tagging of structure, including the mathematics.
It would be very interesting to see what your e-reader does
with it.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sphere_volume-2-readaloud.pdf
Type: application/pdf
Size: 64502 bytes
Desc: not available
URL: <http://tug.org/pipermail/pdftex/attachments/20100702/30db0400/attachment-0001.pdf>
-------------- next part --------------
>
> I noticed that pdflatex stops at punctuation marks. How can I proceed to
> make it create one box per word? In the output, I would then have one box
> for "This", one for "is", one for "an", and so on.
Try my attached PDF and tell me how well it works for you.
>
> Thanks a lot!
> CFP.
Cheers,
Ross
------------------------------------------------------------------------
Ross Moore ross.moore at mq.edu.au
Mathematics Department office: E7A-419
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia 2109 fax: +61 (0)2 9850 8114
------------------------------------------------------------------------
More information about the pdftex
mailing list