[XeTeX] Making nice tables

Mike Maxwell maxwell at umiacs.umd.edu
Thu Nov 22 06:07:05 CET 2007


I'm know on some forums I'd take flak for this posting, but maybe you 
guys are nicer.

We are writing grammars in DocBook.  The way we converted the last one 
to PDF was rather convoluted: we used a converter that changes the 
DocBook XML into a Microsoft Word format, then created the PDF from 
that.  (The normal route goes through XML-FO, and I won't go into the 
reasons we didn't do that.)

The next grammar will be for Urdu.  This is a right-to-left writing 
system, and even worse than typical Arabic in the sense that Urdu is 
normally typeset in the Nasta'liq style, which employs a lot more 
ligatures and other fun stuff than the standard Naskh style of the 
Arabic script.  AFAIK, MsWord is not adept at typesetting Nasta'liq.

I ran across a program (dblatex) that converts DocBook XML into Latex. 
Turns out it works (with a few twists) for conversion into Xetex. 
Ah-hah, says I: a way to nicely typeset Urdu.

I haven't actually tried Urdu yet, but I ran our previous Bengali 
grammar through dblatex and xetex; Bengali has a reasonably complex, 
albeit left-to-right script, so it seemed a good way to start.  I'm 
afraid I have to say that I am not as yet able to get as nice output as 
I can by going through Word.  (Yes, I know that's heresy.)

I've found solutions for many of the problems, but tables seem to be a 
more difficult problem.  Some of our tables run longer than a single 
page, so I have given dblatex a parameter (table.in.float="0") that 
tells it to use the package 'longtable'.  The result is bizarre in 
certain cases; I'm attaching a jpg of one of the PDF pages.  As you can 
see, table 3.3 (which is broken across this page and the next) has a 
very odd bottom line; it extends slightly to the left of the left-hand 
vertical border, and over half of the right-hand part of the line is 
missing.  Moreover, the last word in the second column overlaps the 
bottom line, and it isn't leaving enough room above the footnote.  All 
in all, it's just a mess.

In contrast, when we output this through Word, this table is simply 
floated on to the next page.  This leaves a substantial white area at 
the bottom of this page; Word does not attempt to put any text in that 
area, and indeed there is a succession of several tables at this point, 
so trying to put text into the white area would mean that text from 
several pages later would be showing up here instead.

A couple things surprise me about this.  First, I am a bit surprised 
that Tex prefers to break this table across the page, rather than float 
it like Word does.  I guess you could argue about the ugliness of white 
space, but I for one prefer it in this case to the alternative of either 
breaking the table, or trying to fill in the white space with text which 
logically belongs much later.

But even more surprising to me is the mess that *tex makes of last line 
of this broken table.  I realize tables are hard, and footnotes make the 
problem of page layout still harder, but I would have thought it would 
get this right.

So my question: If you were doing this in xetex/latex, so that you could 
use whatever table package you wanted (instead of converting from 
DocBook), what would you do with these tables?  Should I be using some 
other package, instead of longtable?  Or maybe the problem is with the 
standard footnote package?  (I can of course provide the xetex input, if 
anyone is brave enough to look at it.)
-- 
	Mike Maxwell
	maxwell at umiacs.umd.edu
	"For over a thousand years, the British Empire was the guardian
	 of good grammar and the English language.
	 Before the dark times.  Before the Americans."
	--Bob Kenobi (Ben Kenobi's younger brother)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Table.JPG
Type: image/jpeg
Size: 120124 bytes
Desc: not available
Url : http://tug.org/pipermail/xetex/attachments/20071122/28d0eba1/attachment-0001.jpe 


More information about the XeTeX mailing list