<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Hi All,<div><br></div><div>Sorry, I go OT here, but in order to debate it is necessary.</div><div>Please forgive.</div><div><br></div><div>I have to side more with Philip.</div><div><br></div><div>What most are forgetting is what (Xe)TeX is intended for.</div><div>It is for most a typesetting program(you do mention this below).</div><div>It was not designed to handle different languages or actually truly</div><div>do wordprocessing in the modern sense. </div><div><br></div><div>Due to the power of the TeX engine, it evolved to deal with different languages</div><div>and newer output methods and encodings. The problem with TeX that the basic </div><div>engine has not been redesigned to handle these new developments well.</div><div>The internals need to be completely revamped.</div><div><br><div><div>Am 17.11.2011 um 20:36 schrieb Ross Moore:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div bgcolor="#FFFFFF"><div>Hi Phil,</div><div><br>On 17/11/2011, at 23:53, Philip TAYLOR <<a href="mailto:P.Taylor@Rhul.Ac.Uk">P.Taylor@Rhul.Ac.Uk</a>> wrote:<br><br></div><div></div><blockquote type="cite"><div><span></span><span>Keith J. Schultz wrote:</span><br><blockquote type="cite"><span></span></blockquote><blockquote type="cite"><blockquote type="cite"><font class="Apple-style-span" color="#005001"><br></font></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>You mention in a later post that you do consider a space as a printable character.</span><br></blockquote></blockquote><blockquote type="cite"><span> This line should read as:</span><br></blockquote><blockquote type="cite"><span> You mention in a later post that you consider a space as a non-printable character.</span><br></blockquote><span></span><br><span>No, I don't think of it as a "character" at all, when we are talking</span><br><span>about typeset output (as opposed to ASCII (or Unicode) input). </span></div></blockquote><div><br></div>This is fine, when all that you require of your output is that it be visible on<div>a printed page. But modern communication media goes much beyond that.</div><div>A machine needs to be able to tell where words and lines end, reflowing paragraphs when appropriate and able to produce a flat extraction of all the text, perhaps also with some indication of the purpose of that text (e.g. by structural tagging).</div></div></blockquote><span class="Apple-tab-span" style="white-space:pre"> </span>I would agree with you, but TeX was not designed as a communications program, it was designed for creating printed media.</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>Furthermore, it may be desirable in the Modern World to have every programs out used as input for another program.</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>This ideal is utopia. If you need the output from one program(media) to another then you will need a intermediate program/filter</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>in order to reformat/convert the differences. As with all types of communication there will be structures missing/lacking in the other</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>system. So a one to one conversion will not be possible. You will need to use some kind of heuristics or in modern terms intelligence.<br><blockquote type="cite"><div bgcolor="#FFFFFF"><div><br></div><div>In short, what is output for one format should also be able to serve as input for another.</div></div></blockquote><span class="Apple-tab-span" style="white-space:pre"> </span>This assertion is completely idealistic. Then again, it is true. It is possibly, today, to design a system that goes from audio, to TeX, to printed documents</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>to audio again. Yet, you will need a lot of effort and most likely the results will be far from perfect. Though it is workable and require considerable</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>resources.<br><blockquote type="cite"><div bgcolor="#FFFFFF"><div><br></div><div>Thus the space certainly does play the role of an output character – though the presence of a gap in the positioning of visible letters may serve this role in many, but not all, circumstances.</div></div></blockquote><span class="Apple-tab-span" style="white-space:pre"> </span>This depends on what you are outputting. For a printed page and is consumed by a human it goes not matter, because humans do not process space characters just space, and they even</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>at times ignore them completely, because it is irrelevant for their natural language processing.</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>For computers on the other hand the use of a space character can be very relevant.</div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>In the early days of TeX and LaTeX I have know people to create their e-mail with TeX. So you can see TeX is capable of outputting character based output.</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>Furthermore, TeX could be used to produce any form of character based formats as its output. <br><blockquote type="cite"><div bgcolor="#FFFFFF"><div><br><blockquote type="cite"><div><span>Clearly</span><br><span>it is a character on input, but unless it generates a glyph in the</span><br><span>output stream (which TeX does not, for normal spaces) then it is not</span><br><span>a character (/qua/ character) on output but rather a formatting</span><br><span>instruction not dissimilar to (say) end-of-line.</span><br></div></blockquote><div><br></div>But a formatting instruction for one program cannot serve as reliable input for another.</div><div>A heuristic is then needed, to attempt to infer that a programming instruction must have been used, and guess what kind of instruction it might have been. This is not 100% reliable, so is deprecated in modern methods of data storage and document formats.</div></div></blockquote><span class="Apple-tab-span" style="white-space:pre"> </span>Are you not contradicting yourself here! See above.<br><blockquote type="cite"><div bgcolor="#FFFFFF"><div>XML based formats use tagging, rather that programming instructions. This is the modern way, which is used extensively for communicating data between different software systems.</div></div></blockquote><span class="Apple-tab-span" style="white-space:pre"> </span>True it is used, for communicating data. Yet, you are misconceived in thinking that it truly solves any of the problems involved different data types or content!</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>You can get a parse tree of the data, yet if a program can not understand or process the data/content it is useless. </div><div><span class="Apple-tab-span" style="white-space:pre"> </span>Agreed the XML file contains information about it structure and is human readable, yet it does NOTHING, for convert from one format to another. You still need a parser/filter to </div><div><span class="Apple-tab-span" style="white-space:pre"> </span>convert into another format. </div><div><span class="Apple-tab-span" style="white-space:pre"> </span>Do not forget you can put practically anything in an XML file; a program, image, TeX file, PDF, etc. Though I would not advise it.<br><blockquote type="cite"><div bgcolor="#FFFFFF"><div><br></div><div><blockquote type="cite"><div><span></span><br><span>** Phil.</span><br></div></blockquote><br></div><div>TeX's strength is in its superior ability to position characters on the page for maximum visual effect. This is done by producing detailed programming instructions within the content stream of the <span class="Apple-style-span" style="-webkit-tap-highlight-color: rgba(26, 26, 26, 0.296875); -webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); -webkit-composition-frame-color: rgba(77, 128, 180, 0.230469); ">PDF output. However, this is not enough to meet the needs of formats such as EPUB, non-visual reading software, archival formats, searchability, and other needs.</span></div></div></blockquote><span class="Apple-tab-span" style="white-space:pre"> </span>You are probably a little young to know this, but TeX's original output format was a dvi file. Only more recent engines produce PDF. It is possible to create engines that output EPUB. If your TeX skills are adequate enough you</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>do not even need to create a new engine. TeX has the ability to output files in any format if you know how to do it. </div><div><br><blockquote type="cite"><div bgcolor="#FFFFFF"><div>Tagged PDF can be viewed as Adobe's response to address these requirements as an extension of the visual aspects of the PDF format. It is a direction in which TeX can (<span class="Apple-style-span" style="-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); -webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); -webkit-composition-frame-color: rgba(77, 128, 180, 0.230469); ">and surely must) move, to stay relevant within the publishing industry of the future.</span></div></div></blockquote><span class="Apple-tab-span" style="white-space:pre"> </span>TeX used to be a industry standard. The innovations of processing power has evolved that the use of it in the publishing industry has made it inefficient and other system are</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>easier and faster for humans to operate.</div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>That TeX has survived this long is amazing. Yet, it remains one of the most powerful and cheapest typesetting systems to date. </div><div><br></div><div>regards</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>Keith.</div><div><br></div><div><br></div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span></div><br></div></body></html>