<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;">
Hi Ulrike, Karl, Thanh and others.
<div><br>
</div>
<div>Happy New Year.</div>
<div><br>
<div><br>
</div>
<div>Hmm. I missed seeing this message months back.</div>
<div><br>
</div>
<div>I agree with Ulrike, (fake) space characters are now missing at line-breaks within a paragraph.</div>
<div>This affects text-extraction, in particular for deriving to HTML, as consecutive words get concatenated.</div>
<div>It’s an effect which is `liveable-with’ but virtually impossible to detect and fix in any automatic way.</div>
<div><br>
</div>
<div>Previously — going back roughly 12+ years, when \pdffakespace was 1st introduced — spaces</div>
<div>*were* included at line-breaks. (Even after hyphenations, but these are detectable automatically.)</div>
<div>So at some point (last 1-2 years?) the algorithm must have changed, </div>
<div>or some parameter has been given a different value.</div>
<div><br>
</div>
<div>Can we please revisit this.</div>
<div><br>
</div>
<div>All the best.</div>
<div><br>
</div>
<div> Ross</div>
<div><br>
<div><br>
<blockquote type="cite">
<div>On 24 Jul 2024, at 7:23 pm, Ulrike Fischer <news3@nililand.de> wrote:</div>
<br class="Apple-interchange-newline">
<div>
<div>If one changes the font there are no space chars at the border. E.g.<br>
with<br>
<br>
\pdfcompresslevel0<br>
\pdfobjcompresslevel0<br>
\font\test=cmss10 <br>
\pdfinterwordspaceon<br>
<br>
text text {\test cmss cmss} text text<br>
<br>
\bye<br>
<br>
there is a space char between "text text" and "cmss cmss":<br>
<br>
[(text)]TJ/F51 9.9626 Tf( )Tj/F1 9.9626 Tf 20.756 0 Td [(text)] <br>
[(cmss)]TJ/F51 9.9626 Tf( )Tj/F20 9.9626 Tf 23.302 0 Td [(cmss)]<br>
<br>
But nothing between "text cmss" and "cmss text"<br>
<br>
[(text)]TJ/F20 9.9626 Tf 20.755 0 Td [(cmss)] <br>
[(cmss)]TJ/F1 9.9626 Tf 23.301 0 Td [(text)]<br>
<br>
One can insert the missing chars manually with \pdffakespace but<br>
perhaps an automatic solution is possible?<br>
<br>
-- <br>
Ulrike Fischer <br>
<a href="http://www.troubleshooting-tex.de">http://www.troubleshooting-tex.de/</a><br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</body>
</html>