<div dir="ltr">I believe this is largely a poppler problem. I'd be happy to discuss it a bit more if you would like.<div><br></div><div>-tom</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Nov 24, 2019 at 2:47 AM Mike Marchywka <<a href="mailto:marchywka@hotmail.com">marchywka@hotmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Sun, Nov 24, 2019 at 12:11:07AM +0000, Mike Marchywka wrote:<br>
> <br>
> I have never seen this before but looks like a stupid font problem<br>
> but it likely to be common with many pdf's now. If I just run <br>
> "pdftotext" on my output, I get weird boxes where each "fi"<br>
> is. If I used "-enc ASCII7" the entire thing is deleted.<br>
> <br>
> I could probably create a minimal working example but thought someone<br>
> may know offhand. Thanks. <br>
<br>
Nevermind, I figured it out :) I added this stupid thing<br>
<br>
\usepackage[T1]{fontenc}<br>
<br>
to fix another problem although if you are finding pdftotext output<br>
is jumbled or want to use the pdf ( and maybe dvi ) format <br>
to obscure information that would be in a normal text file ,<br>
this seems to work, <br>
<br>
<br>
<br>
\documentclass{article}<br>
\usepackage[T1]{fontenc}<br>
\usepackage{hyperref}<br>
\hypersetup{<br>
pdfinfo={<br>
x-bib-author = {A. Writer},<br>
x-bib-journal = {Test}<br>
x-bib-buy-url = {<a href="https://buyexpensivejunk" rel="noreferrer" target="_blank">https://buyexpensivejunk</a>}<br>
}<br>
}<br>
<br>
\newcommand{\addbib}[2]<br>
{<br>
\hypersetup{<br>
pdfinfo={ x-bib-#1 = {#2} } }<br>
<br>
}<br>
\addbib{author}{marchywka}<br>
\addbib{title}{my title}<br>
\addbib{omething}{foobar abstratct asdfasdfa }<br>
<br>
\begin{document}<br>
test<br>
a word that defines the problem, d e f i n e s<br>
\end{document}<br>
<br>
<br>
Compiling to pdf and inverting gives this,<br>
<br>
cat schumann.pdf | pdftotext - - <br>
test a word that de nes the problem, d e f i n e s<br>
<br>
1<br>
<br>
<br>
<br>
<br>
> <br>
> This is the version,<br>
> <br>
> pdftotext -v<br>
> pdftotext version 0.41.0<br>
> Copyright 2005-2016 The Poppler Developers - <a href="http://poppler.freedesktop.org" rel="noreferrer" target="_blank">http://poppler.freedesktop.org</a><br>
> Copyright 1996-2011 Glyph & Cog, LLC<br>
> <br>
> and basic info on the pdf file,<br>
> exifutil -list vitaprop.pdfExifTool Version Number : 11.75<br>
> File Name : vitaprop.pdf<br>
> Directory : .<br>
> File Size : 287 kB<br>
> File Modification Date/Time : 2019:11:23 06:17:53-05:00<br>
> File Access Date/Time : 2019:11:23 06:17:53-05:00<br>
> File Inode Change Date/Time : 2019:11:23 06:17:53-05:00<br>
> File Permissions : rw-rw-r--<br>
> File Type : PDF<br>
> File Type Extension : pdf<br>
> MIME Type : application/pdf<br>
> PDF Version : 1.5<br>
> Linearized : No<br>
> Page Count : 12<br>
> Page Mode : UseOutlines<br>
> Author : <br>
> Title : <br>
> Subject : <br>
> Creator : LaTeX with hyperref package<br>
> Producer : pdfTeX-1.40.16<br>
> Create Date : 2019:11:23 06:17:52-05:00<br>
> Modify Date : 2019:11:23 06:17:52-05:00<br>
> Trapped : False<br>
> PTEX Fullbanner : This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015/Debian) kpathsea version 6.2.1<br>
> <br>
> <br>
> -- <br>
> <br>
> mike marchywka<br>
> 306 charles cox<br>
> canton GA 30115<br>
> USA, Earth <br>
> <a href="mailto:marchywka@hotmail.com" target="_blank">marchywka@hotmail.com</a><br>
> 404-788-1216<br>
> ORCID: 0000-0001-9237-455X<br>
> <br>
<br>
-- <br>
<br>
mike marchywka<br>
306 charles cox<br>
canton GA 30115<br>
USA, Earth <br>
<a href="mailto:marchywka@hotmail.com" target="_blank">marchywka@hotmail.com</a><br>
404-788-1216<br>
ORCID: 0000-0001-9237-455X<br>
<br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div>-- <a href="http://cube20.org/" target="_blank">http://cube20.org/</a> -- <a href="http://golly.sf.net/" target="_blank">http://golly.sf.net/</a> --</div></div></div></div></div>