lualatex vs pdflatex- char encoding issue not fixed by suggestions found in CTAN etc

Mike Marchywka marchywka at
Thu Feb 17 14:02:29 CET 2022

On Thu, Feb 17, 2022 at 12:27:25PM +0000, David Carlisle wrote:
>    On Thu, 17 Feb 2022 at 10:39, Mike Marchywka <[mailto:marchywka at]marchywka at> wrote:
>      On Thu, Feb 17, 2022 at 10:55:18AM +0100, Ulrike Fischer wrote:
>      > Am Wed, 16 Feb 2022 18:29:02 -0500 schrieb Mike Marchywka:
>      >
>      > > I recently changed my default builds from pdflatex to lualatex
>      > > and now the foreign chars are creating cryptic failure modes.
>      >
>      > Well it would help if you would mention the actual error (and would
>      > show a concrete example)
>      yes, I just thought this was a well known problem and it
>      had "A{n}" accepted solution...
>      >
>      > > When I finally figured it out, neither of these suggestions
>      > > seemed to help.
>      > >
>      > >
>      []
>      df
>      >
>      > > \usepackage[OT1]{fontenc}
>      > >
>      > > \usepackage{luaninputenc}
>      >
>      > Do not use fontenc (OT1 or T1) with lualatex.
>      > Do not use luainputenc.
>      >
>      I thought these were supposed to make lualatex act like pdflatex
>      which was my only immediate objective.
>    It does (more or less) choosing 7bit legacy fonts without even European accented letters
>    so basically disables all luatex Unicode support, in which case you may as well use pdflatex.
>    especially I would not use the version you used:
>    This is LuaTeX, Version beta-0.80.0
>    there were many changes in luatex before the release of luatex version 1 (which has been stable for some years, the luatex
>    team already working on its successor) The current version has banner
>    This is LuaTeX, Version 1.13.2
>      >
>      > Encode your files in utf8 or use pure ascii.
>      >
>      I was trying to avoid the issue but since everything seemed to work
>      I did not not worry about it. I have been changing my bibtex
>      downloader, TooBib, I guess i could have checked the download
>      dates of the offending entries but since it worked on pdflatex
>      I didn't think it was a factor.
>      In essence I did make everything ASCII although I never took a
>      char histogram looking for high bit set.
>      However, instead of converting offenders to latex I just picked
>      a close ASCII char.
>    I would have thought the main feature of a bibliography tool would be to handle the names of people cited.

In essence the citation AFAICT should give the reader some way to evaluate
important attributes of the source and decide on "hitting the link."
Personally I use the titles and maybe journal but sure a lot of people look at
authors. In that case, anything close should "ring a bell" for the ad hominem conscious :)
But yes I was very worried about search tools other than google that can auto-correct
or suggest. I'm not sure how you search for latex commands within names of if one is
better than the other. And if the link doesn't work the reader has the DOI...

And of course I don't like typos or inaccuracies in the bib files. So, everything I 
download with TooBib contains a "srcurl" to replay the bibtex discovery. This
allows  for detection of errors in prior download or prior versions on the source site.
If the original src webpage is no good ( some of these stupid sites have time limited
generated strings in the URL ) hopefully the overall bibtex entry is good
enough to find a better link either from a DOI or manual work.

>    If you silently change names without warning, is that really usable other than for testing?
>    How do you choose close ascii for Greek or Cyrillic or Chinese names? And even for accented latin alphabet
>    you can't drop the accents without changing people's names, or introducing errors into titles of papers?

See the above, certainly typos are bad but these should be easy to correct when it
matters and a reader who is familiar with an author can mentally figure it out.
AFAICT, greek and cyrillic transliterate well and I tend to ignore accents etc :)
Transliteration of Russian names has been a problem and you can see things
like -ov vs -off IIRC that vary with translator ( again IIRC, you are probably
more familiar with this and to me it is just a quirk or oddity ). 

I guess I will have to fix this for a variety of reasons although even for
a proof-of-concept for "bill of materials" tex it is not the immediate concern. 


>    David


mike marchywka
306 charles cox
canton GA 30115
USA, Earth 
marchywka at
ORCID: 0000-0001-9237-455X

More information about the texhax mailing list.