Lots of invalid utf-8 sequences when using Arabic Babel

outlook user RACP at outlook.fr
Sun May 12 15:35:27 CEST 2024


 
> l.138 \or ����������� �  \or 
>                    ������������  \or �����������^^@�^^@
> A funny symbol that I can't read has just been (re)read.
> Just continue, I'll change it to 0xFFFD.


On version 1.16.0 of LuaHBTeX from TeX Live distribution version 2023 using LUALaTeX 2023.8.28. It appears when `\usepackage{babel}` is modified to `\usepackage[arabic]{babel}`

Babel maybe tries to use cp1256. Or maybe arabicore.sty wants everything in Arabic and no mix whatsoever, or maybe he chug invisible characters somewhere, I don't know

>From what I understand LUATeX use by default UTF-8 and using inputenc will conflict with it because it tries to re-define the encoding which is locked in LUA/Xe, so will end up in a "crash". So Babel (or whatever invoke it) shouldn't do it (I think it tries to use cp1256 here, despite not asking for it) when not using PDFTeX (or whatever needs it), like on that case. And worse, some calls takes priority so re-using inputenc to ask for UTF-8 won't correct the problem because it takes the first call only



More information about the texhax mailing list.