regression in TeX Live 2023 concerning some math characters

Fri Nov 24 01:49:59 CET 2023

On 2023-11-23 17:38:50 +0100, Ulrike Fischer wrote:
> Are you using the same pdftotext to extract the text?

By default, no. But I also tried the same version by scp'ing
the generated PDF file to the other machine (stable → unstable
and unstable → stable). I could see that in both cases, the
output doesn't depend on pdftotext.

> If you add 
> \ExplSyntaxOn\sys_ensure_backend:\pdf_uncompress:\ExplSyntaxOff
> to the begin and recompile you get uncompressed PDFs and can try to
> diff them. Are then any differences in the ToUnicode?

The ToUnicode CMap data get removed, except

1 begincodespacerange
<00> <FF>
endcodespacerange
0 beginbfrange
endbfrange
3 beginbfchar
<66> <0066>
<69> <0069>
<6C> <006C>
endbfchar

BTW, I usually have

\pdfglyphtounicode{f}{0066}
\pdfglyphtounicode{i}{0069}
\pdfglyphtounicode{l}{006C}

in glyphtounicode.tex from my home directory. But I removed the file
for the tests, and I could see with strace that this is no attempt
to read such a file anyway. I hope that pdflatex doesn't use some
form of obsolete cache.

-- 
Vincent Lefèvre <vincent at vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)