Thu 11 May: TeX Hour: Using LaTeXML to access audit arXiv LaTeX source files

Jonathan Fine jfine2358 at gmail.com
Wed May 10 21:42:33 CEST 2023


Hi

The arXiv has about 2.5 million articles, most of which have been processed
with LaTeX to produce PDF. In addition, most of these LaTeX articles have
been processed with LaTeXML, to produce HTML. Recently the arXix has
announced it will be making this HTML available, to improve accessibility.
Tomorrow's TeX Hour is about using LaTeXML to audit accessibility of the
arXiv LaTeX source.

TeX Hour: Thursday 11 May, 6:30 to 7:30pm BST
More information:
https://texhour.github.io/2023/05/11/latex-access-audit-latex/
Zoom URL:
https://us02web.zoom.us/j/78551255396?pwd=cHdJN0pTTXRlRCtSd1lCTHpuWmNIUT09

LaTeXML produces a log file, containing warnings and errors. It provides to
some degree an accessibility audit of the LaTeX source files on the arXiv.
Tomorrow's TeX Hour is an informal preliminary report on my efforts to use
thes log files to audit arXiv source for accessibility. Results so far are
outnumbered by problems, but it's early days.

Going to https://ar5iv.labs.arxiv.org/feeling_lucky will send you to a
random arXiv article in HTML. At the bottom of that page there is a link to
the LaTeX-to-HTML conversion report (the log file), and also the arXiv PDF.
Getting the LaTeX source is more work. Automating all this is one of the
early problems.

wishing you safe and accessible TeXing

Jonathan
<https://ar5iv.labs.arxiv.org/feeling_lucky>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tug.org/pipermail/tex-live/attachments/20230510/92d10535/attachment.html>


More information about the tex-live mailing list.