Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

\pgfmathparse improvements #2485

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
Open

Conversation

xworld21
Copy link
Contributor

Fix #2482 and lots of other small issues with the parser.

My intention was to simply 'linearise' the grammar, i.e., make sure it never backtracks when parsing the rules. This should improve performance quite a bit (the parser is now guaranteed to run in linear time). I am wondering how much of that can be done for the maths grammar, which has dramatic performance issues.

However, while writing some tests, I discovered that the parser is very broken. So I fixed the easy stuff (wrong functions, string handling) and added a new test to exercise the basic functionality (right now, just the functions).

There are still major issues, e.g. you can't even evaluate bin(185) without LaTeXML claiming an overflow. Strings should become tokens in case they contain unexpanded primitives. The blacklist is very fragile. Maybe the LaTeXML parser should fall back to the PGF one when it fails.

@dginev
Copy link
Collaborator

dginev commented Jan 12, 2025

Thank you for looking into this! Could you also add the new test files to MANIFEST, so that we get CI working?

The one comment I will leave for archival is that - yes, indeed - the perl handling for pgf math has been all of fragile, buggy and incomplete. But importantly - so has the raw interpretataion of the TeX sources of that package. So incremental progress here is good (and I am especially grateful you took the time to add so many new tests).

@xworld21 xworld21 changed the title Pgfmathparse \pgfmathparse improvements Jan 12, 2025
@xworld21
Copy link
Contributor Author

The math parser is now slightly less broken, but still very broken. The operator precedences are all slightly wrong – they should have been set reading the \pgfmathdeclareoperator commands in the sources.

Still wrong: the factorial operator binds higher than multiplication, but lower than power; the radians 'unit' is between addition and multiplication. Hopefully I got the rest right.

@dginev are there notable examples from arXiv that might need all of this?

@dginev
Copy link
Collaborator

dginev commented Jan 12, 2025

@dginev are there notable examples from arXiv that might need all of this?

Oh yes, certainly, pgf is a very commonly used package. Looking at the loaded files report for ar5iv, pgfmathparser.code.tex gets loaded in 436,976 articles.

If you want to fish for examples, I remember from #2237 that most/all pgfplots use the math parser, and there are ~59,000 articles using that library. That previous PR patch was based on arXiv:2104.00602, and you can find a few other reported pgf-related issues at the ar5iv tracker. But naturally arXiv readers do not distill the problems down to the unit test level. It is possible that converting the pgf/tikz showcase sites are a better way to fish out relevant examples.

But yes, this is very much a sizeable need in arXiv.

@xworld21
Copy link
Contributor Author

Uhm, the <pagination role="newpage"></pagination> error changes with TeX Live version. Is there a standard solution for that particular problem?

@dginev
Copy link
Collaborator

dginev commented Jan 13, 2025

Is there a standard solution for that particular problem?

I am not really familiar with this kind of texlive version mismatch. So is the issue that there are two trailing \pars deposited just before \end{document}?

One simple workaround could be to replace the \par in your test macro with something like \newline I suppose. But would really need to get back to a texlive 2021 to test it out.

@dginev dginev requested review from dginev and brucemiller January 13, 2025 15:43
@dginev
Copy link
Collaborator

dginev commented Jan 13, 2025

@xworld21 OK, tried on a server with an old texlive, it looks like there is an extra \clearpage somehow added to the end of even the smallest pgf-loading document:

\documentclass{article}
\usepackage{pgf}
\begin{document}\end{document}

What you can do for the CI test is to add \let\clearpage\relax after \usepackage{pgf}, as I assume you won't be needing that feature in this particular math-oriented setup. That works in my setup.

@xworld21
Copy link
Contributor Author

What you can do for the CI test is to add \let\clearpage\relax after \usepackage{pgf}

Done!

In the meanwhile, I changed my mind. I can definitely fix the radians operator at least, and implement gcd. I'll do more tweaks and throw in some long expressions without parentheses in the tests, to catch other precedence errors.

@xworld21 xworld21 marked this pull request as ready for review January 18, 2025 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

\pgfmathparse calls every function *twice*
2 participants