Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

for a list of math symbols in all the papers, what are the various definitions of each symbol? #3

Open
bhpayne opened this issue May 29, 2021 · 4 comments

Comments

@bhpayne
Copy link
Member

bhpayne commented May 29, 2021

Suppose we can pick out math symbols from all the papers.

  • For the most common symbol, what are the definitions for that symbol?
  • For the second most common symbol, what are the definitions for that symbol?
@bhpayne
Copy link
Member Author

bhpayne commented May 29, 2021

Symbol definitions, if not in the paper itself, might be in cited papers (use bibliographic citation tracing)

@msgoff
Copy link
Contributor

msgoff commented May 21, 2023

xz -d HEP_TEX.model.xz
pip install requirements.txt
python resolve_symbol_definitions.py tex_file

The script tries to map all variable names to their definition(s)/and or properties in the file.
currently maps 10-30% of definitions otherwise creates a Concordance dictionary where every sentence that uses the variable is in a dictionary of lists.

I have found there are roughly 50k variables used in HEP.
Many of which do not use the same definition.

@bhpayne
Copy link
Member Author

bhpayne commented Nov 24, 2023

Is HEP_TEX.model.xz in the git repo?

@msgoff
Copy link
Contributor

msgoff commented Nov 24, 2023

The file HEP_TEX.model.xz was removed.
I will update the python files to use the results from scanner.out for word tokenization.

The first pass at resolving symbol defintions can be found in the utils directory
run make variable_definitions in the utils directory
#./variable_definitions.out ../sound1.tex |grep '<:.*?:>' -oP
the results look like
<:, the fine structure constant $\alpha$:>
<:and the proton-to-electron mass ratio $\frac{m_p}{m_e}$:>
<:the upper bound for the speed of sound in condensed phases, $v_u$:>
<:We find that $\frac{v_u}{c}=\alpha\left(\frac{m_e}{2m_p}\right)^{\frac{1}{2}}$:>
...

The results from the python version can be found in
https://github.com/allofphysicsgraph/latex-in-arxiv/blob/master/symbol_definitions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants