Replies: 1 comment 3 replies
-
Hi Brando, If I'm understanding it correctly, by "proving two theorems equivalent", you have two theorems, You can certainly enter this theorem into LeanDojo and apply various tactics to it. However, how do you plan to prove the theorem? In general, it is quite hard to prove if two theorems are equivalent, and I don't think any existing proof automation in Lean (linarith or ReProver) comes even close to that. Also, there is no hammer in Lean as of now. |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In general I want to know how to verify if a theorem generated from a LLM is true in Lean Dojo using a prover.
More details:
I want to build a static eval benchmark for my autoformalization projects using LeanDojo and a Prover (e.g., linarith,LeanHammer, ReProver).
For this the AF LLM model will output formal string that will be made into proofs of the form
formal_stmt_mdl_af ===_prover formal_stmt_ground_truth
and I want to be able to check if the two are equivalent using some prover (essentially using a Prover to prove equivalences). Initially anything that makes the code run and is simple is likely good. Basically using a prover to prove equivalences).Given all the engineering efforts in LeanDojo I wasn't 100% sure how to start this. Is the right level of abstraction to think of LeanDojo as an Lean env I can give two formal strings to check if they are equivelent using some powerful prover e.g.,
equiv(f_stmt1, f_stmt2, prover, env=LeanDojo)
? Or does LeanDojo work better by processing things in bulk? Or maybe due to the complexity of LeanDojo and the engineering efforts, there might be another detail/unknown I'm not even aware of.Note, I have seen https://github.com/lean-dojo/LeanDojo/blob/main/scripts/generate-benchmark-lean4.ipynb and read around the docs and repo and realized the project is complex, so wanted to ask before spending to much time on any approach.
Thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions