-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] REPL accepts incorrect proofs #44
Comments
Your two examples are not the same. Did you try attempting to complete the proof using It is expected that calling |
I forgot to paste the full output {"cmd": "theorem test (p q : Prop) (hp : p) (hq : q) : p ∧ q ∧ p := by sorry"}
{"sorries":
[{"proofState": 0,
"pos": {"line": 1, "column": 62},
"goal": "p q : Prop\nhp : p\nhq : q\n⊢ p ∧ q ∧ p",
"endPos": {"line": 1, "column": 67}}],
"messages":
[{"severity": "warning",
"pos": {"line": 1, "column": 8},
"endPos": {"line": 1, "column": 12},
"data": "declaration uses 'sorry'"}],
"env": 0}
{"tactic": "apply test", "proofState": 0}
{"proofState": 1,
"goals":
["case hp\np q : Prop\nhp : p\nhq : q\n⊢ p",
"case hq\np q : Prop\nhp : p\nhq : q\n⊢ q"]}
{"tactic": "exact hp", "proofState": 1}
{"proofState": 2, "goals": ["case hq\np q : Prop\nhp : p\nhq : q\n⊢ q"]}
{"tactic": "exact hq", "proofState": 2}
{"proofState": 3, "goals": []} As you can there is no error message here. |
Also |
Is this actually a complete proof though? I'm not super familiar with the REPL API but I would expect some kind of qed-equivalent to complete the proof once all the goals are finished. It may well be that this is a (design) bug in the REPL in that it has no way to express errors that occur when entering the definition into the environment. PS: |
I just followed the example shown in the README (https://github.com/leanprover-community/repl/blob/master/README.md), I may be wrong, but it doesn't look like there is something like qed from the examples. |
So apparently QED is not implemented yet. If it was, this is the step that would give the error. |
Hi, So, unfortunately, in this context, it seems that this makes the
|
The However, it only works if the theorem is initialized in an existing environment, e.g. {"cmd": "#eval 1 + 1"}
{"cmd": "theorem ex : False := sorry", "env": 0}
{"proofState": 0, "tactic": "exact?"}
|
Interesting, thanks! |
I found another example where the REPL accepts an incorrect proof: amc12a_2002_p12 from minif2f.
This can be reproduced with the following REPL commands:
I've tested on PR #63, but it still fails to catch the invalid proofs. |
Hello, I have encountered the same issue. Have you resolved it now? |
You can use our tool: It is based on REPL but doesn't accept wrong proofs. It also allows parallel proof execution, handling memory issues which might arise when you use too many tactics in your proof while using REPL, and a lot of other fixes. It is in Python and can run parallely on ray clusters. |
Hi, @amit9oct , I run the repl at https://github.com/amit9oct/repl with lean4 version v4.16.0. {"cmd": "theorem amc12a_2002_p12 (f : ℝ → ℝ) (k : ℝ) (a b : ℕ) (h₀ : ∀ x, f x = x ^ 2 - 63 * x + k)\n (h₁ : f a = 0 ∧ f b = 0) (h₂ : a ≠ b) (h₃ : Nat.Prime a ∧ Nat.Prime b) : k = 122 := by sorry", "env": 0} {"tactic": "apply (mul_right_inj' (sub_ne_zero.2 ?_)).1", "proofState": 0} {"tactic": "ring_nf at h₁ h₂ ⊢", "proofState": 1} but it still accept wrong proofs. |
No use the python interface and not the native REPL to run your code. See the README https://github.com/trishullab/itp-interface (we have also tested our code on miniF2F and it works). The key is to not use tactic mode at all |
Our python code uses REPL but only for barebones Lean 4 support, we don't use pickling or tactic mode which might be prone to bugs |
I found a peculiar bug in REPL, where it can accept any proof that applies the theorem itself.
Example:
In the above example, we cannot complete the proof by applying itself. However, REPL does not raise any error messages. However, on VS Code IDE for Lean 4, we get the error message for the proof below:
Expected Error message:
The text was updated successfully, but these errors were encountered: