Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] REPL accepts incorrect proofs #44

Open
amit9oct opened this issue May 29, 2024 · 15 comments
Open

[BUG] REPL accepts incorrect proofs #44

amit9oct opened this issue May 29, 2024 · 15 comments

Comments

@amit9oct
Copy link

I found a peculiar bug in REPL, where it can accept any proof that applies the theorem itself.

Example:

{"cmd": "theorem test (p q : Prop) (hp : p) (hq : q) : p ∧ q ∧ p := by sorry"}

{"sorries":
 [{"proofState": 0,
   "pos": {"line": 1, "column": 62},
   "goal": "p q : Prop\nhp : p\nhq : q\n⊢ p ∧ q ∧ p",
   "endPos": {"line": 1, "column": 67}}],
 "messages":
 [{"severity": "warning",
   "pos": {"line": 1, "column": 8},
   "endPos": {"line": 1, "column": 12},
   "data": "declaration uses 'sorry'"}],
 "env": 0}

{"tactic": "apply test", "proofState": 0}

{"proofState": 1,
 "goals":
 ["case hp\np q : Prop\nhp : p\nhq : q\n⊢ p",
  "case hq\np q : Prop\nhp : p\nhq : q\n⊢ q"]}

In the above example, we cannot complete the proof by applying itself. However, REPL does not raise any error messages. However, on VS Code IDE for Lean 4, we get the error message for the proof below:

theorem test (p q : Prop) (hp : p) (hq : q) : p ∧ q ∧ p := by
    apply test
    exact hp
    exact hq

Expected Error message:

fail to show termination for
  Lean4Proj1.test
with errors
structural recursion cannot be used

Could not find a decreasing measure.
The arguments relate at each recursive call as follows:
(<, ≤, =: relation proved, ? all proofs failed, _: no proof attempted)
          hp hq
1) 5:8-12  _  _
Please use `termination_by` to specify a decreasing measure.
@digama0
Copy link
Member

digama0 commented May 30, 2024

Your two examples are not the same. Did you try attempting to complete the proof using exact hp; exact hq in the REPL?

It is expected that calling apply test does not result in an immediate error. This is a valid proof strategy for proofs by induction, and the termination check only runs once the definition or proof is complete.

@amit9oct
Copy link
Author

amit9oct commented May 30, 2024

I forgot to paste the full output

{"cmd": "theorem test (p q : Prop) (hp : p) (hq : q) : p ∧ q ∧ p := by sorry"}

{"sorries":
 [{"proofState": 0,
   "pos": {"line": 1, "column": 62},
   "goal": "p q : Prop\nhp : p\nhq : q\n⊢ p ∧ q ∧ p",
   "endPos": {"line": 1, "column": 67}}],
 "messages":
 [{"severity": "warning",
   "pos": {"line": 1, "column": 8},
   "endPos": {"line": 1, "column": 12},
   "data": "declaration uses 'sorry'"}],
 "env": 0}

{"tactic": "apply test", "proofState": 0}

{"proofState": 1,
 "goals":
 ["case hp\np q : Prop\nhp : p\nhq : q\n⊢ p",
  "case hq\np q : Prop\nhp : p\nhq : q\n⊢ q"]}

{"tactic": "exact hp", "proofState": 1}

{"proofState": 2, "goals": ["case hq\np q : Prop\nhp : p\nhq : q\n⊢ q"]}

{"tactic": "exact hq", "proofState": 2}

{"proofState": 3, "goals": []}

As you can there is no error message here.
Also NOTE: The VS Code IDE shows the error message as soon as we use apply test tactic.

@amit9oct
Copy link
Author

amit9oct commented May 30, 2024

Also exact hp; exact hq won't work in REPL because it doesn't support tactical (;)

@digama0
Copy link
Member

digama0 commented May 30, 2024

Is this actually a complete proof though? I'm not super familiar with the REPL API but I would expect some kind of qed-equivalent to complete the proof once all the goals are finished. It may well be that this is a (design) bug in the REPL in that it has no way to express errors that occur when entering the definition into the environment.

PS: ; isn't a tactical, it is just a list of tactics. You have to pass them separately with the REPL API since it takes single tactics, or enclose it in parentheses like (exact hp; exact hq).

@amit9oct
Copy link
Author

I just followed the example shown in the README (https://github.com/leanprover-community/repl/blob/master/README.md), I may be wrong, but it doesn't look like there is something like qed from the examples.

@digama0
Copy link
Member

digama0 commented May 30, 2024

At present there is nothing you can do with a completed proof state: we would like to extend this so that you can replace the original sorry with your tactic script, and obtain the resulting Environment.

So apparently QED is not implemented yet. If it was, this is the step that would give the error.

@augustepoiroux
Copy link

augustepoiroux commented Jan 1, 2025

Hi,
Happy new year!

So, unfortunately, in this context, it seems that this makes the exact? tactic slightly dangerous to use as it will "prove" anything:

{"cmd": "theorem ex : False := sorry"}

{"sorries":
 [{"proofState": 0,
   "pos": {"line": 1, "column": 22},
   "goal": "⊢ False",
   "endPos": {"line": 1, "column": 27}}],
 "messages":
 [{"severity": "warning",
   "pos": {"line": 1, "column": 8},
   "endPos": {"line": 1, "column": 10},
   "data": "declaration uses 'sorry'"}],
 "env": 0}

{"proofState": 0, "tactic": "exact?"}

{"proofState": 1,
 "messages":
 [{"severity": "info",
   "pos": {"line": 0, "column": 0},
   "endPos": {"line": 0, "column": 0},
   "data": "Try this: exact _root_.ex"}],
 "goals": []}

exact? is referring to the theorem itself to prove it as suggested by "Try this: exact _root_.ex".

@llllvvuu
Copy link
Contributor

llllvvuu commented Jan 8, 2025

The exact? case is handled here:

6651502

However, it only works if the theorem is initialized in an existing environment, e.g.

{"cmd": "#eval 1 + 1"}

{"cmd": "theorem ex : False := sorry", "env": 0}

{"proofState": 0, "tactic": "exact?"}

exact ex still works here. I guess for recursion it resolves from somewhere else than ctx.env (resolveLocalName it seems?)

The question is, why does the InfoTree from elaborating the definition have as its root a ContextInfo with an Environment that includes in constants the definition itself, if it is not necessary for recursion and messes with exact?? It becomes an issue when the definition creates e.g. foo and foo.match_1 and you want only foo.match_1 to be in the env of the sorry. As it is currently, 6651502 ensures that neither foo nor foo.match_1 are in the env. #66

@augustepoiroux
Copy link

Interesting, thanks!
Having the definition in constants is probably useful/necessary for the pickle feature.

@RexWzh
Copy link

RexWzh commented Feb 20, 2025

I found another example where the REPL accepts an incorrect proof:

amc12a_2002_p12 from minif2f.

import Mathlib
theorem amc12a_2002_p12 (f : ℝ → ℝ) (k : ℝ) (a b : ℕ) (h₀ : ∀ x, f x = x ^ 2 - 63 * x + k)
    (h₁ : f a = 0 ∧ f b = 0) (h₂ : a ≠ b) (h₃ : Nat.Prime a ∧ Nat.Prime b) : k = 122 := by
  apply (mul_right_inj' (sub_ne_zero.2 ?_)).1
  ring_nf at h₁ h₂ ⊢

This can be reproduced with the following REPL commands:

{"cmd": "import Mathlib"}

{"cmd": "theorem amc12a_2002_p12 (f : ℝ → ℝ) (k : ℝ) (a b : ℕ) (h₀ : ∀ x, f x = x ^ 2 - 63 * x + k)\n    (h₁ : f a = 0 ∧ f b = 0) (h₂ : a ≠ b) (h₃ : Nat.Prime a ∧ Nat.Prime b) : k = 122 := by sorry", "env": 0}

{"tactic": "apply (mul_right_inj' (sub_ne_zero.2 ?_)).1", "proofState": 0}

{"tactic": "ring_nf at h₁ h₂ ⊢", "proofState": 1}

I've tested on PR #63, but it still fails to catch the invalid proofs.

@wuseguang
Copy link

I found another example where the REPL accepts an incorrect proof:

amc12a_2002_p12 from minif2f.

import Mathlib
theorem amc12a_2002_p12 (f : ℝ → ℝ) (k : ℝ) (a b : ℕ) (h₀ : ∀ x, f x = x ^ 2 - 63 * x + k)
    (h₁ : f a = 0 ∧ f b = 0) (h₂ : a ≠ b) (h₃ : Nat.Prime a ∧ Nat.Prime b) : k = 122 := by
  apply (mul_right_inj' (sub_ne_zero.2 ?_)).1
  ring_nf at h₁ h₂ ⊢

This can be reproduced with the following REPL commands:

{"cmd": "import Mathlib"}

{"cmd": "theorem amc12a_2002_p12 (f : ℝ → ℝ) (k : ℝ) (a b : ℕ) (h₀ : ∀ x, f x = x ^ 2 - 63 * x + k)\n    (h₁ : f a = 0 ∧ f b = 0) (h₂ : a ≠ b) (h₃ : Nat.Prime a ∧ Nat.Prime b) : k = 122 := by sorry", "env": 0}

{"tactic": "apply (mul_right_inj' (sub_ne_zero.2 ?_)).1", "proofState": 0}

{"tactic": "ring_nf at h₁ h₂ ⊢", "proofState": 1}

I've tested on PR #63, but it still fails to catch the invalid proofs.

Hello, I have encountered the same issue. Have you resolved it now?

@amit9oct
Copy link
Author

amit9oct commented Mar 3, 2025

You can use our tool:
https://github.com/trishullab/itp-interface

It is based on REPL but doesn't accept wrong proofs. It also allows parallel proof execution, handling memory issues which might arise when you use too many tactics in your proof while using REPL, and a lot of other fixes. It is in Python and can run parallely on ray clusters.

@wuseguang
Copy link

You can use our tool: https://github.com/trishullab/itp-interface

It is based on REPL but doesn't accept wrong proofs. It also allows parallel proof execution, handling memory issues which might arise when you use too many tactics in your proof while using REPL, and a lot of other fixes. It is in Python and can run parallely on ray clusters.

Hi, @amit9oct , I run the repl at https://github.com/amit9oct/repl with lean4 version v4.16.0.
The commads is:
{"cmd": "import Mathlib"}

{"cmd": "theorem amc12a_2002_p12 (f : ℝ → ℝ) (k : ℝ) (a b : ℕ) (h₀ : ∀ x, f x = x ^ 2 - 63 * x + k)\n (h₁ : f a = 0 ∧ f b = 0) (h₂ : a ≠ b) (h₃ : Nat.Prime a ∧ Nat.Prime b) : k = 122 := by sorry", "env": 0}

{"tactic": "apply (mul_right_inj' (sub_ne_zero.2 ?_)).1", "proofState": 0}

{"tactic": "ring_nf at h₁ h₂ ⊢", "proofState": 1}

but it still accept wrong proofs.

@amit9oct
Copy link
Author

amit9oct commented Mar 3, 2025

No use the python interface and not the native REPL to run your code. See the README https://github.com/trishullab/itp-interface (we have also tested our code on miniF2F and it works). The key is to not use tactic mode at all

@amit9oct
Copy link
Author

amit9oct commented Mar 3, 2025

Our python code uses REPL but only for barebones Lean 4 support, we don't use pickling or tactic mode which might be prone to bugs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants