[BUG] REPL accepts incorrect proofs #44

amit9oct · 2024-05-29T19:33:31Z

I found a peculiar bug in REPL, where it can accept any proof that applies the theorem itself.

Example:

{"cmd": "theorem test (p q : Prop) (hp : p) (hq : q) : p ∧ q ∧ p := by sorry"}

{"sorries":
 [{"proofState": 0,
   "pos": {"line": 1, "column": 62},
   "goal": "p q : Prop\nhp : p\nhq : q\n⊢ p ∧ q ∧ p",
   "endPos": {"line": 1, "column": 67}}],
 "messages":
 [{"severity": "warning",
   "pos": {"line": 1, "column": 8},
   "endPos": {"line": 1, "column": 12},
   "data": "declaration uses 'sorry'"}],
 "env": 0}

{"tactic": "apply test", "proofState": 0}

{"proofState": 1,
 "goals":
 ["case hp\np q : Prop\nhp : p\nhq : q\n⊢ p",
  "case hq\np q : Prop\nhp : p\nhq : q\n⊢ q"]}

In the above example, we cannot complete the proof by applying itself. However, REPL does not raise any error messages. However, on VS Code IDE for Lean 4, we get the error message for the proof below:

theorem test (p q : Prop) (hp : p) (hq : q) : p ∧ q ∧ p := by
    apply test
    exact hp
    exact hq

Expected Error message:

fail to show termination for
  Lean4Proj1.test
with errors
structural recursion cannot be used

Could not find a decreasing measure.
The arguments relate at each recursive call as follows:
(<, ≤, =: relation proved, ? all proofs failed, _: no proof attempted)
          hp hq
1) 5:8-12  _  _
Please use `termination_by` to specify a decreasing measure.

The text was updated successfully, but these errors were encountered:

digama0 · 2024-05-30T05:19:09Z

Your two examples are not the same. Did you try attempting to complete the proof using exact hp; exact hq in the REPL?

It is expected that calling apply test does not result in an immediate error. This is a valid proof strategy for proofs by induction, and the termination check only runs once the definition or proof is complete.

amit9oct · 2024-05-30T05:53:09Z

I forgot to paste the full output

{"cmd": "theorem test (p q : Prop) (hp : p) (hq : q) : p ∧ q ∧ p := by sorry"}

{"sorries":
 [{"proofState": 0,
   "pos": {"line": 1, "column": 62},
   "goal": "p q : Prop\nhp : p\nhq : q\n⊢ p ∧ q ∧ p",
   "endPos": {"line": 1, "column": 67}}],
 "messages":
 [{"severity": "warning",
   "pos": {"line": 1, "column": 8},
   "endPos": {"line": 1, "column": 12},
   "data": "declaration uses 'sorry'"}],
 "env": 0}

{"tactic": "apply test", "proofState": 0}

{"proofState": 1,
 "goals":
 ["case hp\np q : Prop\nhp : p\nhq : q\n⊢ p",
  "case hq\np q : Prop\nhp : p\nhq : q\n⊢ q"]}

{"tactic": "exact hp", "proofState": 1}

{"proofState": 2, "goals": ["case hq\np q : Prop\nhp : p\nhq : q\n⊢ q"]}

{"tactic": "exact hq", "proofState": 2}

{"proofState": 3, "goals": []}

As you can there is no error message here.
Also NOTE: The VS Code IDE shows the error message as soon as we use apply test tactic.

amit9oct · 2024-05-30T05:54:22Z

Also exact hp; exact hq won't work in REPL because it doesn't support tactical (;)

digama0 · 2024-05-30T05:58:18Z

Is this actually a complete proof though? I'm not super familiar with the REPL API but I would expect some kind of qed-equivalent to complete the proof once all the goals are finished. It may well be that this is a (design) bug in the REPL in that it has no way to express errors that occur when entering the definition into the environment.

PS: ; isn't a tactical, it is just a list of tactics. You have to pass them separately with the REPL API since it takes single tactics, or enclose it in parentheses like (exact hp; exact hq).

amit9oct · 2024-05-30T06:11:05Z

I just followed the example shown in the README (https://github.com/leanprover-community/repl/blob/master/README.md), I may be wrong, but it doesn't look like there is something like qed from the examples.

digama0 · 2024-05-30T06:21:05Z

At present there is nothing you can do with a completed proof state: we would like to extend this so that you can replace the original sorry with your tactic script, and obtain the resulting Environment.

So apparently QED is not implemented yet. If it was, this is the step that would give the error.

augustepoiroux · 2025-01-01T13:47:43Z

Hi,
Happy new year!

So, unfortunately, in this context, it seems that this makes the exact? tactic slightly dangerous to use as it will "prove" anything:

{"cmd": "theorem ex : False := sorry"}

{"sorries":
 [{"proofState": 0,
   "pos": {"line": 1, "column": 22},
   "goal": "⊢ False",
   "endPos": {"line": 1, "column": 27}}],
 "messages":
 [{"severity": "warning",
   "pos": {"line": 1, "column": 8},
   "endPos": {"line": 1, "column": 10},
   "data": "declaration uses 'sorry'"}],
 "env": 0}

{"proofState": 0, "tactic": "exact?"}

{"proofState": 1,
 "messages":
 [{"severity": "info",
   "pos": {"line": 0, "column": 0},
   "endPos": {"line": 0, "column": 0},
   "data": "Try this: exact _root_.ex"}],
 "goals": []}

exact? is referring to the theorem itself to prove it as suggested by "Try this: exact _root_.ex".

llllvvuu · 2025-01-08T07:45:57Z

The exact? case is handled here:

6651502

However, it only works if the theorem is initialized in an existing environment, e.g.

{"cmd": "#eval 1 + 1"}

{"cmd": "theorem ex : False := sorry", "env": 0}

{"proofState": 0, "tactic": "exact?"}

exact ex still works here. I guess for recursion it resolves from somewhere else than ctx.env (resolveLocalName it seems?)

The question is, why does the InfoTree from elaborating the definition have as its root a ContextInfo with an Environment that includes in constants the definition itself, if it is not necessary for recursion and messes with exact?? It becomes an issue when the definition creates e.g. foo and foo.match_1 and you want only foo.match_1 to be in the env of the sorry. As it is currently, 6651502 ensures that neither foo nor foo.match_1 are in the env. #66

augustepoiroux · 2025-01-08T09:12:53Z

Interesting, thanks!
Having the definition in constants is probably useful/necessary for the pickle feature.

RexWzh · 2025-02-20T08:48:25Z

I found another example where the REPL accepts an incorrect proof:

amc12a_2002_p12 from minif2f.

import Mathlib
theorem amc12a_2002_p12 (f : ℝ → ℝ) (k : ℝ) (a b : ℕ) (h₀ : ∀ x, f x = x ^ 2 - 63 * x + k)
    (h₁ : f a = 0 ∧ f b = 0) (h₂ : a ≠ b) (h₃ : Nat.Prime a ∧ Nat.Prime b) : k = 122 := by
  apply (mul_right_inj' (sub_ne_zero.2 ?_)).1
  ring_nf at h₁ h₂ ⊢

This can be reproduced with the following REPL commands:

{"cmd": "import Mathlib"}

{"cmd": "theorem amc12a_2002_p12 (f : ℝ → ℝ) (k : ℝ) (a b : ℕ) (h₀ : ∀ x, f x = x ^ 2 - 63 * x + k)\n    (h₁ : f a = 0 ∧ f b = 0) (h₂ : a ≠ b) (h₃ : Nat.Prime a ∧ Nat.Prime b) : k = 122 := by sorry", "env": 0}

{"tactic": "apply (mul_right_inj' (sub_ne_zero.2 ?_)).1", "proofState": 0}

{"tactic": "ring_nf at h₁ h₂ ⊢", "proofState": 1}

I've tested on PR #63, but it still fails to catch the invalid proofs.

wuseguang · 2025-03-03T06:35:14Z

I found another example where the REPL accepts an incorrect proof:

amc12a_2002_p12 from minif2f.

import Mathlib
theorem amc12a_2002_p12 (f : ℝ → ℝ) (k : ℝ) (a b : ℕ) (h₀ : ∀ x, f x = x ^ 2 - 63 * x + k)
    (h₁ : f a = 0 ∧ f b = 0) (h₂ : a ≠ b) (h₃ : Nat.Prime a ∧ Nat.Prime b) : k = 122 := by
  apply (mul_right_inj' (sub_ne_zero.2 ?_)).1
  ring_nf at h₁ h₂ ⊢

This can be reproduced with the following REPL commands:

{"cmd": "import Mathlib"}

{"cmd": "theorem amc12a_2002_p12 (f : ℝ → ℝ) (k : ℝ) (a b : ℕ) (h₀ : ∀ x, f x = x ^ 2 - 63 * x + k)\n    (h₁ : f a = 0 ∧ f b = 0) (h₂ : a ≠ b) (h₃ : Nat.Prime a ∧ Nat.Prime b) : k = 122 := by sorry", "env": 0}

{"tactic": "apply (mul_right_inj' (sub_ne_zero.2 ?_)).1", "proofState": 0}

{"tactic": "ring_nf at h₁ h₂ ⊢", "proofState": 1}

I've tested on PR #63, but it still fails to catch the invalid proofs.

Hello, I have encountered the same issue. Have you resolved it now?

amit9oct · 2025-03-03T06:59:32Z

You can use our tool:
https://github.com/trishullab/itp-interface

It is based on REPL but doesn't accept wrong proofs. It also allows parallel proof execution, handling memory issues which might arise when you use too many tactics in your proof while using REPL, and a lot of other fixes. It is in Python and can run parallely on ray clusters.

wuseguang · 2025-03-03T08:03:44Z

You can use our tool: https://github.com/trishullab/itp-interface

It is based on REPL but doesn't accept wrong proofs. It also allows parallel proof execution, handling memory issues which might arise when you use too many tactics in your proof while using REPL, and a lot of other fixes. It is in Python and can run parallely on ray clusters.

Hi, @amit9oct , I run the repl at https://github.com/amit9oct/repl with lean4 version v4.16.0.
The commads is:
{"cmd": "import Mathlib"}

{"cmd": "theorem amc12a_2002_p12 (f : ℝ → ℝ) (k : ℝ) (a b : ℕ) (h₀ : ∀ x, f x = x ^ 2 - 63 * x + k)\n (h₁ : f a = 0 ∧ f b = 0) (h₂ : a ≠ b) (h₃ : Nat.Prime a ∧ Nat.Prime b) : k = 122 := by sorry", "env": 0}

{"tactic": "apply (mul_right_inj' (sub_ne_zero.2 ?_)).1", "proofState": 0}

{"tactic": "ring_nf at h₁ h₂ ⊢", "proofState": 1}

but it still accept wrong proofs.

amit9oct · 2025-03-03T08:10:55Z

No use the python interface and not the native REPL to run your code. See the README https://github.com/trishullab/itp-interface (we have also tested our code on miniF2F and it works). The key is to not use tactic mode at all

amit9oct · 2025-03-03T08:13:40Z

Our python code uses REPL but only for barebones Lean 4 support, we don't use pickling or tactic mode which might be prone to bugs

augustepoiroux mentioned this issue Jan 3, 2025

Add proof validation to tactic mode #63

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] REPL accepts incorrect proofs #44

[BUG] REPL accepts incorrect proofs #44

amit9oct commented May 29, 2024

digama0 commented May 30, 2024 •

edited

Loading

amit9oct commented May 30, 2024 •

edited

Loading

amit9oct commented May 30, 2024 •

edited

Loading

digama0 commented May 30, 2024 •

edited

Loading

amit9oct commented May 30, 2024

digama0 commented May 30, 2024

augustepoiroux commented Jan 1, 2025 •

edited

Loading

llllvvuu commented Jan 8, 2025 •

edited

Loading

augustepoiroux commented Jan 8, 2025

RexWzh commented Feb 20, 2025

wuseguang commented Mar 3, 2025

amit9oct commented Mar 3, 2025 •

edited

Loading

wuseguang commented Mar 3, 2025

amit9oct commented Mar 3, 2025 •

edited

Loading

amit9oct commented Mar 3, 2025

[BUG] REPL accepts incorrect proofs #44

[BUG] REPL accepts incorrect proofs #44

Comments

amit9oct commented May 29, 2024

digama0 commented May 30, 2024 • edited Loading

amit9oct commented May 30, 2024 • edited Loading

amit9oct commented May 30, 2024 • edited Loading

digama0 commented May 30, 2024 • edited Loading

amit9oct commented May 30, 2024

digama0 commented May 30, 2024

augustepoiroux commented Jan 1, 2025 • edited Loading

llllvvuu commented Jan 8, 2025 • edited Loading

augustepoiroux commented Jan 8, 2025

RexWzh commented Feb 20, 2025

wuseguang commented Mar 3, 2025

amit9oct commented Mar 3, 2025 • edited Loading

wuseguang commented Mar 3, 2025

amit9oct commented Mar 3, 2025 • edited Loading

amit9oct commented Mar 3, 2025

digama0 commented May 30, 2024 •

edited

Loading

amit9oct commented May 30, 2024 •

edited

Loading

amit9oct commented May 30, 2024 •

edited

Loading

digama0 commented May 30, 2024 •

edited

Loading

augustepoiroux commented Jan 1, 2025 •

edited

Loading

llllvvuu commented Jan 8, 2025 •

edited

Loading

amit9oct commented Mar 3, 2025 •

edited

Loading

amit9oct commented Mar 3, 2025 •

edited

Loading