TIP-907: Track overflow in melvm #63

jaybutera · 2022-01-19T21:08:58Z


TIP	907
Category	9xx (consensus-breaking)
Author	Jay Butera

Summary

Proposing the addition of a sticky overflow flag to melvm, and Oflo/Cloflo opcodes which push the flag's value to the stack and clears the flag to false respectively. This will allow programs to check if an arithmetic operation overflowed.

Motivation

Right now overflow on arithmetic is not explicitly reported in melvm. While it can be determined in some cases by the remaining value of an arithmetic expression, an overflow flow flag can help in the case that a sequence of arithmetic is done concluding with a final check.

Proposed changes

Add an overflow boolean to melvm's execution state. When an arithmetic operation (+,-,*,^) overflows it sets the overflow flag to true. When an op does not overflow it sets the flag to false. So in (x+y)+z the first addition might overflow but if the second does not then the overflow flag will be false after this expression. Therefore a user should check overflow directly after an operation where needed.

The Oflo opcode simply pushes the value in the overflow flag to the stack.

** edit: considering changes discussed below

The text was updated successfully, but these errors were encountered:

nullchinchilla · 2022-01-20T16:26:18Z

This is certainly an important functionality that MelVM is curently missing.

However, an interesting, non-trivial question is how higher-level languages (Mil and Melodeon, in particular) can expose this functionality.

Strategy 1: directly expose an "intrinsic" function

The easiest way would be just to have a special expression that compiles to the Oflo opcode. For example, this can be exposed as a built-in function in Melodeon:

def __overflowed() : Bool

We could use it in contexts like this:

def saturating_add(x: Nat, y: Nat) : Nat =
   let res = x + y in
   if __overflowed() then
      ~0 # bitwise not of zero = 2^256-1
   else
      res

The problem here is that the presence of __overflowed() in the language causes every arithmetic expression to have an invisible global side-effect --- it affects whether __overflowed() returns true or false.

This is especially bad because Melodeon is otherwise a purely functional language. Being purely functional means that the compiler (in this case, really the Mil compiler) is free to do many sorts of rearrangements. For example, in general

let x = f() in
let y = g() in
some_expr

and

let y = g() in
let x = f() in
some_expr

are equivalent. Strictly speaking, the language doesn't even have an evaluation order that says that f() must be evaluated before g() --- for all we know everything could be lazy-evaluated, and we cannot observe the difference.

Except if __overflowed() is involved anywhere! If g() calls some other function, which calls some other function, which may or may not call __overflowed(), and if f() does any math, then suddenly the order matters a lot. Otherwise, g() may not observe the overflows in f().

Thus, both the Melodeon and the Mil compiler must globally keep track of any function that calls __overflowed either directly or indirectly, and handle them specially when doing transformations or optimizations. In fact, this already impacts an optimization we do in the Mil compiler: eliminating let bindings when they are used only once. That is, expression like

(let (x some-expr)
  (... x ....))

where x is only used once in the body, are currently rewritten to

(... some-expr ...)

to save on loads and stores. But an overflow check breaks this:

(let (x (+ foo bar))
  (if (__overflow) x 0))

=>

;; oof
(if (__overflow) (+ foo bar) 0)

This strategy thus will impose significant complications on designing compiler optimizations and internal representations.

Strategy 2: "Checked" blocks

In this strategy, the TIP will be modified to make the overflow bit "sticky", and a Cloflo opcode can clear this bit. Mil then exposes a construct for checking whether overflow occurred within an arbitrarily complex expression.

In particular, Mil exposes an expression (checked <arith-expr> <fail-case>), which is defined to

Evaluate <arith-expr>, which is an arbitrary expression
If overflow occurred, return <fail-case> (this is implemented by checking then clearing the overflow bit)
Otherwise, return <arith-expr>

This can directly be exposed in Melodeon safely:

def saturating_add(x: Nat, y: Nat) =
    checked
        x + y
    catch ~0

The implementation overhead of this is much lower than strategy 1. It also makes it possible to catch overflow after a series of arithmetic operations, to save on constant checking in the middle.

The costs:

TIP needs to be modified to make the bit sticky and clearable.
Clearing the bit over and over is an overhead in the case where we do want to check after every operation. We could just have Oflo clear the bit for us though.
This still involves a bit of "action at a distance" that doesn't feel very functional. In particular, consider something like
```
checked
  some_complicated_expression
catch "hello"
```
, where stuff deep in some_complicated_expression "unknowingly" would affect the behavior of the top-level checked expression. Whether this counts as a side effect depends on your definition of a side effect...

Strategy 3: checked arithmetic in Mil

Instead of a general checked-block solution, we can introduce simply operations like (chk+ x y fail-case), which adds x and y, and if it overflows, returns fail-case instead of x+y. These can simply be called through extern in Melodeon.

The advantage of this over 2 is that we do not need to change the TIP, and everything is quite "functional" and nice. The disadvantage is a proliferation of Mil forms (chk+, chk-, etc), as well as forcing a check after every arithmetic operation. In particular, the check can be quite expensive, as it at least involves a Oflo opcode, a branch, and whatever code is in fail-case, which is at least 3 times more expensive than an unchecked arithmetic operation.

jaybutera · 2022-01-20T18:37:39Z

I chose an auto overflow flip originally for ease of use but am actually a fan of the performance benefits of a sticky flag. We would need the second opcode cloflo to clear the flag in the case the a program simply wants to ignore whatever value is in there. The oflo opcode can also reset the flag since it seems unlikely that a program would need to read the same value twice.

I agree the functional semantics don't fit so well with the stack machine model of melvm. But I don't see much of an efficient alternative.

jaybutera mentioned this issue Jan 19, 2022

Feature/overflow (TIP-907) mel-project/melstf#11

Closed

nullchinchilla added the TIP Themelio Improvement Proposal label Jan 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TIP-907: Track overflow in melvm #63

TIP-907: Track overflow in melvm #63

jaybutera commented Jan 19, 2022 •

edited

Loading

nullchinchilla commented Jan 20, 2022 •

edited

Loading

jaybutera commented Jan 20, 2022

TIP-907: Track overflow in melvm #63

TIP-907: Track overflow in melvm #63

Comments

jaybutera commented Jan 19, 2022 • edited Loading

Summary

Motivation

Proposed changes

nullchinchilla commented Jan 20, 2022 • edited Loading

Strategy 1: directly expose an "intrinsic" function

Strategy 2: "Checked" blocks

Strategy 3: checked arithmetic in Mil

jaybutera commented Jan 20, 2022

jaybutera commented Jan 19, 2022 •

edited

Loading

nullchinchilla commented Jan 20, 2022 •

edited

Loading