Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TIP-907: Track overflow in melvm #63

Open
jaybutera opened this issue Jan 19, 2022 · 2 comments
Open

TIP-907: Track overflow in melvm #63

jaybutera opened this issue Jan 19, 2022 · 2 comments
Labels
TIP Themelio Improvement Proposal

Comments

@jaybutera
Copy link
Collaborator

jaybutera commented Jan 19, 2022

TIP 907
Category 9xx (consensus-breaking)
Author Jay Butera

Summary

Proposing the addition of a sticky overflow flag to melvm, and Oflo/Cloflo opcodes which push the flag's value to the stack and clears the flag to false respectively. This will allow programs to check if an arithmetic operation overflowed.

Motivation

Right now overflow on arithmetic is not explicitly reported in melvm. While it can be determined in some cases by the remaining value of an arithmetic expression, an overflow flow flag can help in the case that a sequence of arithmetic is done concluding with a final check.

Proposed changes

Add an overflow boolean to melvm's execution state. When an arithmetic operation (+,-,*,^) overflows it sets the overflow flag to true. When an op does not overflow it sets the flag to false. So in (x+y)+z the first addition might overflow but if the second does not then the overflow flag will be false after this expression. Therefore a user should check overflow directly after an operation where needed.

The Oflo opcode simply pushes the value in the overflow flag to the stack.

** edit: considering changes discussed below

@nullchinchilla nullchinchilla added the TIP Themelio Improvement Proposal label Jan 20, 2022
@nullchinchilla
Copy link
Member

nullchinchilla commented Jan 20, 2022

This is certainly an important functionality that MelVM is curently missing.

However, an interesting, non-trivial question is how higher-level languages (Mil and Melodeon, in particular) can expose this functionality.

Strategy 1: directly expose an "intrinsic" function

The easiest way would be just to have a special expression that compiles to the Oflo opcode. For example, this can be exposed as a built-in function in Melodeon:

def __overflowed() : Bool

We could use it in contexts like this:

def saturating_add(x: Nat, y: Nat) : Nat =
   let res = x + y in
   if __overflowed() then
      ~0 # bitwise not of zero = 2^256-1
   else
      res

The problem here is that the presence of __overflowed() in the language causes every arithmetic expression to have an invisible global side-effect --- it affects whether __overflowed() returns true or false.

This is especially bad because Melodeon is otherwise a purely functional language. Being purely functional means that the compiler (in this case, really the Mil compiler) is free to do many sorts of rearrangements. For example, in general

let x = f() in
let y = g() in
some_expr

and

let y = g() in
let x = f() in
some_expr

are equivalent. Strictly speaking, the language doesn't even have an evaluation order that says that f() must be evaluated before g() --- for all we know everything could be lazy-evaluated, and we cannot observe the difference.

Except if __overflowed() is involved anywhere! If g() calls some other function, which calls some other function, which may or may not call __overflowed(), and if f() does any math, then suddenly the order matters a lot. Otherwise, g() may not observe the overflows in f().

Thus, both the Melodeon and the Mil compiler must globally keep track of any function that calls __overflowed either directly or indirectly, and handle them specially when doing transformations or optimizations. In fact, this already impacts an optimization we do in the Mil compiler: eliminating let bindings when they are used only once. That is, expression like

(let (x some-expr)
  (... x ....))

where x is only used once in the body, are currently rewritten to

(... some-expr ...)

to save on loads and stores. But an overflow check breaks this:

(let (x (+ foo bar))
  (if (__overflow) x 0))

=>

;; oof
(if (__overflow) (+ foo bar) 0)

This strategy thus will impose significant complications on designing compiler optimizations and internal representations.

Strategy 2: "Checked" blocks

In this strategy, the TIP will be modified to make the overflow bit "sticky", and a Cloflo opcode can clear this bit. Mil then exposes a construct for checking whether overflow occurred within an arbitrarily complex expression.

In particular, Mil exposes an expression (checked <arith-expr> <fail-case>), which is defined to

  • Evaluate <arith-expr>, which is an arbitrary expression
  • If overflow occurred, return <fail-case> (this is implemented by checking then clearing the overflow bit)
  • Otherwise, return <arith-expr>

This can directly be exposed in Melodeon safely:

def saturating_add(x: Nat, y: Nat) =
    checked
        x + y
    catch ~0

The implementation overhead of this is much lower than strategy 1. It also makes it possible to catch overflow after a series of arithmetic operations, to save on constant checking in the middle.

The costs:

  • TIP needs to be modified to make the bit sticky and clearable.
  • Clearing the bit over and over is an overhead in the case where we do want to check after every operation. We could just have Oflo clear the bit for us though.
  • This still involves a bit of "action at a distance" that doesn't feel very functional. In particular, consider something like
    checked
      some_complicated_expression
    catch "hello"
    , where stuff deep in some_complicated_expression "unknowingly" would affect the behavior of the top-level checked expression. Whether this counts as a side effect depends on your definition of a side effect...

Strategy 3: checked arithmetic in Mil

Instead of a general checked-block solution, we can introduce simply operations like (chk+ x y fail-case), which adds x and y, and if it overflows, returns fail-case instead of x+y. These can simply be called through extern in Melodeon.

The advantage of this over 2 is that we do not need to change the TIP, and everything is quite "functional" and nice. The disadvantage is a proliferation of Mil forms (chk+, chk-, etc), as well as forcing a check after every arithmetic operation. In particular, the check can be quite expensive, as it at least involves a Oflo opcode, a branch, and whatever code is in fail-case, which is at least 3 times more expensive than an unchecked arithmetic operation.

@jaybutera
Copy link
Collaborator Author

I chose an auto overflow flip originally for ease of use but am actually a fan of the performance benefits of a sticky flag. We would need the second opcode cloflo to clear the flag in the case the a program simply wants to ignore whatever value is in there. The oflo opcode can also reset the flag since it seems unlikely that a program would need to read the same value twice.

I agree the functional semantics don't fit so well with the stack machine model of melvm. But I don't see much of an efficient alternative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
TIP Themelio Improvement Proposal
Projects
None yet
Development

No branches or pull requests

2 participants