Refactor generation of proof hint "rule applied" event #862

Baltoli · 2023-10-17T08:31:20Z

This PR packages up @gtrepta's initial work towards part (1) of #860; a refactoring of the emission of rewrite / rule match events in the proof trace.

At a high level, the code changes here cause the interpreter to emit "rule matched" events for all rules, not just top-level rewrite rules; in the trace below we can see function rules emitting these events. Currently, the term produced by a function event is not yet emitted into the trace, but it could be recovered by evaluating the function on its bound arguments, which are in the trace.

To evaluate the current state of the trace format, consider the following definition:

module TEST
  imports INT

  syntax Int ::= "inc" Int [function, total]
  rule inc I => I +Int 1

  syntax Foo ::= foo(Int) | bar(Int)
  rule foo(I) => bar(inc I)
endmodule

When run with --proof-output, the sequence of events that is generated in the trace is as follows (with lengthy KORE terms omitted here for my own sanity when extracting this information):

function: _|->_
function: .Map
function: _Map_
function: initGeneratedTopCell
rewrite :
  ordinal: 162 (rule initGeneratedTopCell(Init) => ...)
  VarInit -> ...
function: initKCell
rewrite:
  ordinal: 163 (rule initKCell(Init) => ...)
  VarInit -> ...
function: Map:lookup
function: project:KItem
rewrite:
  ordinal: 207 (rule project:KItem(K) => ...)
  VarK -> ...
function: initGeneratedCounterCell
rewrite:
  ordinal: 161 (rule initGeneratedCounterCell() => ...)
<initial configuration>
rewrite:
  ordinal: 126 (rule foo(I) => ...)
  _DotVar0 -> ...
  _DotVar1 -> ...
  I -> ...
function: inc__TEST_Int_Int
rewrite:
  ordinal: 160 (rule inc__TEST_Int_Int(I) => ...)
  I -> ...
function: _+Int_
<next configuration>

The things lacking from this trace are currently:

Return values from hooked functions (Hook event for proof hints #873, New proof hint features #860)
Side conditions (New proof hint features #860)

Additionally, the parser will need to be updated to reflect this new format as previously.

I would also like to clean up the code a little now that I better understand the trace format and the events being emitted.

gtrepta · 2023-11-02T16:29:56Z

lib/codegen/Decision.cpp

+  auto Call = llvm::CallInst::Create(applyRule, args, "", d->CurrentBlock);
+
+  // How do you emit the resulting config from here?
+
  setDebugLoc(Call);
  Call->setCallingConv(llvm::CallingConv::Tail);


This still needs to be implemented. The call instruction is meant to be tail recursive, I tried changing that but from what I remember it wasn't that simple.

Once this is implemented, the call that's left in CreateTerm to emit the config on rewrites can be removed, right? i.e. that it's there and not here is why we have configs and not function returns in the trace.

I believe so, yes.

As discussed previously, this PR is an initial cleanup and refactoring effort for the existing proof-hint generation code implemented for rewrite, function and hook events. The changes in this PR don't add any new features to the traces; they just reorganise the existing code. Most of the changes made are documentation and identifying duplicated code across different call-sites that can be merged together. I plan to port the reorganisation in #862 over to the infrastructure in this PR once it is merged. The trace format is not tested in the backend currently (this is future work), but I have verified that a proof trace (that uses all event types) generated using this branch is byte-for-byte identical to one generated from the current master branch.

@gtrepta

This PR is a reworking of the changes initially made by @gtrepta in #862, to include the cleaned-up proof infrastructure in #876. The changes here are pretty minimal; we need to split the "rewrite" event into a pre and post part so that the arity / ordinal and resulting configuration can be emitted separately. Other than that, it's just a question of reusing some of the boilerplate in #862 that assembles the substitution information in the event. ~~There is one major issue with the change here, which @gtrepta also identified in his initial version of this change. We can only emit resulting configurations for top-level rewrites (that is, we cannot print the resulting _term_ from calling a function). Doing so breaks TCO for the generated interpreter badly enough that some backend tests crash; I have not dug into this in detail as I'm reasonably sure that's the reason things break.~~ * @nishantjr confirms that we can safely omit the result of function calls from the trace. ## Outstanding Issues I am working my way through the example below to ensure that the format being generated is precisely what we expect to see from the documentation. The issues I have identified are as follows: * [x] Duplicated KORE comments below are caused by function events not being bracketed as rewrites and hooks are; we need to pull the header out to the front, allocate args, then do the existing event. * [x] We need to handle binary KORE delimitation more consistently; at the moment it's wavy whether we emit header / footer or not. Should probably standardise on using the 1.2.0 self-limiting format. ## Worked Example Similarly to #862, I have generated a trace for the program `foo(0)` in the following definition: ```k module TEST imports INT syntax Int ::= "inc" Int [function, total] rule inc I => I +Int 1 syntax Foo ::= foo(Int) | bar(Int) rule foo(I) => bar(inc I) endmodule ``` The trace looks like: ``` hook: MAP.element function: Lbl'UndsPipe'-'-GT-Unds'{} (0) arg: kore[60] arg: kore[109] hook result: kore[207] hook: MAP.unit function: Lbl'Stop'Map{} (0) hook result: kore[51] hook: MAP.concat function: Lbl'Unds'Map'Unds'{} (0) arg: kore[51] arg: kore[207] hook result: kore[207] function: LblinitGeneratedTopCell{} (0) arg: kore[207] rule: 162 1 VarInit = kore[207] function: LblinitKCell{} (0:0) arg: kore[207] rule: 163 1 VarInit = kore[207] function: Lblproject'Coln'KItem{} (0:0:0) hook: MAP.lookup function: LblMap'Coln'lookup{} (0:0:0:0:0) arg: kore[207] arg: kore[60] hook result: kore[109] arg: kore[129] rule: 207 1 VarK = kore[109] function: LblinitGeneratedCounterCell{} (0:1) rule: 161 0 config: kore[237] rule: 126 3 Var'Unds'DotVar0 = kore[61] VarI = kore[50] Var'Unds'DotVar1 = kore[10] function: Lblinc'UndsUnds'TEST'Unds'Int'Unds'Int{} (0:0:0:0:0:0) arg: kore[50] rule: 160 1 VarI = kore[50] hook: INT.add function: Lbl'UndsPlus'Int'Unds'{} (0) arg: kore[50] arg: kore[50] hook result: kore[50] config: kore[237] ``` And was generated with the following (really horrible, quick and dirty) Python recogniser: <details><summary>Recogniser</summary> ```python #!/usr/bin/env python3 import sys import struct def load(fn): with open(fn, 'rb') as f: return f.read() def word(byte): return byte * 8 class Parser: def __init__(self, data): self.input = data self.depth = 0 def print(self, *args): print(f'{" " * self.depth}', end='') print(*args) def peek(self, cst): return self.input[:len(cst)] == cst def eof(self): return len(self.input) == 0 def skip(self, n): self.input = self.input[n:] def read_constant(self, cst): if self.input[:len(cst)] != cst: print(self.input[:40]) assert False self.input = self.input[len(cst):] def read_word(self, byte): assert len(byte) == 1 self.read_constant(byte * 8) def read_until(self, constant): index = self.input.find(constant) ret = self.input[:index] self.input = self.input[index:] return ret def read_string(self): data = self.read_until(b'\x00') self.read_constant(b'\x00') return str(data, 'ascii') def read_uint64(self): index = 8 raw = self.input[:index] self.input = self.input[index:] little_endian_long_long = '<Q' return struct.unpack(little_endian_long_long, raw)[0] def read_arg(self): self.depth += 1 if self.peek(b'\x7F'): l = self.read_kore() self.print(f'arg: kore[{l}]') self.depth -= 1 return True elif self.peek(word(b'\xDD')): self.read_function() self.depth -= 1 return True elif self.peek(word(b'\xAA')): self.read_hook() self.depth -= 1 return True elif self.peek(word(b'\x11')) or self.peek(word(b'\xBB')): self.depth -= 1 return False else: self.read_rule() self.depth -= 1 return True def read_hook(self): self.read_word(b'\xAA') name = self.read_string() self.print(f'hook: {name}') while True: if not self.read_arg(): break self.read_word(b'\xBB') result = self.read_kore() self.print(f'hook result: kore[{result}]') def read_function(self): self.read_word(b'\xDD') label = self.read_string() loc = self.read_string() self.print(f'function: {label} ({loc})') while True: if not self.read_arg(): break self.read_word(b'\x11') def read_kore(self): self.read_constant(b'\x7FKORE') self.skip(6) pattern_len = self.read_uint64() self.skip(pattern_len) return pattern_len def read_rule(self): ordinal = self.read_uint64() arity = self.read_uint64() self.print(f'rule: {ordinal} {arity}') for i in range(arity): self.read_variable() def read_variable(self): name = self.read_string() kore_len = self.read_kore() self.print(f' {name} = kore[{kore_len}]') self.read_word(b'\xCC') def read_config(self): self.read_word(b'\xFF') l = self.read_kore() self.print(f'config: kore[{l}]') self.read_word(b'\xCC') def read_trace(self): while not self.eof(): if self.peek(word(b'\xAA')): self.read_hook() elif self.peek(word(b'\xDD')): self.read_function() elif self.peek(word(b'\xFF')): self.read_config() else: self.read_rule() if __name__ == '__main__': data = load(sys.argv[1]) parser = Parser(data) parser.read_trace() ``` </details> I have verified by hand that this recogniser accepts all the proof traces from the proof generation repo's master branch, if they are regenerated with the LLVM backend pointed at this PR (`kup install k --override llvm-backend hint_emitPlace_2`).

gtrepta and others added 5 commits October 10, 2023 21:15

WIP

bfac04e

WIP

669a3b1

Merge branch 'master' into hint_emitPlace

540b453

Merge branch 'master' into hint_emitPlace

abaf477

Merge branch 'master' into hint_emitPlace

c200aca

Baltoli marked this pull request as ready for review November 2, 2023 14:06

gtrepta requested changes Nov 2, 2023

View reviewed changes

Baltoli mentioned this pull request Nov 6, 2023

Hook event for proof hints #873

Merged

Merge branch 'master' into hint_emitPlace

3ae3fe9

This was referenced Nov 9, 2023

Refactoring proof trace code #876

Merged

Move rewrite event emission #877

Merged

Baltoli closed this Nov 9, 2023

Baltoli deleted the hint_emitPlace branch December 1, 2023 15:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor generation of proof hint "rule applied" event #862

Refactor generation of proof hint "rule applied" event #862

Baltoli commented Oct 17, 2023 •

edited

Loading

gtrepta Nov 2, 2023

Baltoli Nov 2, 2023

gtrepta Nov 2, 2023

Refactor generation of proof hint "rule applied" event #862

Refactor generation of proof hint "rule applied" event #862

Conversation

Baltoli commented Oct 17, 2023 • edited Loading

gtrepta Nov 2, 2023

Choose a reason for hiding this comment

Baltoli Nov 2, 2023

Choose a reason for hiding this comment

gtrepta Nov 2, 2023

Choose a reason for hiding this comment

Baltoli commented Oct 17, 2023 •

edited

Loading