-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor generation of proof hint "rule applied" event #862
Conversation
auto Call = llvm::CallInst::Create(applyRule, args, "", d->CurrentBlock); | ||
|
||
// How do you emit the resulting config from here? | ||
|
||
setDebugLoc(Call); | ||
Call->setCallingConv(llvm::CallingConv::Tail); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still needs to be implemented. The call instruction is meant to be tail recursive, I tried changing that but from what I remember it wasn't that simple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once this is implemented, the call that's left in CreateTerm
to emit the config on rewrites can be removed, right? i.e. that it's there and not here is why we have configs and not function returns in the trace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe so, yes.
As discussed previously, this PR is an initial cleanup and refactoring effort for the existing proof-hint generation code implemented for rewrite, function and hook events. The changes in this PR don't add any new features to the traces; they just reorganise the existing code. Most of the changes made are documentation and identifying duplicated code across different call-sites that can be merged together. I plan to port the reorganisation in #862 over to the infrastructure in this PR once it is merged. The trace format is not tested in the backend currently (this is future work), but I have verified that a proof trace (that uses all event types) generated using this branch is byte-for-byte identical to one generated from the current master branch.
This PR is a reworking of the changes initially made by @gtrepta in #862, to include the cleaned-up proof infrastructure in #876. The changes here are pretty minimal; we need to split the "rewrite" event into a pre and post part so that the arity / ordinal and resulting configuration can be emitted separately. Other than that, it's just a question of reusing some of the boilerplate in #862 that assembles the substitution information in the event. ~~There is one major issue with the change here, which @gtrepta also identified in his initial version of this change. We can only emit resulting configurations for top-level rewrites (that is, we cannot print the resulting _term_ from calling a function). Doing so breaks TCO for the generated interpreter badly enough that some backend tests crash; I have not dug into this in detail as I'm reasonably sure that's the reason things break.~~ * @nishantjr confirms that we can safely omit the result of function calls from the trace. ## Outstanding Issues I am working my way through the example below to ensure that the format being generated is precisely what we expect to see from the documentation. The issues I have identified are as follows: * [x] Duplicated KORE comments below are caused by function events not being bracketed as rewrites and hooks are; we need to pull the header out to the front, allocate args, then do the existing event. * [x] We need to handle binary KORE delimitation more consistently; at the moment it's wavy whether we emit header / footer or not. Should probably standardise on using the 1.2.0 self-limiting format. ## Worked Example Similarly to #862, I have generated a trace for the program `foo(0)` in the following definition: ```k module TEST imports INT syntax Int ::= "inc" Int [function, total] rule inc I => I +Int 1 syntax Foo ::= foo(Int) | bar(Int) rule foo(I) => bar(inc I) endmodule ``` The trace looks like: ``` hook: MAP.element function: Lbl'UndsPipe'-'-GT-Unds'{} (0) arg: kore[60] arg: kore[109] hook result: kore[207] hook: MAP.unit function: Lbl'Stop'Map{} (0) hook result: kore[51] hook: MAP.concat function: Lbl'Unds'Map'Unds'{} (0) arg: kore[51] arg: kore[207] hook result: kore[207] function: LblinitGeneratedTopCell{} (0) arg: kore[207] rule: 162 1 VarInit = kore[207] function: LblinitKCell{} (0:0) arg: kore[207] rule: 163 1 VarInit = kore[207] function: Lblproject'Coln'KItem{} (0:0:0) hook: MAP.lookup function: LblMap'Coln'lookup{} (0:0:0:0:0) arg: kore[207] arg: kore[60] hook result: kore[109] arg: kore[129] rule: 207 1 VarK = kore[109] function: LblinitGeneratedCounterCell{} (0:1) rule: 161 0 config: kore[237] rule: 126 3 Var'Unds'DotVar0 = kore[61] VarI = kore[50] Var'Unds'DotVar1 = kore[10] function: Lblinc'UndsUnds'TEST'Unds'Int'Unds'Int{} (0:0:0:0:0:0) arg: kore[50] rule: 160 1 VarI = kore[50] hook: INT.add function: Lbl'UndsPlus'Int'Unds'{} (0) arg: kore[50] arg: kore[50] hook result: kore[50] config: kore[237] ``` And was generated with the following (really horrible, quick and dirty) Python recogniser: <details><summary>Recogniser</summary> ```python #!/usr/bin/env python3 import sys import struct def load(fn): with open(fn, 'rb') as f: return f.read() def word(byte): return byte * 8 class Parser: def __init__(self, data): self.input = data self.depth = 0 def print(self, *args): print(f'{" " * self.depth}', end='') print(*args) def peek(self, cst): return self.input[:len(cst)] == cst def eof(self): return len(self.input) == 0 def skip(self, n): self.input = self.input[n:] def read_constant(self, cst): if self.input[:len(cst)] != cst: print(self.input[:40]) assert False self.input = self.input[len(cst):] def read_word(self, byte): assert len(byte) == 1 self.read_constant(byte * 8) def read_until(self, constant): index = self.input.find(constant) ret = self.input[:index] self.input = self.input[index:] return ret def read_string(self): data = self.read_until(b'\x00') self.read_constant(b'\x00') return str(data, 'ascii') def read_uint64(self): index = 8 raw = self.input[:index] self.input = self.input[index:] little_endian_long_long = '<Q' return struct.unpack(little_endian_long_long, raw)[0] def read_arg(self): self.depth += 1 if self.peek(b'\x7F'): l = self.read_kore() self.print(f'arg: kore[{l}]') self.depth -= 1 return True elif self.peek(word(b'\xDD')): self.read_function() self.depth -= 1 return True elif self.peek(word(b'\xAA')): self.read_hook() self.depth -= 1 return True elif self.peek(word(b'\x11')) or self.peek(word(b'\xBB')): self.depth -= 1 return False else: self.read_rule() self.depth -= 1 return True def read_hook(self): self.read_word(b'\xAA') name = self.read_string() self.print(f'hook: {name}') while True: if not self.read_arg(): break self.read_word(b'\xBB') result = self.read_kore() self.print(f'hook result: kore[{result}]') def read_function(self): self.read_word(b'\xDD') label = self.read_string() loc = self.read_string() self.print(f'function: {label} ({loc})') while True: if not self.read_arg(): break self.read_word(b'\x11') def read_kore(self): self.read_constant(b'\x7FKORE') self.skip(6) pattern_len = self.read_uint64() self.skip(pattern_len) return pattern_len def read_rule(self): ordinal = self.read_uint64() arity = self.read_uint64() self.print(f'rule: {ordinal} {arity}') for i in range(arity): self.read_variable() def read_variable(self): name = self.read_string() kore_len = self.read_kore() self.print(f' {name} = kore[{kore_len}]') self.read_word(b'\xCC') def read_config(self): self.read_word(b'\xFF') l = self.read_kore() self.print(f'config: kore[{l}]') self.read_word(b'\xCC') def read_trace(self): while not self.eof(): if self.peek(word(b'\xAA')): self.read_hook() elif self.peek(word(b'\xDD')): self.read_function() elif self.peek(word(b'\xFF')): self.read_config() else: self.read_rule() if __name__ == '__main__': data = load(sys.argv[1]) parser = Parser(data) parser.read_trace() ``` </details> I have verified by hand that this recogniser accepts all the proof traces from the proof generation repo's master branch, if they are regenerated with the LLVM backend pointed at this PR (`kup install k --override llvm-backend hint_emitPlace_2`).
This PR packages up @gtrepta's initial work towards part (1) of #860; a refactoring of the emission of rewrite / rule match events in the proof trace.
At a high level, the code changes here cause the interpreter to emit "rule matched" events for all rules, not just top-level rewrite rules; in the trace below we can see function rules emitting these events. Currently, the term produced by a function event is not yet emitted into the trace, but it could be recovered by evaluating the function on its bound arguments, which are in the trace.
To evaluate the current state of the trace format, consider the following definition:
When run with
--proof-output
, the sequence of events that is generated in the trace is as follows (with lengthy KORE terms omitted here for my own sanity when extracting this information):The things lacking from this trace are currently:
Additionally, the parser will need to be updated to reflect this new format as previously.
I would also like to clean up the code a little now that I better understand the trace format and the events being emitted.