diff --git a/doc/development/compressed_state_table/main.md b/doc/development/compressed_state_table/main.md index 9000de86..8fc7a686 100644 --- a/doc/development/compressed_state_table/main.md +++ b/doc/development/compressed_state_table/main.md @@ -1,8 +1,45 @@ # Compressed State Table LR parser generates two large tables, action table and GOTO table. -Action table is a matrix of current state and token. Each cell of action table indicates next action (shift, reduce, accept and error). -GOTO table is a matrix of current state and nonterminal symbol. Each cell of GOTO table indicates next state. +Action table is a matrix of states and tokens. Each cell of action table indicates next action (shift, reduce, accept and error). +GOTO table is a matrix of states and nonterminal symbols. Each cell of GOTO table indicates next state. + +Action table of "parse.y": + +| |EOF| LF|NUM|'+'|'*'|'('|')'| +|--------|--:|--:|--:|--:|--:|--:|--:| +|State 0| r1| | s1| | | s2| | +|State 1| r3| r3| r3| r3| r3| r3| r3| +|State 2| | | s1| | | s2| | +|State 3| s6| | | | | | | +|State 4| | s7| | s8| s9| | | +|State 5| | | | s8| s9| |s10| +|State 6|acc|acc|acc|acc|acc|acc|acc| +|State 7| r2| r2| r2| r2| r2| r2| r2| +|State 8| | | s1| | | s2| | +|State 9| | | s1| | | s2| | +|State 10| r6| r6| r6| r6| r6| r6| r6| +|State 11| | r4| | r4| s9| | r4| +|State 12| | r5| | r5| r5| | r5| + +GOTO table of "parse.y": + +| |$accept|program|expr| +|--------|------:|------:|---:| +|State 0| | g3| g4| +|State 1| | | | +|State 2| | | g5| +|State 3| | | | +|State 4| | | | +|State 5| | | | +|State 6| | | | +|State 7| | | | +|State 8| | | g11| +|State 9| | | g12| +|State 10| | | | +|State 11| | | | +|State 12| | | | + Both action table and GOTO table are sparse. Therefore LR parser generator compresses both tables and creates these tables. @@ -377,7 +414,7 @@ yydefgoto = [ Both of them are Rule table. `yyr1` specifies nonterminal symbol id of rule's Left-Hand-Side. `yyr2` specifies the length of the rule, number of symbols on the rule's Right-Hand-Side. -Index 0 +Index 0 is not used because Rule id starts with 1. ```ruby yyr1 = [ @@ -394,6 +431,10 @@ yyr2 = [ ## How to use tables +See also "parse.rb" which implements LALR parser based on "parse.y" file. + +At first, define important constants and arrays: + ```ruby YYNTOKENS = 9 @@ -419,6 +460,9 @@ yyr2 = [ 0, 2, 0, 2, 1, 3, 3, 3] Determine what to do next based on current state (`state`) and next token (`yytoken`). +The first step to decide action is looking up `yypact` table by current state. +If only default reduce exists for the current state, `yypact` returns `YYPACT_NINF`. + ```ruby # Case 1: Only default reduce exists for the state # @@ -438,6 +482,11 @@ if offset == YYPACT_NINF # true end ``` +If both shift and default reduce exists for the current state, `yypact` returns offset in `yytable`. +Index is the sum of `offset` and `yytoken`. +Need to check index before access to `yytable` by consulting `yycheck`. +Index can be out of range because blank cells on head and tail are omitted then need to check index is not less than 0 and not greater than `YYLAST`, see how `yycheck` is constructed in the example above. + ```ruby # Case 2: Both shift and default reduce exists for the state # @@ -493,6 +542,26 @@ end ### Execute (default) reduce +Once next action is decided to default reduce, need to determine + +1. the rule to be applied +2. the next state from GOTO table + +Rule id for the default reduce is stored in `yydefact`. +`0` in `yydefact` means syntax error so need to check the value is not `0` before continue the process. + +Once rule is determined, the lenght of the rule can be decided from `yyr2` and the LHS nonterminal can be decided from `yyr1`. + +The next state is determined by LHS nonterminal and the state after reduce. +GOTO table is also compressed into `yytable` then the process to decide next state is similar to `yypact`. + +1. Look up `yypgoto` by LHS nonterminal. Note `yypact` is indexed by state but `yypgoto` is indexed by nonterminal. +2. Check the value on `yypgoto` is `YYPACT_NINF` is not. +3. Check the index, sum of offset and state, is out of range or not. +4. Check `yycheck` table before access to `yytable`. + +Finally push the state to the stack. + ```ruby # State 11 #