yacc - BNF: input going to wrong nonterminal -


i'm developing bnf chess algebraic notation , ran interesting case, input going wrong non-terminal.

my start bnf rule follows (note intentionally doesn't include castling or notes):

algebraic_notation : piece start_position capture end_position promotion 

piece, start_position, capture, , promotion can empty, allowing move 'd4'. problem when such move entered, input ('d4') taken start_position, resulting in error b/c there no more input end_position, cannot empty.

the obvious hack/workaround allow end_position empty , check see if got input , act accordingly.

this work, know if there way deal this. possible input not go first matching symbol if causes entire expression not match?

another question whether standard behaviour bnf, or problem yaccer i'm using: ply v 3.3.

tried using flex/bison , got same thing. appears not specific ply.

here relevant rules completeness:

algebraic_notation : piece start_position capture end_position promotion  piece : king         | queen         | bishop         | knight         | rook         | pawn  pawn : empty  start_position : file                 | number                 | file number                 | empty  end_position : file number                 | empty                 // line hack/workaround  capture : capture         | empty  promotion : equal queen             | equal rook             | equal knight             | equal bishop             | empty  empty :  

the problem you're ignoring shift/reduce conflict parser generator. while yacc/bison (and presumably ply) resolve errors you, resolution might not doing want, , might result in parser parses langauge other 1 trying parse.

whenever shift/reduce (or reduce/reduce) conflict lr parser generator, need understand conflict (and why occurs) know whether can ignore or whether need fix it. lets fix grammar getting rid of 'hack' (which wrong , not want parse), useless 'empty' rule (which confuses things):

%token file number %% algebraic_notation : piece start_position capture end_position promotion piece : 'k' | 'q' | 'b' | 'n' | 'r' | /*pawn*/ start_position : file | number | file number | /*empty*/ end_position : file number capture : 'x' | /*empty*/ promotion : '=' 'q' | '=' 'r' | '=' 'n' | '=' 'b' | /*empty*/ 

now when run through 'bison -v' (always use -v verbose output file -- i'm not sure ply's equivalent is), message shift/reduce conflict, , if in .output file can see is:

state 7      1 algebraic_notation: piece . start_position capture end_position promotion      file    shift, , go state 9     number  shift, , go state 10      file      [reduce using rule 11 (start_position)]     $default  reduce using rule 11 (start_position)      start_position  go state 11 

this telling after seeing piece, when next token file, doesn't know whether should shift (treating file (part of) start_position) or reduce (giving empty start_position). that's because needs more lookahead see if there's second position use end_position know do, ignoring conflict result in parser fails parse lots of valid things (basically, empty start_position , capture).

the best way solve lookahead-related shift-reduce conflict involving empty production (or pretty conflict involving empty production, really) unfactor grammar -- rid of empty rule , duplicate rule uses non-terminal both , without it. in case, gives rules:

algebraic_notation : piece capture end_position promotion algebraic_notation : piece start_position capture end_position promotion start_position : file | number | file number  

(the other rules unchanged) still have shift-reduce conflict:

state 7      1 algebraic_notation: piece . capture end_position promotion     2                   | piece . start_position capture end_position promotion      file    shift, , go state 9     number  shift, , go state 10     'x'     shift, , go state 11      file  [reduce using rule 14 (capture)]      start_position  go state 12     capture         go state 13 

basically, we've moved conflict 1 step , have problem empty capture rule. unfactor well:

algebraic_notation : piece end_position promotion algebraic_notation : piece capture end_position promotion algebraic_notation : piece start_position end_position promotion algebraic_notation : piece start_position capture end_position promotion capture : 'x' 

and bison reports no more conflicts, can reasonably confident parse way want. can simplify bit more getting rid of capture rule , using literal 'x' in algebraic_notation rule. prefer this, think clearer avoid unnecessary indirection:

%token file number %% algebraic_notation : piece end_position promotion algebraic_notation : piece 'x' end_position promotion algebraic_notation : piece start_position end_position promotion algebraic_notation : piece start_position 'x' end_position promotion piece : 'k' | 'q' | 'b' | 'n' | 'r' | /*pawn*/ start_position : file | number | file number end_position : file number promotion : '=' 'q' | '=' 'r' | '=' 'n' | '=' 'b' | /*empty*/ 

Comments

Popular posts from this blog

asp.net - repeatedly call AddImageUrl(url) to assemble pdf document -

java - Android recognize cell phone with keyboard or not? -

iphone - How would you achieve a LED Scrolling effect? -