Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

order of tokens in choice matters? #26

Open
stef opened this issue May 13, 2015 · 2 comments
Open

order of tokens in choice matters? #26

stef opened this issue May 13, 2015 · 2 comments

Comments

@stef
Copy link

stef commented May 13, 2015

If a token is a prefix of another token in a choice, then depending on the order of their listing in the choice influences the results of the parser. an example:

#include <hammer/hammer.h>
#include <stdio.h>
#include <string.h>

// test with
// gcc -DBUG -o prefixbug prefixbug.c $(pkg-config --cflags --libs libhammer)
// echo "asdfv" | ./prefixbug # should return yay, does boo
// c.f.
// gcc -o prefixbug prefixbug.c $(pkg-config --cflags --libs libhammer)
// echo "asdfv" | ./prefixbug # correctly^Was expected returns yay

int main(int argc, char *argv[]) {
    uint8_t input[1024];
    size_t inputsize;

    HParser *eol = h_token("\n",1);
    HParser *words = h_choice(
#ifdef BUG
                              h_token("asdf",strlen("asdf")),
                              h_token("asdfv",strlen("asdfv")),
#else // no bug
                              h_token("asdfv",strlen("asdfv")),
                              h_token("asdf",strlen("asdf")),
#endif
                              NULL);
    HParser *rule = h_sequence(words,eol,NULL);

    inputsize = fread(input, 1, sizeof(input), stdin);

    HParseResult *result = h_parse(rule, input, inputsize);
    if(result) {
        printf("yay!\n");
    } else {
        printf("boo!\n");
    }
}

sorry if this is due to my poor understanding and in fact an intended feature.

@abiggerhammer
Copy link
Owner

Hi! I tried to reply to the email you sent, but it finally bounced today after several days. I'll reproduce it here.

The problem you're running into is a known one, and has to do with the fact that Hammer's default backend is packrat parsing. The choice operator in packrat is ordered, which means that if you define a parser that chooses between "asdf" and "asdfv", the first choice in its list that the input string matches will succeed. This is why it works when "asdfv" is the first choice and "asdf" is the second one.

This is an inherent limitation of packrat, it's confusing, and we don't like it either, which is why we're planning on implementing ALL* (from ANTLR) and having it be the default backend, although we'll keep packrat around. All the other backends have the unordered choice you're used to, so if the grammar you're implementing is actually regular or context-free, you can use one of them:

  HParser *p = ...
  if (h_compile(p, PB_LALR, NULL)) { // returns false if grammar not CF
    // parse tables successfully generated, nothing to do here
  } else { ... }
  h_parse(p, input, len);

h_compile is a call-once function (it sets up parse tables and stuff like that for the backends that need them), so you invoke it once your grammar is fully declared and then you never have to touch it again.

I need to write this up for the docs still, so thanks for the reminder.

@stef
Copy link
Author

stef commented May 14, 2015

Thanks for the response! Makes sense I think, but i don't get one thing: If i parse your response correctly (which i doubt, as i'm confused) then that would mean on the "asdfv" input the parser should match and return "asdf", no? The negative branch of the if(result) { false branch means that there was no result in parsing the input. This is not quite what i understand when parsing your sentence:

a parser that chooses between "asdf" and "asdfv", the first choice in its list that the input string matches will succeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants