Skip to content

Commit

Permalink
Implement implicit "OR".
Browse files Browse the repository at this point in the history
If an identifier is given without a keyword, the most recent keyword is assumed. For example,

   not host vs and ace

is short for

   not host vs and host ace

Which should not be confused with

   not ( host vs or ace )

This patch fixes issue #17.
  • Loading branch information
dpino committed Aug 14, 2014
1 parent 550251b commit 2f685f2
Showing 1 changed file with 30 additions and 9 deletions.
39 changes: 30 additions & 9 deletions src/pf/parse.lua
Original file line number Diff line number Diff line change
Expand Up @@ -216,12 +216,15 @@ local function tokens(str)
local tok = next(opts)
assert(tok == expected, "expected "..expected..", got: "..tok)
end
local function put(tok)
str = string.sub(str, 1, pos - 1)..tok..string.sub(str, pos + 1)
end
local function check(expected, opts)
if peek(opts) ~= expected then return false end
next()
return true
end
return { peek = peek, next = next, consume = consume, check = check }
return { peek = peek, next = next, consume = consume, check = check, put = put }
end

local addressables = set(
Expand Down Expand Up @@ -619,6 +622,16 @@ function parse_arithmetic(lexer, tok, max_precedence, parsed_exp)
end
end

local last_keyword = (function()
local keyword = nil
return function(arg)
if arg ~= nil then
keyword = arg
end
return keyword
end
end)()

local function parse_primitive_or_arithmetic(lexer)
local tok = lexer.next({maybe_arithmetic=true})
if (type(tok) == 'number' or tok == 'len' or
Expand All @@ -627,15 +640,18 @@ local function parse_primitive_or_arithmetic(lexer)
end

local parser = primitives[tok]
if parser then return parser(lexer, tok) end
if parser then
last_keyword(tok)
return parser(lexer, tok)
else
lexer.put(tok)
local parser = primitives[last_keyword()]
if parser then
return parser(lexer, last_keyword())
end
end

-- At this point the official pcap grammar is squirrely. It says:
-- "If an identifier is given without a keyword, the most recent
-- keyword is assumed. For example, `not host vs and ace' is
-- short for `not host vs and host ace` and which should not be
-- confused with `not (host vs or ace)`." For now we punt on this
-- part of the grammar.
error("keyword elision not implemented "..tok)
error(string.format("Unexpected token: %s", tok))
end

local logical_precedence = {
Expand Down Expand Up @@ -703,6 +719,7 @@ function parse(str)
if not lexer.peek({maybe_arithmetic=true}) then return { 'true' } end
local expr = parse_logical(lexer)
assert(not lexer.peek(), "unexpected token", lexer.peek())
last_keyword("")
return expr
end

Expand Down Expand Up @@ -788,5 +805,9 @@ function selftest ()
{ "-", { "-", { "[ip]", 2, 1 },
{ "<<", { "&", { "[ip]", 0, 1 }, 15 }, 2 } },
{ ">>", { "&", { "[tcp]", 12, 1 }, 240 }, 2 } }, 0 } })
parse_test("not host vs and ace",
{ 'not', { 'and', { 'host', 'vs' }, { 'host', 'ace' } } } )
parse_test("not ( host vs or ace )",
{ 'not', { 'or', { 'host', 'vs' }, { 'host', 'ace' } } } )
print("OK")
end

1 comment on commit 2f685f2

@dpino
Copy link
Member Author

@dpino dpino commented on 2f685f2 Aug 14, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 'parse_primitive_or_arithmetic' I tried to do a peek and later consume the next token, but I had trouble with other expressions. Finally I added a method to the lexer to put back a token. For example, in the expression:

not and host vs ace

When the lexer reaches 'ace', cannot find primitive, so since I keep track of the last keyword I know what keyword to apply. However, the pos of the lexer has already advance. The consumed 'ace' should be the last keyword (in this case 'host'), and I append 'ace' back to the string at the current position.

Another thing is that I'm not sure if the tree for "not host vs and ace" is correct. I got the following for its equivalent expression "not host vs and ace":

{not, {and, {host, vs}, {host, ace}}}

Shouldn't be:

{and, {not, {host, vs}}, {host, ace}}

As usually 'not' has higher precedence than any other logical operator. I noticed 'not' is not present in the precedence table, maybe it has to do with this. WDYT?

Please sign in to comment.