Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for OR and Parenthesis in AST #58

Merged
merged 1 commit into from
Jan 28, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 16 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@ func Example(headerValue string) (*epsearchast_v3.AstNode, error) {
If the error that comes back is a ValidationErr you should treat it as a 400 to the caller.



### Aliases

This package provides a way to support aliases for fields, this will allow a user to specify multiple different names for a field, and still have it validated and converted properly:
Expand Down Expand Up @@ -61,15 +60,15 @@ Aliases can also match Regular Expressions. Regular expresses are specified star

**Note**: Regular expressions are an advanced use case, and care is needed as the validation involved is maybe more limited than expected. In general if more than one regular expression can a key, then it's not defined which one will be used. Some errors may only be caught at runtime.

**Note**: Another catch concerns the fact that `.` is a wild card in regex and often a path seperator in JSON, so if you aren't careful you can allow or create inconsistent rules. In general, you should escape `.` in separators to `\.` and use `([^.]+)` to match a wild card part of the attribute name (or maybe even `[a-zA-Z0-9_-]+`)
**Note**: Another catch concerns the fact that `.` is a wild card in regex and often a path separator in JSON, so if you aren't careful you can allow or create inconsistent rules. In general, you should escape `.` in separators to `\.` and use `([^.]+)` to match a wild card part of the attribute name (or maybe even `[a-zA-Z0-9_-]+`)

**Incorrect**: `^attributes.locales..+.description$` - This would match `attributesXlocalesXXXdescription`, it would also match `attributes.locales.en-US.foo.bar.description`

**Correct**: `^attributes\.locales\.([a-zA-Z0-9_-]+)\.description$`

### Validation

This package provides a concise way to validate that the operators and fields specified in the header are permitted, as well as contrain the allowed values to specific types such as Boolean, Int64, and Float64:
This package provides a concise way to validate that the operators and fields specified in the header are permitted, as well as constrain the allowed values to specific types such as Boolean, Int64, and Float64:

```go
package example
Expand Down Expand Up @@ -132,6 +131,19 @@ func Example(ast *epsearchast_v3.AstNode) error {
}
```

#### OR Filter Restrictions

By default, when using validation in this library, it will cap the complexity of OR queries to 4. The terminology we use internally is effective index intersection count and conceptually it is computed as follows:

1. The value is 1 for every leaf node in the AST.
2. For AND nodes it is the product of the children.
3. For OR nodes it is the sum of the children.

For example if you were searching for (a=1 OR b=2) AND (c=3 OR d=4 OR e=5), we compute that there might be 6 index intersections needed, (a=1,c=3),(a=1,d=4),(a=1,e=5),... This provides a heuristic to cap costs and prevent
runaway queries from being generated. It was actually intended that we look at the number of index scans needed, and maybe that's a closer measure to expense in the DB, but the math would only be slightly different.

Over time this value and argument might change as we get more experience, in the interm you can use 0 as a value to allow everything (say if the collection is small).

#### Regular Expressions

Regular Expressions can also be set when using the Validation functions, the same rules apply as for aliases (see above). In general aliases are resolved prior to validation rules and operator checks.
Expand Down Expand Up @@ -178,7 +190,7 @@ func Example(ast *epsearchast_v3.AstNode, query *gorm.DB, tenantBoundaryId strin
1. The GORM builder does not support aliases (easy MR to fix).
2. The GORM builder does not support joins (fixable in theory).
3. There is no way currently to specify the type of a field for SQL, which means everything gets written as a string today (fixable with MR).
4. The `text` operator implementation makes a number of assumptions, and you likely will want to override it's implementation:
4. The `text` operator implementation makes a number of assumptions, and you likely will want to override its implementation:
* English is hard coded as the language.
* Postgres recommends using a [distinct tsvector column and using a stored generated column](https://www.postgresql.org/docs/current/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX). The current implementation does not support this and, you would need to override the method to support it. A simple MR could be made to allow for the Gorm query builder to know if there is a tsvector column and use that.

Expand Down
50 changes: 48 additions & 2 deletions external/epsearchast/v3/ast.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,35 @@ func (a *AstNode) AsFilter() string {
sb := strings.Builder{}
switch a.NodeType {
case "AND":
for _, c := range a.Children {
for idx, c := range a.Children {

if c.Children != nil {
sb.WriteString("(")
}
sb.WriteString(c.AsFilter())
sb.WriteString(":")

if c.Children != nil {
sb.WriteString(")")
}

if idx < len(a.Children)-1 {
sb.WriteString(":")
}
}
case "OR":
for idx, c := range a.Children {
if c.Children != nil {
sb.WriteString("(")
}
sb.WriteString(c.AsFilter())

if c.Children != nil {
sb.WriteString(")")
}

if idx < len(a.Children)-1 {
sb.WriteString("|")
}
}
default:
sb.WriteString(strings.ToLower(a.NodeType))
Expand Down Expand Up @@ -85,6 +111,8 @@ type AstVisitor interface {
PostVisit() error
PreVisitAnd(astNode *AstNode) (bool, error)
PostVisitAnd(astNode *AstNode) error
PreVisitOr(astNode *AstNode) (bool, error)
PostVisitOr(astNode *AstNode) error
VisitIn(astNode *AstNode) (bool, error)
VisitEq(astNode *AstNode) (bool, error)
VisitLe(astNode *AstNode) (bool, error)
Expand Down Expand Up @@ -123,6 +151,8 @@ func (a *AstNode) accept(v AstVisitor) error {
switch a.NodeType {
case "AND":
descend, err = v.PreVisitAnd(a)
case "OR":
descend, err = v.PreVisitOr(a)
case "IN":
descend, err = v.VisitIn(a)
case "EQ":
Expand Down Expand Up @@ -166,6 +196,12 @@ func (a *AstNode) accept(v AstVisitor) error {
case "AND":
err = v.PostVisitAnd(a)

if err != nil {
return err
}
case "OR":
err = v.PostVisitOr(a)

if err != nil {
return err
}
Expand All @@ -186,6 +222,16 @@ func (a *AstNode) checkValid() error {
if len(a.Children) < 2 {
return fmt.Errorf("and should have at least two children")
}
case "OR":
for _, c := range a.Children {
err := c.checkValid()
if err != nil {
return err
}
}
if len(a.Children) < 2 {
return fmt.Errorf("or should have at least two children")
}
case "IN":
if len(a.Children) > 0 {
return fmt.Errorf("operator %v should not have any children", strings.ToLower(a.NodeType))
Expand Down
48 changes: 48 additions & 0 deletions external/epsearchast/v3/ast_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -383,6 +383,54 @@ func TestAndReturnsErrorWithAnInvalidChild(t *testing.T) {
require.Nil(t, astNode)
}

func TestOrReturnsErrorWithOneChildren(t *testing.T) {
// Fixture Setup
// language=JSON
jsonTxt := `
{
"type": "OR",
"children": [{
"type": "EQ",
"args": [ "status", "paid"]
}]
}
`
// Execute SUT
astNode, err := GetAst(jsonTxt)

// Verify
require.Error(t, err)
require.ErrorContains(t, err, "or should have at least two children")
require.ErrorAs(t, err, &ValidationErr{})
require.Nil(t, astNode)
}

func TestOrReturnsErrorWithAnInvalidChild(t *testing.T) {
// Fixture Setup
// language=JSON
jsonTxt := `
{
"type": "OR",
"children": [{
"type": "EQ",
"args": [ "status", "paid"]
},
{
"type": "FOO"
}
]
}
`
// Execute SUT
astNode, err := GetAst(jsonTxt)

// Verify
require.Error(t, err)
require.ErrorContains(t, err, "unsupported operator foo()")
require.ErrorAs(t, err, &ValidationErr{})
require.Nil(t, astNode)
}

func TestValidObjectThatIsUrlEncodedReturnsAst(t *testing.T) {
// Fixture Setup
// language=JSON
Expand Down
141 changes: 141 additions & 0 deletions external/epsearchast/v3/ast_visitor_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -598,6 +598,137 @@ func TestPreAndPreVisitAndEqAndPostVisitCalledOnAcceptWithError(t *testing.T) {
require.ErrorContains(t, err, "foo")
}

func TestPreAndPostOrEqAndAndCalledOnAccept(t *testing.T) {
// Fixture Setup
// language=JSON
jsonTxt := `
{
"type": "OR",
"children": [
{
"type": "EQ",
"args": [ "amount", "5"]
},{
"type": "EQ",
"args": [ "amount", "5"]
}]
}
`

mockObj := new(MyMockedVisitor)
mockObj.On("PreVisit").Return(nil).
On("PostVisit").Return(nil).
On("VisitEq", mock.Anything).Return(true, nil).
On("PreVisitOr", mock.Anything).Return(true, nil).
On("PostVisitOr", mock.Anything).Return(nil)

astNode, err := GetAst(jsonTxt)
require.NoError(t, err)

// Execute SUT
err = astNode.Accept(mockObj)

// Verification
require.NoError(t, err)

}

func TestPreAndPreVisitOrCalledOnAcceptWithError(t *testing.T) {
// Fixture Setup
// language=JSON
jsonTxt := `
{
"type": "OR",
"children": [
{
"type": "EQ",
"args": [ "amount", "5"]
},{
"type": "EQ",
"args": [ "amount", "5"]
}]
}
`

mockObj := new(MyMockedVisitor)
mockObj.On("PreVisit").Return(nil).
On("PreVisitOr", mock.Anything).Return(true, fmt.Errorf("foo"))

astNode, err := GetAst(jsonTxt)
require.NoError(t, err)

// Execute SUT
err = astNode.Accept(mockObj)

// Verification
require.ErrorContains(t, err, "foo")
}

func TestPreAndPreVisitOrEqCalledOnAcceptWithError(t *testing.T) {
// Fixture Setup
// language=JSON
jsonTxt := `
{
"type": "OR",
"children": [
{
"type": "EQ",
"args": [ "amount", "5"]
},{
"type": "EQ",
"args": [ "amount", "5"]
}]
}
`

mockObj := new(MyMockedVisitor)
mockObj.On("PreVisit").Return(nil).
On("PreVisitOr", mock.Anything).Return(true, nil).
On("VisitEq", mock.Anything).Return(true, fmt.Errorf("foo"))

astNode, err := GetAst(jsonTxt)
require.NoError(t, err)

// Execute SUT
err = astNode.Accept(mockObj)

// Verification
require.ErrorContains(t, err, "foo")
}

func TestPreAndPreVisitOrEqAndPostVisitCalledOnAcceptWithError(t *testing.T) {
// Fixture Setup
// language=JSON
jsonTxt := `
{
"type": "OR",
"children": [
{
"type": "EQ",
"args": [ "amount", "5"]
},{
"type": "EQ",
"args": [ "amount", "5"]
}]
}
`

mockObj := new(MyMockedVisitor)
mockObj.On("PreVisit").Return(nil).
On("PreVisitOr", mock.Anything).Return(true, nil).
On("VisitEq", mock.Anything).Return(true, nil).
On("PostVisitOr", mock.Anything).Return(fmt.Errorf("foo"))

astNode, err := GetAst(jsonTxt)
require.NoError(t, err)

// Execute SUT
err = astNode.Accept(mockObj)

// Verification
require.ErrorContains(t, err, "foo")
}

type MyMockedVisitor struct {
mock.Mock
}
Expand All @@ -622,6 +753,16 @@ func (m *MyMockedVisitor) PostVisitAnd(astNode *AstNode) error {
return args.Error(0)
}

func (m *MyMockedVisitor) PreVisitOr(astNode *AstNode) (bool, error) {
args := m.Called(astNode)
return args.Bool(0), args.Error(1)
}

func (m *MyMockedVisitor) PostVisitOr(astNode *AstNode) error {
args := m.Called(astNode)
return args.Error(0)
}

func (m *MyMockedVisitor) VisitIn(astNode *AstNode) (bool, error) {
args := m.Called(astNode)
return args.Bool(0), args.Error(1)
Expand Down
10 changes: 10 additions & 0 deletions external/epsearchast/v3/es/es_query_builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,16 @@ func (d DefaultEsQueryBuilder) PostVisitAnd(rs []*JsonObject) (*JsonObject, erro
}), nil
}

func (d DefaultEsQueryBuilder) PostVisitOr(rs []*JsonObject) (*JsonObject, error) {
return (*JsonObject)(&map[string]interface{}{
"bool": map[string]interface{}{
"should": rs,
// https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-minimum-should-match.html
"minimum_should_match": 1,
},
}), nil
}

func (d DefaultEsQueryBuilder) VisitIn(args ...string) (*JsonObject, error) {
return (*JsonObject)(&map[string]interface{}{
"terms": map[string]interface{}{
Expand Down
Loading
Loading