Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] 2.15 Hybrid Search With query_text and radial vector search producing array out of bounds #973

Open
iyoung opened this issue Nov 1, 2024 · 6 comments
Assignees
Labels
bug Something isn't working hybrid search

Comments

@iyoung
Copy link

iyoung commented Nov 1, 2024

What is the bug?

Running a hybrid query which contains a min score for the vector side below 0.5 and providing a query_text form lexical search for certain searches (possibly related to number of matches) results in the following response:

{
    "error": {
        "root_cause": [
            {
                "type": "index_out_of_bounds_exception",
                "reason": "index_out_of_bounds_exception: null"
            }
        ],
        "type": "search_phase_execution_exception",
        "reason": "all shards failed",
        "phase": "query",
        "grouped": true,
        "failed_shards": [
            {
                "shard": 0,
                "index": "xxxxxx",
                "node": "xxxxxx",
                "reason": {
                    "type": "index_out_of_bounds_exception",
                    "reason": "index_out_of_bounds_exception: null"
                }
            }
        ]
    },
    "status": 500
}

How can one reproduce the bug?

This is the structure of query I am using which always throws the exception.

{
  "size": 50,
  "track_total_hits": true,
  "query": {
    "hybrid": {
      "queries": [
        {
          "knn": {
            "vector": {
              "vector": [
                -0.018432617,
                0.05419922,
                ...
              ],
              "min_score": 0.4
            }
          }
        },
        {
          "query_string": {            
            "fields": [
              "essential_words^2",
              "caption^3",
              "description^3",
              "main_words",
              "extra_words"
            ],
            "query": "\"cold weather\"",
            "default_operator": "AND"
          }
        }
      ]
    }
  },
  "post_filter": {
    "bool": {
      "must": [
        {
          "exists": {
            "field": "vector"
          }
        }
      ]
    }
  },
  "search_pipeline": {
    "description": "Inline post processor for hybrid search",
    "phase_results_processors": [
      {
        "normalization-processor": {
          "normalization": {
            "technique": "min_max"
          },
          "combination": {
            "technique": "arithmetic_mean",
            "parameters": {
              "weights": [
                0.5,
                0.5
              ]
            }
          }
        }
      }
    ]
  }
}

What is the expected behaviour?

Search results returned

What is your host/environment?

AWS managed Opensearch 2.15

Do you have any additional context?

Increasing the radial search threshold by increasing the min score on the vector search to 0.5 to 1 stops this, also by changing the query term or fields included and therefore the matched items also avoids this.

There are numerous exact phrase search terms which cause this issue for us, such as "group hiking" and "cold weather"

In the error logs I get: -

Failed to execute phase [query], all shards failed; shardFailures {[xxxxxxx][xxxxxxx][0]: RemoteTransportException[[f6b3bd921afde7de466c508873aece19][__IP__][__PATH__[__PATH__]]]; nested: QueryPhaseExecutionException[Query Failed [Failed to execute main query]]; nested: NotSerializableExceptionWrapper[index_out_of_bounds_exception: null]; }
	at org.opensearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:780)
	at org.opensearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:397)
	at org.opensearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:820)
	at org.opensearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:558)
	at org.opensearch.action.search.AbstractSearchAsyncAction$1.onFailure(AbstractSearchAsyncAction.java:318)
	at org.opensearch.action.search.SearchExecutionStatsCollector.onFailure(SearchExecutionStatsCollector.java:104)
	at org.opensearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:75)
	at org.opensearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.java:764)
	at org.opensearch.transport.TransportService$9.handleException(TransportService.java:1729)
	at org.opensearch.security.transport.SecurityInterceptor$RestoringTransportResponseHandler.handleException(SecurityInterceptor.java:436)
	at org.opensearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1515)
	at org.opensearch.transport.NativeMessageHandler.lambda$handleException$5(NativeMessageHandler.java:454)
	at org.opensearch.common.util.concurrent.OpenSearchExecutors$DirectExecutorService.execute(OpenSearchExecutors.java:412)
	at org.opensearch.transport.NativeMessageHandler.handleException(NativeMessageHandler.java:452)
	at org.opensearch.transport.NativeMessageHandler.handlerResponseError(NativeMessageHandler.java:444)
	at org.opensearch.transport.NativeMessageHandler.handleMessage(NativeMessageHandler.java:172)
	at org.opensearch.transport.NativeMessageHandler.messageReceived(NativeMessageHandler.java:126)
	at org.opensearch.transport.InboundHandler.messageReceivedFromPipeline(InboundHandler.java:121)
	at org.opensearch.transport.InboundHandler.inboundMessage(InboundHandler.java:113)
	at org.opensearch.transport.TcpTransport.inboundMessage(TcpTransport.java:800)
	at org.opensearch.transport.nativeprotocol.NativeInboundBytesHandler.forwardFragments(NativeInboundBytesHandler.java:157)
	at org.opensearch.transport.nativeprotocol.NativeInboundBytesHandler.doHandleBytes(NativeInboundBytesHandler.java:94)
	at org.opensearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:143)
	at org.opensearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:119)
	at org.opensearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:95)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
	at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:280)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1475)
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1338)
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1387)
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530)
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1407)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:918)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:994)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at __PATH__(Thread.java:1583)
Caused by: NotSerializableExceptionWrapper[index_out_of_bounds_exception: null]
	at java.nio.Buffer$1.apply(Buffer.java:757)
	at java.nio.Buffer$1.apply(Buffer.java:754)
	at jdk.internal.util.Preconditions$4.apply(Preconditions.java:213)
	at jdk.internal.util.Preconditions$4.apply(Preconditions.java:210)
	at jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:98)
	at jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:106)
	at jdk.internal.util.Preconditions.checkIndex(Preconditions.java:302)
	at java.nio.Buffer.checkIndex(Buffer.java:768)
	at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:342)
	at org.apache.lucene.store.ByteBufferGuard.getByte(ByteBufferGuard.java:119)
	at org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl.readByte(ByteBufferIndexInput.java:583)
	at org.apache.lucene.codecs.lucene90.Lucene90NormsProducer$3.longValue(Lucene90NormsProducer.java:389)
	at org.apache.lucene.search.LeafSimScorer.getNormValue(LeafSimScorer.java:47)
	at org.apache.lucene.search.LeafSimScorer.score(LeafSimScorer.java:60)
	at org.apache.lucene.search.PhraseScorer.score(PhraseScorer.java:83)
	at org.apache.lucene.search.DisjunctionMaxScorer.score(DisjunctionMaxScorer.java:65)
	at org.apache.lucene.search.DisjunctionScorer.score(DisjunctionScorer.java:178)
	at org.opensearch.neuralsearch.query.HybridQueryScorer.hybridScores(HybridQueryScorer.java:193)
	at org.opensearch.neuralsearch.search.collector.HybridTopScoreDocCollector$1.collect(HybridTopScoreDocCollector.java:100)
	at org.opensearch.common.lucene.search.FilteredCollector$1.collect(FilteredCollector.java:79)
	at org.apache.lucene.search.MultiCollector$MultiLeafCollector.collect(MultiCollector.java:226)
	at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreRange(Weight.java:296)
	at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:236)
	at org.opensearch.search.internal.CancellableBulkScorer.score(CancellableBulkScorer.java:71)
	at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:38)
	at org.opensearch.search.internal.ContextIndexSearcher.searchLeaf(ContextIndexSearcher.java:334)
	at org.opensearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:285)
	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:552)
	at org.opensearch.search.query.QueryPhase.searchWithCollector(QueryPhase.java:361)
	at org.opensearch.search.query.QueryPhase$DefaultQueryPhaseSearcher.searchWithCollector(QueryPhase.java:468)
	at org.opensearch.neuralsearch.search.query.HybridQueryPhaseSearcher$DefaultQueryPhaseSearcherWithEmptyQueryCollectorContext.searchWithCollector(HybridQueryPhaseSearcher.java:199)
	at org.opensearch.search.query.QueryPhase$DefaultQueryPhaseSearcher.searchWith(QueryPhase.java:438)
	at org.opensearch.neuralsearch.search.query.HybridQueryPhaseSearcher.searchWith(HybridQueryPhaseSearcher.java:65)
	at org.opensearch.search.query.QueryPhase.executeInternal(QueryPhase.java:284)
	at org.opensearch.search.query.QueryPhase.execute(QueryPhase.java:157)
	at org.opensearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:589)
	at org.opensearch.search.SearchService.executeQueryPhase(SearchService.java:653)
	at org.opensearch.search.SearchService$2.lambda$onResponse$0(SearchService.java:622)
	at org.opensearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:74)
	at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:89)
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
	at org.opensearch.threadpool.TaskAwareRunnable.doRun(TaskAwareRunnable.java:78)
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
	at org.opensearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:59)
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:950)
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.lang.Thread.run(Thread.java:1583)
@iyoung iyoung added bug Something isn't working untriaged labels Nov 1, 2024
@minalsha minalsha removed the untriaged label Nov 4, 2024
@martin-gaievski
Copy link
Member

@iyoung thank you for informing us about this scenario.
I need to request some additional information from you: what is the mapping for your index, what's the index configuration (number of nodes, shards prime and replicas), how many documents do you have, do you expect that query that is failing return search hits, if yes then approximately how many of them.

I have tried following scenario, it works fine on my side:

  1. create index with knn vector field
PUT /index-test
{
  "settings": {
    "index": {
      "knn": true
    }
  },
  "mappings": {
    "properties": {
      "vector": {
        "type": "knn_vector",
        "dimension": 3,
        "method": {
          "name": "hnsw",
          "space_type": "l2",
          "engine": "lucene"
        }
      },
      "field1": {
        "type": "integer"
      },
      "name": {
        "type": "text"
      }
    }
  }
}
  1. ingest several documents with vectors and text fields:
POST /index-test/_bulk?refresh
{"index":{}}
{"field1": 2,"vector": [0.4, 0.5, 0.2],"title": "basic", "name": "A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .", "category": "novel", "price": 20}
{"index":{}}
{ "name": "I brought home the trophy", "category": "story", "price": 20, "field1": 10,"vector": [0.2, 0.2, 0.3],"title": "java"}
{"index":{}}
{"field1": 50,"vector": [4.2, 5.5, 8.9],"name": "Why would he go to all that effort for a free pack of ranch dressing?", "category": "story", "price": 10 }
{"index":{}}
{"vector": [0.3, 0.12, 3.3],"title": "python","name": "In the next 40-50 years I plan on opening up my own business.","category": "poem","price": 100}
{"index":{}}
{  "field1": 100,"vector": [0.2, 0.2, 0.3],"title": "java", "name": "Does he have a big family?", "category": "biography", "price": 70}
{"index":{}}
{"name": "She is my younger sister","category": "workbook","price": 25}
  1. run search with hybrid query
GET /index-test/_search
{
    "size": 50,
    "track_total_hits": true,
    "query": {
        "hybrid": {
            "queries": [
                {
                    "knn": {
                        "vector": {
                            "vector": [
                                0.15,
                                0.3,
                                1.1
                            ],
                            "min_score": 0.2
                        }
                    }
                },
                {
                    "query_string": {
                        "fields": [
                            "title^2",
                            "name^3"
                        ],
                        "query": "\"small\"",
                        "default_operator": "AND"
                    }
                }
            ]
        }
    },
    "post_filter": {
        "bool": {
            "must": [
                {
                    "exists": {
                        "field": "vector"
                    }
                }
            ]
        }
    },
    "search_pipeline": {
        "description": "Inline post processor for hybrid search",
        "phase_results_processors": [
            {
                "normalization-processor": {
                    "normalization": {
                        "technique": "min_max"
                    },
                    "combination": {
                        "technique": "arithmetic_mean",
                        "parameters": {
                            "weights": [
                                0.5,
                                0.5
                            ]
                        }
                    }
                }
            }
        ]
    }
}

I tried multiple search words and different values of min_score

@max-shyman
Copy link

max-shyman commented Nov 8, 2024

Same for my query after 2.13 > 2.15 upgrade.
Exactly as described by @iyoung and also very similar to description #497

Especially this:

Honestly, it's very hard to reproduce the bug.
It only happens for Hybrid search.
I observe a pattern that queries with more than one word tend to be more likely to have this error than simple queries. Queries that failed are like "horror movies", "teen mom", "news radio".
I also observed that when I changed the index data, some queries started working, and other queries started failing.

Issue happens randomly and it is possible to reproduce only for several minutes/hours. I cannot reproduce it for totally the same query after (probably, index data changes affect this).

2 nodes, 1 primary shard, 1 replica shard, ~600k documents (~13GB), hnsw, faiss

Query is mostly the same as topic starter query but with three subqueries (text search + 2 knn). And also no min_score for knn (because it doesn't exist in 2.13), instead knn queris in subqueries are wrapped by function_score with own min_score.

Any ideas?

@iyoung
Copy link
Author

iyoung commented Nov 8, 2024

@iyoung thank you for informing us about this scenario. I need to request some additional information from you: what is the mapping for your index, what's the index configuration (number of nodes, shards prime and replicas), how many documents do you have, do you expect that query that is failing return search hits, if yes then approximately how many of them.

I have tried following scenario, it works fine on my side:

  1. create index with knn vector field
PUT /index-test
{
  "settings": {
    "index": {
      "knn": true
    }
  },
  "mappings": {
    "properties": {
      "vector": {
        "type": "knn_vector",
        "dimension": 3,
        "method": {
          "name": "hnsw",
          "space_type": "l2",
          "engine": "lucene"
        }
      },
      "field1": {
        "type": "integer"
      },
      "name": {
        "type": "text"
      }
    }
  }
}
  1. ingest several documents with vectors and text fields:
POST /index-test/_bulk?refresh
{"index":{}}
{"field1": 2,"vector": [0.4, 0.5, 0.2],"title": "basic", "name": "A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .", "category": "novel", "price": 20}
{"index":{}}
{ "name": "I brought home the trophy", "category": "story", "price": 20, "field1": 10,"vector": [0.2, 0.2, 0.3],"title": "java"}
{"index":{}}
{"field1": 50,"vector": [4.2, 5.5, 8.9],"name": "Why would he go to all that effort for a free pack of ranch dressing?", "category": "story", "price": 10 }
{"index":{}}
{"vector": [0.3, 0.12, 3.3],"title": "python","name": "In the next 40-50 years I plan on opening up my own business.","category": "poem","price": 100}
{"index":{}}
{  "field1": 100,"vector": [0.2, 0.2, 0.3],"title": "java", "name": "Does he have a big family?", "category": "biography", "price": 70}
{"index":{}}
{"name": "She is my younger sister","category": "workbook","price": 25}
  1. run search with hybrid query
GET /index-test/_search
{
    "size": 50,
    "track_total_hits": true,
    "query": {
        "hybrid": {
            "queries": [
                {
                    "knn": {
                        "vector": {
                            "vector": [
                                0.15,
                                0.3,
                                1.1
                            ],
                            "min_score": 0.2
                        }
                    }
                },
                {
                    "query_string": {
                        "fields": [
                            "title^2",
                            "name^3"
                        ],
                        "query": "\"small\"",
                        "default_operator": "AND"
                    }
                }
            ]
        }
    },
    "post_filter": {
        "bool": {
            "must": [
                {
                    "exists": {
                        "field": "vector"
                    }
                }
            ]
        }
    },
    "search_pipeline": {
        "description": "Inline post processor for hybrid search",
        "phase_results_processors": [
            {
                "normalization-processor": {
                    "normalization": {
                        "technique": "min_max"
                    },
                    "combination": {
                        "technique": "arithmetic_mean",
                        "parameters": {
                            "weights": [
                                0.5,
                                0.5
                            ]
                        }
                    }
                }
            }
        ]
    }
}

I tried multiple search words and different values of min_score

Thank you, I am in contact with the Opensearch managed service team within AWS about this issue and looking to replicate this with a smaller index. The index we're using has around 2m documents in it. Once I have a more concrete way to reproduce this I will update. Thank you for replying.

@minalsha
Copy link
Collaborator

minalsha commented Dec 2, 2024

@iyoung any update on this issue with reproducing the bug? Thanks

@minalsha
Copy link
Collaborator

Hi @iyoung any update on this issue with reproducing the bug? Thanks

@martin-gaievski
Copy link
Member

@iyoung do you have any updated on steps to reproduce this? if not I'm about to close this issue next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working hybrid search
Projects
None yet
Development

No branches or pull requests

6 participants