You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I added more examples to the documents array in the tf-idf example, the wrong document was shown as the most similar. For me, with scikit-learn version 0.24.1, the cosine similarities don't include the input document, so the index 'i' is actually one less than the corresponding document in the documents array. Therefore the most similar document turns out to be documents[highest_score_index + 1].
The text was updated successfully, but these errors were encountered:
When I added more examples to the documents array in the tf-idf example, the wrong document was shown as the most similar. For me, with scikit-learn version 0.24.1, the cosine similarities don't include the input document, so the index 'i' is actually one less than the corresponding document in the documents array. Therefore the most similar document turns out to be documents[highest_score_index + 1].
Thank you! I had the same issue and spent quite some time messing around until I saw your comment
When I added more examples to the documents array in the tf-idf example, the wrong document was shown as the most similar. For me, with scikit-learn version 0.24.1, the cosine similarities don't include the input document, so the index 'i' is actually one less than the corresponding document in the documents array. Therefore the most similar document turns out to be documents[highest_score_index + 1].
The text was updated successfully, but these errors were encountered: