Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update similarity_fn_name = 'cosine' to MaxSim #77

Open
sam-hey opened this issue Jan 13, 2025 · 4 comments
Open

Update similarity_fn_name = 'cosine' to MaxSim #77

sam-hey opened this issue Jan 13, 2025 · 4 comments
Assignees

Comments

@sam-hey
Copy link

sam-hey commented Jan 13, 2025

When loading a model, the similarity_fn_name attribute defaults to cosine instead of max_sim. Since SentenceTransformer does not support max_sim natively, this functionality must be implemented using pylate.

self.model = colbert_model.ColBERT("colbert-ir/colbertv2.0")
@NohTow
Copy link
Collaborator

NohTow commented Jan 13, 2025

Hello,
Yes indeed, I did not overwrite it as we are not directly using it in the training/inference code.
It happens that I am working on another PR to add NanoBEIREvaluator and I added the SimilarityFunctions to PyLate, with only MaxSim for now.
See #76 for when it gets merged.
Also, at some point I should change the code to use these calls instead of directly calling scores in the training and inference, but that will be another PR.

@NohTow NohTow self-assigned this Jan 13, 2025
@sam-hey
Copy link
Author

sam-hey commented Jan 13, 2025

Awesome, thanks for the quick response—I'm really looking forward to it!

By the way, I absolutely love this project—fantastic work!

@NohTow
Copy link
Collaborator

NohTow commented Jan 13, 2025

Thanks for the kind words!
Actually, could you try the branch with your use case (I suppose you have one) and report if anything is not working for you?
Maybe it will be in another PR, but it's nice to see if there is something wrong that I could fix.
The PR itself should be merged soon, but if you want to try it now, it might prevent merging a bug!

@sam-hey
Copy link
Author

sam-hey commented Jan 13, 2025

Sure, it works for me without any issues!
Thanks for the implementation—I’m looking forward to the new version being available!

sam-hey added a commit to sam-hey/mteb that referenced this issue Jan 13, 2025
@sam-hey sam-hey changed the title Update similarity_fn_name = 'cosine' to max_sim Update similarity_fn_name = 'cosine' to MaxSim Jan 13, 2025
KennethEnevoldsen pushed a commit to embeddings-benchmark/mteb that referenced this issue Jan 17, 2025
* add dotwrapper

* lint

* make cleaner

* add poc similarity_fn in ModelMeta

* ref: rename EvaluationFunction to ScoringFunction

Co-authored-by: Isaac Chung <[email protected]>

* make cos_sim default

* Revert "make cleaner"

This reverts commit 7d1e949.

* Revert "add dotwrapper"

This reverts commit d71718b.

* lint

* fix: _run_eval no co tracking

* fix: bm25s

* add enum to models

* add mapping st sim fn name to mteb sim fn name

* fix model meta use new fn for sim operators

* add max_sim

* fix: colbert &  rm similarity_fn_name

* ci: skip AfriSentiLID for now (#1785)

* skip AfriSentiLID for now

* skip relevant test case instead

---------

Co-authored-by: Isaac Chung <[email protected]>

* test: add test for bm25s and ColBERT

* lint

* feat: add mapping for max_sim from pylate

lightonai/pylate#77

* test: bm25s skip

* fix: MaxSim as max_sim match pylate &  rm Enum in models

* rm enum

* update tests skip

---------

Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: sam021313 <[email protected]>
Co-authored-by: Isaac Chung <[email protected]>
Co-authored-by: Isaac Chung <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants