Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference Speed #47

Open
vefalun opened this issue Mar 5, 2025 · 1 comment
Open

Inference Speed #47

vefalun opened this issue Mar 5, 2025 · 1 comment

Comments

@vefalun
Copy link

vefalun commented Mar 5, 2025

Hello, I would like to ask what the approximate inference speed of the model can reach after deployment?

Thanks for your great work!

@XMHZZ2018
Copy link
Contributor

@vefalun

Thanks for your interest in our work! The inference speed largely depends on the data (e.g., image resolution) and the available computing resources (for batching the inputs). Additionally, we have several versions of VLM2Vec, each with a different number of parameters. For reference, on MMEB-eval, a 7B model takes about 10 ~ 20 GPU hours on an H100. Our model is also integrated into vLLM, which I believe further enhances inference speed. Let me know if this answers your question!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants