Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] Add support for llava_hf video, better loading logic for llava_hf ckpt #260

Merged
merged 1 commit into from
Sep 17, 2024

Conversation

kcz358
Copy link
Collaborator

@kcz358 kcz358 commented Sep 16, 2024

This PR update the llava_hf to enable the evaluation of the llava_hf on new series of llava model such as llava-onevision and llava-next(stronger). This PR also enable the video evaluation using llava_hf.

Noted that the video evaluation is only supported using llava onevision hf and would possibly failed if you other version of llava. Since the newest transformers version has not released you have to do

pip install git+https://github.com/huggingface/transformers.git

to install the transformers version from source if you want to use llava onevision hf.

However, after experiment, I think the performance still has some significance difference compare to the original llava. Thus, you are not recommended to use this model to get results of original llava or llava-onevision. This model is only recommended to those that wish to have a quick baseline or have finetuned their model using llava-hf

@kcz358
Copy link
Collaborator Author

kcz358 commented Sep 16, 2024

Video result aligned for 0.5-ov

image

@Luodian Luodian merged commit 9f8d1b4 into main Sep 17, 2024
2 checks passed
@kcz358 kcz358 deleted the dev/llava_hf branch October 22, 2024 07:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants