A strange bug #27

LiJunscs · 2024-12-30T11:11:43Z

When I use the lmms-eval and videomme dataset to eval the videoxl, I want to implement an function that is interrupted to save the evaluation result. So I create a debug dataset which includes the first 12 row of videomme, including 4 videos and 12 questions.

The bug is as following,
1. When I use the whole 12 data, the first 3 questions' answer are C, A, C respectively.
2. When I use the first 3 questions, answers are A, A, A.
3. When I use the first question, the answer is A.
4. When I continue the eval with the result cached, the answers of question 0 and 2 are A, C

The result cached when interrupted is correct, and load successfully. Note that only the serializable parts of Instance(lmms-eval.api.instance) are cached, others are set to None. And if the eval finished successfully, the cached function will not be call, whole process is same as origin.

Is this my problem or videoxl or lmms-eval.

I eval the qwen2-vl of lmms-eval, answers of the first 3 questions are always A, A, A.

shuyansy · 2025-01-15T09:10:20Z

Sorry to give a late reply. I suppose this is because the random compression ratio is set as default settings. To this end, you can use self.model.config.beacon_ratio=[8] in the py file to achieve fix compression

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A strange bug #27

A strange bug #27

LiJunscs commented Dec 30, 2024 •

edited

Loading

shuyansy commented Jan 15, 2025

A strange bug #27

A strange bug #27

Comments

LiJunscs commented Dec 30, 2024 • edited Loading

shuyansy commented Jan 15, 2025

LiJunscs commented Dec 30, 2024 •

edited

Loading