Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some badcase. #9

Open
ApolloRay opened this issue Oct 30, 2024 · 1 comment
Open

Some badcase. #9

ApolloRay opened this issue Oct 30, 2024 · 1 comment

Comments

@ApolloRay
Copy link

I tried to use models to infer some cases, but found that the model's handling of details is not very good. For example, for this link http, my prompt is “When was the goal scored in the game and provide a specific match time“, but the output result is "The goal was scored towards the end of the match, specifically at a match time of 86.42". Thank you for using clip for encoding, the effect loss is still quite significant.

@shuyansy
Copy link
Collaborator

I acknowledge the current version of VideoXL still holds limited capacity in some domains. As for the case you provided, it is weak in video text recognition and sports event spotting. We will add more data to improve its ability in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants