-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The quality of the reproduced results is a little worse #17
Comments
It's due to reconstruction issues. Try higher DDIM Inversion steps. You need to use higher DDIM Inversion steps (1000) to get better results. All our demo results were done in 1000 inversion steps, but the HuggingFace demo page GPU has a limited up time so we made it default 100 steps. |
Thank you for your reply. I increased the steps to 999, and the results are indeed a bit better, but not significantly. I originally thought the sequential images in the demo folder were generated, but it turns out they were the original video frames. As for the Video description prompt, should I just write the original video's name there? I've tried leaving it blank or adding "snowing," and the results become quite strange. |
description prompt should be the prompt for your target output (e.g. A couple in a public display of affection, snowing) |
In fact, the effect is worse with snowing, and the character's face breaks, as shown in the video below: snowing_100.mp4 |
Were you able to reproduce the results in the examples from the demo? I think the seed and the hyperparameter injection weightings matters, but indeed AnyV2V is not robust enough. |
Ok, thanks for your reply, I will try I2VEdit |
The quality of the reproduced results is a little worse, I don't know what went wrong.I used the following parameters, which were run in the 3090 in the wsl environment
edited_video.mp4
The text was updated successfully, but these errors were encountered: