Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] What is the meaning of argument "multi_phased_distill_schedule" in distill.py #130

Open
1145284121 opened this issue Jan 4, 2025 · 6 comments

Comments

@1145284121
Copy link

Motivation

What does the parameter multi_phased_distill_schedulein distill.py mean? Default is "4000-1", but 4000 seems to be unused; moreover, 1 corresponds to only 1 segment in PCM (I recall it being 4 or 8 in PCM). What is the reasoning behind this?

Related resources

No response

@rlsu9
Copy link
Collaborator

rlsu9 commented Jan 4, 2025

Thanks for your feedback!

Yes, our ablation experiments suggest that multi-step distillation does not significantly improve quality. In this context, 4000 refers to the number of steps used for multi-phase distillation. For example, we first distill the model into 8 segments, then distill those into 4 segments. You can refer to this experiment for more details.

@1145284121
Copy link
Author

Thanks for your feedback!

Yes, our ablation experiments suggest that multi-step distillation does not significantly improve quality. In this context, 4000 refers to the number of steps used for multi-phase distillation. For example, we first distill the model into 8 segments, then distill those into 4 segments. You can refer to this experiment for more details.

Thank you for your reply. So, 4000-1 is equivalent to directly distilling a consistency model with only one sub-segment using 4000 steps, right? Are the 6-step checkpoints of the released hunyuanvideo obtained through distillation from 4000-1?

@rlsu9 rlsu9 closed this as completed Jan 5, 2025
@rlsu9 rlsu9 reopened this Jan 5, 2025
@rlsu9
Copy link
Collaborator

rlsu9 commented Jan 5, 2025

  1. 4000-1 is equivalent to directly distilling a consistency model with only one sub-segment using 4000 steps?
    Yes
  2. Are the 6-step checkpoints of the released hunyuanvideo obtained through distillation from 4000-1?
    Yes, you can refer to this script for our final distill recipe.

@1145284121
Copy link
Author

  1. 4000-1 is equivalent to directly distilling a consistency model with only one sub-segment using 4000 steps?
    Yes
  2. Are the 6-step checkpoints of the released hunyuanvideo obtained through distillation from 4000-1?
    Yes, you can refer to this script for our final distill recipe.

Thank you very much for your response. I have another question: in code, the 6-step reasoning does not iteratively add noise like LCM, but instead uses the Euler method for direct iterative solving. Is this because the Hunyuan pre-training is based on flow matching, and fine-tuning the v-prediction model yields better results? Or does your team have other considerations?

compute the previous noisy sample x_t -> x_t-1

latents = self.scheduler.step(
noise_pred, t, latents, **extra_step_kwargs, return_dict=False
)[0]

in step:
if self.config.solver == "euler":
prev_sample = sample + model_output.to(torch.float32) * dt

@rlsu9
Copy link
Collaborator

rlsu9 commented Jan 5, 2025

Hi, I am not quite sure about iteratively add noise like LCM. But I guess your question about denoise part might related to the section 4.1: Consistency Distillation from the LCM paper.

@1145284121
Copy link
Author

1145284121 commented Jan 7, 2025

add noise like LCM

thanks for your reply, add noise like LCM refers to prediction of the zero point at each step during multi-step inference ,and add noise by $\hat{\mathbf{x}}{r{n}}\stackrel{\cdot}{\longleftrightarrow}\mathbf{x}+\sqrt{\tau_{n}^{2}-\epsilon^{2}}\mathbf{z}$ (as shown in the figure, which from the paper Consistency Models). In FastVideo, multi-step inference is achieved through the integration of v-prediction, i.e., v*dt, which appears to differ from the theory of Consistency Models?
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants