-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About Parallel encoder #4
Comments
My point is that if time t is a key moment, and t+1, t+2, t+3 are non-key, this means that the decoders for t+1, t+2, t+3 all use the features f_t from time t. According to the parallel steps in the paper, t+1, t+2, t+3 all need to decode f_t, but these time steps do not utilize the encoder. So, what is the purpose of the results obtained from this decoding? I hope I have made my question clear, Thanks |
Even though the encoder of UNet is not used during non-key timesteps, its decoder receives shared encoder features from key timesteps, then outputs the predicted noise |
Thank you for your answer, it has nicely resolved my doubts. I made a silly mistake. |
Great work on the study, but I have some queries I'd like to ask.
If the time-steps considered as non-key directly skip the encoding step of the encoder, how are the images decoded from the features encoded by the key time-step encoder used in these non-key time-steps? Since the encoders at non-key time-steps are skipped, there wouldn't be any encoding at time t+1 either. Why not skip the non-key phases altogether?
The text was updated successfully, but these errors were encountered: