Feature Request: Text (and other modality) conditioning + CFG #9

moiseshorta · 2025-01-07T01:14:20Z

Hello,

Thanks so much for open sourcing the code.

I have been training an unconditioned RF model on audio latents, with really good quality results.

I've been trying out how to implement the text conditioned embeddings using the T5-base model, but so far haven't had good results in the training.

Any chance this will be implemented in a future version?

Thanks again!

lucidrains · 2025-01-07T14:56:48Z

very cool! yes I'll get around to it, back logged with too many ongoing projects

Provide feedback