-
Notifications
You must be signed in to change notification settings - Fork 782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What is the VRAM requirement for 14B models? #41
Comments
1.3B I tested and actually used 24GB of video memory |
14B i tested, and a total of 40B*8 of gpu memory was used. |
14b works even on 24 gb with optimization but taking like 3 hours on rtx 3090 ti it takes like 30 min on h200 :D used 56 gb vram in my app |
1.3B actually doesn't run on 8Gb VRAM, if it does than the documentation in the readme is not uptodate. I have been on it since yesterday, trying out everything I can with my 3060 12 Gb but unable to get this to work. |
1.3b works as low as 3.5 GB VRAM - and fully it uses is around 6.5 GB so it should run perfect on 8 GB GPUs here a step by step tutorial which i have shown with evidence : https://youtu.be/hnAhveNy-8s |
Nvidia L40 with 48Gb RAM is not enough for 14B model, even for size 480*832 but with --fp8 option (from commit #80) works fine: 36.74s/it |
In short, 6GB VRAM (RTX 3060 Laptop) + 16GB RAM should work for both 1.3B and 14B models, but quantized models are needed for 14B ones. 6GB VRAM + 16GB RAM is definitely enough to run the 1.3B T2V DiT model comfortably without quantization (clip, VAE and DiT weights straight from the original repo without modification). I have succeeded in generating videos with size of 512x768x121 (would take a while to generate like 20 minutes or so, smaller size runs a lot faster down to a few minutes. have tested 512x768x161 but I forgot how long it took), but you might need to use tiled VAE decoding. BTW I am using ComfyUI native implementation (with sage attention) on Windows for all the experiments. This repo might take up more resources as it is not as optimized as ComfyUI. |
Does ComfyUI, Diffusers, Wan2.1 repo has different code? If so, which one has the best optimized code? |
AFAIK ComfyUI comes with its own implementation and memory management system and (the Comfy native implementation) does not use diffusers as backend. I'm not pretty sure about the difference between the implementation of this repo and Diffusers, but they are probably quite different from ComfyUI's anyways. |
In the README, we learned that the 1.3B model only requires as little as 8 GB of VRAM to run. What about the other 14B models? Any suggestions
The text was updated successfully, but these errors were encountered: