Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add macOS Compatibility #69

Open
wants to merge 27 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
c132cc0
Update README.md
WanX-Video-1 Feb 25, 2025
656b915
Update README.md
WanX-Video-1 Feb 26, 2025
9ab8f96
Update requirements.txt
WanX-Video-1 Feb 26, 2025
fb6dbad
Update text2video.py to reduce GPU memory by emptying cache (#44)
g7adrian Feb 26, 2025
4c503a8
os.path.sep instead of / (#12)
cocktailpeanut Feb 26, 2025
4169800
update gradio (#58)
WanX-Video-1 Feb 26, 2025
8d75c01
add modelscope download cli
yingdachen Feb 26, 2025
9d3d4d7
Check for cuda is available for macos
bakhti-uzb Feb 26, 2025
04bf739
Merge remote-tracking branch 'myfork/main'
bakhti-uzb Feb 26, 2025
5872652
Adapted model for macOS with M1 Pro chip and other improvements
bakhti-uzb Feb 26, 2025
b6a0d1e
Update README with macOS setup and usage instructions
bakhti-uzb Feb 26, 2025
e2287e5
Update text2video.py to reduce GPU memory by emptying cache (#44)
g7adrian Feb 26, 2025
1881816
os.path.sep instead of / (#12)
cocktailpeanut Feb 26, 2025
3f0dde1
update gradio (#58)
WanX-Video-1 Feb 26, 2025
b562f86
add modelscope download cli
yingdachen Feb 26, 2025
cf578ab
Adapted model for macOS with M1 Pro chip and other improvements
bakhti-uzb Feb 26, 2025
60ecbf4
Update README with macOS setup and usage instructions
bakhti-uzb Feb 26, 2025
2beb726
Update README.md
bakhti-uzb Feb 27, 2025
68ae718
Merge main branch and resolve conflicts in README.md
bakhti-uzb Feb 27, 2025
e0317b2
Merge branch 'main' into macos-compatibility
bakhti-uzb Feb 27, 2025
5cb67c6
Fix MPS compatibility for I2V by adjusting device usage and dtype
bakhti-uzb Feb 27, 2025
7ae5058
Merge remote-tracking branch 'myfork/macos-compatibility' into macos-…
bakhti-uzb Feb 27, 2025
ac1bcfa
Merge branch 'main' into macos-compatibility
bakhti-ai Mar 3, 2025
f7bd4d1
Merge branch 'main' into macos-compatibility
bakhti-ai Mar 4, 2025
bd180d1
Merge branch 'main' into macos-compatibility
bakhti-ai Mar 5, 2025
abdcd2b
Merge branch 'main' into macos-compatibility
bakhti-ai Mar 6, 2025
b3e6943
Merge branch 'main' into macos-compatibility
bakhti-ai Mar 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,5 @@ Wan2.1-T2V-14B/
Wan2.1-T2V-1.3B/
Wan2.1-I2V-14B-480P/
Wan2.1-I2V-14B-720P/
venv_wan/
venv_wan_py310/
106 changes: 88 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,78 @@
# Wan2.1
# Wan2.1 Text-to-Video Model

<p align="center">
<img src="assets/logo.png" width="400"/>
<p>
This repository contains the Wan2.1 text-to-video model, adapted for macOS with M1 Pro chip. This adaptation allows macOS users to run the model efficiently, overcoming CUDA-specific limitations.

<p align="center">
💜 <a href=""><b>Wan</b></a> &nbsp&nbsp | &nbsp&nbsp 🖥️ <a href="https://github.com/Wan-Video/Wan2.1">GitHub</a> &nbsp&nbsp | &nbsp&nbsp🤗 <a href="https://huggingface.co/Wan-AI/">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp🤖 <a href="https://modelscope.cn/organization/Wan-AI">ModelScope</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="">Paper (Coming soon)</a> &nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://wanxai.com">Blog</a> &nbsp&nbsp | &nbsp&nbsp💬 <a href="https://gw.alicdn.com/imgextra/i2/O1CN01tqjWFi1ByuyehkTSB_!!6000000000015-0-tps-611-1279.jpg">WeChat Group</a>&nbsp&nbsp | &nbsp&nbsp 📖 <a href="https://discord.gg/AKNgpMK4Yj">Discord</a>&nbsp&nbsp
<br>

-----
## Introduction
The Wan2.1 model is an open-source text-to-video generation model. It transforms textual descriptions into video sequences, leveraging advanced machine learning techniques.

## Changes for macOS

This version includes modifications to make the model compatible with macOS, specifically for systems using the M1 Pro chip. Key changes include:

- Adaptation of CUDA-specific code to work with MPS (Metal Performance Shaders) on macOS.
- Environment variable settings for MPS fallback to CPU for unsupported operations.
- Adjustments to command-line arguments for better compatibility with macOS.

## Installation Instructions

Follow these steps to set up the environment on macOS:

1. **Install Homebrew**: If not already installed, use Homebrew to manage packages.
```bash
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
```

2. **Install Python 3.10+**:
```bash
brew install [email protected]
```

3. **Create and Activate a Virtual Environment**:
```bash
python3.10 -m venv venv_wan
source venv_wan/bin/activate
```

4. **Install Dependencies**:
```bash
pip install -r requirements.txt
pip install einops
```

5. **Download models using huggingface-cli**:
```bash
pip install "huggingface_hub[cli]"
huggingface-cli download Wan-AI/Wan2.1-T2V-1.3B --local-dir ./Wan2.1-T2V-1.3B
```
**Or download models using huggingface-cli**:
```bash
pip install modelscope
modelscope download Wan-AI/Wan2.1-T2V-1.3B --local_dir ./Wan2.1-T2V-1.3B
```

## Usage

To generate a video, use the following command:

```bash
export PYTORCH_ENABLE_MPS_FALLBACK=1
python generate.py --task t2v-1.3B --size "480*832" --frame_num 16 --sample_steps 25 --ckpt_dir ./Wan2.1-T2V-1.3B --offload_model True --t5_cpu --device mps --prompt "Lion running under snow in Samarkand" --save_file output_video.mp4
```

[**Wan: Open and Advanced Large-Scale Video Generative Models**]("") <be>
## Optimization Tips

In this repository, we present **Wan2.1**, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation. **Wan2.1** offers these key features:
- 👍 **SOTA Performance**: **Wan2.1** consistently outperforms existing open-source models and state-of-the-art commercial solutions across multiple benchmarks.
- 👍 **Supports Consumer-grade GPUs**: The T2V-1.3B model requires only 8.19 GB VRAM, making it compatible with almost all consumer-grade GPUs. It can generate a 5-second 480P video on an RTX 4090 in about 4 minutes (without optimization techniques like quantization). Its performance is even comparable to some closed-source models.
- 👍 **Multiple Tasks**: **Wan2.1** excels in Text-to-Video, Image-to-Video, Video Editing, Text-to-Image, and Video-to-Audio, advancing the field of video generation.
- 👍 **Visual Text Generation**: **Wan2.1** is the first video model capable of generating both Chinese and English text, featuring robust text generation that enhances its practical applications.
- 👍 **Powerful Video VAE**: **Wan-VAE** delivers exceptional efficiency and performance, encoding and decoding 1080P videos of any length while preserving temporal information, making it an ideal foundation for video and image generation.
- **Use CPU for Large Models**: If you encounter memory issues, use `--device cpu`.
- **Reduce Resolution and Frame Count**: Use smaller resolutions and fewer frames to reduce memory usage.
- **Monitor System Resources**: Keep an eye on memory usage and adjust parameters as needed.

## Video Demos
## Acknowledgments

<div align="center">
<video src="https://github.com/user-attachments/assets/4aca6063-60bf-4953-bfb7-e265053f49ef" width="70%" poster=""> </video>
</div>
This project is based on the original Wan2.1 model. Special thanks to the original authors and contributors for their work.

## 🔥 Latest News!!

* Mar 3, 2025: 👋 **Wan2.1**'s T2V and I2V have been integrated into Diffusers ([T2V](https://huggingface.co/docs/diffusers/main/en/api/pipelines/wan#diffusers.WanPipeline) | [I2V](https://huggingface.co/docs/diffusers/main/en/api/pipelines/wan#diffusers.WanImageToVideoPipeline)). Feel free to give it a try!
* Feb 27, 2025: 👋 **Wan2.1** has been integrated into [ComfyUI](https://comfyanonymous.github.io/ComfyUI_examples/wan/). Enjoy!
Expand All @@ -36,6 +83,10 @@ If your work has improved **Wan2.1** and you would like more people to see it, p
- [TeaCache](https://github.com/ali-vilab/TeaCache) now supports **Wan2.1** acceleration, capable of increasing speed by approximately 2x. Feel free to give it a try!
- [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) provides more support for **Wan2.1**, including video-to-video, FP8 quantization, VRAM optimization, LoRA training, and more. Please refer to [their examples](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/wanvideo).

<div align="center">
<video src="https://github.com/user-attachments/assets/4aca6063-60bf-4953-bfb7-e265053f49ef" width="70%" poster=""> </video>
</div>


## 📑 Todo List
- Wan2.1 Text-to-Video
Expand All @@ -49,6 +100,8 @@ If your work has improved **Wan2.1** and you would like more people to see it, p
- [x] Multi-GPU Inference code of the 14B model
- [x] Checkpoints of the 14B model
- [x] Gradio demo
- [X] ComfyUI integration
- [ ] Diffusers integration
- [x] ComfyUI integration
- [x] Diffusers integration
- [ ] Diffusers + Multi-GPU Inference
Expand Down Expand Up @@ -157,6 +210,14 @@ torchrun --nproc_per_node=8 generate.py --task t2v-14B --size 1280*720 --ckpt_di

Extending the prompts can effectively enrich the details in the generated videos, further enhancing the video quality. Therefore, we recommend enabling prompt extension. We provide the following two methods for prompt extension:

## Usage

To generate a video, use the following command:

```bash
export PYTORCH_ENABLE_MPS_FALLBACK=1
python generate.py --task t2v-1.3B --size "480*832" --frame_num 16 --sample_steps 25 --ckpt_dir ./Wan2.1-T2V-1.3B --offload_model True --t5_cpu --device mps --prompt "Lion running under snow in Samarkand" --save_file output_video.mp4
```
- Use the Dashscope API for extension.
- Apply for a `dashscope.api_key` in advance ([EN](https://www.alibabacloud.com/help/en/model-studio/getting-started/first-api-call-to-qwen) | [CN](https://help.aliyun.com/zh/model-studio/getting-started/first-api-call-to-qwen)).
- Configure the environment variable `DASH_API_KEY` to specify the Dashscope API key. For users of Alibaba Cloud's international site, you also need to set the environment variable `DASH_API_URL` to 'https://dashscope-intl.aliyuncs.com/api/v1'. For more detailed instructions, please refer to the [dashscope document](https://www.alibabacloud.com/help/en/model-studio/developer-reference/use-qwen-by-calling-api?spm=a2c63.p38356.0.i1).
Expand Down Expand Up @@ -483,9 +544,18 @@ The models in this repository are licensed under the Apache 2.0 License. We clai

## Acknowledgements

We would like to thank the contributors to the [SD3](https://huggingface.co/stabilityai/stable-diffusion-3-medium), [Qwen](https://huggingface.co/Qwen), [umt5-xxl](https://huggingface.co/google/umt5-xxl), [diffusers](https://github.com/huggingface/diffusers) and [HuggingFace](https://huggingface.co) repositories, for their open research.

## Optimization Tips

- **Use CPU for Large Models**: If you encounter memory issues, use `--device cpu`.
- **Reduce Resolution and Frame Count**: Use smaller resolutions and fewer frames to reduce memory usage.
- **Monitor System Resources**: Keep an eye on memory usage and adjust parameters as needed.

## Acknowledgments


This project is based on the original Wan2.1 model. Special thanks to the original authors and contributors for their work.

## Contact Us
If you would like to leave a message to our research or product teams, feel free to join our [Discord](https://discord.gg/AKNgpMK4Yj) or [WeChat groups](https://gw.alicdn.com/imgextra/i2/O1CN01tqjWFi1ByuyehkTSB_!!6000000000015-0-tps-611-1279.jpg)!

Loading