Add macOS Compatibility #69

bakhti-ai · 2025-02-26T17:35:00Z

Overview

This pull request introduces compatibility improvements for running the Wan2.1 text-to-video model on macOS systems with M1 Pro chips. It also includes enhancements to the documentation to assist macOS users in setting up and using the model effectively.

Key Changes

MPS Compatibility: Adapted CUDA-specific code to work with Metal Performance Shaders (MPS) on macOS, allowing the model to run on M1 Pro chips.
Environment Variable for Fallback: Implemented the use of PYTORCH_ENABLE_MPS_FALLBACK=1 to enable CPU fallback for operations not supported by MPS.
Command-Line Adjustments: Modified command-line arguments to improve compatibility and performance on macOS.
Documentation Updates: Enhanced the README with detailed installation instructions, usage examples, and optimization tips specifically for macOS users.

Benefits

Broader Accessibility: Enables macOS users, particularly those with M1 Pro chips, to utilize the Wan2.1 model without encountering CUDA-related issues.
Improved User Experience: Provides clear guidance and best practices for setting up and running the model on macOS, reducing setup time and potential errors.
Community Contribution: Shares valuable insights and solutions with the community, potentially benefiting other users facing similar challenges.

Testing

The changes have been tested on a MacBook Pro with an M1 Pro chip, ensuring that the model runs smoothly with the specified configurations.

Additional Notes

Users are encouraged to monitor system resources and adjust parameters as needed to optimize performance and memory usage.
Feedback and further suggestions for improvement are welcome.

…o#44) * Update text2video.py to reduce GPU memory by emptying cache If offload_model is set, empty_cache() must be called after the model is moved to CPU to actually free the GPU. I verified on a RTX 4090 that without calling empty_cache the model remains in memory and the subsequent vae decoding never finishes. * Update text2video.py only one empty_cache needed before vae decode

bakhti-ai · 2025-02-26T17:57:30Z

Hi @WanX-Video-1,

I hope you're doing well. I've made some changes to adapt the Wan2.1 text-to-video model for macOS with M1 Pro chips. The key changes include:

Compatibility improvements for MPS on macOS.
Documentation updates for macOS setup and usage.
Command-line adjustments for better performance on macOS.

Could you please review the pull request when you have a moment? Your feedback would be greatly appreciated.

Thank you!

Best regards,
Bakhtiyor

lorihuang · 2025-02-27T04:37:42Z

Thank you for your help, but I encountered the following error during the program execution. Could you please let me know how to resolve it? Thank you!
"""
python(16280) MallocStackLogging: can't turn off malloc stack logging because it was not enabled.
0%| | 0/25 [00:00<?, ?it/s]
"""

bakhti-ai · 2025-02-27T04:53:50Z

Thank you for your help, but I encountered the following error during the program execution. Could you please let me know how to resolve it? Thank you! """ python(16280) MallocStackLogging: can't turn off malloc stack logging because it was not enabled. 0%| | 0/25 [00:00<?, ?it/s] """

This is just a warning message from macOS. I also encountered it.
It doesn't impact the program's execution
The progress bar that follows indicates the program is working as expected
You can safely ignore this message and just wait the progress bar go to 100%

lorihuang · 2025-02-27T05:11:47Z

Thank you for your help, but I encountered the following error during the program execution. Could you please let me know how to resolve it? Thank you! """ python(16280) MallocStackLogging: can't turn off malloc stack logging because it was not enabled. 0%| | 0/25 [00:00<?, ?it/s] """

This is just a warning message from macOS. I also encountered it. It doesn't impact the program's execution The progress bar that follows indicates the program is working as expected You can safely ignore this message and just wait the progress bar go to 100%

got it ! ths a lot

TreasureJade · 2025-02-27T06:33:51Z

I have successfully run your solution, thank you ^ ^

…o#44) * Update text2video.py to reduce GPU memory by emptying cache If offload_model is set, empty_cache() must be called after the model is moved to CPU to actually free the GPU. I verified on a RTX 4090 that without calling empty_cache the model remains in memory and the subsequent vae decoding never finishes. * Update text2video.py only one empty_cache needed before vae decode

Add model files download step

Volutionn · 2025-02-27T07:44:42Z

Thanks @bakhti-uzb ! I was working on the same PR but ran into this error on my M4 Max:
RuntimeError: Input type (MPSFloatType) and weight type (MPSHalfType) should be the same

I'm hitting the same error on your PR, have you run into this?

bakhti-ai · 2025-02-27T09:39:50Z

Thanks @bakhti-uzb ! I was working on the same PR but ran into this error on my M4 Max: RuntimeError: Input type (MPSFloatType) and weight type (MPSHalfType) should be the same

I'm hitting the same error on your PR, have you run into this?

I haven't encountered that exact error on my M1 Pro, but it's related to tensor data type mismatches on the MPS backend. The model is trying to use half-precision weights (FP16) with full-precision inputs (FP32).
You could try a few approaches:

Force everything to use the same precision by adding this before model loading:

torch.mps.set_per_process_memory_fraction(0.8)  # Optional but helpful
torch.set_default_dtype(torch.float32)  # Force full precision

Alternatively, if you prefer half precision for memory efficiency:
torch.set_default_dtype(torch.float16)
You might need to explicitly convert tensors to match types in the model code. Look for where tensors are being sent to the MPS device and ensure consistent typing.

Let me know if you've found a solution that works consistently - I'd be happy to incorporate it into the PR to make it work better across different Apple Silicon chips.

Volutionn · 2025-02-27T11:09:00Z

Thanks @bakhti-uzb, I managed to make it work. The issue was that I was trying to run the I2V while you focused on the T2V. Here's the change I made to get the I2V working:

image2video.py: I removed the cpu() in img[None].cpu() to have the img use the same device when using MPS
wan_i2v_14B.py: I changed this line i2v_14B.clip_dtype = torch.float16 to use float32

tanmay1100

  File "/Users/me/Wan2.1/wan/modules/attention.py", line 186, in attention
    out = torch.nn.functional.scaled_dot_product_attention(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/site-packages/torch/utils/_device.py", line 78, in __torch_function__
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Invalid buffer size: 47.98 GB

This pops up after making all these changes, on m3 pro.

…compatibility

bakhti-ai · 2025-02-27T13:57:13Z

Thanks @bakhti-uzb, I managed to make it work. The issue was that I was trying to run the I2V while you focused on the T2V. Here's the change I made to get the I2V working:

image2video.py: I removed the cpu() in img[None].cpu() to have the img use the same device when using MPS

wan_i2v_14B.py: I changed this line i2v_14B.clip_dtype = torch.float16 to use float32

Thanks for letting us know. I included this changes in the repo

bakhti-ai · 2025-02-27T14:10:57Z

  File "/Users/me/Wan2.1/wan/modules/attention.py", line 186, in attention
    out = torch.nn.functional.scaled_dot_product_attention(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/site-packages/torch/utils/_device.py", line 78, in __torch_function__
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Invalid buffer size: 47.98 GB

This pops up after making all these changes, on m3 pro.

It would be easy to help if you provide some more information like how you are trying to generate with what kind of options (frame_num etc.)

For now I can suggest these:

Lower resolution: Suggest they try a smaller video size like --size "320*576" instead of your current settings.
Reduce frame count: Use fewer frames with --frame_num 8
Increase memory efficiency:
export PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.8
Use CPU mode: While slower, it would avoid the memory limitation:
python generate.py --task t2v-1.3B --device cpu [other parameters]
Set explicit memory fraction:

#Add this somewhere at the top of the script
import torch
torch.mps.set_per_process_memory_fraction(0.7)  # Adjust value as needed

tanmay1100 · 2025-02-27T14:18:57Z

  File "/Users/me/Wan2.1/wan/modules/attention.py", line 186, in attention
    out = torch.nn.functional.scaled_dot_product_attention(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/site-packages/torch/utils/_device.py", line 78, in __torch_function__
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Invalid buffer size: 47.98 GB
This pops up after making all these changes, on m3 pro.
It would be easy to help if you provide some more information like how you are trying to generate with what kind of options (frame_num etc.)

For now I can suggest these:

Lower resolution: Suggest they try a smaller video size like --size "320*576" instead of your current settings.

Reduce frame count: Use fewer frames with --frame_num 8

Increase memory efficiency:
export PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.8

Use CPU mode: While slower, it would avoid the memory limitation:
python generate.py --task t2v-1.3B --device cpu [other parameters]

Set explicit memory fraction:
#Add this somewhere at the top of the script
import torch
torch.mps.set_per_process_memory_fraction(0.7)  # Adjust value as needed

not explicitly specifying frame numbers, just copied their example from readme:

python generate.py  --task t2v-1.3B --size "832*480" --ckpt_dir ../Wan2.1-T2V-1.3B --sample_shift 8 --sample_guide_scale 6 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."

Did this: export PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.8

Now it throws: RuntimeError: invalid low watermark ratio 1.4

Also set the torch.mps.set_per_process_memory_fraction(0.7)
Throws:

 File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/site-packages/torch/mps/__init__.py", line 108, in set_per_process_memory_fraction
   torch._C._mps_setMemoryFraction(fraction)`
RuntimeError: invalid low watermark ratio 1.4

Setting the watermark ratio back to 0, and running this command:

python generate.py  --task t2v-1.3B --size "832*480" --ckpt_dir ../Wan2.1-T2V-1.3B --sample_shift 8 --sample_guide_scale 6 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage." --frame_num 5 --offload_model True --t5_cpu --device "cpu"

throws:

  File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 610, in forward
    return self._conv_forward(input, self.weight, self.bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 605, in _conv_forward
    return F.conv3d(
           ^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/site-packages/torch/utils/_device.py", line 78, in __torch_function__
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
RuntimeError: MPS backend out of memory (MPS allocated: 6.11 GB, other allocations: 2.27 GB, max allowed: 8.40 GB). Tried to allocate 73.12 MB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).

bakhti-ai · 2025-02-27T17:10:39Z

  File "/Users/me/Wan2.1/wan/modules/attention.py", line 186, in attention
    out = torch.nn.functional.scaled_dot_product_attention(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/site-packages/torch/utils/_device.py", line 78, in __torch_function__
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Invalid buffer size: 47.98 GB
This pops up after making all these changes, on m3 pro.
It would be easy to help if you provide some more information like how you are trying to generate with what kind of options (frame_num etc.)
For now I can suggest these:

Lower resolution: Suggest they try a smaller video size like --size "320*576" instead of your current settings.

Reduce frame count: Use fewer frames with --frame_num 8

Increase memory efficiency:
export PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.8

Use CPU mode: While slower, it would avoid the memory limitation:
python generate.py --task t2v-1.3B --device cpu [other parameters]

Set explicit memory fraction:
#Add this somewhere at the top of the script
import torch
torch.mps.set_per_process_memory_fraction(0.7)  # Adjust value as needed

not explicitly specifying frame numbers, just copied their example from readme:

python generate.py  --task t2v-1.3B --size "832*480" --ckpt_dir ../Wan2.1-T2V-1.3B --sample_shift 8 --sample_guide_scale 6 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."

Did this: export PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.8

Now it throws: RuntimeError: invalid low watermark ratio 1.4

Also set the torch.mps.set_per_process_memory_fraction(0.7) Throws:

 File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/site-packages/torch/mps/__init__.py", line 108, in set_per_process_memory_fraction
   torch._C._mps_setMemoryFraction(fraction)`
RuntimeError: invalid low watermark ratio 1.4

Setting the watermark ratio back to 0, and running this command:

python generate.py  --task t2v-1.3B --size "832*480" --ckpt_dir ../Wan2.1-T2V-1.3B --sample_shift 8 --sample_guide_scale 6 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage." --frame_num 5 --offload_model True --t5_cpu --device "cpu"

throws:

  File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 610, in forward
    return self._conv_forward(input, self.weight, self.bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 605, in _conv_forward
    return F.conv3d(
           ^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/site-packages/torch/utils/_device.py", line 78, in __torch_function__
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
RuntimeError: MPS backend out of memory (MPS allocated: 6.11 GB, other allocations: 2.27 GB, max allowed: 8.40 GB). Tried to allocate 73.12 MB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).

It looks like you're encountering a series of memory-related issues that are common when running large AI models on Mac.

Here's what I recommend trying:

Combined approach - try this exact command sequence:

# First unset any existing memory settings
unset PYTORCH_MPS_HIGH_WATERMARK_RATIO

# Then run with these specific settings
python generate.py --task t2v-1.3B --size "416*256" --frame_num 4 --sample_steps 10 --ckpt_dir ../Wan2.1-T2V-1.3B --offload_model True --t5_cpu --device cpu --prompt "Two anthropomorphic cats in comfy boxing gear"

Reduce complexity all around:
- Use a significantly smaller resolution
- Reduce frame count to minimum (4)
- Reduce the prompt length
- Reduce sample steps

Memory management in Python - add these at the top of your script:

import gc
import torch

# Force garbage collection
gc.collect()
torch.cuda.empty_cache() if torch.cuda.is_available() else None

Close other applications - Make sure you don't have other memory-intensive apps running

The key insight is that when using large models, especially on Mac, you typically need to be much more conservative with generation parameters than the default examples suggest. Start with minimal settings that work, then gradually increase until you find your system's limit.

tanmay1100 · 2025-02-27T18:24:38Z

  File "/Users/me/Wan2.1/wan/modules/attention.py", line 186, in attention
    out = torch.nn.functional.scaled_dot_product_attention(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/site-packages/torch/utils/_device.py", line 78, in __torch_function__
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Invalid buffer size: 47.98 GB
This pops up after making all these changes, on m3 pro.
It would be easy to help if you provide some more information like how you are trying to generate with what kind of options (frame_num etc.)
For now I can suggest these:

Lower resolution: Suggest they try a smaller video size like --size "320*576" instead of your current settings.

Reduce frame count: Use fewer frames with --frame_num 8

Increase memory efficiency:
export PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.8

Use CPU mode: While slower, it would avoid the memory limitation:
python generate.py --task t2v-1.3B --device cpu [other parameters]

Set explicit memory fraction:
#Add this somewhere at the top of the script
import torch
torch.mps.set_per_process_memory_fraction(0.7)  # Adjust value as needed
not explicitly specifying frame numbers, just copied their example from readme:
python generate.py  --task t2v-1.3B --size "832*480" --ckpt_dir ../Wan2.1-T2V-1.3B --sample_shift 8 --sample_guide_scale 6 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
Did this: export PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.8
Now it throws: RuntimeError: invalid low watermark ratio 1.4
Also set the torch.mps.set_per_process_memory_fraction(0.7) Throws:
 File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/site-packages/torch/mps/__init__.py", line 108, in set_per_process_memory_fraction
   torch._C._mps_setMemoryFraction(fraction)`
RuntimeError: invalid low watermark ratio 1.4
Setting the watermark ratio back to 0, and running this command:
python generate.py  --task t2v-1.3B --size "832*480" --ckpt_dir ../Wan2.1-T2V-1.3B --sample_shift 8 --sample_guide_scale 6 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage." --frame_num 5 --offload_model True --t5_cpu --device "cpu"
throws:
  File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 610, in forward
    return self._conv_forward(input, self.weight, self.bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 605, in _conv_forward
    return F.conv3d(
           ^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.12/site-packages/torch/utils/_device.py", line 78, in __torch_function__
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
RuntimeError: MPS backend out of memory (MPS allocated: 6.11 GB, other allocations: 2.27 GB, max allowed: 8.40 GB). Tried to allocate 73.12 MB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).
It looks like you're encountering a series of memory-related issues that are common when running large AI models on Mac.

Here's what I recommend trying:
Combined approach - try this exact command sequence:
# First unset any existing memory settings
unset PYTORCH_MPS_HIGH_WATERMARK_RATIO

# Then run with these specific settings
python generate.py --task t2v-1.3B --size "416*256" --frame_num 4 --sample_steps 10 --ckpt_dir ../Wan2.1-T2V-1.3B --offload_model True --t5_cpu --device cpu --prompt "Two anthropomorphic cats in comfy boxing gear"
Reduce complexity all around:

Use a significantly smaller resolution

Reduce frame count to minimum (4)

Reduce the prompt length

Reduce sample steps
Memory management in Python - add these at the top of your script:
import gc
import torch

# Force garbage collection
gc.collect()
torch.cuda.empty_cache() if torch.cuda.is_available() else None
Close other applications - Make sure you don't have other memory-intensive apps running
The key insight is that when using large models, especially on Mac, you typically need to be much more conservative with generation parameters than the default examples suggest. Start with minimal settings that work, then gradually increase until you find your system's limit.

generate.py: error: argument --size: invalid choice: '416*256' (choose from '720*1280', '1280*720', '480*832', '832*480', '1024*1024')

it doesn't let me go below 480p

agentx-cgn · 2025-03-01T21:22:18Z

Still get MPS backend out of memory with --device cpu, but did generate video, just tripped at the end.

MrShakila · 2025-03-02T13:44:23Z

can we use this model in m3 Pro chip as well ?

bakhti-ai · 2025-03-03T05:50:35Z

can we use this model in m3 Pro chip as well ?

This should probably work, but I haven't tested it specifically on M3 Pro, I ran it on my M1 Pro and it worked. Let me know if you have any errors while trying it out

HighDoping · 2025-03-03T10:26:03Z

Still get MPS backend out of memory with --device cpu, but did generate video, just tripped at the end.

You may try my fork, which will save few GB of RAM.

can we use this model in m3 Pro chip as well ?

32GB M4 runs T2V-1.3B fine.

chikiuso · 2025-03-03T19:58:53Z

Hi @bakhti-ai , thanks for your contribution to the Mac mod version, I just tried to install on my MacBook but it will kill the process itself after loading and creating pipeline and model, my MacBook is M2 and just having 8Gb Ram, would that be the reason it doesn't run? I already am using the 480p model, thanks!!

bakhti-ai · 2025-03-04T05:20:04Z

Hi @bakhti-ai , thanks for your contribution to the Mac mod version, I just tried to install on my MacBook but it will kill the process itself after loading and creating pipeline and model, my MacBook is M2 and just having 8Gb Ram, would that be the reason it doesn't run? I already am using the 480p model, thanks!!

Even if you use 480p video generation models are quite memory-intensive. Yes I think you are having out of memory issue. Can you provide any error message, warning or any logs from running it on your machine?

chklovski · 2025-03-05T05:11:20Z

Using --frame_num over 45 leads to: Error: total bytes of NDArray > 2**32'

This is on a 128 GB M3 Max, with the options --task t2v-1.3B --size "480*832". Any idea whether this can be fixed?

bakhti-ai · 2025-03-05T06:16:28Z

Using --frame_num over 45 leads to: Error: total bytes of NDArray > 2**32'

This is on a 128 GB M3 Max, with the options --task t2v-1.3B --size "480*832". Any idea whether this can be fixed?

I am not sure if this will work. I asked chatgpt about this issue and here its response:

Why This Happens

NumPy stores large tensors as arrays, and by default, it limits array sizes to 4GB (2^32 bytes) when using 32-bit indexing.

Since the code converts NumPy arrays to PyTorch tensors (torch.from_numpy(...)), a large frame count (45+) at 480×832 resolution can exceed this limit.

NumPy’s default array dtype (float64) makes this worse because it uses 8 bytes per value, quickly inflating memory usage.

How to Fix It
✅ 1. Use float32 Instead of float64
Modify any instance where NumPy arrays are created and force float32 instead of defaulting to float64.

For example, in fm_solvers.py and fm_solvers_unipc.py, update:
sigmas = torch.from_numpy(sigmas.astype(np.float32)).to(dtype=torch.float32)
self.sigmas = torch.from_numpy(sigmas.astype(np.float32))
self.timesteps = torch.from_numpy(timesteps.astype(np.float32))
This reduces memory usage by 50%.

✅ 2. Convert NumPy Arrays to PyTorch Tensors Earlier
Instead of handling large NumPy arrays, move calculations to PyTorch before NumPy gets too large.

For example, in utils.py:
for frame in tensor.numpy():
Change it to:
for frame in tensor.detach().cpu().float().numpy():
This ensures the tensor is in float32 before NumPy processes it.

✅ 3. Check NumPy’s allow_large_arrays Flag (Not Always Available)
Some versions of NumPy support large arrays, but it depends on platform and settings. Try:
np.seterr(over='ignore')
np.set_printoptions(threshold=np.inf)
If errors persist, upgrading NumPy might help:
pip install --upgrade numpy

Final Verdict
✔ Yes, the error is due to NumPy’s 4GB per-array limit.
✔ Fix it by forcing float32 and using PyTorch tensors earlier.
✔ Reducing frame_num is a temporary workaround, but not a real fix.

HighDoping · 2025-03-05T06:52:40Z

Using --frame_num over 45 leads to: Error: total bytes of NDArray > 2**32'

This is on a 128 GB M3 Max, with the options --task t2v-1.3B --size "480*832". Any idea whether this can be fixed?

Haven't got enough RAM to test it, but may due to MPSNDArray limit. pytorch/pytorch#134177 May be solved by breaking up the array in generation process or try new software version.

WanX-Video-1 and others added 11 commits February 25, 2025 22:54

Update README.md

c132cc0

Update README.md

656b915

Update requirements.txt

9ab8f96

os.path.sep instead of / (Wan-Video#12)

4c503a8

update gradio (Wan-Video#58)

4169800

add modelscope download cli

8d75c01

Check for cuda is available for macos

9d3d4d7

Merge remote-tracking branch 'myfork/main'

04bf739

Adapted model for macOS with M1 Pro chip and other improvements

5872652

Update README with macOS setup and usage instructions

b6a0d1e

This was referenced Feb 26, 2025

Request: Add Metal GPU Apple Silicon Support #14

Closed

Apple Slicon安装报错 #30

Open

g7adrian and others added 7 commits February 27, 2025 11:38

os.path.sep instead of / (Wan-Video#12)

1881816

update gradio (Wan-Video#58)

3f0dde1

add modelscope download cli

b562f86

Adapted model for macOS with M1 Pro chip and other improvements

cf578ab

Update README with macOS setup and usage instructions

60ecbf4

Update README.md

2beb726

Add model files download step

bakhti-ai force-pushed the macos-compatibility branch 4 times, most recently from f8cca94 to 2beb726 Compare February 27, 2025 07:09

Merge main branch and resolve conflicts in README.md

68ae718

Merge branch 'main' into macos-compatibility

e0317b2

tanmay1100 reviewed Feb 27, 2025

View reviewed changes

bakhti-uzb added 2 commits February 27, 2025 18:55

Fix MPS compatibility for I2V by adjusting device usage and dtype

5cb67c6

Merge remote-tracking branch 'myfork/macos-compatibility' into macos-…

7ae5058

…compatibility

Merge branch 'main' into macos-compatibility

ac1bcfa

Merge branch 'main' into macos-compatibility

f7bd4d1

bakhti-ai changed the title ~~Add macOS M1 Pro Compatibility and Documentation Enhancements~~ Add macOS Compatibility Mar 4, 2025

Merge branch 'main' into macos-compatibility

bd180d1

bakhti-ai added 2 commits March 6, 2025 12:36

Merge branch 'main' into macos-compatibility

abdcd2b

Merge branch 'main' into macos-compatibility

b3e6943

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add macOS Compatibility #69

Add macOS Compatibility #69

bakhti-ai commented Feb 26, 2025

bakhti-ai commented Feb 26, 2025

lorihuang commented Feb 27, 2025

bakhti-ai commented Feb 27, 2025

lorihuang commented Feb 27, 2025

TreasureJade commented Feb 27, 2025

Volutionn commented Feb 27, 2025

bakhti-ai commented Feb 27, 2025

Volutionn commented Feb 27, 2025

tanmay1100 left a comment •

edited

Loading

bakhti-ai commented Feb 27, 2025

bakhti-ai commented Feb 27, 2025

tanmay1100 commented Feb 27, 2025 •

edited

Loading

bakhti-ai commented Feb 27, 2025

tanmay1100 commented Feb 27, 2025

agentx-cgn commented Mar 1, 2025

MrShakila commented Mar 2, 2025

bakhti-ai commented Mar 3, 2025

HighDoping commented Mar 3, 2025

chikiuso commented Mar 3, 2025

bakhti-ai commented Mar 4, 2025

chklovski commented Mar 5, 2025 •

edited

Loading

bakhti-ai commented Mar 5, 2025

HighDoping commented Mar 5, 2025

Add macOS Compatibility #69

Are you sure you want to change the base?

Add macOS Compatibility #69

Conversation

bakhti-ai commented Feb 26, 2025

Overview

Key Changes

Benefits

Testing

Additional Notes

bakhti-ai commented Feb 26, 2025

lorihuang commented Feb 27, 2025

bakhti-ai commented Feb 27, 2025

lorihuang commented Feb 27, 2025

TreasureJade commented Feb 27, 2025

Volutionn commented Feb 27, 2025

bakhti-ai commented Feb 27, 2025

Volutionn commented Feb 27, 2025

tanmay1100 left a comment • edited Loading

Choose a reason for hiding this comment

bakhti-ai commented Feb 27, 2025

bakhti-ai commented Feb 27, 2025

tanmay1100 commented Feb 27, 2025 • edited Loading

bakhti-ai commented Feb 27, 2025

tanmay1100 commented Feb 27, 2025

agentx-cgn commented Mar 1, 2025

MrShakila commented Mar 2, 2025

bakhti-ai commented Mar 3, 2025

HighDoping commented Mar 3, 2025

chikiuso commented Mar 3, 2025

bakhti-ai commented Mar 4, 2025

chklovski commented Mar 5, 2025 • edited Loading

bakhti-ai commented Mar 5, 2025

HighDoping commented Mar 5, 2025

tanmay1100 left a comment •

edited

Loading

tanmay1100 commented Feb 27, 2025 •

edited

Loading

chklovski commented Mar 5, 2025 •

edited

Loading