Update attention.py #116

shirubei · 2025-02-28T14:32:53Z

Adding support for cards that aren't Ampere architecture

splendiz · 2025-03-01T16:08:53Z

Thanks shirubei, I tried the code with my 8*2080ti.
However, I got the error argument "TypeError: attention() got an unexpected keyword argument 'version'" for each of the graphic card. The errors result in "torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ".
Any solutions?

jimbojd72 · 2025-03-02T04:30:38Z

python generate.py  --task t2v-1.3B --size '832*480' --ckpt_dir ./Wan2.1-T2V-1.3B --offload_model True --t5_cpu --sample_shift 8 --sample_guide_scale 6 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."                                                                                     (Wan2.1)  ✔  󰌠 3.13.2  23:16:08 
[2025-03-01 23:16:36,760] INFO: Generation job args: Namespace(task='t2v-1.3B', size='832*480', frame_num=81, ckpt_dir='./Wan2.1-T2V-1.3B', offload_model=True, ulysses_size=1, ring_size=1, t5_fsdp=False, t5_cpu=True, dit_fsdp=False, save_file=None, prompt='Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage.', use_prompt_extend=False, prompt_extend_method='local_qwen', prompt_extend_model=None, prompt_extend_target_lang='ch', base_seed=9018256661711622796, image=None, sample_solver='unipc', sample_steps=50, sample_shift=8.0, sample_guide_scale=6.0)
[2025-03-01 23:16:36,760] INFO: Generation model config: {'__name__': 'Config: Wan T2V 1.3B', 't5_model': 'umt5_xxl', 't5_dtype': torch.bfloat16, 'text_len': 512, 'param_dtype': torch.bfloat16, 'num_train_timesteps': 1000, 'sample_fps': 16, 'sample_neg_prompt': '色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走', 't5_checkpoint': 'models_t5_umt5-xxl-enc-bf16.pth', 't5_tokenizer': 'google/umt5-xxl', 'vae_checkpoint': 'Wan2.1_VAE.pth', 'vae_stride': (4, 8, 8), 'patch_size': (1, 2, 2), 'dim': 1536, 'ffn_dim': 8960, 'freq_dim': 256, 'num_heads': 12, 'num_layers': 30, 'window_size': (-1, -1), 'qk_norm': True, 'cross_attn_norm': True, 'eps': 1e-06}
[2025-03-01 23:16:36,760] INFO: Input prompt: Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage.
[2025-03-01 23:16:36,760] INFO: Creating WanT2V pipeline.
[2025-03-01 23:17:19,381] INFO: loading ./Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth
[2025-03-01 23:17:32,059] INFO: loading ./Wan2.1-T2V-1.3B/Wan2.1_VAE.pth
[2025-03-01 23:17:32,481] INFO: Creating WanModel from ./Wan2.1-T2V-1.3B
[2025-03-01 23:17:35,481] INFO: Generating video ...
  0%|                                                                                                                                                                                                                                                                                                                                                                                                               | 0/50 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/mnt/data/AI/Wan2.1/generate.py", line 411, in <module>
    generate(args)
    ~~~~~~~~^^^^^^
  File "/mnt/data/AI/Wan2.1/generate.py", line 313, in generate
    video = wan_t2v.generate(
        args.prompt,
    ...<6 lines>...
        seed=args.base_seed,
        offload_model=args.offload_model)
  File "/mnt/data/AI/Wan2.1/wan/text2video.py", line 236, in generate
    noise_pred_cond = self.model(
                      ~~~~~~~~~~^
        latent_model_input, t=timestep, **arg_c)[0]
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/data/AI/Wan2.1/wan/modules/model.py", line 564, in forward
    x = block(x, **kwargs)
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/data/AI/Wan2.1/wan/modules/model.py", line 298, in forward
    y = self.self_attn(
        self.norm1(x).float() * (1 + e[1]) + e[0], seq_lens, grid_sizes,
        freqs)
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/data/AI/Wan2.1/wan/modules/model.py", line 148, in forward
    k=rope_apply(k, grid_sizes, freqs),
      ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/amp/autocast_mode.py", line 44, in decorate_autocast
    return func(*args, **kwargs)
  File "/mnt/data/AI/Wan2.1/wan/modules/model.py", line 67, in rope_apply
    return torch.stack(output).float()
           ~~~~~~~~~~~~~~~~~~~~~~~~~^^
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 192.00 MiB. GPU 0 has a total capacity of 10.57 GiB of which 244.06 MiB is free. Including non-PyTorch memory, this process has 8.11 GiB memory in use. Of the allocated memory 7.52 GiB is allocated by PyTorch, and 421.31 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

I got further with your branch but now getting OOM issues on my VRAM. Good try though

splendiz · 2025-03-02T06:58:36Z


python generate.py  --task t2v-1.3B --size '832*480' --ckpt_dir ./Wan2.1-T2V-1.3B --offload_model True --t5_cpu --sample_shift 8 --sample_guide_scale 6 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."                                                                                     (Wan2.1)  ✔  󰌠 3.13.2  23:16:08 

[2025-03-01 23:16:36,760] INFO: Generation job args: Namespace(task='t2v-1.3B', size='832*480', frame_num=81, ckpt_dir='./Wan2.1-T2V-1.3B', offload_model=True, ulysses_size=1, ring_size=1, t5_fsdp=False, t5_cpu=True, dit_fsdp=False, save_file=None, prompt='Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage.', use_prompt_extend=False, prompt_extend_method='local_qwen', prompt_extend_model=None, prompt_extend_target_lang='ch', base_seed=9018256661711622796, image=None, sample_solver='unipc', sample_steps=50, sample_shift=8.0, sample_guide_scale=6.0)

[2025-03-01 23:16:36,760] INFO: Generation model config: {'__name__': 'Config: Wan T2V 1.3B', 't5_model': 'umt5_xxl', 't5_dtype': torch.bfloat16, 'text_len': 512, 'param_dtype': torch.bfloat16, 'num_train_timesteps': 1000, 'sample_fps': 16, 'sample_neg_prompt': '色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走', 't5_checkpoint': 'models_t5_umt5-xxl-enc-bf16.pth', 't5_tokenizer': 'google/umt5-xxl', 'vae_checkpoint': 'Wan2.1_VAE.pth', 'vae_stride': (4, 8, 8), 'patch_size': (1, 2, 2), 'dim': 1536, 'ffn_dim': 8960, 'freq_dim': 256, 'num_heads': 12, 'num_layers': 30, 'window_size': (-1, -1), 'qk_norm': True, 'cross_attn_norm': True, 'eps': 1e-06}

[2025-03-01 23:16:36,760] INFO: Input prompt: Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage.

[2025-03-01 23:16:36,760] INFO: Creating WanT2V pipeline.

[2025-03-01 23:17:19,381] INFO: loading ./Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth

[2025-03-01 23:17:32,059] INFO: loading ./Wan2.1-T2V-1.3B/Wan2.1_VAE.pth

[2025-03-01 23:17:32,481] INFO: Creating WanModel from ./Wan2.1-T2V-1.3B

[2025-03-01 23:17:35,481] INFO: Generating video ...

  0%|                                                                                                                                                                                                                                                                                                                                                                                                               | 0/50 [00:00<?, ?it/s]

Traceback (most recent call last):

  File "/mnt/data/AI/Wan2.1/generate.py", line 411, in <module>

    generate(args)

    ~~~~~~~~^^^^^^

  File "/mnt/data/AI/Wan2.1/generate.py", line 313, in generate

    video = wan_t2v.generate(

        args.prompt,

    ...<6 lines>...

        seed=args.base_seed,

        offload_model=args.offload_model)

  File "/mnt/data/AI/Wan2.1/wan/text2video.py", line 236, in generate

    noise_pred_cond = self.model(

                      ~~~~~~~~~~^

        latent_model_input, t=timestep, **arg_c)[0]

        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl

    return self._call_impl(*args, **kwargs)

           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^

  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl

    return forward_call(*args, **kwargs)

  File "/mnt/data/AI/Wan2.1/wan/modules/model.py", line 564, in forward

    x = block(x, **kwargs)

  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl

    return self._call_impl(*args, **kwargs)

           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^

  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl

    return forward_call(*args, **kwargs)

  File "/mnt/data/AI/Wan2.1/wan/modules/model.py", line 298, in forward

    y = self.self_attn(

        self.norm1(x).float() * (1 + e[1]) + e[0], seq_lens, grid_sizes,

        freqs)

  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl

    return self._call_impl(*args, **kwargs)

           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^

  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl

    return forward_call(*args, **kwargs)

  File "/mnt/data/AI/Wan2.1/wan/modules/model.py", line 148, in forward

    k=rope_apply(k, grid_sizes, freqs),

      ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^

  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/amp/autocast_mode.py", line 44, in decorate_autocast

    return func(*args, **kwargs)

  File "/mnt/data/AI/Wan2.1/wan/modules/model.py", line 67, in rope_apply

    return torch.stack(output).float()

           ~~~~~~~~~~~~~~~~~~~~~~~~~^^

torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 192.00 MiB. GPU 0 has a total capacity of 10.57 GiB of which 244.06 MiB is free. Including non-PyTorch memory, this process has 8.11 GiB memory in use. Of the allocated memory 7.52 GiB is allocated by PyTorch, and 421.31 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

I got further with your branch but now getting OOM issues on my VRAM. Good try though

Same here, for single 2080ti, i got OOM as well. Multiple GPUs not working. :(

shirubei · 2025-03-02T13:28:20Z

python generate.py  --task t2v-1.3B --size '832*480' --ckpt_dir ./Wan2.1-T2V-1.3B --offload_model True --t5_cpu --sample_shift 8 --sample_guide_scale 6 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."                                                                                     (Wan2.1)  ✔  󰌠 3.13.2  23:16:08 
[2025-03-01 23:16:36,760] INFO: Generation job args: Namespace(task='t2v-1.3B', size='832*480', frame_num=81, ckpt_dir='./Wan2.1-T2V-1.3B', offload_model=True, ulysses_size=1, ring_size=1, t5_fsdp=False, t5_cpu=True, dit_fsdp=False, save_file=None, prompt='Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage.', use_prompt_extend=False, prompt_extend_method='local_qwen', prompt_extend_model=None, prompt_extend_target_lang='ch', base_seed=9018256661711622796, image=None, sample_solver='unipc', sample_steps=50, sample_shift=8.0, sample_guide_scale=6.0)
[2025-03-01 23:16:36,760] INFO: Generation model config: {'__name__': 'Config: Wan T2V 1.3B', 't5_model': 'umt5_xxl', 't5_dtype': torch.bfloat16, 'text_len': 512, 'param_dtype': torch.bfloat16, 'num_train_timesteps': 1000, 'sample_fps': 16, 'sample_neg_prompt': '色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走', 't5_checkpoint': 'models_t5_umt5-xxl-enc-bf16.pth', 't5_tokenizer': 'google/umt5-xxl', 'vae_checkpoint': 'Wan2.1_VAE.pth', 'vae_stride': (4, 8, 8), 'patch_size': (1, 2, 2), 'dim': 1536, 'ffn_dim': 8960, 'freq_dim': 256, 'num_heads': 12, 'num_layers': 30, 'window_size': (-1, -1), 'qk_norm': True, 'cross_attn_norm': True, 'eps': 1e-06}
[2025-03-01 23:16:36,760] INFO: Input prompt: Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage.
[2025-03-01 23:16:36,760] INFO: Creating WanT2V pipeline.
[2025-03-01 23:17:19,381] INFO: loading ./Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth
[2025-03-01 23:17:32,059] INFO: loading ./Wan2.1-T2V-1.3B/Wan2.1_VAE.pth
[2025-03-01 23:17:32,481] INFO: Creating WanModel from ./Wan2.1-T2V-1.3B
[2025-03-01 23:17:35,481] INFO: Generating video ...
  0%|                                                                                                                                                                                                                                                                                                                                                                                                               | 0/50 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/mnt/data/AI/Wan2.1/generate.py", line 411, in <module>
    generate(args)
    ~~~~~~~~^^^^^^
  File "/mnt/data/AI/Wan2.1/generate.py", line 313, in generate
    video = wan_t2v.generate(
        args.prompt,
    ...<6 lines>...
        seed=args.base_seed,
        offload_model=args.offload_model)
  File "/mnt/data/AI/Wan2.1/wan/text2video.py", line 236, in generate
    noise_pred_cond = self.model(
                      ~~~~~~~~~~^
        latent_model_input, t=timestep, **arg_c)[0]
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/data/AI/Wan2.1/wan/modules/model.py", line 564, in forward
    x = block(x, **kwargs)
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/data/AI/Wan2.1/wan/modules/model.py", line 298, in forward
    y = self.self_attn(
        self.norm1(x).float() * (1 + e[1]) + e[0], seq_lens, grid_sizes,
        freqs)
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/data/AI/Wan2.1/wan/modules/model.py", line 148, in forward
    k=rope_apply(k, grid_sizes, freqs),
      ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/amp/autocast_mode.py", line 44, in decorate_autocast
    return func(*args, **kwargs)
  File "/mnt/data/AI/Wan2.1/wan/modules/model.py", line 67, in rope_apply
    return torch.stack(output).float()
           ~~~~~~~~~~~~~~~~~~~~~~~~~^^
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 192.00 MiB. GPU 0 has a total capacity of 10.57 GiB of which 244.06 MiB is free. Including non-PyTorch memory, this process has 8.11 GiB memory in use. Of the allocated memory 7.52 GiB is allocated by PyTorch, and 421.31 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

I got further with your branch but now getting OOM issues on my VRAM. Good try though

you can go with --frame_num 17 or even less to test the modification, such as 13 or 9.

shirubei · 2025-03-02T14:36:01Z

@splendiz @jimbojd72
I run cmd like below with 2080ti 22GB

python generate.py --task t2v-1.3B --size 832*480 --ckpt_dir
./Wan2.1-T2V-1.3B --offload_model False --frame_num 13 --sample_shift 8 --sample_guide_scale 6 --prompt "Ultra-wide angle, night, a young girl in a red dress walks towards the camera from a distance on a noisy street. On both sides of the road, there are continuous shops with soft lighting."

shirubei · 2025-03-02T15:16:28Z

Finally video file was created.

t2v-1.3B_1_Ultra-wide_angle._night._a_young_girl_in_a_red_dre_20250303_001358.mp4

jimbojd72 · 2025-03-02T17:00:06Z

Getting the same error with your new prompt. I don't know enough for now to debug on my side (first time trying a model on my Arch with 2080 Ti).

My point here is that I should not be a blocker for merging if it doesn't work on my setup.

    /mnt/data/AI/Wan2.1   main !1 ?1  python generate.py --task t2v-1.3B --size '832*480' --ckpt_dir ./Wan2.1-T2V-1.3B --offload_model False --frame_num 13 --sample_shift 8 --sample_guide_scale 6 --prompt "Ultra-wide angle, night, a young girl in a red dress walks towards the camera from a distance on a noisy street. On both sides of the road, there are continuous shops with soft lighting."
[2025-03-02 11:51:36,833] INFO: Generation job args: Namespace(task='t2v-1.3B', size='832*480', frame_num=13, ckpt_dir='./Wan2.1-T2V-1.3B', offload_model=False, ulysses_size=1, ring_size=1, t5_fsdp=False, t5_cpu=False, dit_fsdp=False, save_file=None, prompt='Ultra-wide angle, night, a young girl in a red dress walks towards the camera from a distance on a noisy street. On both sides of the road, there are continuous shops with soft lighting.', use_prompt_extend=False, prompt_extend_method='local_qwen', prompt_extend_model=None, prompt_extend_target_lang='ch', base_seed=1674726556309183157, image=None, sample_solver='unipc', sample_steps=50, sample_shift=8.0, sample_guide_scale=6.0)
[2025-03-02 11:51:36,833] INFO: Generation model config: {'__name__': 'Config: Wan T2V 1.3B', 't5_model': 'umt5_xxl', 't5_dtype': torch.bfloat16, 'text_len': 512, 'param_dtype': torch.bfloat16, 'num_train_timesteps': 1000, 'sample_fps': 16, 'sample_neg_prompt': '色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走', 't5_checkpoint': 'models_t5_umt5-xxl-enc-bf16.pth', 't5_tokenizer': 'google/umt5-xxl', 'vae_checkpoint': 'Wan2.1_VAE.pth', 'vae_stride': (4, 8, 8), 'patch_size': (1, 2, 2), 'dim': 1536, 'ffn_dim': 8960, 'freq_dim': 256, 'num_heads': 12, 'num_layers': 30, 'window_size': (-1, -1), 'qk_norm': True, 'cross_attn_norm': True, 'eps': 1e-06}
[2025-03-02 11:51:36,833] INFO: Input prompt: Ultra-wide angle, night, a young girl in a red dress walks towards the camera from a distance on a noisy street. On both sides of the road, there are continuous shops with soft lighting.
[2025-03-02 11:51:36,833] INFO: Creating WanT2V pipeline.
[2025-03-02 11:52:28,223] INFO: loading ./Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth
[2025-03-02 11:52:36,409] INFO: loading ./Wan2.1-T2V-1.3B/Wan2.1_VAE.pth
[2025-03-02 11:52:36,689] INFO: Creating WanModel from ./Wan2.1-T2V-1.3B
[2025-03-02 11:52:39,375] INFO: Generating video ...
Traceback (most recent call last):
  File "/mnt/data/AI/Wan2.1/generate.py", line 411, in <module>
    generate(args)
    ~~~~~~~~^^^^^^
  File "/mnt/data/AI/Wan2.1/generate.py", line 313, in generate
    video = wan_t2v.generate(
        args.prompt,
    ...<6 lines>...
        seed=args.base_seed,
        offload_model=args.offload_model)
  File "/mnt/data/AI/Wan2.1/wan/text2video.py", line 171, in generate
    self.text_encoder.model.to(self.device)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1343, in to
    return self._apply(convert)
           ~~~~~~~~~~~^^^^^^^^^
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 903, in _apply
    module._apply(fn)
    ~~~~~~~~~~~~~^^^^
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 930, in _apply
    param_applied = fn(param)
  File "/mnt/data/miniconda3/envs/Wan2.1/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1329, in convert
    return t.to(
           ~~~~^
        device,
        ^^^^^^^
        dtype if t.is_floating_point() or t.is_complex() else None,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        non_blocking,
        ^^^^^^^^^^^^^
    )
    ^
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.96 GiB. GPU 0 has a total capacity of 10.57 GiB of which 2.01 GiB is free. Including non-PyTorch memory, this process has 6.28 GiB memory in use. Of the allocated memory 5.78 GiB is allocated by PyTorch, and 335.21 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

shirubei · 2025-03-03T02:36:34Z

@jimbojd72
Does your 2080ti come with 22GB memory?
You can see that frame_num=13 without t5_cpu will consume 21.6 GB of graphic memory and 2 GB of shared memory(must be shared from ordinary memory) from the screen image captured .
So if you own a 11GB 2080ti , I suggest using frame_num 5 to test the code.
Maybe it's better to add --t5_cpu parameter.

Thank you.

jimbojd72 · 2025-03-03T04:25:52Z

11GB

Now I understand the link between frame_num and vram 🤦🏻 . Thanks for clarifying that for me.

It did work afterward even though the video seems too short!

    /mnt/data/AI/Wan2.1   main !1 ?1  python generate.py  --task t2v-1.3B --size '832*480' --ckpt_dir ./Wan2.1-T2V-1.3B --offload_model True --t5_cpu --sample_shift 8 --sample_guide_scale 6 --frame_num 5 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."                                                            (Wan2.1)  ✔   16s  󰌠 3.13.2  23:15:48 
[2025-03-02 23:15:51,906] INFO: Generation job args: Namespace(task='t2v-1.3B', size='832*480', frame_num=5, ckpt_dir='./Wan2.1-T2V-1.3B', offload_model=True, ulysses_size=1, ring_size=1, t5_fsdp=False, t5_cpu=True, dit_fsdp=False, save_file=None, prompt='Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage.', use_prompt_extend=False, prompt_extend_method='local_qwen', prompt_extend_model=None, prompt_extend_target_lang='ch', base_seed=5029517465784891079, image=None, sample_solver='unipc', sample_steps=50, sample_shift=8.0, sample_guide_scale=6.0)
[2025-03-02 23:15:51,906] INFO: Generation model config: {'__name__': 'Config: Wan T2V 1.3B', 't5_model': 'umt5_xxl', 't5_dtype': torch.bfloat16, 'text_len': 512, 'param_dtype': torch.bfloat16, 'num_train_timesteps': 1000, 'sample_fps': 16, 'sample_neg_prompt': '色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走', 't5_checkpoint': 'models_t5_umt5-xxl-enc-bf16.pth', 't5_tokenizer': 'google/umt5-xxl', 'vae_checkpoint': 'Wan2.1_VAE.pth', 'vae_stride': (4, 8, 8), 'patch_size': (1, 2, 2), 'dim': 1536, 'ffn_dim': 8960, 'freq_dim': 256, 'num_heads': 12, 'num_layers': 30, 'window_size': (-1, -1), 'qk_norm': True, 'cross_attn_norm': True, 'eps': 1e-06}
[2025-03-02 23:15:51,906] INFO: Input prompt: Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage.
[2025-03-02 23:15:51,906] INFO: Creating WanT2V pipeline.
[2025-03-02 23:16:35,034] INFO: loading ./Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth
[2025-03-02 23:16:44,671] INFO: loading ./Wan2.1-T2V-1.3B/Wan2.1_VAE.pth
[2025-03-02 23:16:44,968] INFO: Creating WanModel from ./Wan2.1-T2V-1.3B
[2025-03-02 23:16:47,722] INFO: Generating video ...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [02:45<00:00,  3.30s/it]
[2025-03-02 23:24:00,482] INFO: Saving generated video to t2v-1.3B_832*480_1_1_Two_anthropomorphic_cats_in_comfy_boxing_gear_and__20250302_232400.mp4
[2025-03-02 23:24:00,961] INFO: Finished.

t2v-1.3B_832.480_1_1_Two_anthropomorphic_cats_in_comfy_boxing_gear_and__20250302_232400.mp4

qianzhouyi2 · 2025-03-05T16:42:53Z

I have tested this.It need frame_num under about 12 to make sure 2080ti 22G run smoothly.It cost 5min to generate a 832480 video
python generate.py --task t2v-1.3B --size 832480 --ckpt_dir ./Wan2.1-T2V-1.3B --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage." --frame_num 12

splendiz · 2025-03-06T08:05:42Z

single gpu may work, how about multi gpus? Anyone tested the code?

Update attention.py

5f7e7ed

Adding support for cards that aren't Ampere architecture

This was referenced Mar 1, 2025

RuntimeError: FlashAttention only supports Ampere GPUs or newer #25

Open

Does it support 2080Ti? Dao-AILab/flash-attention#1514

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update attention.py #116

Update attention.py #116

shirubei commented Feb 28, 2025

splendiz commented Mar 1, 2025

jimbojd72 commented Mar 2, 2025

splendiz commented Mar 2, 2025

shirubei commented Mar 2, 2025

shirubei commented Mar 2, 2025

shirubei commented Mar 2, 2025

jimbojd72 commented Mar 2, 2025

shirubei commented Mar 3, 2025 •

edited

Loading

jimbojd72 commented Mar 3, 2025

qianzhouyi2 commented Mar 5, 2025

splendiz commented Mar 6, 2025

Update attention.py #116

Are you sure you want to change the base?

Update attention.py #116

Conversation

shirubei commented Feb 28, 2025

splendiz commented Mar 1, 2025

jimbojd72 commented Mar 2, 2025

splendiz commented Mar 2, 2025

shirubei commented Mar 2, 2025

shirubei commented Mar 2, 2025

shirubei commented Mar 2, 2025

jimbojd72 commented Mar 2, 2025

shirubei commented Mar 3, 2025 • edited Loading

jimbojd72 commented Mar 3, 2025

qianzhouyi2 commented Mar 5, 2025

splendiz commented Mar 6, 2025

shirubei commented Mar 3, 2025 •

edited

Loading