You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MoE module training在multi devices上training时出现异步错误 尝试使用wait stream也仍然报错
···
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/trainer/ppytorch/utils/torchscript_converter.py", line 502, in forward
obj = torch.pad(_151, [0, int(torch.rsub(_153, 63))])
_154 = (dcn_layers).forward((masknet_layers).forward(obj, ), )
_155 = torch.pad((MoE).forward(_154, ), [1, 0])
~~~~~~~~~~~~ <--- HERE
_156 = torch.split(torch.sigmoid(_155), 1, 1)
_157, _158, _159, _160, _161, _162, _163, _164, _165, _166, _167, _168, _169, = _156
File "code/torch/trainer/ppytorch/mlenv/common/inference_utils/___torch_mangle_382.py", line 11, in forward
module = self.module
x = torch.to(argument_1, 15)
return torch.to((module).forward(x, ), 6)
~~~~~~~~~~~~~~~ <--- HERE
File "code/torch/trainer/ppytorch/mlenv/common/packageable_modules/deepseek_moe.py", line 77, in forward 41 = annotate(List[Optional[Tensor]], [idx2])
y3 = torch.index_put(y2, _41, _40)
idx3, top3, = torch.where(torch.eq(_0, 39))
~~~~~~~~~~~ <--- HERE
_42 = annotate(List[Optional[Tensor]], [idx3])
_43 = torch.index(y3, _42)
RuntimeError: CUDA error: operation not permitted when stream is capturing
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
···
The text was updated successfully, but these errors were encountered:
MoE module training在multi devices上training时出现异步错误 尝试使用wait stream也仍然报错
···
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/trainer/ppytorch/utils/torchscript_converter.py", line 502, in forward
obj = torch.pad(_151, [0, int(torch.rsub(_153, 63))])
_154 = (dcn_layers).forward((masknet_layers).forward(obj, ), )
_155 = torch.pad((MoE).forward(_154, ), [1, 0])
~~~~~~~~~~~~ <--- HERE
_156 = torch.split(torch.sigmoid(_155), 1, 1)
_157, _158, _159, _160, _161, _162, _163, _164, _165, _166, _167, _168, _169, = _156
File "code/torch/trainer/ppytorch/mlenv/common/inference_utils/___torch_mangle_382.py", line 11, in forward
module = self.module
x = torch.to(argument_1, 15)
return torch.to((module).forward(x, ), 6)
~~~~~~~~~~~~~~~ <--- HERE
File "code/torch/trainer/ppytorch/mlenv/common/packageable_modules/deepseek_moe.py", line 77, in forward
41 = annotate(List[Optional[Tensor]], [idx2])
y3 = torch.index_put(y2, _41, _40)
idx3, top3, = torch.where(torch.eq(_0, 39))
~~~~~~~~~~~ <--- HERE
_42 = annotate(List[Optional[Tensor]], [idx3])
_43 = torch.index(y3, _42)
RuntimeError: CUDA error: operation not permitted when stream is capturing
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.···
The text was updated successfully, but these errors were encountered: