Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMD RX550 高频率出现 gpu reset 问题 #89

Open
zhongtaifeng opened this issue Jan 11, 2024 · 3 comments
Open

AMD RX550 高频率出现 gpu reset 问题 #89

zhongtaifeng opened this issue Jan 11, 2024 · 3 comments

Comments

@zhongtaifeng
Copy link

zhongtaifeng commented Jan 11, 2024

硬件:XA61200 + 3A6000
系统: gentoo
内核: 6.6.8 , 6.7.0-rc7 , 6.7.0

使用中, 反复出现 gpu reset

在使用 freerdp , qemu , 以及使用 webkit 的浏览器的过程中, 极高频率出现 gpu reset

但不限于这些应用, 甚至 gnome-text-edit 也会引发 gpu reset , 规律无法锁定

此问题在 loongarch 交流群有多人出现此问题, 现在还没能锁定是否单独这一款显卡出现此问题, 已再次尝试购买一张 RX6400 显卡进行下一步验证

日志样本 :
Dec 31 11:05:53 loongson kernel: [drm:amdgpu_job_timedout [amdgpu]] ERROR ring gfx timeout, signaled seq=152958, emitted seq=152961
Dec 31 11:05:53 loongson kernel: [drm:amdgpu_job_timedout [amdgpu]] ERROR Process information: process WebKitWebProces pid 333260 thread WebKitWebP:cs0 pid 333289
Dec 31 11:05:53 loongson kernel: amdgpu 0000:07:00.0: amdgpu: GPU reset begin!
Dec 31 11:05:57 loongson kernel: amdgpu 0000:07:00.0: amdgpu: failed to suspend display audio
Dec 31 11:05:58 loongson kernel: amdgpu: cp is busy, skip halt cp
Dec 31 11:05:58 loongson kernel: amdgpu: rlc is busy, skip halt rlc
Dec 31 11:05:58 loongson kernel: amdgpu 0000:07:00.0: amdgpu: BACO reset
Dec 31 11:05:58 loongson kernel: azx_single_wait_for_response: 62 callbacks suppressed
Dec 31 11:05:58 loongson kernel: amdgpu 0000:07:00.0: amdgpu: GPU reset succeeded, trying to resume
Dec 31 11:05:58 loongson kernel: [drm] PCIE GART of 256M enabled (table at 0x000000F400200000).
Dec 31 11:05:58 loongson kernel: [drm] VRAM is lost due to GPU reset!
Dec 31 11:05:59 loongson kernel: [drm] UVD and UVD ENC initialized successfully.
Dec 31 11:05:59 loongson kernel: [drm] VCE initialized successfully.
Dec 31 11:05:59 loongson kernel: amdgpu 0000:07:00.0: amdgpu: recover vram bo from shadow start
Dec 31 11:05:59 loongson kernel: amdgpu 0000:07:00.0: amdgpu: recover vram bo from shadow done
Dec 31 11:05:59 loongson kernel: [drm] Skip scheduling IBs!
Dec 31 11:05:59 loongson kernel: [drm] Skip scheduling IBs!
Dec 31 11:05:59 loongson kernel: amdgpu 0000:07:00.0: amdgpu: GPU reset(6) succeeded!

@xry111
Copy link

xry111 commented Jan 12, 2024

It may be related to a hardware issue (hazard in HyperTransport) explained in chenhuacai/linux@a1e31fe. But it's just a "may" and the workaround is for radeon, no workaround has been developed for amdgpu yet.

@zhongtaifeng
Copy link
Author

目前 替换为一块 R5 240 使用 DisplayPort 输出口的显卡, 情况终于变的正常.

@LinuxResearcher
Copy link

我也遇到了这样的问题。我是在跑SPEC CPU 2017,跑一段时间看看它,怎么动鼠标键盘,屏幕都不亮,没有显示输出。ssh进去一看,SPEC CPU 2017在正常跑,显示挂了。一看dmesg信息,gpu reset。

MarsDoge added a commit to MarsDoge/Firmware that referenced this issue Mar 15, 2024
Fixes : loongson#83
        loongson#89

This is a test version, thanks!

Signed-off-by: Dongyan Qian <[email protected]>
Signed-off-by: Xiangdong Meng <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants