Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Causes amdgpu crash #7

Open
hojjatabdollahi opened this issue Jan 8, 2025 · 8 comments
Open

Causes amdgpu crash #7

hojjatabdollahi opened this issue Jan 8, 2025 · 8 comments
Labels
bug Something isn't working

Comments

@hojjatabdollahi
Copy link

hojjatabdollahi commented Jan 8, 2025

As soon as I add the applet to the panel I get these in the log:

│2025-01-08T14:29:20.408000-07:00 Cosmic cosmic-ext-applet-external-monitor-brightness[135682]: Ignoring unknown button type:                                                                ┤
│2025-01-08T14:29:20.427000-07:00 Cosmic systemd[1]: Started systemd-timedated.service - Time & Date Service.                                                                                ┤
│2025-01-08T14:29:20.503000-07:00 Cosmic cosmic-ext-applet-external-monitor-brightness[135682]: Ignoring unknown button type:                                                                │
│2025-01-08T14:29:20.503000-07:00 Cosmic kernel: usb 1-4: reset full-speed USB device number 2 using xhci_hcd                                                                                │
│2025-01-08T14:29:23.609000-07:00 Cosmic kernel: ------------[ cut here ]------------                                                                                                        │
│2025-01-08T14:29:23.609000-07:00 Cosmic kernel: WARNING: CPU: 2 PID: 127898 at drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dmub_psr.c:223 dmub_psr_enable+0xf8/0x110 [amdgpu]              │
│2025-01-08T14:29:23.609000-07:00 Cosmic kernel: Modules linked in: usbhid cdc_acm igc ccm michael_mic nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6t_REJECT nf_reject_┤
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel:  snd_rpl_pci_acp6x uvc snd_acp_pci industrialio_triggered_buffer videobuf2_memops snd_seq_midi kfifo_buf snd_acp_legacy_common videobuf2_v4l│
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel:  parport efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 dm_crypt raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq│
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel: CPU: 2 UID: 0 PID: 127898 Comm: kworker/u64:11 Tainted: G        W          6.13.0-rc6+ #8                                                  ┤
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel: Tainted: [W]=WARN                                                                                                                           ┤
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel: Hardware name: Framework Laptop 13 (AMD Ryzen 7040Series)/FRANMDCP07, BIOS 03.06 10/14/2024                                                 │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel: Workqueue: dm_vblank_control_workqueue amdgpu_dm_crtc_vblank_control_worker [amdgpu]                                                        │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel: RIP: 0010:dmub_psr_enable+0xf8/0x110 [amdgpu]                                                                                               │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel: Code: 48 8b 45 d8 65 48 2b 04 25 28 00 00 00 75 1f 48 83 c4 50 5b 41 5c 41 5d 41 5e 5d 31 c0 31 d2 31 c9 31 f6 31 ff e9 b3 33 e6 cb <0f> 0b │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel: RSP: 0018:ffffad1dcd78fce0 EFLAGS: 00010246                                                                                                 ┤
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel: RAX: 0000000000000000 RBX: 00000000000003e9 RCX: 0000000000000000                                                                           │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000                                                                           │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel: RBP: ffffad1dcd78fd50 R08: 0000000000000000 R09: 0000000000000000                                                                           │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000                                                                           │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel: R13: ffff9a4c80deae90 R14: 0000000000000000 R15: ffff9a4ce3400000                                                                           │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel: FS:  0000000000000000(0000) GS:ffff9a5301b00000(0000) knlGS:0000000000000000                                                                │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033                                                                                           │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel: CR2: 000036c402dc00b0 CR3: 00000001303ba000 CR4: 0000000000f50ef0                                                                           ┤
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel: PKRU: 55555554                                                                                                                              │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel: Call Trace:                                                                                                                                 │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel:  <TASK>                                                                                                                                     │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel:  ? show_regs+0x6c/0x80                                                                                                                      ┤
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel:  ? __warn+0x8d/0x150                                                                                                                        ┤
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel:  ? dmub_psr_enable+0xf8/0x110 [amdgpu]                                                                                                      │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel:  ? report_bug+0x182/0x1b0                                                                                                                   │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel:  ? handle_bug+0x6e/0xb0                                                                                                                     │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel:  ? exc_invalid_op+0x18/0x80                                                                                                                 │
│2025-01-08T14:29:23.610000-07:00 Cosmic kernel:  ? asm_exc_invalid_op+0x1b/0x20                                                                                                             │
│2025-01-08T14:29:23.611000-07:00 Cosmic kernel:  ? dmub_psr_enable+0xf8/0x110 [amdgpu]                                                                                                      ┤
│2025-01-08T14:29:23.611000-07:00 Cosmic kernel:  ? __pfx_dmub_psr_enable+0x10/0x10 [amdgpu]                                                                                                 │
│2025-01-08T14:29:23.611000-07:00 Cosmic kernel:  edp_set_psr_allow_active+0x1b7/0x330 [amdgpu]                                                                                              │
│2025-01-08T14:29:23.611000-07:00 Cosmic kernel:  dc_link_set_psr_allow_active+0x26/0x40 [amdgpu]                                                                                            │
│2025-01-08T14:29:23.611000-07:00 Cosmic kernel:  amdgpu_dm_psr_disable+0x58/0x90 [amdgpu]                                                                                                   │
│2025-01-08T14:29:23.611000-07:00 Cosmic kernel:  amdgpu_dm_crtc_vblank_control_worker+0x2ed/0x310 [amdgpu]                                                                                  │
│2025-01-08T14:29:23.611000-07:00 Cosmic kernel:  process_one_work+0x178/0x3d0                                                                                                               │
│2025-01-08T14:29:23.611000-07:00 Cosmic kernel:  worker_thread+0x2de/0x410                                                                                                                  ┤
│2025-01-08T14:29:23.611000-07:00 Cosmic kernel:  ? __pfx_worker_thread+0x10/0x10                                                                                                            ┤
│2025-01-08T14:29:23.611000-07:00 Cosmic kernel:  kthread+0xe1/0x110                                                                                                                         ┤
│2025-01-08T14:29:23.611000-07:00 Cosmic kernel:  ? __pfx_kthread+0x10/0x10                                                                                                                  │
│2025-01-08T14:29:23.611000-07:00 Cosmic kernel:  ret_from_fork+0x44/0x70                                                                                                                    │
│2025-01-08T14:29:23.611000-07:00 Cosmic kernel:  ? __pfx_kthread+0x10/0x10                                                                                                                  │
│2025-01-08T14:29:23.611000-07:00 Cosmic kernel:  ret_from_fork_asm+0x1a/0x30                                                                                                                │
│2025-01-08T14:29:23.611000-07:00 Cosmic kernel:  </TASK>                                                                                                                                    ┤
│2025-01-08T14:29:23.611000-07:00 Cosmic kernel: ---[ end trace 0000000000000000 ]---                                                                                                        │
│2025-01-08T14:29:24.431000-07:00 Cosmic cosmic-ext-applet-external-monitor-brightness[135682]: Failed to enumerate a display: Failed to parse EDID for i2c-22784                            │
│2025-01-08T14:29:24.568000-07:00 Cosmic cosmic-ext-applet-external-monitor-brightness[135682]: can't get_vcp_feature: DDC/CI error: Expected DDC/CI length bit                              │
│2025-01-08T14:29:28.774000-07:00 Cosmic cosmic-ext-applet-external-monitor-brightness[135682]: Failed to enumerate a display: Failed to parse EDID for i2c-22784                            │
│2025-01-08T14:29:28.909000-07:00 Cosmic cosmic-ext-applet-external-monitor-brightness[135682]: can't get_vcp_feature: DDC/CI error: Expected DDC/CI length bit                              │
│2025-01-08T14:29:33.207000-07:00 Cosmic cosmic-ext-applet-external-monitor-brightness[135682]: Failed to enumerate a display: Failed to parse EDID for i2c-22784                            │
│2025-01-08T14:29:33.342000-07:00 Cosmic cosmic-ext-applet-external-monitor-brightness[135682]: can't get_vcp_feature: DDC/CI error: Expected DDC/CI length bit                              │
│2025-01-08T14:29:37.812000-07:00 Cosmic cosmic-ext-applet-external-monitor-brightness[135682]: Failed to enumerate a display: Failed to parse EDID for i2c-22784                            │
│2025-01-08T14:29:37.902000-07:00 Cosmic cosmic-ext-applet-external-monitor-brightness[135682]: can't get_vcp_feature: DDC/CI I2C error: Remote I/O error (os error 121)                     │
│2025-01-08T14:29:42.706000-07:00 Cosmic cosmic-ext-applet-external-monitor-brightness[135682]: Failed to enumerate a display: Failed to parse EDID for i2c-22784                            ┤
│2025-01-08T14:29:42.840000-07:00 Cosmic cosmic-ext-applet-external-monitor-brightness[135682]: can't get_vcp_feature: DDC/CI error: Expected DDC/CI length bit                              │
│2025-01-08T14:29:48.442000-07:00 Cosmic cosmic-ext-applet-external-monitor-brightness[135682]: Failed to enumerate a display: Failed to parse EDID for i2c-22784                            │
│2025-01-08T14:29:48.577000-07:00 Cosmic cosmic-ext-applet-external-monitor-brightness[135682]: can't get_vcp_feature: DDC/CI error: Expected DDC/CI length bit                              │
│2025-01-08T14:29:50.453000-07:00 Cosmic systemd[1]: systemd-timedated.service: Deactivated successfully.                                                                                    │
│2025-01-08T14:29:55.810000-07:00 Cosmic cosmic-ext-applet-external-monitor-brightness[135682]: Failed to enumerate a display: Failed to parse EDID for i2c-22784                            │
│2025-01-08T14:29:55.944000-07:00 Cosmic cosmic-ext-applet-external-monitor-brightness[135682]: can't get_vcp_feature: DDC/CI error: Expected DDC/CI length bit

My laptop display freezes, but the external monitors still work, but very laggy, until I remove the applet from the panel.
I remember testing your "async" pr in the original repo and it was causing the same problem.

Right now I use this to set my brightness and it works, so I don't think the problem is ddcutil:

BR=50; ddcutil setvcp --bus=16 10 $BR && ddcutil setvcp  --bus=17 10 $BR
@hojjatabdollahi hojjatabdollahi added the bug Something isn't working label Jan 8, 2025
@wiiznokes
Copy link
Collaborator

Could you test this branch https://github.com/cosmic-utils/cosmic-ext-applet-external-monitor-brightness/tree/fix-freeze @hojjatabdollahi ?
I'm not sure what was the bug so it will probably not work, but i made the code more error resistant

@hojjatabdollahi
Copy link
Author

Thank you. It works now. But still it freezes when I add it to the panel, until it finishes trying 5 times. I'm guessing it's trying to read from my laptop display which doesn't support ddc/ci.

Cosmic cosmic-ext-applet-external-monitor-brightness[416857]: Failed to enumerate a display: Failed to parse EDID for i2c-22784                                                       │
Cosmic cosmic-ext-applet-external-monitor-brightness[416857]: can't get_vcp_feature: DDC/CI I2C error: Remote I/O error (os error 121)                                                ┤
Cosmic cosmic-ext-applet-external-monitor-brightness[416857]: can't get_vcp_feature: DDC/CI error: Expected DDC/CI length bit                                                         │
Cosmic cosmic-ext-applet-external-monitor-brightness[416857]: message repeated 5 times: [ can't get_vcp_feature: DDC/CI error: Expected DDC/CI length bit]

And this message is logged everytime I open the pop up:

 Cosmic cosmic-ext-applet-external-monitor-brightness[416857]: can't get_vcp_feature: DDC/CI error: Expected DDC/CI length bit                                                                                                                                                                                                                                                          

@wiiznokes
Copy link
Collaborator

Can you re test the same branch please ? I added a commit to filter display that do not have this capabilities

@hojjatabdollahi
Copy link
Author

The builtin display still freezes for 10 seconds when I add the applet to the panel and these show up in the logs:

Jan 10 20:41:45 Cosmic cosmic-ext-applet-external-monitor-brightness[166976]: Ignoring unknown button type:
Jan 10 20:41:45 Cosmic cosmic-ext-applet-external-monitor-brightness[166976]: Ignoring unknown button type:
Jan 10 20:41:49 Cosmic cosmic-ext-applet-external-monitor-brightness[166976]: Failed to enumerate a display: Failed to parse EDID for i2c-22784
Jan 10 20:41:57 Cosmic cosmic-ext-applet-external-monitor-brightness[166976]: can't get capabilities Failed to read capabilities string

But the other error that used to be printed every time that I opened the pop up is gone.

@wiiznokes
Copy link
Collaborator

Could you re test with cargo run ? This will enable some log, could you post it ?

The display should only freeze when opening the pop up this time.

I wonder if there was a freeze with the previous version ?
If there is a freeze, i think we could black list some display, by using the model name or smtg

@hojjatabdollahi
Copy link
Author

I don't have access to my external monitors over the weekend. But I ran the new changes using cargo run with only my laptop display, and my screen froze. Here is the output:

2025-01-12T00:59:03.309725Z  WARN Ignoring unknown button type:
2025-01-12T00:59:07.538338Z  WARN Failed to enumerate a display: Failed to parse EDID for i2c-22784
2025-01-12T00:59:07.538904Z  INFO DisplayInfo { backend: I2cDevice, id: "22795", manufacturer_id: Some("BOE"), model_id: Some(2399), version: Some((1, 4)), serial: Some(0), manufacture_year: Some(29), manufacture_week: Some(23), model_name: None, serial_number: None, edid_data: Some([0, 255, 255, 255, 255, 255, 255, 0, 9, 229, 95, 9, 0, 0, 0, 0, 23, 29, 1, 4, 165, 28, 19, 120, 3, 222, 80, 163, 84, 76, 153, 38, 15, 80, 84, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 17, 92, 208, 24, 129, 224, 45, 80, 48, 32, 54, 0, 29, 190, 16, 0, 0, 26, 167, 73, 208, 24, 129, 224, 45, 80, 48, 32, 54, 0, 29, 190, 16, 0, 0, 26, 0, 0, 0, 254, 0, 66, 79, 69, 32, 67, 81, 10, 32, 32, 32, 32, 32, 32, 0, 0, 0, 254, 0, 78, 69, 49, 51, 53, 70, 66, 77, 45, 78, 52, 49, 10, 0, 251, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]), mccs_version: None, mccs_database: Database { entries: {} } }

@hojjatabdollahi
Copy link
Author

I have been digging deeper into this. I figured out that the Dispaly::enumerate. Can reliably cause the same freeze.
I dug deeper and it looks like that ddc-hi, ddc-i2c, i2c-linux. Almost everything is very old and kinda abandoned.

I'm still looking deeper to figure out where exactly do we send the i2c messages to the wrong device, and why ddcutil doesn't cause the same issue, even though it can quickly detect my builtin display and mark it as invalid because it's a laptop display.

For what it's worth, I don't see the amdgpu crash in the logs anymore. From the logs that I've seen it looks like cosmic-comp thinks the drm device is paused. Switching to tty3 and back resets cosmic comp and everything works afterward.
So, feel free to close this issue, I'll report back here if I find something.

Jan 11 17:51:25 Cosmic cosmic-comp[3781]: Failed to submit rendering: Rendering failed: The underlying drm surface encountered an error: Device is currently paused, operation rejected

@wiiznokes
Copy link
Collaborator

Can you confirm you had the same freeze with the initial code ?
I am not sure if we can do anything in this repo if we can't even call Display::enumerate().
I agree the that ddc seems unmaintained, but we could totally fork it if someone want to work on this
Let's keep that issue open for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants