• Bug#1105017: linux: [amdgpu] X crashed: Errors are "ring gfx timeout" a

    From Salvatore Bonaccorso@1:229/2 to All on Sun Jun 15 17:20:01 2025
    XPost: linux.debian.bugs.dist
    From: carnil@debian.org

    Control: tags -1 + moreinfo

    Hi,

    On Fri, May 09, 2025 at 08:16:56PM -0300, k02 wrote:
    Source: linux
    Version: linux-image-6.1.0-34-amd64
    Severity: normal

    Hi,
    First time I get this error and crash, but noticed somebody at bugzilla kernel <https://bugzilla.kernel.org/show_bug.cgi?id=205089#c61> had a
    similar issue on debian. So better have it reported.
    Using ryzen 3500u, xfce4 (xfwm4 configured to use vblank=xpresent)
    kernel-rt compiled with debian's linux-source and linux-config defaults except for two custom lines:
    CONFIG_HZ_1000=y
    CONFIG_HZ=1000

    I was using libreoffice-write and atril both tiled to each side of the screen, I was scrolling through a pdf then the following happened:

    may 09 19:14:12 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=2363974, emitted seq=2363976
    may 09 19:14:12 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 1891 thread Xorg:cs0 pid 1936
    may 09 19:14:12 kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset begin!
    may 09 19:14:12 kernel: amdgpu 0000:04:00.0: amdgpu: free PSP TMR buffer
    may 09 19:14:12 kernel: amdgpu 0000:04:00.0: amdgpu: MODE2 reset
    may 09 19:14:12 kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset succeeded, trying to resume
    may 09 19:14:12 kernel: [drm] PCIE GART of 1024M enabled.
    may 09 19:14:12 kernel: [drm] PTB located at 0x000000F400A00000
    may 09 19:14:12 kernel: [drm] VRAM is lost due to GPU reset!
    may 09 19:14:12 kernel: [drm] PSP is resuming...
    may 09 19:14:12 kernel: [drm] reserve 0x400000 from 0xf47fc00000 for PSP TMR may 09 19:14:12 kernel: amdgpu 0000:04:00.0: amdgpu: RAS: optional ras ta ucode is not available
    may 09 19:14:12 kernel: amdgpu 0000:04:00.0: amdgpu: RAP: optional rap ta ucode is not available
    may 09 19:14:13 kernel: [drm] kiq ring mec 2 pipe 1 q 0
    may 09 19:14:13 kernel: [drm] VCN decode and encode initialized successfully(under SPG Mode).
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: ring gfx uses VM inv
    eng 0 on hub 0
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 1
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: ring vcn_dec uses VM
    inv eng 1 on hub 1
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 4 on hub 1
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 5 on hub 1
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: ring jpeg_dec uses VM inv eng 6 on hub 1
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: recover vram bo from shadow start
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: recover vram bo from shadow done
    may 09 19:14:13 kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset(2) succeeded! may 09 19:14:13 kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!

    Does this still happen with the most curren 6.1.140 in Debian?

    Does it happen with a recent 6.12. kernel or mainline?

    If the later answers are "yes", can you fill a new upstream issue at https://gitlab.freedesktop.org/drm/amd (double checking htere seems
    though to have some similar issues) and report back the upstream issue
    here, please?

    Regards,
    Salvatore

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)