• =?UTF-8?Q?crash_probablement_li=C3=A9_=C3=A0_amdgpu?=

    From LECOQ Vincent@21:1/5 to All on Wed Sep 27 13:30:01 2023
    Bonjour,

    J'ai jeté un oeil (rapide, trop?) aux bugs ouverts sans trouver, donc je rapporte ma petite misère.
    Depuis mon dernier apt full-upgrade hier, je constate un crash assez rapide
    de ma session gnome wayland.
    mon dmesg indique alors:
    [ 4765.695352] ------------[ cut here ]------------
    [ 4765.695354] WARNING: CPU: 2 PID: 721753 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:615 amdgpu_irq_put+0x46/0x70
    [amdgpu]
    [ 4765.695512] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer snd_seq rpcsec_gss_krb5 auth_rpcgss nf_tables nfnetlink nfsv4 dns_resolver nfs
    lockd grace fscache netfs qrtr cmac algif_hash algif_skcipher af_alg bnep sunrpc binfmt_misc nls_ascii nls_cp437 intel_rapl_msr vfat
    intel_rapl_common fat edac_mce_amd mt7921e btusb mt7921_common btrtl btbcm kvm_amd mt76_connac_lib btintel btmtk mt76 bluetooth kvm mac80211
    sha3_generic jitterentropy_rng irqbypass uvcvideo drbg videobuf2_vmalloc libarc4 ghash_clmulni_intel uvc videobuf2_memops snd_hda_codec_hdmi
    ansi_cprng videobuf2_v4l2 sha512_ssse3 snd_hda_intel ecdh_generic sha512_generic snd_usb_audio videodev snd_intel_dspcfg ecc cfg80211 snd_intel_sdw_acpi snd_usbmidi_lib snd_hda_codec snd_rawmidi
    videobuf2_common snd_seq_device aesni_intel snd_hda_core crypto_simd mc
    cryptd snd_pci_acp6x snd_hwdep snd_pci_acp5x snd_pcm rfkill
    snd_rn_pci_acp3x rapl wmi_bmof snd_timer snd_acp_config snd_soc_acpi snd
    pcspkr sp5100_tco k10temp ccp watchdog snd_pci_acp3x soundcore joydev sg
    [ 4765.695578] evdev msr parport_pc ppdev lp parport fuse loop efi_pstore configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs blake2b_generic efivarfs raid10 raid456 async_raid6_recov async_memcpy
    async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1
    raid0 multipath linear md_mod hid_cmedia amdgpu hid_generic amdxcp
    drm_buddy gpu_sched i2c_algo_bit drm_suballoc_helper usbhid uas drm_display_helper hid usb_storage sd_mod cec rc_core dm_mod drm_ttm_helper
    ttm ahci drm_kms_helper libahci nvme xhci_pci xhci_hcd nvme_core libata drm t10_pi usbcore scsi_mod igc crc32_pclmul crc64_rocksoft crc32c_intel crc64 crc_t10dif crct10dif_generic crct10dif_pclmul i2c_piix4 crct10dif_common usb_common scsi_common video wmi gpio_amdpt gpio_generic button
    [ 4765.695636] CPU: 2 PID: 721753 Comm: kworker/u64:2 Tainted: G W
    6.5.0-1-amd64 #1 Debian 6.5.3-1
    [ 4765.695639] Hardware name: BESSTAR TECH LIMITED B550/B550, BIOS 5.17 03/31/2022
    [ 4765.695640] Workqueue: amdgpu-reset-dev drm_sched_job_timedout
    [gpu_sched]
    [ 4765.695646] RIP: 0010:amdgpu_irq_put+0x46/0x70 [amdgpu]
    [ 4765.695796] Code: c0 74 33 48 8b 4e 10 48 83 39 00 74 29 89 d1 48 8d 04
    88 8b 08 85 c9 74 11 f0 ff 08 74 07 31 c0 e9 cf 5d 1d c4 e9 5a fd ff ff
    <0f> 0b b8 ea ff ff ff e9 be 5d 1d c4 b8 ea ff ff ff e9 b4 5d 1d c4
    [ 4765.695798] RSP: 0018:ffffbc5f85a17c80 EFLAGS: 00010246
    [ 4765.695800] RAX: ffff9642e26b1370 RBX: ffff96420e880000 RCX: 0000000000000000
    [ 4765.695801] RDX: 0000000000000000 RSI: ffff96420e8a78a8 RDI: ffff96420e880000
    [ 4765.695802] RBP: ffff96420e880000 R08: ffffeb8d0e5d0000 R09: ffffeb8d0e5cc001
    [ 4765.695803] R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000001050
    [ 4765.695804] R13: ffff96420e8c1218 R14: ffff964358662000 R15: 0000000000000000
    [ 4765.695806] FS: 0000000000000000(0000) GS:ffff9650de280000(0000) knlGS:0000000000000000
    [ 4765.695807] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 4765.695808] CR2: 00007f6a18805760 CR3: 000000010c2ae000 CR4: 0000000000750ee0
    [ 4765.695810] PKRU: 55555554
    [ 4765.695810] Call Trace:
    [ 4765.695813] <TASK>
    [ 4765.695815] ? amdgpu_irq_put+0x46/0x70 [amdgpu]
    [ 4765.695963] ? __warn+0x81/0x130
    [ 4765.695970] ? amdgpu_irq_put+0x46/0x70 [amdgpu]
    [ 4765.696108] ? report_bug+0x191/0x1c0
    [ 4765.696112] ? handle_bug+0x3c/0x80
    [ 4765.696116] ? exc_invalid_op+0x17/0x70
    [ 4765.696118] ? asm_exc_invalid_op+0x1a/0x20
    [ 4765.696123] ? amdgpu_irq_put+0x46/0x70 [amdgpu]
    [ 4765.696250] gfx_v9_0_hw_fini+0x35/0x710 [amdgpu]
    [ 4765.696380] amdgpu_device_ip_suspend_phase2+0x101/0x1a0 [amdgpu]
    [ 4765.696497] ? amdgpu_device_ip_suspend_phase1+0x6f/0xe0 [amdgpu]
    [ 4765.696614] amdgpu_device_ip_suspend+0x36/0x70 [amdgpu]
    [ 4765.696731] amdgpu_device_pre_asic_reset+0xd3/0x2a0 [amdgpu]
    [ 4765.696849] amdgpu_device_gpu_recover+0x4c6/0xd70 [amdgpu]
    [ 4765.696968] amdgpu_job_timedout+0x186/0x270 [amdgpu]
    [ 4765.697112] ? srso_alias_return_thunk+0x5/0x7f
    [ 4765.697118] drm_sched_job_timedout+0x7a/0x110 [gpu_sched]
    [ 4765.697124] process_one_work+0x1e1/0x3f0
    [ 4765.697128] worker_thread+0x51/0x390
    [ 4765.697130] ? _raw_spin_lock_irqsave+0x27/0x60
    [ 4765.697133] ? __pfx_worker_thread+0x10/0x10
    [ 4765.697134] kthread+0xf7/0x130
    [ 4765.697137] ? __pfx_kthread+0x10/0x10
    [ 4765.697140] ret_from_fork+0x34/0x50
    [ 4765.697143] ? __pfx_kthread+0x10/0x10
    [ 4765.697146] ret_from_fork_asm+0x1b/0x30
    [ 4765.697152] </TASK>
    [ 4765.697153] ---[ end trace 0000000000000000 ]---
    [ 4765.697161] ------------[ cut here ]------------

    L'écran clignote du noir <-> retour bureau <-> noir <-> jardinnage en ram
    <-> noir
    Ma machine semble figée mais un CTRL-ALT-FX puis taper à l'aveugle pour me connecter et récupérer le log marche est possible.

    Si je laisse la session démarrer quelques minutes puis que je lance (par exemple) un browser assez lourd (chrome dans mon cas), cela se passe mieux mais si je reprends après une veille j'y ai droit aussi

    Je constate plus de stabilité en mode Xorg mais le même message
    mais mon dmesg indique :
    [ 3881.415898] [drm:dm_plane_helper_prepare_fb [amdgpu]] *ERROR* Failed to
    pin framebuffer with error -12
    [ 3881.432742] amdgpu 0000:07:00.0: amdgpu: 0000000083f7ea8e pin failed
    [ 3881.432747] [drm:dm_plane_helper_prepare_fb [amdgpu]] *ERROR* Failed to
    pin framebuffer with error -12
    [ 3881.449576] amdgpu 0000:07:00.0: amdgpu: 000000002757bd96 pin failed
    [ 3881.449582] [drm:dm_plane_helper_prepare_fb [amdgpu]] *ERROR* Failed to
    pin framebuffer with error -12
    [ 3881.465611] amdgpu 0000:07:00.0: amdgpu: 0000000083f7ea8e pin failed
    [ 3881.465615] [drm:dm_plane_helper_prepare_fb [amdgpu]] *ERROR* Failed to
    pin framebuffer with error -12
    (en permanence)

    Quelques infos environnementales :
    oktail@b550:~$ uname -a
    Linux b550 6.5.0-1-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.5.3-1 (2023-09-13) x86_64 GNU/Linux
    oktail@b550:~$ cat /etc/debian_version
    trixie/sid
    oktail@b550:~$ lspci
    00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne Root Complex
    00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne IOMMU
    00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy
    Host Bridge
    00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy
    Host Bridge
    00:02.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne PCIe
    GPP Bridge
    00:02.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne PCIe
    GPP Bridge
    00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy
    Host Bridge
    00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir Internal PCIe
    GPP Bridge to Bus
    00:08.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir Internal PCIe
    GPP Bridge to Bus
    00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev
    51)
    00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev
    51)
    00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data
    Fabric; Function 0
    00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data
    Fabric; Function 1
    00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data
    Fabric; Function 2
    00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data
    Fabric; Function 3
    00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data
    Fabric; Function 4
    00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data
    Fabric; Function 5
    00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data
    Fabric; Function 6
    00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data
    Fabric; Function 7
    01:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] 500 Series
    Chipset USB 3.1 XHCI Controller
    01:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] 500 Series
    Chipset SATA Controller
    01:00.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] 500 Series Chipset Switch Upstream Port
    02:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43ea
    02:03.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43ea
    02:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43ea
    03:00.0 Ethernet controller: Intel Corporation Ethernet Controller I225-V
    (rev 01)
    04:00.0 Network controller: MEDIATEK Corp. MT7921K (RZ608) Wi-Fi 6E 80MHz 05:00.0 Non-Volatile memory controller: Kingston Technology Company, Inc. OM3PDP3 NVMe SSD (rev 01)
    06:00.0 Non-Volatile memory controller: Kingston Technology Company, Inc. Device 5017 (rev 03)
    07:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] (rev c8)
    07:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Renoir Radeon
    High Definition Audio Controller
    07:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family
    17h (Models 10h-1fh) Platform Security Processor
    07:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne
    USB 3.1
    07:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne
    USB 3.1
    07:00.5 Multimedia controller: Advanced Micro Devices, Inc. [AMD] ACP/ACP3X/ACP6x Audio Coprocessor (rev 01)
    07:00.6 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h/19h HD Audio Controller
    08:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA
    Controller [AHCI mode] (rev 81)
    08:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA
    Controller [AHCI mode] (rev 81)

    cpuinfo
    processor : 15
    vendor_id : AuthenticAMD
    cpu family : 25
    model : 80
    model name : AMD Ryzen 7 5700G with Radeon Graphics
    stepping : 0
    microcode : 0xa50000c
    cpu MHz : 400.000
    cache size : 512 KB
    physical id : 0
    siblings : 16
    core id : 7
    cpu cores : 8
    apicid : 15
    initial apicid : 15
    fpu : yes
    fpu_exception : yes
    cpuid level : 16
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
    pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb
    rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid
    aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic
    movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce
    topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2
    erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt
    xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local
    clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock
    nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm
    bugs : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass srso
    bogomips : 7586.20
    TLB size : 2560 4K pages
    clflush size : 64
    cache_alignment : 64
    address sizes : 48 bits physical, 48 bits virtual
    power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]

    modèle de mon PC : Minisforum B550

    En espérant apporter assez d'eau au moulin !

    Merci

    --
    LECOQ Vincent
    vincent.lecoq@gmail.com

    <div dir="ltr">Bonjour,<div><br></div><div>J&#39;ai jeté un oeil (rapide, trop?) aux bugs ouverts sans trouver, donc je rapporte ma petite misère.</div><div>Depuis mon dernier apt full-upgrade hier, je constate un crash assez rapide de ma session
    gnome wayland.</div><div>mon dmesg indique alors:</div><div>[ 4765.695352] ------------[ cut here ]------------<br>[ 4765.695354] WARNING: CPU: 2 PID: 721753 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:615 amdgpu_irq_put+0x46/0x70 [amdgpu]<br>[ 4765.
    695512] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer snd_seq rpcsec_gss_krb5 auth_rpcgss nf_tables nfnetlink nfsv4 dns_resolver nfs lockd grace fscache netfs qrtr cmac algif_hash algif_skcipher af_alg bnep sunrpc binfmt_misc nls_ascii nls_cp437
    intel_rapl_msr vfat intel_rapl_common fat edac_mce_amd mt7921e btusb mt7921_common btrtl btbcm kvm_amd mt76_connac_lib btintel btmtk mt76 bluetooth kvm mac80211 sha3_generic jitterentropy_rng irqbypass uvcvideo drbg vide
  • From NoSpam@21:1/5 to All on Wed Sep 27 15:00:02 2023
    Bonjour

    Le 27/09/2023 à 13:03, LECOQ Vincent a écrit :
    [...]
    Depuis mon dernier apt full-upgrade hier, je constate un crash assez
    rapide de ma session gnome wayland.
    [...]
    Linux b550 6.5.0-1-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.5.3-1
    (2023-09-13) x86_64 GNU/Linux
    oktail@b550:~$ cat /etc/debian_version
    trixie/sid

    [...]

    Les plaisirs de SID :) Ouvrir un ticket de bug est la solution et
    revenir aux versions antérieurs des paquets concernés.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)