Forum: >>> Magnum BBS <<<

USB SSD randomly unmounting

From Pancho@3:770/3 to All on Fri Jul 7 11:45:24 2023

I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA adapter, shared via Samba. It had worked this way for years.

A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted
again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.

As this was my main server, which I needed to work, I bought a new SSD,
and copied everything across. Everything all now works fine.

The thing is, I can't now see anything wrong with the problematic SSD.
fsck says it is OK. Smartctl says it is OK, but can't run a long test
(the long test always says "Aborted by host" 90% remaining). The SSD now
stays mounted on another machine, still USB.

My guess is there was something like a bad block, which caused the SSD
to dismount when it was accessed. Now that the SSD isn't being used
heavily, the problem just doesn't show up, it stays mounted. Previously,
It was being used for a security camera, so a fair bit of writing.

I know I should just bin the drive, but I'm curious, is there a better
way of testing it, finding a fault?

--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | Fido<>Usenet Gateway (3:770/3)

From Jan Panteltje@3:770/3 to Pancho.Jones@proton.me on Fri Jul 7 11:36:24 2023

On a sunny day (Fri, 7 Jul 2023 11:45:24 +0100) it happened Pancho <Pancho.Jones@proton.me> wrote in <u88qc5$1a435$1@dont-email.me>:

I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA >adapter, shared via Samba. It had worked this way for years.

A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted >again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.

As this was my main server, which I needed to work, I bought a new SSD,
and copied everything across. Everything all now works fine.

The thing is, I can't now see anything wrong with the problematic SSD.
fsck says it is OK. Smartctl says it is OK, but can't run a long test
(the long test always says "Aborted by host" 90% remaining). The SSD now >stays mounted on another machine, still USB.

My guess is there was something like a bad block, which caused the SSD
to dismount when it was accessed. Now that the SSD isn't being used
heavily, the problem just doesn't show up, it stays mounted. Previously,
It was being used for a security camera, so a fair bit of writing.

I know I should just bin the drive, but I'm curious, is there a better
way of testing it, finding a fault?

I had a similar problem with an old Pi,
and it turned out to be the power supply module,
now replaced with original raspi supply, no more problems.
What did / does
dmesg
show?
I sometimes see this on my Pi4 8GB:
[1484748.627699] hwmon hwmon1: Voltage normalised
[1484750.707706] hwmon hwmon1: Undervoltage detected!
[1484754.867040] hwmon hwmon1: Voltage normalised
that is with a 3.8 TB USB Toshiba harddisk on a Siecom USB hub,
so far no crashes... Runs on a UPS..

--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | Fido<>Usenet Gateway (3:770/3)

From Pancho@3:770/3 to Computer Nerd Kev on Fri Jul 7 13:58:34 2023

On 07/07/2023 13:44, Computer Nerd Kev wrote:

Pancho <Pancho.Jones@proton.me> wrote:

I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA
adapter, shared via Samba. It had worked this way for years.

A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted
again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.

As this was my main server, which I needed to work, I bought a new SSD,
and copied everything across. Everything all now works fine.

The thing is, I can't now see anything wrong with the problematic SSD.
fsck says it is OK. Smartctl says it is OK, but can't run a long test
(the long test always says "Aborted by host" 90% remaining). The SSD now
stays mounted on another machine, still USB.

My guess is there was something like a bad block, which caused the SSD
to dismount when it was accessed.

That isn't a normal respose in Linux. For ext file systems the
behaviour is:

errors={continue|remount-ro|panic}
Define the behavior when an error is encountered. (Either ig-
nore errors and just mark the filesystem erroneous and continue,
or remount the filesystem read-only, or panic and halt the sys-
tem.) The default is set in the filesystem superblock, and can
be changed using tune2fs(8).
(from EXT4(5))

For FAT filesystems it's:

errors={panic|continue|remount-ro}
Specify FAT behavior on critical errors: panic, continue without
doing anything, or remount the partition in read-only mode (de-
fault behavior).
(from MOUNT(8))

So between continuing and halting the system, the only other option
should be remounting read-only, not unmounting the filesystem.

Perhaps there was some service unmounting it automatically because
it thought the drive had been disconnected? Anyway it shouldn't be
able to happen due to read/write errors.

Thx, I was just guessing. I posted a bit of sys
log in response to Jan. It didn't make much sense to me, but maybe it
will to someone else.

--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | Fido<>Usenet Gateway (3:770/3)

From Computer Nerd Kev@3:770/3 to Pancho on Fri Jul 7 22:44:25 2023

Pancho <Pancho.Jones@proton.me> wrote:

I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA adapter, shared via Samba. It had worked this way for years.

A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.

As this was my main server, which I needed to work, I bought a new SSD,
and copied everything across. Everything all now works fine.

The thing is, I can't now see anything wrong with the problematic SSD.
fsck says it is OK. Smartctl says it is OK, but can't run a long test
(the long test always says "Aborted by host" 90% remaining). The SSD now stays mounted on another machine, still USB.

My guess is there was something like a bad block, which caused the SSD
to dismount when it was accessed.

That isn't a normal respose in Linux. For ext file systems the
behaviour is:

errors={continue|remount-ro|panic}
Define the behavior when an error is encountered. (Either ig-
nore errors and just mark the filesystem erroneous and continue,
or remount the filesystem read-only, or panic and halt the sys-
tem.) The default is set in the filesystem superblock, and can
be changed using tune2fs(8).
(from EXT4(5))

For FAT filesystems it's:

errors={panic|continue|remount-ro}
Specify FAT behavior on critical errors: panic, continue without
doing anything, or remount the partition in read-only mode (de-
fault behavior).
(from MOUNT(8))

So between continuing and halting the system, the only other option
should be remounting read-only, not unmounting the filesystem.

Perhaps there was some service unmounting it automatically because
it thought the drive had been disconnected? Anyway it shouldn't be
able to happen due to read/write errors.

--
__ __
#_ < |\| |< _#

--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | Fido<>Usenet Gateway (3:770/3)

From Pancho@3:770/3 to Jan Panteltje on Fri Jul 7 13:55:10 2023

On 07/07/2023 12:36, Jan Panteltje wrote:

On a sunny day (Fri, 7 Jul 2023 11:45:24 +0100) it happened Pancho <Pancho.Jones@proton.me> wrote in <u88qc5$1a435$1@dont-email.me>:

I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA
adapter, shared via Samba. It had worked this way for years.

A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted
again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.

As this was my main server, which I needed to work, I bought a new SSD,
and copied everything across. Everything all now works fine.

The thing is, I can't now see anything wrong with the problematic SSD.
fsck says it is OK. Smartctl says it is OK, but can't run a long test
(the long test always says "Aborted by host" 90% remaining). The SSD now
stays mounted on another machine, still USB.

My guess is there was something like a bad block, which caused the SSD
to dismount when it was accessed. Now that the SSD isn't being used
heavily, the problem just doesn't show up, it stays mounted. Previously,
It was being used for a security camera, so a fair bit of writing.

I know I should just bin the drive, but I'm curious, is there a better
way of testing it, finding a fault?

I had a similar problem with an old Pi,
and it turned out to be the power supply module,
now replaced with original raspi supply, no more problems.
What did / does
dmesg
show?
I sometimes see this on my Pi4 8GB:
[1484748.627699] hwmon hwmon1: Voltage normalised
[1484750.707706] hwmon hwmon1: Undervoltage detected!
[1484754.867040] hwmon hwmon1: Voltage normalised
that is with a 3.8 TB USB Toshiba harddisk on a Siecom USB hub,
so far no crashes... Runs on a UPS..

I don't thing undervoltage is the problem.

Here is a bit of /var/log/syslog.1. I think is relevant. rsnapshot is
good, I think the line after is things going bad.

Jun 25 16:00:15 rpi4 rsnapshot[1261324]: /usr/bin/rsnapshot alpha:
completed successfully
Jun 25 16:05:01 rpi4 CRON[1261432]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Jun 25 16:05:28 rpi4 kernel: [6195820.664216] sd 0:0:0:0: [sda] tag#20 uas_eh_abort_handler 0 uas-tag 4 inflight: CMD OUT
Jun 25 16:05:28 rpi4 kernel: [6195820.664245] sd 0:0:0:0: [sda] tag#20
CDB: Write(10) 2a 00 0d 14 09 5f 00 00 08 00
Jun 25 16:05:28 rpi4 kernel: [6195820.664781] sd 0:0:0:0: [sda] tag#19 uas_eh_abort_handler 0 uas-tag 3 inflight: CMD OUT
Jun 25 16:05:28 rpi4 kernel: [6195820.664794] sd 0:0:0:0: [sda] tag#19
CDB: Write(10) 2a 00 0d 14 09 7f 00 00 08 00
Jun 25 16:05:28 rpi4 kernel: [6195820.665104] sd 0:0:0:0: [sda] tag#18 uas_eh_abort_handler 0 uas-tag 2 inflight: CMD OUT
Jun 25 16:05:28 rpi4 kernel: [6195820.665115] sd 0:0:0:0: [sda] tag#18
CDB: Write(10) 2a 00 25 c9 01 8f 00 00 18 00
Jun 25 16:05:43 rpi4 kernel: [6195836.024416] sd 0:0:0:0: [sda] tag#27 uas_eh_abort_handler 0 uas-tag 11 inflight: CMD OUT
Jun 25 16:05:43 rpi4 kernel: [6195836.024451] sd 0:0:0:0: [sda] tag#27
CDB: Write(10) 2a 00 0b f3 35 5f 00 04 00 00
Jun 25 16:05:43 rpi4 kernel: [6195836.024927] sd 0:0:0:0: [sda] tag#26 uas_eh_abort_handler 0 uas-tag 10 inflight: CMD OUT
Jun 25 16:05:43 rpi4 kernel: [6195836.024942] sd 0:0:0:0: [sda] tag#26
CDB: Write(10) 2a 00 0b f3 31 5f 00 04 00 00
Jun 25 16:05:43 rpi4 kernel: [6195836.025287] sd 0:0:0:0: [sda] tag#25 uas_eh_abort_handler 0 uas-tag 9 inflight: CMD OUT
Jun 25 16:05:43 rpi4 kernel: [6195836.025301] sd 0:0:0:0: [sda] tag#25
CDB: Write(10) 2a 00 0b f3 2d 5f 00 04 00 00
Jun 25 16:05:43 rpi4 kernel: [6195836.025639] sd 0:0:0:0: [sda] tag#24 uas_eh_abort_handler 0 uas-tag 8 inflight: CMD OUT
Jun 25 16:05:43 rpi4 kernel: [6195836.025652] sd 0:0:0:0: [sda] tag#24
CDB: Write(10) 2a 00 0b f3 29 5f 00 04 00 00
Jun 25 16:05:43 rpi4 kernel: [6195836.025991] sd 0:0:0:0: [sda] tag#23 uas_eh_abort_handler 0 uas-tag 7 inflight: CMD OUT
Jun 25 16:05:43 rpi4 kernel: [6195836.026004] sd 0:0:0:0: [sda] tag#23
CDB: Write(10) 2a 00 0b f3 25 5f 00 04 00 00
Jun 25 16:05:43 rpi4 kernel: [6195836.026327] sd 0:0:0:0: [sda] tag#22 uas_eh_abort_handler 0 uas-tag 6 inflight: CMD OUT
Jun 25 16:05:43 rpi4 kernel: [6195836.026340] sd 0:0:0:0: [sda] tag#22
CDB: Write(10) 2a 00 0b f3 21 5f 00 04 00 00
Jun 25 16:05:43 rpi4 kernel: [6195836.026677] sd 0:0:0:0: [sda] tag#21 uas_eh_abort_handler 0 uas-tag 5 inflight: CMD OUT
Jun 25 16:05:43 rpi4 kernel: [6195836.026690] sd 0:0:0:0: [sda] tag#21
CDB: Write(10) 2a 00 0b f3 21 57 00 00 08 00
Jun 25 16:05:46 rpi4 kernel: [6195838.328427] sd 0:0:0:0: [sda] tag#12 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD
Jun 25 16:05:46 rpi4 kernel: [6195838.328464] sd 0:0:0:0: [sda] tag#12
CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00
Jun 25 16:05:46 rpi4 kernel: [6195838.344423] scsi host0: uas_eh_device_reset_handler start
Jun 25 16:05:51 rpi4 kernel: [6195843.437090] usb 2-1: Disable of device-initiated U1 failed.
Jun 25 16:05:56 rpi4 kernel: [6195848.556846] usb 2-1: Disable of device-initiated U2 failed.
Jun 25 16:05:56 rpi4 kernel: [6195848.685697] usb 2-1: reset SuperSpeed
USB device number 2 using xhci_hcd
Jun 25 16:05:56 rpi4 kernel: [6195848.705275] usb 2-1: device firmware
changed
Jun 25 16:05:56 rpi4 kernel: [6195848.713529] scsi host0: uas_eh_device_reset_handler FAILED err -19
Jun 25 16:05:56 rpi4 kernel: [6195848.713558] sd 0:0:0:0: Device
offlined - not ready after error recovery
Jun 25 16:05:56 rpi4 kernel: [6195848.713569] sd 0:0:0:0: Device
offlined - not ready after error recovery
Jun 25 16:05:56 rpi4 kernel: [6195848.713579] sd 0:0:0:0: Device
offlined - not ready after error recovery
Jun 25 16:05:56 rpi4 kernel: [6195848.713587] sd 0:0:0:0: Device
offlined - not ready after error recovery
Jun 25 16:05:56 rpi4 kernel: [6195848.713596] sd 0:0:0:0: Device
offlined - not ready after error recovery
Jun 25 16:05:56 rpi4 kernel: [6195848.713604] sd 0:0:0:0: Device
offlined - not ready after error recovery
Jun 25 16:05:56 rpi4 kernel: [6195848.713612] sd 0:0:0:0: Device
offlined - not ready after error recovery
Jun 25 16:05:56 rpi4 kernel: [6195848.713620] sd 0:0:0:0: Device
offlined - not ready after error recovery
Jun 25 16:05:56 rpi4 kernel: [6195848.713628] sd 0:0:0:0: Device
offlined - not ready after error recovery
Jun 25 16:05:56 rpi4 kernel: [6195848.713637] sd 0:0:0:0: Device
offlined - not ready after error recovery
Jun 25 16:05:56 rpi4 kernel: [6195848.713645] sd 0:0:0:0: Device
offlined - not ready after error recovery
Jun 25 16:05:56 rpi4 kernel: [6195848.713681] sd 0:0:0:0: [sda] tag#12
FAILED Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK cmd_age=70s
Jun 25 16:05:56 rpi4 kernel: [6195848.713696] sd 0:0:0:0: [sda] tag#12
CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00
Jun 25 16:05:56 rpi4 kernel: [6195848.713714] blk_update_request: I/O
error, dev sda, sector 487857615 op 0x1:(WRITE) flags 0x800 phys_seg 1
prio class 0
Jun 25 16:05:56 rpi4 kernel: [6195848.724869] Aborting journal on device sda1-8.
Jun 25 16:05:56 rpi4 kernel: [6195848.724965] usb 2-1: USB disconnect,
device number 2
Jun 25 16:05:56 rpi4 kernel: [6195848.727015] sd 0:0:0:0: [sda] tag#21
FAILED Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK cmd_age=42s
Jun 25 16:05:56 rpi4 kernel: [6195848.727033] sd 0:0:0:0: [sda] tag#21
CDB: Write(10) 2a 00 0b f3 21 57 00 00 08 00
Jun 25 16:05:56 rpi4 kernel: [6195848.727042] blk_update_request: I/O
error, dev sda, sector 200483159 op 0x1:(WRITE) flags 0x0 phys_seg 1
prio class 0
Jun 25 16:05:56 rpi4 kernel: [6195848.727056] EXT4-fs warning (device
sda1): ext4_end_bio:344: I/O error 10 writing to inode 2885595 starting
block 25060395)
Jun 25 16:05:56 rpi4 kernel: [6195848.727071] Buffer I/O error on device
sda1, logical block 25052203
Jun 25 16:05:56 rpi4 kernel: [6195848.727120] sd 0:0:0:0: [sda] tag#22
FAILED Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK cmd_age=42s
Jun 25 16:05:56 rpi4 kernel: [6195848.727130] sd 0:0:0:0: [sda] tag#22
CDB: Write(10) 2a 00 0b f3 21 5f 00 04 00 00
Jun 25 16:05:56 rpi4 kernel: [6195848.727137] blk_update_request: I/O
error, dev sda, sector 200483167 op 0x1:(WRITE) flags 0x4000 phys_seg
128 prio class 0
Jun 25 16:05:56 rpi4 kernel: [6195848.727163] sd 0:0:0:0: [sda] tag#23
FAILED Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK cmd_age=42s
Jun 25 16:05:56 rpi4 kernel: [6195848.727172] sd 0:0:0:0: [sda] tag#23
CDB: Write(10) 2a 00 0b f3 25 5f 00 04 00 00
Jun 25 16:05:56 rpi4 kernel: [6195848.727179] blk_update_request: I/O
error, dev sda, sector 200484191 op 0x1:(WRITE) flags 0x4000 phys_seg
128 prio class 0
Jun 25 16:05:56 rpi4 kernel: [6195848.727204] sd 0:0:0:0: [sda] tag#24
FAILED Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK cmd_age=42s
Jun 25 16:05:56 rpi4 kernel: [6195848.727212] sd 0:0:0:0: [sda] tag#24
CDB: Write(10) 2a 00 0b f3 29 5f 00 04 00 00
Jun 25 16:05:56 rpi4 kernel: [6195848.727219] blk_update_request: I/O
error, dev sda, sector 200485215 op 0x1:(WRITE) flags 0x4000 phys_seg
128 prio class 0
Jun 25 16:05:56 rpi4 kernel: [6195848.727243] sd 0:0:0:0: [sda] tag#25
FAILED Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK cmd_age=42s
Jun 25 16:05:56 rpi4 kernel: [6195848.727251] sd 0:0:0:0: [sda] tag#25
CDB: Write(10) 2a 00 0b f3 2d 5f 00 04 00 00
Jun 25 16:05:56 rpi4 kernel: [6195848.727258] blk_update_request: I/O
error, dev sda, sector 200486239 op 0x1:(WRITE) flags 0x4000 phys_seg
128 prio class 0
Jun 25 16:05:56 rpi4 kernel: [6195848.727279] sd 0:0:0:0: [sda] tag#26
FAILED Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK cmd_age=42s
Jun 25 16:05:56 rpi4 kernel: [6195848.727287] sd 0:0:0:0: [sda] tag#26
CDB: Write(10) 2a 00 0b f3 31 5f 00 04 00 00
Jun 25 16:05:56 rpi4 kernel: [6195848.727294] blk_update_request: I/O
error, dev sda, sector 200487263 op 0x1:(WRITE) flags 0x4000 phys_seg
128 prio class 0
Jun 25 16:05:56 rpi4 kernel: [6195848.727317] sd 0:0:0:0: [sda] tag#27
FAILED Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK cmd_age=42s
Jun 25 16:05:56 rpi4 kernel: [6195848.727325] sd 0:0:0:0: [sda] tag#27
CDB: Write(10) 2a 00 0b f3 35 5f 00 04 00 00
Jun 25 16:05:56 rpi4 kernel: [6195848.727332] blk_update_request: I/O
error, dev sda, sector 200488287 op 0x1:(WRITE) flags 0x4000 phys_seg
128 prio class 0
Jun 25 16:05:56 rpi4 kernel: [6195848.727351] sd 0:0:0:0: [sda] tag#18
FAILED Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK cmd_age=58s
Jun 25 16:05:56 rpi4 kernel: [6195848.727359] sd 0:0:0:0: [sda] tag#18
CDB: Write(10) 2a 00 25 c9 01 8f 00 00 18 00
Jun 25 16:05:56 rpi4 kernel: [6195848.727366] blk_update_request: I/O
error, dev sda, sector 633930127 op 0x1:(WRITE) flags 0x0 phys_seg 3
prio class 0
Jun 25 16:05:56 rpi4 kernel: [6195848.727376] EXT4-fs warning (device
sda1): ext4_end_bio:344: I/O error 10 writing to inode 12588833 starting
block 79241266)
Jun 25 16:05:56 rpi4 kernel: [6195848.727387] Buffer I/O error on device
sda1, logical block 79233074
Jun 25 16:05:56 rpi4 kernel: [6195848.727417] EXT4-fs warning (device
sda1): ext4_end_bio:344: I/O error 10 writing to inode 12588833 starting
block 79241268)
Jun 25 16:05:56 rpi4 kernel: [6195848.727428] Buffer I/O error on device
sda1, logical block 79233075
Jun 25 16:05:56 rpi4 kernel: [6195848.727437] Buffer I/O error on device
sda1, logical block 79233076
Jun 25 16:05:56 rpi4 kernel: [6195848.727460] sd 0:0:0:0: [sda] tag#19
FAILED Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK cmd_age=58s
Jun 25 16:05:56 rpi4 kernel: [6195848.727468] sd 0:0:0:0: [sda] tag#19
CDB: Write(10) 2a 00 0d 14 09 7f 00 00 08 00
Jun 25 16:05:56 rpi4 kernel: [6195848.727475] blk_update_request: I/O
error, dev sda, sector 219416959 op 0x1:(WRITE) flags 0x0 phys_seg 1
prio class 0
Jun 25 16:05:56 rpi4 kernel: [6195848.727485] EXT4-fs warning (device
sda1): ext4_end_bio:344: I/O error 10 writing to inode 12588834 starting
block 27427120)
Jun 25 16:05:56 rpi4 kernel: [6195848.727496] Buffer I/O error on device
sda1, logical block 27418928
Jun 25 16:05:56 rpi4 kernel: [6195848.727513] EXT4-fs warning (device
sda1): ext4_end_bio:344: I/O error 10 writing to inode 12588834 starting
block 27427116)
Jun 25 16:05:56 rpi4 kernel: [6195848.727523] Buffer I/O error on device
sda1, logical block 27418924
Jun 25 16:05:56 rpi4 kernel: [6195848.727599] sd 0:0:0:0: rejecting I/O
to offline device
Jun 25 16:05:56 rpi4 kernel: [6195848.729583] EXT4-fs (sda1):
ext4_writepages: jbd2_start: 3115 pages, ino 2885595; err -30
Jun 25 16:05:56 rpi4 kernel: [6195848.729637] Buffer I/O error on dev
sda1, logical block 60850176, lost sync page write
Jun 25 16:05:56 rpi4 kernel: [6195848.729648] EXT4-fs error (device
sda1): ext4_journal_check_start:83: comm MQTT-rpi3a: Detected aborted
journal
Jun 25 16:05:56 rpi4 kernel: [6195848.729663] JBD2: Error -5 detected
when updating journal superblock for sda1-8.
Jun 25 16:05:56 rpi4 kernel: [6195848.729731] Buffer I/O error on dev
sda1, logical block 0, lost sync page write
Jun 25 16:05:56 rpi4 kernel: [6195848.729771] EXT4-fs (sda1): I/O error
while writing superblock
Jun 25 16:05:56 rpi4 kernel: [6195848.729780] EXT4-fs (sda1): Remounting filesystem read-only
Jun 25 16:05:56 rpi4 kernel: [6195848.930557] EXT4-fs warning (device
sda1): ext4_end_bio:344: I/O error 10 writing to inode 2885595 starting
block 25061375)
Jun 25 16:05:56 rpi4 kernel: [6195848.930642] EXT4-fs (sda1): failed to
convert unwritten extents to written extents -- potential data loss!
(inode 2885595, error -30)
Jun 25 16:05:56 rpi4 kernel: [6195848.943721] Buffer I/O error on device
sda1, logical block 25052204
Jun 25 16:05:56 rpi4 kernel: [6195848.950710] Buffer I/O error on device
sda1, logical block 25052205
Jun 25 16:05:56 rpi4 kernel: [6195848.957371] Buffer I/O error on device
sda1, logical block 25052206
Jun 25 16:05:56 rpi4 kernel: [6195848.964194] Buffer I/O error on device
sda1, logical block 25052207
Jun 25 16:05:57 rpi4 kernel: [6195849.358845] EXT4-fs error (device
sda1): __ext4_find_entry:1663: inode #2949122: comm tail: reading
directory lblock 0
Jun 25 16:05:57 rpi4 kernel: [6195849.358841] EXT4-fs error (device
sda1): __ext4_find_entry:1663: inode #2949122: comm tail: reading
directory lblock 0
Jun 25 16:05:57 rpi4 kernel: [6195849.358919] Buffer I/O error on dev
sda1, logical block 0, lost sync page write
Jun 25 16:05:57 rpi4 kernel: [6195849.388672] EXT4-fs (sda1): I/O error
while writing superblock
Jun 25 16:05:57 rpi4 kernel: [6195849.394937] Buffer I/O error on dev
sda1, logical block 0, lost sync page write
Jun 25 16:05:57 rpi4 kernel: [6195849.402628] EXT4-fs (sda1): I/O error
while writing superblock
Jun 25 16:05:57 rpi4 kernel: [6195849.859353] EXT4-fs error (device
sda1): ext4_wait_block_bitmap:531: comm MQTT-rpi3a: Cannot read block
bitmap - block_group = 2418, block_bitmap = 79167490
Jun 25 16:05:57 rpi4 kernel: [6195849.873920] Buffer I/O error on dev
sda1, logical block 0, lost sync page write
Jun 25 16:05:57 rpi4 kernel: [6195849.881928] EXT4-fs (sda1): I/O error
while writing superblock
Jun 25 16:05:57 rpi4 kernel: [6195849.881954] EXT4-fs error (device
sda1): ext4_discard_preallocations:5036: comm MQTT-rpi3a: Error -5
loading buddy information for 2418
Jun 25 16:05:57 rpi4 kernel: [6195849.901153] Buffer I/O error on dev
sda1, logical block 0, lost sync page write
Jun 25 16:05:57 rpi4 kernel: [6195849.908875] EXT4-fs (sda1): I/O error
while writing superblock
Jun 25 16:05:57 rpi4 systemd[1]: docker-000b2ff9ea29a56854ce883f927b06ef2a3445ee58dc2d4d0638abe4e9d043c0.scope: Deactivated successfully.
Jun 25 16:05:57 rpi4 systemd[1]: docker-000b2ff9ea29a56854ce883f927b06ef2a3445ee58dc2d4d0638abe4e9d043c0.scope: Consumed 4h 33min 20.171s CPU time.
Jun 25 16:05:58 rpi4 kernel: [6195850.370638] EXT4-fs error (device
sda1): __ext4_find_entry:1663: inode #2949122: comm tail: reading
directory lblock 0
Jun 25 16:05:58 rpi4 kernel: [6195850.382076] Buffer I/O error on dev
sda1, logical block 0, lost sync page write
Jun 25 16:05:58 rpi4 kernel: [6195850.389058] EXT4-fs error (device
sda1): __ext4_find_entry:1663: inode #2949122: comm tail: reading
directory lblock 0
Jun 25 16:05:58 rpi4 kernel: [6195850.390048] EXT4-fs (sda1): I/O error
while writing superblock
Jun 25 16:05:58 rpi4 kernel: [6195850.407332] Buffer I/O error on dev
sda1, logical block 0, lost sync page write
Jun 25 16:05:58 rpi4 kernel: [6195850.415395] EXT4-fs (sda1): I/O error
while writing superblock
Jun 25 16:05:58 rpi4 containerd[909]: time="2023-06-25T16:05:58.855133545+01:00" level=info msg="shim
disconnected" id=000b2ff9ea29a56854ce883f927b06ef2a3445ee58dc2d4d0638abe4e9d043c0
Jun 25 16:05:58 rpi4 containerd[909]: time="2023-06-25T16:05:58.875593006+01:00" level=warning msg="cleaning
up after shim disconnected" id=000b2ff9ea29a56854ce883f927b06ef2a3445ee58dc2d4d0638abe4e9d043c0 namespace=moby
Jun 25 16:05:58 rpi4 containerd[909]: time="2023-06-25T16:05:58.875654302+01:00" level=info msg="cleaning up
dead shim"
Jun 25 16:05:58 rpi4 dockerd[218433]: time="2023-06-25T16:05:58.873599939+01:00" level=info msg="ignoring
event" container=000b2ff9ea29a56854ce883f927b06ef2a3445ee58dc2d4d0638abe4e9d043c0 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Jun 25 16:05:58 rpi4 containerd[909]: time="2023-06-25T16:05:58.976245851+01:00" level=warning msg="cleanup
warnings time=\"2023-06-25T16:05:58+01:00\" level=info msg=\"starting
signal loop\" namespace=moby pid=1261463 runtime=io.containerd.runc.v2\n"
Jun 25 16:05:59 rpi4 kernel: [6195851.105773] EXT4-fs warning (device
sda1): htree_dirblock_to_tree:1072: inode #3145730: lblock 0: comm
python3: error -5 reading directory block
Jun 25 16:05:59 rpi4 kernel: [6195851.139008] EXT4-fs error (device
sda1): __ext4_find_entry:1663: inode #5898243: comm dockerd: reading
directory lblock 0
Jun 25 16:05:59 rpi4 dockerd[218433]: time="2023-06-25T16:05:59.100406766+01:00" level=warning msg="failed to
get endpoint_count map for scope local: open /mnt/ssd/var/lib/docker/network/files/local-kv.db: input/output error"
Jun 25 16:05:59 rpi4 kernel: [6195851.184699] sd 0:0:0:0: [sda]
Synchronizing SCSI cache
Jun 25 16:05:59 rpi4 kernel: [6195851.195331] EXT4-fs error (device
sda1): __ext4_find_entry:1663: inode #5898243: comm dockerd: reading
directory lblock 0
Jun 25 16:05:59 rpi4 kernel: [6195851.216288] EXT4-fs error (device
sda1): __ext4_find_entry:1663: inode #5898243: comm dockerd: reading
directory lblock 0
Jun 25 16:05:59 rpi4 kernel: [6195851.285295] sd 0:0:0:0: [sda]
Synchronize Cache(10) failed: Result: hostbyte=DID_ERROR
driverbyte=DRIVER_OK
Jun 25 16:05:59 rpi4 systemd[1]: Unmounting /mnt/ssd/var/lib/docker/overlay2/5629a6e8d4bea9ddc2810428b2930991bd61f40a11a9ed9dc92b92c8b84f7b09/merged...
Jun 25 16:05:59 rpi4 dockerd[218433]: time="2023-06-25T16:05:59.142223677+01:00" level=warning msg="failed to
get endpoint_count map for scope local: open /mnt/ssd/var/lib/docker/network/files/local-kv.db: input/output error"
Jun 25 16:05:59 rpi4 dockerd[218433]: time="2023-06-25T16:05:59.162758619+01:00" level=warning msg="failed to
get endpoint_count map for scope local: open /mnt/ssd/var/lib/docker/network/files/local-kv.db: input/output error"
Jun 25 16:05:59 rpi4 multipathd[470]: sda: path already removed

--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | Fido<>Usenet Gateway (3:770/3)

From Vincent Coen@2:250/1 to Pancho on Fri Jul 7 14:32:21 2023

Hello Pancho!

Friday July 07 2023 11:45, you wrote to All:

I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to
SATA adapter, shared via Samba. It had worked this way for years.

A couple of weeks ago it went offline, and wouldn't remount. I
switched off power and restarted, and it worked OK, for a bit, but it dismounted again, after about 24 hours. I changed the USB/SATA cable,
but the problem persisted.

As this was my main server, which I needed to work, I bought a new
SSD, and copied everything across. Everything all now works fine.

The thing is, I can't now see anything wrong with the problematic SSD.
fsck says it is OK. Smartctl says it is OK, but can't run a long test
(the long test always says "Aborted by host" 90% remaining). The SSD
now stays mounted on another machine, still USB.

My guess is there was something like a bad block, which caused the SSD
to dismount when it was accessed. Now that the SSD isn't being used
heavily, the problem just doesn't show up, it stays mounted.
Previously, It was being used for a security camera, so a fair bit of writing.

I know I should just bin the drive, but I'm curious, is there a better
way of testing it, finding a fault?

Just a weak guess but do you have the package linux-utils installed that has fstrim. It might be a slightly different name for your distro as you have not specified what you are using.

If so run sudo fstrim -av and see what you get once it is online.

Let it run for 30 minutes then rerun and see what the size is (you should ignore the first one after a reboot but make a note of the size which should
be the same as the partition size - more or less).

This does not work when using a M.2 device and least here - it may be treating it differently but no idea why.

I use fstrim on all Linux system that has a SSD installed and run it every 24 hours on system that are up 24/7 and with one every 12 hours because it get a lot of new or amended data in that period.

The above is to service garbage collection. This is not needed for Windows as that does it internally and very well.

Vincent

--- Mageia Linux v8 X64/Mbse v1.0.8.3/GoldED+/LNX 1.1.5-b20180707
* Origin: Air Applewood, The Linux Gateway to the UK & Eire (2:250/1)

From The Natural Philosopher@3:770/3 to Pancho on Fri Jul 7 14:44:08 2023

On 07/07/2023 11:45, Pancho wrote:

I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA adapter, shared via Samba. It had worked this way for years.

A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.

As this was my main server, which I needed to work, I bought a new SSD,
and copied everything across. Everything all now works fine.

The thing is, I can't now see anything wrong with the problematic SSD.
fsck says it is OK. Smartctl says it is OK, but can't run a long test
(the long test always says "Aborted by host" 90% remaining). The SSD now stays mounted on another machine, still USB.

My guess is there was something like a bad block, which caused the SSD
to dismount when it was accessed. Now that the SSD isn't being used
heavily, the problem just doesn't show up, it stays mounted. Previously,
It was being used for a security camera, so a fair bit of writing.

I know I should just bin the drive, but I'm curious, is there a better
way of testing it, finding a fault?

compare its SMART with the good one...
--
There’s a mighty big difference between good, sound reasons and reasons
that sound good.

Burton Hillis (William Vaughn, American columnist)

--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | Fido<>Usenet Gateway (3:770/3)

From Theo@3:770/3 to Pancho on Fri Jul 7 15:30:42 2023

Pancho <Pancho.Jones@proton.me> wrote:

Here is a bit of /var/log/syslog.1. I think is relevant. rsnapshot is
good, I think the line after is things going bad.

Jun 25 16:00:15 rpi4 rsnapshot[1261324]: /usr/bin/rsnapshot alpha:
completed successfully
Jun 25 16:05:01 rpi4 CRON[1261432]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Jun 25 16:05:28 rpi4 kernel: [6195820.664216] sd 0:0:0:0: [sda] tag#20 uas_eh_abort_handler 0 uas-tag 4 inflight: CMD OUT

That looks sick. uas is USB Attached SCSI, so the above means the Linux
kernel tried to issue a UAS command and for some reason it failed.

Eventually the errors stack up and Linux tries to do things like resetting
the device in the hope it'll fix it, but they don't.

The first thing I'd do is try updating the firmware: https://wiki.gentoo.org/wiki/Samsung_SSD_Firmware

That may not work on a Pi as you can't run x86 binaries, so you may have to resort to the Samsung Magician Windows tool. As well as firmware updates,
that supposedly has a health check feature. Maybe that will tell you what's going on.

If you're been doing a lot of security camera writes it's possible it's worn out the SSD.

Theo

--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | Fido<>Usenet Gateway (3:770/3)

From crn@nospam.com@3:770/3 to Pancho on Fri Jul 7 16:12:59 2023

Pancho <Pancho.Jones@proton.me> wrote:

I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA adapter, shared via Samba. It had worked this way for years.

A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.

As this was my main server, which I needed to work, I bought a new SSD,
and copied everything across. Everything all now works fine.

The thing is, I can't now see anything wrong with the problematic SSD.
fsck says it is OK. Smartctl says it is OK, but can't run a long test
(the long test always says "Aborted by host" 90% remaining). The SSD now stays mounted on another machine, still USB.

My guess is there was something like a bad block, which caused the SSD
to dismount when it was accessed. Now that the SSD isn't being used
heavily, the problem just doesn't show up, it stays mounted. Previously,
It was being used for a security camera, so a fair bit of writing.

I know I should just bin the drive, but I'm curious, is there a better
way of testing it, finding a fault?

Badblocks

--
http://www.netunix.com/

--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | Fido<>Usenet Gateway (3:770/3)

From Jim Jackson@3:770/3 to crn@nospam.com on Sat Jul 8 09:40:31 2023

On 2023-07-07, crn@nospam.com <crn@nospam.com> wrote:

Pancho <Pancho.Jones@proton.me> wrote:

I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA
adapter, shared via Samba. It had worked this way for years.

A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted
again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.

As this was my main server, which I needed to work, I bought a new SSD,
and copied everything across. Everything all now works fine.

The thing is, I can't now see anything wrong with the problematic SSD.
fsck says it is OK. Smartctl says it is OK, but can't run a long test
(the long test always says "Aborted by host" 90% remaining). The SSD now
stays mounted on another machine, still USB.

My guess is there was something like a bad block, which caused the SSD
to dismount when it was accessed. Now that the SSD isn't being used
heavily, the problem just doesn't show up, it stays mounted. Previously,
It was being used for a security camera, so a fair bit of writing.

I know I should just bin the drive, but I'm curious, is there a better
way of testing it, finding a fault?

Badblocks

Don't SSD's reserve some spare "blocks" (whatever they are called) that
get used to replace faulty ones. Won't smartctl tell you if that has
happenned?

--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | Fido<>Usenet Gateway (3:770/3)

From druck@3:770/3 to Jim Jackson on Sat Jul 8 20:42:49 2023

On 08/07/2023 10:40, Jim Jackson wrote:

On 2023-07-07, crn@nospam.com <crn@nospam.com> wrote:

Pancho <Pancho.Jones@proton.me> wrote:

I know I should just bin the drive, but I'm curious, is there a better
way of testing it, finding a fault?

Badblocks

Don't SSD's reserve some spare "blocks" (whatever they are called) that
get used to replace faulty ones. Won't smartctl tell you if that has happenned?

smartctl was designed for hard discs which are far simpler, only having
sectors on spinning rust, it may or may not show the relocated sector
count for SSDs, but this might not be the whole story.

An SSD may have multiple types of flash, with the main bulk being slower
and cheaper 2, 3 or 4 bits per cell, but also a cache of faster
expensive single bit per cell flash. There is also the storage which
contains the crucial mapping of logical addresses to flash blocks.

A failure of a certain percentage of the main flash can normally be accommodated by over provisioning, but a failure of the cache or mapping
store may cause the SSD to go in to read only mode.

Sometimes a complete reformat of flash memory devices can appear to
temporarily clear problems, but the wear life doesn't magically come
back, and would you want to trust it again?

---druck

--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | Fido<>Usenet Gateway (3:770/3)

From Pancho@3:770/3 to Theo on Sat Jul 8 20:52:41 2023

On 7/7/23 15:30, Theo wrote:

Pancho <Pancho.Jones@proton.me> wrote:

Here is a bit of /var/log/syslog.1. I think is relevant. rsnapshot is
good, I think the line after is things going bad.

Jun 25 16:00:15 rpi4 rsnapshot[1261324]: /usr/bin/rsnapshot alpha:
completed successfully
Jun 25 16:05:01 rpi4 CRON[1261432]: (root) CMD (command -v debian-sa1 >
/dev/null && debian-sa1 1 1)
Jun 25 16:05:28 rpi4 kernel: [6195820.664216] sd 0:0:0:0: [sda] tag#20
uas_eh_abort_handler 0 uas-tag 4 inflight: CMD OUT

That looks sick. uas is USB Attached SCSI, so the above means the Linux kernel tried to issue a UAS command and for some reason it failed.

Eventually the errors stack up and Linux tries to do things like resetting the device in the hope it'll fix it, but they don't.

The first thing I'd do is try updating the firmware: https://wiki.gentoo.org/wiki/Samsung_SSD_Firmware

That may not work on a Pi as you can't run x86 binaries, so you may have to resort to the Samsung Magician Windows tool. As well as firmware updates, that supposedly has a health check feature. Maybe that will tell you what's going on.

I upgraded the firmware with Samsung Magician. Samsung Magician does not
have diagnostics for the EVO 850. All in all, it is pretty crap, for
anything apart from the firmware update.

If you're been doing a lot of security camera writes it's possible it's worn out the SSD.

SMART says 83% healthy, meaning the Wear Levelling Count. The drive has
59 TB written, which is less than the warranty 150 TBW. These stats were
better than the system SSD of the Windows machine I put it into.

The warranty is also 5 years. Coincidentally, this problem seems to have occurred at almost exactly 5 years Power-on Hours, but I guess the
warranty was ownership time, not power on time.

One suggestion to check health was to clone the drive, and look for
errors, but I'd already used rsync to copy the entire drive.

It all seems like more hard work than it should be. I'll see if I can
find a simple use for it where I care even less about reliability. Some
kind of cache or sync drive. Maybe I'll see if I can run 3 drives off
the same rpi4 :-).

--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | Fido<>Usenet Gateway (3:770/3)

From John Aldridge@3:770/3 to All on Tue Jul 18 17:21:06 2023

In article <u88tbp$26mf3$1@solani.org>, alien@comet.invalid says...

I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA >adapter, shared via Samba. It had worked this way for years.

A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted >again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.

I had a similar problem with an old Pi,
and it turned out to be the power supply module,
now replaced with original raspi supply, no more problems.

FWIW, I've seen behaviour like that too. In the case I saw, it was a USB
stick which would dismount every few minutes. And the fix, like yours,
was to replace a presumably dodgy PSU with an offical RPi one.

John

--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | Fido<>Usenet Gateway (3:770/3)

From Pancho@3:770/3 to John Aldridge on Tue Jul 18 18:01:27 2023

On 18/07/2023 17:21, John Aldridge wrote:

In article <u88tbp$26mf3$1@solani.org>, alien@comet.invalid says...

I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA
adapter, shared via Samba. It had worked this way for years.

A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted
again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.

I had a similar problem with an old Pi,
and it turned out to be the power supply module,
now replaced with original raspi supply, no more problems.

FWIW, I've seen behaviour like that too. In the case I saw, it was a USB stick which would dismount every few minutes. And the fix, like yours,
was to replace a presumably dodgy PSU with an offical RPi one.

John

I too, have had problems with power and USB drives, one of my SATA
adapters is powered, not the one for the drive that failed. I think the
drive also failed in the powered adapter too, but I wasn't methodical
enough to be sure. I was getting uptight about losing some of my
essential services.

In this instance, I suspect it isn't the problem, given it has worked
for years, and I have a 20 amp USB power supply, no under voltage
entries in the log.

Anyway, I umed and ahed about it. I couldn't decide if the SSD is crap
or not. The new, working, SSD replacement, is lower power.

So earlier today, I ordered a USB/SATA dual docking station. I'm going
to put the old drive in there and just use it for CCTV, at a higher
refresh rate than before, because I no longer care about it.

We'll see what happens :-).

--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | Fido<>Usenet Gateway (3:770/3)

Who's Online
Recent Visitors
- Gwylbert
  Sun Aug 24 00:06:36 2025
  from Sydney, Nsw via Telnet
- Bob Worm
  Sat Aug 23 21:17:17 2025
  from Wales, Uk via Telnet
- Craig Colby
  Sat Aug 23 03:22:40 2025
  from Lewiston, Maine via Telnet
- Bob Worm
  Fri Aug 22 22:17:23 2025
  from Wales, Uk via Telnet
- Centurion
  Fri Aug 22 21:33:08 2025
  from Berea, Ohio via Telnet
- Bob Worm
  Fri Aug 22 17:24:20 2025
  from Wales, Uk via Telnet
- Bob Worm
  Fri Aug 22 08:41:05 2025
  from Wales, Uk via Telnet
- Plume
  Fri Aug 22 07:29:56 2025
  from Uk via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	537
Nodes:	16 (0 / 16)
Uptime:	156:52:15
Calls:	10,251
Calls today:	1
Files:	13,983
Messages:	6,406,364

USB SSD randomly unmounting

Who's Online

Recent Visitors

System Info