I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA >adapter, shared via Samba. It had worked this way for years.
A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted >again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.
As this was my main server, which I needed to work, I bought a new SSD,
and copied everything across. Everything all now works fine.
The thing is, I can't now see anything wrong with the problematic SSD.
fsck says it is OK. Smartctl says it is OK, but can't run a long test
(the long test always says "Aborted by host" 90% remaining). The SSD now >stays mounted on another machine, still USB.
My guess is there was something like a bad block, which caused the SSD
to dismount when it was accessed. Now that the SSD isn't being used
heavily, the problem just doesn't show up, it stays mounted. Previously,
It was being used for a security camera, so a fair bit of writing.
I know I should just bin the drive, but I'm curious, is there a better
way of testing it, finding a fault?
Pancho <Pancho.Jones@proton.me> wrote:
I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA
adapter, shared via Samba. It had worked this way for years.
A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted
again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.
As this was my main server, which I needed to work, I bought a new SSD,
and copied everything across. Everything all now works fine.
The thing is, I can't now see anything wrong with the problematic SSD.
fsck says it is OK. Smartctl says it is OK, but can't run a long test
(the long test always says "Aborted by host" 90% remaining). The SSD now
stays mounted on another machine, still USB.
My guess is there was something like a bad block, which caused the SSD
to dismount when it was accessed.
That isn't a normal respose in Linux. For ext file systems the
behaviour is:
errors={continue|remount-ro|panic}
Define the behavior when an error is encountered. (Either ig-
nore errors and just mark the filesystem erroneous and continue,
or remount the filesystem read-only, or panic and halt the sys-
tem.) The default is set in the filesystem superblock, and can
be changed using tune2fs(8).
(from EXT4(5))
For FAT filesystems it's:
errors={panic|continue|remount-ro}
Specify FAT behavior on critical errors: panic, continue without
doing anything, or remount the partition in read-only mode (de-
fault behavior).
(from MOUNT(8))
So between continuing and halting the system, the only other option
should be remounting read-only, not unmounting the filesystem.
Perhaps there was some service unmounting it automatically because
it thought the drive had been disconnected? Anyway it shouldn't be
able to happen due to read/write errors.
I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA adapter, shared via Samba. It had worked this way for years.
A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.
As this was my main server, which I needed to work, I bought a new SSD,
and copied everything across. Everything all now works fine.
The thing is, I can't now see anything wrong with the problematic SSD.
fsck says it is OK. Smartctl says it is OK, but can't run a long test
(the long test always says "Aborted by host" 90% remaining). The SSD now stays mounted on another machine, still USB.
My guess is there was something like a bad block, which caused the SSD
to dismount when it was accessed.
On a sunny day (Fri, 7 Jul 2023 11:45:24 +0100) it happened Pancho <Pancho.Jones@proton.me> wrote in <u88qc5$1a435$1@dont-email.me>:
I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA
adapter, shared via Samba. It had worked this way for years.
A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted
again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.
As this was my main server, which I needed to work, I bought a new SSD,
and copied everything across. Everything all now works fine.
The thing is, I can't now see anything wrong with the problematic SSD.
fsck says it is OK. Smartctl says it is OK, but can't run a long test
(the long test always says "Aborted by host" 90% remaining). The SSD now
stays mounted on another machine, still USB.
My guess is there was something like a bad block, which caused the SSD
to dismount when it was accessed. Now that the SSD isn't being used
heavily, the problem just doesn't show up, it stays mounted. Previously,
It was being used for a security camera, so a fair bit of writing.
I know I should just bin the drive, but I'm curious, is there a better
way of testing it, finding a fault?
I had a similar problem with an old Pi,
and it turned out to be the power supply module,
now replaced with original raspi supply, no more problems.
What did / does
dmesg
show?
I sometimes see this on my Pi4 8GB:
[1484748.627699] hwmon hwmon1: Voltage normalised
[1484750.707706] hwmon hwmon1: Undervoltage detected!
[1484754.867040] hwmon hwmon1: Voltage normalised
that is with a 3.8 TB USB Toshiba harddisk on a Siecom USB hub,
so far no crashes... Runs on a UPS..
I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to
SATA adapter, shared via Samba. It had worked this way for years.
A couple of weeks ago it went offline, and wouldn't remount. I
switched off power and restarted, and it worked OK, for a bit, but it dismounted again, after about 24 hours. I changed the USB/SATA cable,
but the problem persisted.
As this was my main server, which I needed to work, I bought a new
SSD, and copied everything across. Everything all now works fine.
The thing is, I can't now see anything wrong with the problematic SSD.
fsck says it is OK. Smartctl says it is OK, but can't run a long test
(the long test always says "Aborted by host" 90% remaining). The SSD
now stays mounted on another machine, still USB.
My guess is there was something like a bad block, which caused the SSD
to dismount when it was accessed. Now that the SSD isn't being used
heavily, the problem just doesn't show up, it stays mounted.
Previously, It was being used for a security camera, so a fair bit of writing.
I know I should just bin the drive, but I'm curious, is there a better
way of testing it, finding a fault?
I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA adapter, shared via Samba. It had worked this way for years.
A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.
As this was my main server, which I needed to work, I bought a new SSD,
and copied everything across. Everything all now works fine.
The thing is, I can't now see anything wrong with the problematic SSD.
fsck says it is OK. Smartctl says it is OK, but can't run a long test
(the long test always says "Aborted by host" 90% remaining). The SSD now stays mounted on another machine, still USB.
My guess is there was something like a bad block, which caused the SSD
to dismount when it was accessed. Now that the SSD isn't being used
heavily, the problem just doesn't show up, it stays mounted. Previously,
It was being used for a security camera, so a fair bit of writing.
I know I should just bin the drive, but I'm curious, is there a better
way of testing it, finding a fault?
Here is a bit of /var/log/syslog.1. I think is relevant. rsnapshot is
good, I think the line after is things going bad.
Jun 25 16:00:15 rpi4 rsnapshot[1261324]: /usr/bin/rsnapshot alpha:
completed successfully
Jun 25 16:05:01 rpi4 CRON[1261432]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Jun 25 16:05:28 rpi4 kernel: [6195820.664216] sd 0:0:0:0: [sda] tag#20 uas_eh_abort_handler 0 uas-tag 4 inflight: CMD OUT
I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA adapter, shared via Samba. It had worked this way for years.
A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.
As this was my main server, which I needed to work, I bought a new SSD,
and copied everything across. Everything all now works fine.
The thing is, I can't now see anything wrong with the problematic SSD.
fsck says it is OK. Smartctl says it is OK, but can't run a long test
(the long test always says "Aborted by host" 90% remaining). The SSD now stays mounted on another machine, still USB.
My guess is there was something like a bad block, which caused the SSD
to dismount when it was accessed. Now that the SSD isn't being used
heavily, the problem just doesn't show up, it stays mounted. Previously,
It was being used for a security camera, so a fair bit of writing.
I know I should just bin the drive, but I'm curious, is there a better
way of testing it, finding a fault?
Pancho <Pancho.Jones@proton.me> wrote:
I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA
adapter, shared via Samba. It had worked this way for years.
A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted
again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.
As this was my main server, which I needed to work, I bought a new SSD,
and copied everything across. Everything all now works fine.
The thing is, I can't now see anything wrong with the problematic SSD.
fsck says it is OK. Smartctl says it is OK, but can't run a long test
(the long test always says "Aborted by host" 90% remaining). The SSD now
stays mounted on another machine, still USB.
My guess is there was something like a bad block, which caused the SSD
to dismount when it was accessed. Now that the SSD isn't being used
heavily, the problem just doesn't show up, it stays mounted. Previously,
It was being used for a security camera, so a fair bit of writing.
I know I should just bin the drive, but I'm curious, is there a better
way of testing it, finding a fault?
Badblocks
On 2023-07-07, crn@nospam.com <crn@nospam.com> wrote:
Pancho <Pancho.Jones@proton.me> wrote:
I know I should just bin the drive, but I'm curious, is there a better
way of testing it, finding a fault?
Badblocks
Don't SSD's reserve some spare "blocks" (whatever they are called) that
get used to replace faulty ones. Won't smartctl tell you if that has happenned?
Pancho <Pancho.Jones@proton.me> wrote:
Here is a bit of /var/log/syslog.1. I think is relevant. rsnapshot is
good, I think the line after is things going bad.
Jun 25 16:00:15 rpi4 rsnapshot[1261324]: /usr/bin/rsnapshot alpha:
completed successfully
Jun 25 16:05:01 rpi4 CRON[1261432]: (root) CMD (command -v debian-sa1 >
/dev/null && debian-sa1 1 1)
Jun 25 16:05:28 rpi4 kernel: [6195820.664216] sd 0:0:0:0: [sda] tag#20
uas_eh_abort_handler 0 uas-tag 4 inflight: CMD OUT
That looks sick. uas is USB Attached SCSI, so the above means the Linux kernel tried to issue a UAS command and for some reason it failed.
Eventually the errors stack up and Linux tries to do things like resetting the device in the hope it'll fix it, but they don't.
The first thing I'd do is try updating the firmware: https://wiki.gentoo.org/wiki/Samsung_SSD_Firmware
That may not work on a Pi as you can't run x86 binaries, so you may have to resort to the Samsung Magician Windows tool. As well as firmware updates, that supposedly has a health check feature. Maybe that will tell you what's going on.
If you're been doing a lot of security camera writes it's possible it's worn out the SSD.
I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA >adapter, shared via Samba. It had worked this way for years.
A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted >again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.
I had a similar problem with an old Pi,
and it turned out to be the power supply module,
now replaced with original raspi supply, no more problems.
In article <u88tbp$26mf3$1@solani.org>, alien@comet.invalid says...
I had an 500 GB Samsung EVO 850, connected to an rPi4 via a USB to SATA
adapter, shared via Samba. It had worked this way for years.
A couple of weeks ago it went offline, and wouldn't remount. I switched
off power and restarted, and it worked OK, for a bit, but it dismounted
again, after about 24 hours. I changed the USB/SATA cable, but the
problem persisted.
I had a similar problem with an old Pi,
and it turned out to be the power supply module,
now replaced with original raspi supply, no more problems.
FWIW, I've seen behaviour like that too. In the case I saw, it was a USB stick which would dismount every few minutes. And the fix, like yours,
was to replace a presumably dodgy PSU with an offical RPi one.
John
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 371 |
Nodes: | 16 (2 / 14) |
Uptime: | 37:31:42 |
Calls: | 7,932 |
Calls today: | 2 |
Files: | 12,998 |
Messages: | 5,805,631 |