Forum: >>> Magnum BBS <<<

[gentoo-user] e2fsck -c when bad blocks are in existing file?

From Grant Edwards@21:1/5 to All on Tue Nov 8 04:40:01 2022

I've got an SSD that's failing, and I'd like to know what files
contain bad blocks so that I don't attempt to copy them to the
replacement disk.

According to e2fsck(8):

-c This option causes e2fsck to use badblocks(8) program to do a
read-only scan of the device in order to find any bad blocks. If
any bad blocks are found, they are added to the bad block inode
to prevent them from being allocated to a file or directory. If
this option is specified twice, then the bad block scan will be
done using a non-destructive read-write test.

What happens when the bad block is _already_allocated_ to a file?

--
Grant

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael@21:1/5 to All on Tue Nov 8 13:20:40 2022

On Tuesday, 8 November 2022 03:31:07 GMT Grant Edwards wrote:

I've got an SSD that's failing, and I'd like to know what files
contain bad blocks so that I don't attempt to copy them to the
replacement disk.

According to e2fsck(8):

-c This option causes e2fsck to use badblocks(8) program to do
a read-only scan of the device in order to find any bad blocks. If any
bad blocks are found, they are added to the bad block inode to prevent them from being allocated to a file or directory. If this option is specified twice, then the bad block scan will be done using a non-destructive read-write test.

What happens when the bad block is _already_allocated_ to a file?

--
Grant

Previously allocated to a file and now re-allocated or not, my understanding
is with spinning disks the data in a bad block stays there unless you've dd'ed some zeros over it. Even then read or write operations could fail if the
block is too far gone.[1] Some data recovery applications will try to read data off a bad block in different patterns to retrieve what's there. Once the bad block is categorized as such it won't be used by the filesystem to write new data to it again.

With SSDs the situation is less deterministic, because the disk's internal
wear levelling firmware moves things around according to its algorithms to remap bad blocks. This is all transparent to the filesystem, block addresses sent to the fs are virtual anyway. Bypassing the firmware controller to
access individual cells on an SSD requires specialist equipment and your own lab, although things may have evolved since I last looked into this.

The general advice is to avoid powering down an SSD which is suspected of corruption, until all the data is copied/recovered off it first. If you power it down, data on it may never be accessible again without the aforementioned lab.

BTW, running badblocks in read-write mode on an ailing/aged SSD may exacerbate the problem without much benefit by accelerating wear and causing additional cells to fail. At the same time you could be relying on the suspect disk firmware to access via its virtual map the data on some of its cells. Data scrubbing (btrfs, zfs) and recent backups would probably be a better strategy with SSDs.

[1] https://www.smartmontools.org/wiki/BadBlockHowto
-----BEGIN PGP SIGNATURE-----

iQIzBAABCAAdFiEEXqhvaVh2ERicA8Ceseqq9sKVZxkFAmNqV6gACgkQseqq9sKV ZxkN7w//UkZkxfLQUwULIifY3rzn74viNTdGy82iNL52TucPVcitP7TQDcM9Yz0g IneLZIkWa37bL+IeWxeemSl4aQEEFcZcQGkfL3z+tL6VZj3vitH/NhvLp55NTAUT wd/dE1v4YL0ooXh8ABPXlzsQ5HQten0I06Kgy3syFtpbZMe9dgI6csLa8LEluoal Wtrp2KdKVl3hjRPgo5nNMpn2CPQRk0/QRfMa+0cm2ebBBua1q+AtQIFIlQGxfn9s XOPge0rV6EuBqgr++xNzYagho8bcRnlr5Yzvcv1c+4MlFXbd907Gc5on+LErbtUC J9SyMzefYreqM3Oo8RDu8xENI4ygn9BGVlrtQDCRZC02a6OilA5jdWYSMtDlPATT so4uckVnr3LR1by9pY9qlpFiEcGC7fjt3eO+vy31DLA4nmuxFcQ7Y8jIxe96/uS0 7Bw7Edwz83NarU0NnJT4EX8VWypItxM2OFWjhX7JhdXEDE+8phIXVX4lyCq6xylk rCuqbnocpX3i3zugXxRWWzBTxiup60Ea1EsT7hlZ4+HXjaoMewFCQvFQi1kT7dQB /C+bDpskRsrMuAojduU8wl+B/RMRwCvXtYy59DdzAd04sIzbIp/ZEQrG8ORZSzas /5+vMSBb+P317TPx/bqWY0P4dOZ/4NJKxyMhcZmTDvR1o9orRtI=
=myhI
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Wols Lists@21:1/5 to Michael on Tue Nov 8 19:30:01 2022

On 08/11/2022 13:20, Michael wrote:

On Tuesday, 8 November 2022 03:31:07 GMT Grant Edwards wrote:

I've got an SSD that's failing, and I'd like to know what files
contain bad blocks so that I don't attempt to copy them to the
replacement disk.

According to e2fsck(8):

-c This option causes e2fsck to use badblocks(8) program to do
a read-only scan of the device in order to find any bad blocks. If any
bad blocks are found, they are added to the bad block inode to prevent
them from being allocated to a file or directory. If this option is
specified twice, then the bad block scan will be done using a
non-destructive read-write test.

What happens when the bad block is _already_allocated_ to a file?

--
Grant

Previously allocated to a file and now re-allocated or not, my understanding is with spinning disks the data in a bad block stays there unless you've dd'ed
some zeros over it. Even then read or write operations could fail if the block is too far gone.[1] Some data recovery applications will try to read data off a bad block in different patterns to retrieve what's there. Once the
bad block is categorized as such it won't be used by the filesystem to write new data to it again.

With SSDs the situation is less deterministic, because the disk's internal wear levelling firmware moves things around according to its algorithms to remap bad blocks. This is all transparent to the filesystem, block addresses sent to the fs are virtual anyway. Bypassing the firmware controller to access individual cells on an SSD requires specialist equipment and your own lab, although things may have evolved since I last looked into this.

Which is actually pretty much exactly the same as what happens with
spinning rust.

The primary aim of a hard drive - SSD or spinning rust - is to save the
user's data. If the drive can't read the data it will do nothing save
returning a read error. Think about it - any other action will simply
make matters worse, namely the drive is actively destroying possibly-salvageable data.

All being well, the user has raid or backups, and will be able to
re-write the file, at which point the drive will attempt recovery, as it
now has KNOWN GOOD data. If the write fails, the block will then be
added to the *drive internal* badblock list, and will be remapped elsewhere.

MODERN DRIVES SHOULD NEVER HAVE AN OS-LEVEL BADBLOCKS LIST. If they do, something is seriously wrong, because the drive should be hiding it from
the OS.

The general advice is to avoid powering down an SSD which is suspected of corruption, until all the data is copied/recovered off it first. If you power
it down, data on it may never be accessible again without the aforementioned lab.

Seriously, this is EXTREMELY GOOD advice. I don't know whether it is
still true, but there have been plenty of stories in the past about
SSDs, when they get too many errors, they self-destruct on power-down!!!

This imho is a serious design fault - you can't recover data from an SSD
that won't boot - but the fact is it appears to be a deliberate decision
by the manufacturers.

BTW, running badblocks in read-write mode on an ailing/aged SSD may exacerbate
the problem without much benefit by accelerating wear and causing additional cells to fail. At the same time you could be relying on the suspect disk firmware to access via its virtual map the data on some of its cells. Data scrubbing (btrfs, zfs) and recent backups would probably be a better strategy with SSDs.

Yup. If you suspect badblocks have damaged your data, you need backups
or raid. And then don't worry about it - apart from making sure your
drives look healthy and replacing any that are dodgy.

Just make sure you interpret smartmontools data correctly - perfectly
healthy drives can drop dead for no apparent reason, and drives that
look at death's door will carry on for ever. In particular, read errors
aren't serious unless they are accompanied by a growing number of
relocation errors. If the relocation number jumps, watch it. If it
doesn't move while you're watching, it was probably a glitch and the
drive is okay. But use your head and be sensible. Any sign of regular
failed writes, BIN THE DRIVE.

(I think my 8TB drive says 1 read error per less-than-two end-to-end
scans is well within spec...)

Cheers,
Wol

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael@21:1/5 to All on Wed Nov 9 08:46:43 2022

On Tuesday, 8 November 2022 18:24:41 GMT Wols Lists wrote:

MODERN DRIVES SHOULD NEVER HAVE AN OS-LEVEL BADBLOCKS LIST. If they do, something is seriously wrong, because the drive should be hiding it from
the OS.

If you run badblocks or e2fsck you'll find the application asks to write data to the disk, at the end of the run. Yes, the drive's firmware should manage badblocks transparently to the filesystem, but I have observed in hdparm
output reallocations of badblocks do not happen in real time. Perhaps the filesystem level badblocks list which is LBA based, acts as an intermediate step until the hardware triggers a reallocation? Not sure. :-/

-----BEGIN PGP SIGNATURE-----

iQIzBAABCAAdFiEEXqhvaVh2ERicA8Ceseqq9sKVZxkFAmNraPMACgkQseqq9sKV ZxkMIA//crauUwOGQzA4eDxc+98IfyFPuos0YIQjq/B/ry25V94iTCdhKs2leI5u z9Q+9VbXoBkL9ZwNvuZs9D0loO2+PACe0UQC7hxWTw+gPEISneF1AI5kxSwn38Cw 6NmIKvZQIHf+D5I749vYnLAapBaBmSnzIJfGJ0TbYFCIoC1IJ2o4/jT35d1317dR n2cLosKBm3Fa+xh0t/rkSl3ASzdI7OwUH0VG7ty8qXvhCc+eMuBJhKB399fSSWgv dONY5TW8fZ90sl/QKOa9+H6c5N/FwN1O94mrhJaVezIOA17b/ESMg97uoETqSsWf 2QfBQxCAeltiZc/5J1/Hkrd6rSpiTlDZcIK+hBG33CbaD5YX784+LTmRGSLU4Lgy 8qXcRewrG/hQ04qUzRsQZqCJVjIBQ7JAy+87A3Wrk02BLXpg1SApQUPcW5slGV8l 6cOGswhEi/tykNUXCkm6aA7evYPXNaLnpNUja741BlvbZu/H+8KrQZIm/v/EdNuo t3g+1/N/ft9XNh6ruIbjTawh4JaQRmnmt03V3lctAqozj47C7BMmtegndFr1oWsm QCir3AKNz2eIc7WGrNDikdgiVcp01xFbrrbL7mNutHczxpFxUfk6Y86fPOOxplfE 9ejEbQ346Gu2zI6wl2fwMkkRycrs8xBbsq3m2A7t0eWXyy8+DUU=
=V46a
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael@21:1/5 to All on Sat Nov 12 13:38:36 2022

On Wednesday, 9 November 2022 16:53:13 GMT Laurence Perkins wrote:

-----Original Message-----
From: Michael <confabulate@kintzios.com>
Sent: Wednesday, November 9, 2022 12:47 AM
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] e2fsck -c when bad blocks are in existing file?

On Tuesday, 8 November 2022 18:24:41 GMT Wols Lists wrote:

MODERN DRIVES SHOULD NEVER HAVE AN OS-LEVEL BADBLOCKS LIST. If they
do, something is seriously wrong, because the drive should be hiding
it from the OS.

If you run badblocks or e2fsck you'll find the application asks to write >data to the disk, at the end of the run. Yes, the drive's firmware should >manage badblocks transparently to the filesystem, but I have observed in >hdparm output reallocations of badblocks do not happen in real time. >Perhaps the filesystem level badblocks list which is LBA based, acts as an >intermediate step until the hardware triggers a reallocation? Not sure. >:-/

Badblocks doesn't ask to write anything at the end of the run. You tell it whether you want a read test, a write-read test or a
read-write-read-replace test at the beginning.

Not to labour the point, but 'e2fsck -v -c' runs a read test and at the end it informs me "... Updating bad block inode", even if it came across no read errors (0/0/0) and consequently does not prompt for a fs repair.

-----BEGIN PGP SIGNATURE-----

iQIzBAABCAAdFiEEXqhvaVh2ERicA8Ceseqq9sKVZxkFAmNvodwACgkQseqq9sKV ZxlDyw/9Hk4rqdoUNSg2rcFwtbxtZKAYg7RAiuqFJU4dvRHh+eI5BEcPGU8H7GR8 KZKmAS4aux+c80EdCnVuJppkDSfTyroJJqJaA2SS0EZWVawyIgYYzOXhUw9psB3g ot3bW2OJks+1t+xtHCKZI25ykjs76dgeyFDQMX8p3av/sX7lQtlMAy+5m+nrSRDW dv2EvtaYoJVayvWJJd68oUWF6Mf2FNGduvG7hxU3/ZlxT8adQ2XF6vmHUemvgwnN Tdf34F5C9Vt8Y3aBfAUleoaP37Jeyu+P/BeVjmKnutIJY7KvtTz407Y/HJKNiF2w N+LY6ee5K+xPuR5LwwzAH9Woi6y0oMIPlDtq5JSufxknckQ8ekIl3eahk1HW3Tlm cM+HqOJdJzuwKUZ7moNAY5r+Y2jnrP7NpdzAQ5dkbjRkUdSdpr4m44is2N7oQDkV tpfb3x50QAW3nrjdbi+6CeA3RRxbPDIEIXBWrrK+NPOT1EBCgjAbzaw33xdCSnDY sKDuixdQD+X/INehB+hyD5RSruZVtD1XbxNzn/oGDCTe1jQvtEjEYezlahPD0phi zEgEBzW+8IMvp42MdXzYToWyOFpDoMCIK7BfoXb5aL6UTimPSdveVHczPRvaU6yv 5igBsJqXxQFYZf4rOipy8vE1Qkkn9sTyym/31A1pwcMj6psKDzA=
=hKmo
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Plume
  Sun Sep 14 09:34:52 2025
  from Uk via Raw
- Gretchiie
  Sun Sep 14 06:07:30 2025
  from Derry, Nh via Telnet
- Thlc
  Sat Sep 13 17:11:34 2025
  from Rognac, France via Telnet
- Thlc
  Sat Sep 13 17:04:03 2025
  from Rognac, France via Telnet
- Thlc
  Sat Sep 13 16:32:19 2025
  from Rognac, France via SSH
- Thlc
  Sat Sep 13 15:41:11 2025
  from Rognac, France via SSH
- Thlc
  Sat Sep 13 07:56:03 2025
  from Rognac, France via SSH
- Gretchiie
  Sat Sep 13 07:22:10 2025
  from Derry, Nh via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	546
Nodes:	16 (0 / 16)
Uptime:	165:43:41
Calls:	10,385
Calls today:	2
Files:	14,057
Messages:	6,416,525

[gentoo-user] e2fsck -c when bad blocks are in existing file?

Who's Online

Recent Visitors

System Info