Forum: >>> Magnum BBS <<<

Re: ext4 FS Crash

From Jochen Spieker@21:1/5 to All on Wed Dec 4 13:50:01 2024

Daniel Harris:

Not sure if I can attach a picture of the error messages but some of the errors are
ext4_find_entry
ext4_journal_check_start
ext4_setattr
mounting filesystem read-only

Please do not send any attachments here. I suggest you just copy and
paste the log messages from journalctl or wherever you can find them. It
is really hard to say anything specific without those messages.

I have done a file system check and everything seems ok and the drive
doesn't have any error messages and journalctl doesn't seem to log any
errors (not that i can find anyway).

Have you tried 'journalctl --dmesg'?

ps my system is a desktop system not a laptop if that makes a difference.

I do not think this matters here.

J.
--
I wish I had been aware enough to enjoy my time as a toddler.
[Agree] [Disagree]
<http://archive.slowlydownward.com/NODATA/data_enter2.html>

-----BEGIN PGP SIGNATURE-----

iQIzBAABCAAdFiEERCNn0ngYrOUG3zZFU4ruOUNvhZcFAmdQTd4ACgkQU4ruOUNv hZcKhQ//VFfxlV2WHf/F4pHGYirZgHQBIfNRjlM1vU2QP2MjCj3zonnwLwepBiSY IdeXQ+DbYchrNJs/NcU26++gXQBf7Y/DJgkGH50syIA4R8feQGVxOSFWwRf8V/HY gJg8hu6pYTXIqOAj4eMa6dc51iweeP9NW3NOsK8RFWnXgj4ZRYImAXWhxvsvOZ76 KXmJYpV5GQ6xDBvCiFHLIiasHRfGq0MzJL673ddUfCbiLev5y5yc+TZs+GJdFOV6 uX/QwmpKASgEDpqQJd2eddybCMNFi2Fq712+k6q0MXZ+Iv0UFmkMGW1SKKqtQfqY wYJ353r0Dw+gz+JGrvBSXLdWJnzyiRNMgpY7HnCuI48V09C8uph5X1n8lwPZ4w2W QAVKOF/Q15PoyZV5AwnUrbtiYsDigwGyCPb+Fr3hwr4/VCmM0HaJIfsCb6dcM8bL 4dHT1vf89qg2ucb4Trl73HgcrOy4KNFO8dn43gEtTXnmMj1NJ7waFs07z6AWndqT Jc1+E1gH2Usr7mNE8BdpTNbamuAbBtPoTpJoTrBYglkPoeNUI27y4NIxz7/I4ObU ERHOMe9lhHxkAEVxOALaJhujtgtibCCiHjrKxIi9XFdRZ5yo3koyjWaiNLLiVYhg Bvf7hXjiCIrD4RCjZf3GKEnHGMZ1DE0x6mv9L/4qeRL7Vv8qhQM=
=2w0z
-----END PGP SIGNATURE-----

---

From Daniel Harris@21:1/5 to All on Wed Dec 4 13:30:01 2024

Hello

I have been using the stable branch but recently it has not been so
stable. I have experienced some unexpected behavior Not sure if its
related to this ubuntu bug ( https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1805816 )

Seeing the similarity especially that we are both using similar drives
(mine i( Samsung SSD 980 PRO 1TB) makes me think it might be a hardware instead of a software issue.

Anyway its interesting that it has only started recently. Different kernel possibly?

Not sure if I can attach a picture of the error messages but some of the
errors are
ext4_find_entry
ext4_journal_check_start
ext4_setattr
mounting filesystem read-only

I have done a file system check and everything seems ok and the drive
doesn't have any error messages and journalctl doesn't seem to log any
errors (not that i can find anyway).

I will have to disable the apst but i just thought this bug might have been fixed from 2018 but maybe not.

ps my system is a desktop system not a laptop if that makes a difference.

Thanks Dan

<div dir="ltr"><div>Hello</div><div><br></div><div>I have been using the stable branch but recently it has not been so stable. I have experienced some unexpected behavior Not sure if its related to this ubuntu bug ( <a href="https://bugs.launchpad.net/
ubuntu/+source/linux/+bug/1805816">https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1805816</a> )</div><div><br></div><div>Seeing the similarity especially that we are both using similar drives (mine i( Samsung SSD 980 PRO 1TB) makes me think it
might be a hardware instead of a software issue.</div><div><br></div><div>Anyway its interesting that it has only started recently. Different kernel possibly?</div><div><br></div><div>Not sure if I can attach a picture of the error messages but some of
the errors are <br></div><div>ext4_find_entry</div><div>ext4_journal_check_start</div><div>ext4_setattr</div><div>mounting filesystem read-only</div><div><br></div><div>I have done a file system check and everything seems ok and the drive doesn't
have any error messages and journalctl doesn't seem to log any errors (not that i can find anyway).</div><div><br></div><div>I will have to disable the apst but i just thought this bug might have been fixed from 2018 but maybe not.</div><div><br></

<div>ps my system is a desktop system not a laptop if that makes a difference.</div><div><br></div><div>Thanks Dan<br></div></div>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Eike Lantzsch ZP5CGE / KY4PZ@21:1/5 to All on Wed Dec 4 15:40:02 2024

On Wednesday, 4 December 2024 09:29:17 -03 Daniel Harris wrote:

Hello

I have been using the stable branch but recently it has not been so
stable. I have experienced some unexpected behavior Not sure if its
related to this ubuntu bug ( https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1805816 )

Seeing the similarity especially that we are both using similar drives
(mine i( Samsung SSD 980 PRO 1TB) makes me think it might be a
hardware instead of a software issue.

Anyway its interesting that it has only started recently. Different
kernel possibly?

Not sure if I can attach a picture of the error messages but some of
the errors are
ext4_find_entry
ext4_journal_check_start
ext4_setattr
mounting filesystem read-only

I have done a file system check and everything seems ok and the drive
doesn't have any error messages and journalctl doesn't seem to log any
errors (not that i can find anyway).

I will have to disable the apst but i just thought this bug might have
been fixed from 2018 but maybe not.

Kernel 4.19 I'd be very surprised if that would be related.

ps my system is a desktop system not a laptop if that makes a
difference.

Thanks Dan

Hello Dan,

I am using the exact same drive Samsung SSD 980 PRO 1TB in my desktop
and in my laptop.
Desktop with Debian Sid
Laptop with Ubuntu 24 LTS (TUXEDO)
Neither one nor the other has experienced ext4 FS crashes

I'd save my data a.s.a.p. and install a new NVME drive if I were in your
shoes.

have a nice day
--
Eike Lantzsch KY4PZ / ZP5CGE

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Klaus Singvogel@21:1/5 to Daniel Harris on Wed Dec 4 17:50:02 2024

Daniel Harris wrote:

Seeing the similarity especially that we are both using similar drives
(mine i( Samsung SSD 980 PRO 1TB) makes me think it might be a hardware instead of a software issue.

The referenced Samsung SSD 980 PRO has a critical firmware bug.

This bug hit me too. I had to replace my broken SSD by a new one, via the Samsung support.
I use a full-disk encryption, and therefore had a data loss - up to my last backup (5 days ago).

Some more details here: https://www.pugetsystems.com/support/guides/critical-samsung-ssd-firmware-update/

Best regards,
Klaus.
--
Klaus Singvogel
GnuPG-Key-ID: 1024R/5068792D 1994-06-27

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Daniel Harris@21:1/5 to All on Wed Dec 4 18:20:01 2024

Thanks for all your replies.
As far as I can tell there are no errors reported using fsck or smartctl or nvme
and the firmware is the correct and newest version so no problems there.

The following are the messages that appear but only taken from my phone and copied from the photo (lots of scrolling errors repeating over).
I thought these new drives were supposed to last longer than older moving
HDD but obviously not

I guess its time to buy a new drive : (

EXT4-fs (nvme0n1p2): Remounting filesystem read-onty
EXT4-fs error (device nvme01p2): ext4_journal_check_start:83: comm systemd-journal: Detected aborted Jounal
[633582.907324] EXT4-fs error (device nvme0n1p2): _ext4_find_entry:1683:
inode #5898248: comm ntpd: reading directory iblock 0
[633582 908250] EXT4-fs (nvme0n1p2) : Remounting filesystem read-only
EXT4-fs (nvme0n1p2): Remounting filesystem read-only

EXT4-fs error (device nvme0n1p2) : ext4_journal.check_start:83: comm systemd-journal: Detected aborted journal
[633582.912099] EXT4-fs (nvme0n1p2): Remounting filesystem read-only
EXT4-fs error (device nvme0n1p2): ext4_journal-check_start:83: comm systemd-journal: Detected aborted journal
[633582.916126] EXT4-fs (nvme0n1p2): Remounting filesystem read-only

(633583.797550) EXT4-fs error (device nvme0n1p2) in ext4_setattr:5628: —— Journal has aborted
(633583. 798466) EXT4-fs (nvme0n1p2): Remounting filesystem read-only
EXT4-fs error (device nvme01p2): ext4_journal_check_start:83: comm systemd-journal: Detected aborted Jounal
EXT4-fs (nvme@nip2): Remounting filesystem read-only

EXT4-fs error (device nvme0n1p2): _ext4_find_entry:1683: inode #1966081:
comm cron: reading directory Iblock 0
EXT4-fs (nvme@nip2): Remounting filesystem read-only
EXT4-fs error (device nvme0n1p2) in ext4_setattr:5628: —— Journal has aborted
EXT4-fs (nvme@nip2): Remounting filesystem read-only

On Wed, Dec 4, 2024 at 4:39 PM Klaus Singvogel <deb-user-ml@singvogel.net> wrote:

Daniel Harris wrote:

Seeing the similarity especially that we are both using similar drives (mine i( Samsung SSD 980 PRO 1TB) makes me think it might be a hardware instead of a software issue.

The referenced Samsung SSD 980 PRO has a critical firmware bug.

This bug hit me too. I had to replace my broken SSD by a new one, via the Samsung support.
I use a full-disk encryption, and therefore had a data loss - up to my
last backup (5 days ago).

Some more details here:

https://www.pugetsystems.com/support/guides/critical-samsung-ssd-firmware-update/

Best regards,
Klaus.
--
Klaus Singvogel
GnuPG-Key-ID: 1024R/5068792D 1994-06-27

<div dir="ltr"><div>Thanks for all your replies.</div><div>As far as I can tell there are no errors reported using fsck or smartctl or nvme<br></div><div> and the firmware is the correct and newest version so no problems there.</div><div><br></div><div>
The following are the messages that appear but only taken from my phone and copied from the photo (lots of scrolling errors repeating over).</div><div>I thought these new drives were supposed to last longer than older moving HDD but obviously not <br></

<div><br></div><div>I guess its time to buy a new drive : (<br></div><div><br></div><div>EXT4-fs (nvme0n1p2): Remounting filesystem read-onty<br>EXT4-fs error (device nvme01p2): ext4_journal_check_start:83: comm systemd-journal: Detected aborted

Jounal<br>[633582.907324] EXT4-fs error (device nvme0n1p2): _ext4_find_entry:1683: inode #5898248: comm ntpd: reading directory iblock 0<br>[633582 908250] EXT4-fs (nvme0n1p2) : Remounting filesystem read-only<br>EXT4-fs (nvme0n1p2): Remounting
filesystem read-only<br><br><br>EXT4-fs error (device nvme0n1p2) : ext4_journal.check_start:83: comm systemd-journal: Detected aborted journal<br>[633582.912099] EXT4-fs (nvme0n1p2): Remounting filesystem read-only<br>EXT4-fs error (device nvme0n1p2):
ext4_journal-check_start:83: comm systemd-journal: Detected aborted journal<br>[633582.916126] EXT4-fs (nvme0n1p2): Remounting filesystem read-only<br><br>(633583.797550) EXT4-fs error (device nvme0n1p2) in ext4_setattr:5628: —— Journal has aborted<

(633583. 798466) EXT4-fs (nvme0n1p2): Remounting filesystem read-only<br>EXT4-fs error (device nvme01p2): ext4_journal_check_start:83: comm systemd-journal: Detected aborted Jounal<br>EXT4-fs (nvme@nip2): Remounting filesystem read-only<br><br><br>

EXT4-fs error (device nvme0n1p2): _ext4_find_entry:1683: inode #1966081: comm cron: reading directory Iblock 0<br>EXT4-fs (nvme@nip2): Remounting filesystem read-only<br>EXT4-fs error (device nvme0n1p2) in ext4_setattr:5628: —— Journal has aborted<br>
EXT4-fs (nvme@nip2): Remounting filesystem read-only<br><br><br></div></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Wed, Dec 4, 2024 at 4:39 PM Klaus Singvogel <<a href="mailto:deb-user-ml@singvogel.
net">deb-user-ml@singvogel.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Daniel Harris wrote:<br>
> Seeing the similarity especially that we are both using similar drives<br> > (mine i( Samsung SSD 980 PRO 1TB) makes me think it might be a hardware<br>
> instead of a software issue.<br>

The referenced Samsung SSD 980 PRO has a critical firmware bug.<br>

This bug hit me too. I had to replace my broken SSD by a new one, via the Samsung support.<br>
I use a full-disk encryption, and therefore had a data loss - up to my last backup (5 days ago).<br>

Some more details here:<br>
<a href="https://www.pugetsystems.com/support/guides/critical-samsung-ssd-firmware-update/" rel="noreferrer" target="_blank">https://www.pugetsystems.com/support/guides/critical-samsung-ssd-firmware-update/</a><br>

Best regards,<br>
Klaus.<br>
-- <br>
Klaus Singvogel<br>
GnuPG-Key-ID: 1024R/5068792D 1994-06-27<br>
</blockquote></div>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andrew M.A. Cater@21:1/5 to All on Wed Dec 4 19:20:01 2024

On Wed, Dec 04, 2024 at 11:34:41AM -0300, Eike Lantzsch ZP5CGE / KY4PZ wrote:

On Wednesday, 4 December 2024 09:29:17 -03 Daniel Harris wrote:

Hello

I have been using the stable branch but recently it has not been so
stable. I have experienced some unexpected behavior Not sure if its related to this ubuntu bug ( https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1805816 )

I have done a file system check and everything seems ok and the drive doesn't have any error messages and journalctl doesn't seem to log any errors (not that i can find anyway).

I will have to disable the apst but i just thought this bug might have
been fixed from 2018 but maybe not.

Kernel 4.19 I'd be very surprised if that would be related.

Agreed - that's the Buster (Debian 10) kernel not the current kernel.

Thanks Dan

I'd save my data a.s.a.p. and install a new NVME drive if I were in your shoes.

have a nice day
--
Eike Lantzsch KY4PZ / ZP5CGE

All best, as ever,

Andy Cater
(amacater@debian.org)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael Stone@21:1/5 to Daniel Harris on Thu Dec 5 00:50:02 2024

On Wed, Dec 04, 2024 at 05:11:47PM +0000, Daniel Harris wrote:

Thanks for all your replies.
As far as I can tell there are no errors reported using fsck or smartctl or >nvme
and the firmware is the correct and newest version so no problems there.

The following are the messages that appear but only taken from my phone and >copied from the photo (lots of scrolling errors repeating over).
I thought these new drives were supposed to last longer than older moving HDD >but obviously not

Is this during boot? The messages indicate a corrupted journal, which
generally means a device error, or maybe a device which lost power while writing. It should be possible to mount read-only without replaying the
journal for recovery purposes, but it's basically unfixable.

I guess its time to buy a new drive : (

Did you try "nvme smart-log /dev/nvme0" to look for issues?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Klaus Singvogel@21:1/5 to Jeffrey Walton on Thu Dec 5 08:30:01 2024

Jeffrey Walton wrote:

On Wed, Dec 4, 2024 at 2:47 PM Klaus Singvogel wrote:

Some more details here: https://www.pugetsystems.com/support/guides/critical-samsung-ssd-firmware-update/

That's interesting (in a morbid sort of way).

Do you know if fwupdmgr will detect out-of-date firmware on the drives?

I'm sorry, but I don't know. I only became aware of fwupdmgr afterwards.

At least the replacement Samsung SSD was detected by fwupdmgr on my last run.

Best regards,
Klaus.
--
Klaus Singvogel
GnuPG-Key-ID: 1024R/5068792D 1994-06-27

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From =?UTF-8?Q?J=C3=B6rg-Volker_Peetz?=@21:1/5 to All on Thu Dec 5 10:30:01 2024

There are two things which could be tried with the SSD

1. SSD's have some self healing capacities (discarding defect sectors) which are
performed when the drive is not mounted. Therefore, enter the BIOS of the computer and let it running for ca. an hour. Then restart the computer.

2. After making a backup, do a "secure erase" of the SSD. Of course that needs reformatting the drive and rebuilding the system. I was able to revive a Samsung
960 Pro this way.

Regards,
Jörg.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From gene heskett@21:1/5 to Klaus Singvogel on Thu Dec 5 11:10:01 2024

On 12/5/24 02:23, Klaus Singvogel wrote:

Jeffrey Walton wrote:

On Wed, Dec 4, 2024 at 2:47 PM Klaus Singvogel wrote:

Some more details here:
https://www.pugetsystems.com/support/guides/critical-samsung-ssd-firmware-update/

That's interesting (in a morbid sort of way).

Do you know if fwupdmgr will detect out-of-date firmware on the drives?

I'm sorry, but I don't know. I only became aware of fwupdmgr afterwards.

At least the replacement Samsung SSD was detected by fwupdmgr on my last run.

Best regards,
Klaus.

interesting comments here. I've been using SSD's since 40G was the
biggest. The 256G spinning rust, now 15 years old is the only spinning
rust left here. And I've drawer full of samsung 860-870 series drives
that have all gone wonky but not RO yet.. I now have a mixture of stuff
from Taiwan in 2T and 4T sizes, all healthy. I guess I was not the only
one that got questionable drives from Samsung. This is the first time
I've seen them discussed in this context. Thank you for saying something
out loud.

Cheers, Gene Heskett, CET.
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
- Louis D. Brandeis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Daniel Harris@21:1/5 to mstone@debian.org on Thu Dec 5 12:00:01 2024

On Wed, Dec 4, 2024 at 11:43 PM Michael Stone <mstone@debian.org> wrote:

On Wed, Dec 04, 2024 at 05:11:47PM +0000, Daniel Harris wrote:

Thanks for all your replies.
As far as I can tell there are no errors reported using fsck or smartctl

or

nvme
and the firmware is the correct and newest version so no problems there.

The following are the messages that appear but only taken from my phone

and

copied from the photo (lots of scrolling errors repeating over).
I thought these new drives were supposed to last longer than older moving HDD
but obviously not

Is this during boot? The messages indicate a corrupted journal, which generally means a device error, or maybe a device which lost power while writing. It should be possible to mount read-only without replaying the journal for recovery purposes, but it's basically unfixable.

So its not actually a crash. On the 2 occasions it has happened, I have
been away from my computer for a while, and when I return and move the
mouse, I can see messages scrolling on a black screen (no X running). I
can move to a new vt but I cannot log in. When I try to log in I just get
the errors repeating on the screen. After I do a hard reset everything
works perfectly. No errors anywhere.

I guess its time to buy a new drive : (

Did you try "nvme smart-log /dev/nvme0" to look for issues?

seems normal to me

Smart Log for NVME device:nvme0 namespace-id:ffffffff
critical_warning : 0
temperature : 31°C (304 Kelvin)
available_spare : 100%
available_spare_threshold : 10%
percentage_used : 0%
endurance group critical warning summary: 0
Data Units Read : 807,634 (413.51 GB)
Data Units Written : 5,680,746 (2.91 TB) host_read_commands : 6,573,734
host_write_commands : 75,990,191
controller_busy_time : 1,145
power_cycles : 618
power_on_hours : 197
unsafe_shutdowns : 21
media_errors : 0
num_err_log_entries : 0
Warning Temperature Time : 0
Critical Composite Temperature Time : 0
Temperature Sensor 1 : 31°C (304 Kelvin)
Temperature Sensor 2 : 38°C (311 Kelvin)
Thermal Management T1 Trans Count : 0
Thermal Management T2 Trans Count : 0
Thermal Management T1 Total Time : 0
Thermal Management T2 Total Time : 0

Thanks Dan

<div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Wed, Dec 4, 2024 at 11:43 PM Michael Stone <<a href="mailto:mstone@debian.org">mstone@
debian.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Wed, Dec 04, 2024 at 05:11:47PM +0000, Daniel Harris wrote:<br>
>Thanks for all your replies.<br>
>As far as I can tell there are no errors reported using fsck or smartctl or<br>
>nvme<br>
> and the firmware is the correct and newest version so no problems there.<br>
><br>
>The following are the messages that appear but only taken from my phone and<br>
>copied from the photo (lots of scrolling errors repeating over).<br>
>I thought these new drives were supposed to last longer than older moving HDD<br>
>but obviously not<br>

Is this during boot? The messages indicate a corrupted journal, which <br> generally means a device error, or maybe a device which lost power while <br> writing. It should be possible to mount read-only without replaying the <br> journal for recovery purposes, but it's basically unfixable.<br></blockquote><div><br></div><div>So its not actually a crash. On the 2 occasions it has happened, I have been away from my computer for a while, and when I return and move the mouse, I
can see messages scrolling on a black screen (no X running). I can move to a new vt but I cannot log in. When I try to log in I just get the errors repeating on the screen. After I do a hard reset everything works perfectly. No errors anywhere.</

<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

>I guess its time to buy a new drive : (<br>

Did you try "nvme smart-log /dev/nvme0" to look for issues?<br> <br></blockquote><div><br></div><div>seems normal to me<br></div><div><br></div><div> Smart Log for NVME device:nvme0 namespace-id:ffffffff<br>critical_warning : 0<br>temperature �
� : 31°C (304 Kelvin)<br>available_spare : 100%<br>available_spare_threshold : 10%<br>percentage_used : 0%<br>endurance group critical warning summary: 0<br>
Data Units Read : 807,634 (413.51 GB)<br>Data Units Written : 5,680,746 (2.91 TB)<br>host_read_commands : 6,573,734<br>host_write_commands �
� : 75,990,191<br>controller_busy_time : 1,145<br>power_cycles : 618<br>power_on_hours : 197<br>unsafe_shutdowns �
� : 21<br>media_errors : 0<br>num_err_log_entries : 0<br>Warning Temperature Time : 0<br>Critical Composite Temperature Time : 0<br>Temperature
Sensor 1 : 31°C (304 Kelvin)<br>Temperature Sensor 2 : 38°C (311 Kelvin)<br>Thermal Management T1 Trans Count : 0<br>Thermal Management T2 Trans Count : 0<br>Thermal Management T1 Total Time :
0<br>Thermal Management T2 Total Time : 0</div><div><br></div><div><br></div><div>Thanks Dan<br></div></div></div></div></div>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Klaus Singvogel@21:1/5 to gene heskett on Thu Dec 5 13:10:01 2024

Hi Gene,

gene heskett wrote:

interesting comments here. I've been using SSD's since 40G was the biggest. The 256G spinning rust, now 15 years old is the only spinning rust left
here. And I've drawer full of samsung 860-870 series drives that have all gone wonky but not RO yet.. I now have a mixture of stuff from Taiwan in 2T and 4T sizes, all healthy. I guess I was not the only one that got questionable drives from Samsung. This is the first time I've seen them discussed in this context. Thank you for saying something out loud.

To point this out: it was only exactly this model from Samsung: SSD 980 PRO, which isn't working properly.
Repeat: only Samsung SSD 980 PRO.

It can be fixed by a Firmware upgrade, and more recently charges of Samsung SSD 980 PRO are flashed/sold with a good Firmware out-of-the-box.

Best regards,
Klaus.
--
Klaus Singvogel
GnuPG-Key-ID: 1024R/5068792D 1994-06-27

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael Stone@21:1/5 to Daniel Harris on Thu Dec 5 13:40:01 2024

On Thu, Dec 05, 2024 at 10:53:54AM +0000, Daniel Harris wrote:

So its not actually a crash.� On the 2 occasions it has happened, I have been >away from my computer for a while, and when I return and move the mouse, I can >see messages scrolling on a black screen (no X running).� I can move to a new >vt but I cannot log in.� When I try to log in I just get the errors repeating >on the screen.� After I do a hard reset everything works perfectly. No errors >anywhere.

Have you tried a memory test? Those symptoms and the smart output make
me think the problem is in hardware other than the drive itself. Memory
is the easiest to check and the easiest to remedy.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From gene heskett@21:1/5 to Klaus Singvogel on Thu Dec 5 13:40:01 2024

On 12/5/24 06:59, Klaus Singvogel wrote:

Hi Gene,

gene heskett wrote:

interesting comments here. I've been using SSD's since 40G was the biggest. >> The 256G spinning rust, now 15 years old is the only spinning rust left
here. And I've drawer full of samsung 860-870 series drives that have all
gone wonky but not RO yet.. I now have a mixture of stuff from Taiwan in 2T >> and 4T sizes, all healthy. I guess I was not the only one that got
questionable drives from Samsung. This is the first time I've seen them
discussed in this context. Thank you for saying something out loud.

To point this out: it was only exactly this model from Samsung: SSD 980 PRO, which isn't working properly.
Repeat: only Samsung SSD 980 PRO.

It can be fixed by a Firmware upgrade, and more recently charges of Samsung SSD 980 PRO are flashed/sold with a good Firmware out-of-the-box.

While I am saying that my results with earlier Samsung have been less
than glorious. triple layer nand's turning into half capacity for instance.

Best regards,
Klaus.

Cheers, Gene Heskett, CET.
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
- Louis D. Brandeis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael Stone@21:1/5 to gene heskett on Thu Dec 5 14:00:01 2024

On Thu, Dec 05, 2024 at 07:32:03AM -0500, gene heskett wrote:

While I am saying that my results with earlier Samsung have been less
than glorious. triple layer nand's turning into half capacity for
instance.

There's simply no real value in looking at historic bad models as a
guide to future performance (or the opposite). I can remember entire
lines of hard drives from reputable manufacturers which were plauged by premature failures to the point that I replaced some multiple times
under warranty before pulling them all (e.g., the IBM Deskstar 75GXP). I
can also remember SSDs which had problems with repeated file corruption
(OCZ Vertex, the only SSDs I ever saw reliably corrupt stored data).
Bottom line is that sometimes you'll get a dud, and it doesn't really
matter if you had a positive (or negative) experience with a
superficially similar product decades ago. The Samsung 980 pros with the
bad firmware were a ticking time bomb, but they haven't been sold with
that version for years, and they haven't had issues since the fix. Other Samsung SSDs have been fine. The 860s have relatively low write
endurance, but that's why they're as cheap as they are. You can either
avoid using them in write-intensive settings and get a drive advertised
for that role, or you can dramtically underprovision to lower the write
cycle of individual cells and create space for caching. That's true for
most low-cost drives, which is why they're low-cost, and why
high-write-cycle drives are fantastically expensive. The average
consumer will never write enough data to matter, but it is possible in pathological cases if something on the system goes nuts and starts
sync-writing a really large number of small blocks.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Klaus Singvogel@21:1/5 to gene heskett on Thu Dec 5 15:00:01 2024

gene heskett wrote:

On 12/5/24 06:59, Klaus Singvogel wrote:

It can be fixed by a Firmware upgrade, and more recently charges of Samsung SSD 980 PRO are flashed/sold with a good Firmware out-of-the-box.

While I am saying that my results with earlier Samsung have been less than glorious. triple layer nand's turning into half capacity for instance.

In my memory, the PRO version of Samsung SSDs (not: NVMe) survived the test of a reputable website - many years ago. All other SSDs died when the article was written, except the Samsung PRO SSD. Their test continued.

But, on the other hand, the regular version (no PRO in the name) of the Samsung SSD died first in the test.

I think the data written to the Samsung SSD in the test exceeded twice the MTBF rate, several hundred TB.

But I also noticed that the quality of Samsung SSDs has adapted to the quality of their competitors (not in the good way). I read a lot about the 980 PRO and the Firmware debacle, but also heard that the 990 PRO shouldn't be any better.

Best regards,
Klaus.
--
Klaus Singvogel
GnuPG-Key-ID: 1024R/5068792D 1994-06-27

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Erwan David@21:1/5 to Max Nikulin on Thu Dec 5 16:40:01 2024

On Thu, Dec 05, 2024 at 04:26:18PM CET, Max Nikulin <manikulin@gmail.com> said:

On 05/12/2024 16:19, Jörg-Volker Peetz wrote:

1. SSD's have some self healing capacities (discarding defect sectors) which are performed when the drive is not mounted. Therefore, enter the BIOS of the computer and let it running for ca. an hour. Then restart
the computer.

I am curious which way OS notifies a drive that it is mounted. I believed that drivers read and write blocks, maybe switch power save states, but
mount is performed on a higher level.

We would the drive need to be notified ?

--
Erwan David

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Daniel Harris@21:1/5 to mstone@debian.org on Thu Dec 5 16:40:01 2024

On Thu, Dec 5, 2024 at 12:33 PM Michael Stone <mstone@debian.org> wrote:

On Thu, Dec 05, 2024 at 10:53:54AM +0000, Daniel Harris wrote:

So its not actually a crash. On the 2 occasions it has happened, I have been
away from my computer for a while, and when I return and move the mouse,

I can

see messages scrolling on a black screen (no X running). I can move to a new
vt but I cannot log in. When I try to log in I just get the errors repeating
on the screen. After I do a hard reset everything works perfectly. No errors
anywhere.

Have you tried a memory test? Those symptoms and the smart output make
me think the problem is in hardware other than the drive itself. Memory
is the easiest to check and the easiest to remedy.

Memtest passed with no errors

Thanks
Dan

<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Thu, Dec 5, 2024 at 12:33 PM Michael Stone <<a href="mailto:mstone@debian.org">mstone@debian.org</a>> wrote:<br></div><
blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Thu, Dec 05, 2024 at 10:53:54AM +0000, Daniel Harris wrote:<br>
>So its not actually a crash. On the 2 occasions it has happened, I have been<br>
>away from my computer for a while, and when I return and move the mouse, I can<br>
>see messages scrolling on a black screen (no X running). I can move to a new<br>
>vt but I cannot log in. When I try to log in I just get the errors repeating<br>
>on the screen. After I do a hard reset everything works perfectly. No errors<br>
>anywhere.<br>

Have you tried a memory test? Those symptoms and the smart output make <br>
me think the problem is in hardware other than the drive itself. Memory <br>
is the easiest to check and the easiest to remedy.<br> <br></blockquote><div>Memtest passed with no errors</div><div><br></div><div>Thanks</div><div>Dan <br></div></div></div>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andy Smith@21:1/5 to Erwan David on Thu Dec 5 17:10:01 2024

Hi,

On Thu, Dec 05, 2024 at 04:32:30PM +0100, Erwan David wrote:

On Thu, Dec 05, 2024 at 04:26:18PM CET, Max Nikulin <manikulin@gmail.com> said:

On 05/12/2024 16:19, J�rg-Volker Peetz wrote:

1. SSD's have some self healing capacities (discarding defect sectors) which are performed when the drive is not mounted. Therefore, enter the BIOS of the computer and let it running for ca. an hour. Then restart
the computer.

I am curious which way OS notifies a drive that it is mounted. I believed that drivers read and write blocks, maybe switch power save states, but mount is performed on a higher level.

We would the drive need to be notified ?

J�rg-Volker claimed that there were self-healing routines that only
happen when an SSD is not mounted.

I am highly skeptical.

Thanks,
Andy

--
https://bitfolk.com/ -- No-nonsense VPS hosting

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael Stone@21:1/5 to Max Nikulin on Thu Dec 5 18:50:01 2024

On Thu, Dec 05, 2024 at 10:26:18PM +0700, Max Nikulin wrote:

On 05/12/2024 16:19, J�rg-Volker Peetz wrote:

1. SSD's have some self healing capacities (discarding defect
sectors) which are performed when the drive is not mounted.
Therefore, enter the BIOS of the computer and let it running for ca.
an hour. Then restart the computer.

I am curious which way OS notifies a drive that it is mounted. I
believed that drivers read and write blocks, maybe switch power save
states, but mount is performed on a higher level.

It doesn't: leaving the system unmounted ensures that the drive is idle,
but in general that's not necessary--just leaving the system alone will
usually have the same result unless you've got a runaway process chewing
on the disk. The SSD will do maintenance tasks when it's idle, or under pressure (has no other choice because there are no writable blocks
available).

The relevant limitation is that an SSD physical block can only be
written once and then needs to be erased before another write. Changing
a logical block writing the logical block to a different physical
location. Physical blocks vary in size but are many times the size of a
512 byte logically-addressible block. Many logical blocks (or versions
of the same logical block) can be written to a physical block, and
logical blocks that change leave unused older copies on the physical
block. The entire physical block must be erased to write anything to the now-unused portions. This means copying all of the in-use logical blocks
to a different physical block before erasing the original physical
block. The drive will try to keep a pool of writable physical locations,
and has a cache of faster storage to hold data pending a write to slower storage. Ideally your writes fit in cache, and the drive can do the
erasing and moving when the drive is idle. If you write more data than
can be cached, and there are no erased blocks to move data into, the
drive needs to relocate existing logical blocks to free up and erase
physical blocks before writing the new data. This has a significant
performance impact if you're trying to write faster than the drive can relocate/erase.

If you use fstrim/discard you'll notify the drive that certain logical
blocks are not in use, allowing the physical block to be erased without
the need to read & relocate those logical blocks. A block is marked unavailable/bad if it fails, and won't be used again. This will happen transparently if a block fails on erase/write (the data will simply be
written to a different physical block and the logical block is
unaffected). The drive will also notice if a physical block is readable
but degrading, and will stop using it once any logical blocks it
contains are written to a new physical block. If a block totally fails
on read (much less common) it can't be relocated and the OS will get
very non-transparent errors every time it tries to read that logical
block. If you have a logical block that can't be read, discarding it can effectively make it disappear (i.e., the drive marks it as unused
without needing to read it, and it will be available after it is written
to again). You may be able to revitalize a drive with a troublesome bad
block (e.g., underneath a directory entry so it can't be deleted and
trimmed) by trimming the entire drive and restoring from backup. This is
rare; in hundreds of TB of SSD I've encountered that situation exactly
once. In that case it may be just a fluke that won't reoccur, but
probably I wouldn't use that drive again (but if it was just a fluke,
the drive is likely fine and not using it is overly paranoid; the right
course of action is dependent on budget and risk tolerance).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From gene heskett@21:1/5 to Klaus Singvogel on Thu Dec 5 22:30:01 2024

On 12/5/24 08:55, Klaus Singvogel wrote:

gene heskett wrote:

On 12/5/24 06:59, Klaus Singvogel wrote:

It can be fixed by a Firmware upgrade, and more recently charges of Samsung SSD 980 PRO are flashed/sold with a good Firmware out-of-the-box.

While I am saying that my results with earlier Samsung have been less than >> glorious. triple layer nand's turning into half capacity for instance.

In my memory, the PRO version of Samsung SSDs (not: NVMe) survived the test of a reputable website - many years ago. All other SSDs died when the article was written, except the Samsung PRO SSD. Their test continued.

But, on the other hand, the regular version (no PRO in the name) of the Samsung SSD died first in the test.

I think the data written to the Samsung SSD in the test exceeded twice the MTBF rate, several hundred TB.

But I also noticed that the quality of Samsung SSDs has adapted to the quality of their competitors (not in the good way). I read a lot about the 980 PRO and the Firmware debacle, but also heard that the 990 PRO shouldn't be any better.

The memory business is very competitive, so this doesn't surprise me.
We've come a long way since the tv station I was working for pair $400
for a 4 static memory to populate an 1802 based S100 board. We call
todays slap it together and get it out the door attitude the bblb
syndrome. So my next experiment is to destry

Best regards,
Klaus.

Cheers, Gene Heskett, CET.
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
- Louis D. Brandeis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From gene heskett@21:1/5 to Erwan David on Fri Dec 6 02:00:01 2024

On 12/5/24 10:33, Erwan David wrote:

On Thu, Dec 05, 2024 at 04:26:18PM CET, Max Nikulin <manikulin@gmail.com> said:

On 05/12/2024 16:19, Jörg-Volker Peetz wrote:

1. SSD's have some self healing capacities (discarding defect sectors)
which are performed when the drive is not mounted. Therefore, enter the
BIOS of the computer and let it running for ca. an hour. Then restart
the computer.

I am curious which way OS notifies a drive that it is mounted. I believed
that drivers read and write blocks, maybe switch power save states, but
mount is performed on a higher level.

We would the drive need to be notified ?

Wrong question IMO. If as you say, and that makes perfect sense, if the
drive is boot mounted, there s/b a mechanism to advise the user that the
boot mount will be delayed until such time as the drive reports its
validation is completed. That would serve the purpose of advising the
user that its use-by date is rapidly approaching. As is, we are at the
mercy of the drive maker until it goes RO. And that most certainly is
not a desirable situation.

I think the case with u-sd's used for nearly everything in the arm
arena, seem to be capable of doing this "housekeeping" while mounted is
a good thing as I only power them down to do my mods, and my failure
rate on those is actually much better. A kill-a-watt says my rebuilt but standing idle between jobs printers is drawing 14 watts. The pi clones
have had zero u-sd card failures in over a decade..

Cheers, Gene Heskett, CET.
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
- Louis D. Brandeis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael Stone@21:1/5 to All on Fri Dec 6 15:00:02 2024

On Fri, Dec 06, 2024 at 02:26:23PM +0100, J�rg-Volker Peetz wrote:

Should have been more clear. The drive should be idle for a longer
time. This is assured by not mounting any partition of the SSD.
I was able to "repair" unreadable sectors on a built-in SSD of an
HP-Probook laptop. As far as I remember I also deleted files which
could not be read any more because of defective sectors and restored
the files from backup. Such unreadable files can be found by
performing, e.g., a checksum calculation of all files on the SSD.
Then, leaving the SSD alone, it was able to "replace" the defective
sectors by spare sectors.

Sorry, I don't buy that. Whatever happened, it wasn't the drive
pondering unreadable sectors and then regenerating them. I can believe
that deleting unreadable files and restoring them made them readable
again. (Overwriting a bad sector will cause the original block to be
freed and potentially discarded; after rewriting, the data is not in the
same physical location it was before.) As outlined in a previous post,
trimming unused space may also let the drive discard bad blocks. None of
that requires the drive to be unmounted.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From =?UTF-8?Q?J=C3=B6rg-Volker_Peetz?=@21:1/5 to Michael Stone on Fri Dec 6 14:30:02 2024

Hi,

Michael Stone wrote on 05/12/2024 18:41:

On Thu, Dec 05, 2024 at 10:26:18PM +0700, Max Nikulin wrote:

On 05/12/2024 16:19, Jörg-Volker Peetz wrote:

1. SSD's have some self healing capacities (discarding defect sectors) which
are performed when the drive is not mounted. Therefore, enter the BIOS of the
computer and let it running for ca. an hour. Then restart the computer.

I am curious which way OS notifies a drive that it is mounted. I believed that
drivers read and write blocks, maybe switch power save states, but mount is >> performed on a higher level.

It doesn't: leaving the system unmounted ensures that the drive is idle, but in
general that's not necessary--just leaving the system alone will usually have the same result unless you've got a runaway process chewing on the disk. The SSD
will do maintenance tasks when it's idle, or under pressure (has no other choice
because there are no writable blocks available).

Should have been more clear. The drive should be idle for a longer time. This is
assured by not mounting any partition of the SSD.
I was able to "repair" unreadable sectors on a built-in SSD of an HP-Probook laptop. As far as I remember I also deleted files which could not be read any more because of defective sectors and restored the files from backup. Such unreadable files can be found by performing, e.g., a checksum calculation of all
files on the SSD. Then, leaving the SSD alone, it was able to "replace" the defective sectors by spare sectors.

Regards,
Jörg.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From =?UTF-8?Q?J=C3=B6rg-Volker_Peetz?=@21:1/5 to Michael Stone on Fri Dec 6 15:10:01 2024

Hi,

Michael Stone wrote on 06/12/2024 14:49:

On Fri, Dec 06, 2024 at 02:26:23PM +0100, Jörg-Volker Peetz wrote:

Should have been more clear. The drive should be idle for a longer time. This
is assured by not mounting any partition of the SSD.
I was able to "repair" unreadable sectors on a built-in SSD of an HP-Probook >> laptop. As far as I remember I also deleted files which could not be read any
more because of defective sectors and restored the files from backup. Such >> unreadable files can be found by performing, e.g., a checksum calculation of >> all files on the SSD. Then, leaving the SSD alone, it was able to "replace" >> the defective sectors by spare sectors.

Sorry, I don't buy that. Whatever happened, it wasn't the drive pondering unreadable sectors and then regenerating them. I can believe that deleting unreadable files and restoring them made them readable again. (Overwriting a bad
sector will cause the original block to be freed and potentially discarded; after rewriting, the data is not in the same physical location it was before.)

Yes, you are right. I also called 'fstrim -a' before restarting the computer into BIOS.

As outlined in a previous post, trimming unused space may also let the drive discard bad blocks. None of that requires the drive to be unmounted.

May be that would have worked if I had waited long enough. But I did it by letting the computer stay in BIOS for a while.

As a check if the defective sectors are all mapped out I did read all sectors of
the partitions:

sudo dd if=/dev/sdaX of=/dev/null bs=8M status=progress

Regards,
Jörg.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael Stone@21:1/5 to Max Nikulin on Fri Dec 6 17:40:02 2024

On Fri, Dec 06, 2024 at 10:51:20PM +0700, Max Nikulin wrote:

Michael, thank you for the long message. Actually I wonder what is
"idle" that allows drive to perform self-maintenance. I expect that
the device should not be in some deep power saving state (I am yet to >discover available tunables that allows drive to "sleep"). Should it
be some period of time (seconds? minutes?) completely without any IO
or is it enough if read/write speed is below some threshold and
from/to another chip?

Basically it means that it isn't busy doing I/O; if you're reading or
writing, the drive can't also be reading and writing. It doesn't need to
be absolutely unused.

As to erase block size, I am aware of it. On the other hand I am
surprised that a drive does not allow kernel to optimize writes on a
higher level (as uSD does):

grep '' /sys/block/*/queue/discard_granularity
...
/sys/block/mmcblk0/queue/discard_granularity:4194304 >/sys/block/nvme0n1/queue/discard_granularity:4096 >/sys/block/sda/queue/discard_granularity:4096 # hdd (shingled)

The discard_granularity *limits* how the kernel can tell the drive that
there are free blocks--a granularity of 4M means that the kernel can
only issue a TRIM command when it has at least 4M of empty space *and*
that empty space is aligned on a 4M boundary. (That is, you can't
discard locations 2-5M on the drive, only 0-3M, 4-7M, etc.) It's a big
number on the sd card because sd cards are pretty much junk. On a decent
NVMe drive it'll typically be 512 (i.e., you can discard any logical
block) or maybe 4096 if you're in 4k mode.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael Stone@21:1/5 to Max Nikulin on Sun Dec 8 18:20:01 2024

On Sun, Dec 08, 2024 at 11:26:51PM +0700, Max Nikulin wrote:

I switched this NVME drive to 4k mode. However I considered your
message as statement that internally drives still use higher erase
block size

The erase block is going to be many megabytes, it has nothing to do with
the logical blocks. The erase block isn't erased as each logical block
is written, it is erased when it's empty. Many logical blocks can be
written (sequentially) over time to the same erase block. Some drives
work better with 4k logical blocks but in general I don't recommend
using them--having a mix of 4k and 512b blocks on a system is a bit of a
pain, and it makes replacing a drive more complicated. Not all drives
support 4k, and many that do get no benefit from such a configuration.
E.g.:

# nvme id-ns -H /dev/nvme0n1 | grep Rel
LBA Format 0 : Metadata Size: 0 bytes - Data Size: 512 bytes - Relative Performance: 0 Best (in use)
LBA Format 1 : Metadata Size: 0 bytes - Data Size: 4096 bytes - Relative Performance: 0 Best

This drive supports either format, but both are "Best". Other drives
will recommend one or the other:

# nvme id-ns -H /dev/nvme0n1 | grep Rel
LBA Format 0 : Metadata Size: 0 bytes - Data Size: 512 bytes - Relative Performance: 0x2 Good (in use)
LBA Format 1 : Metadata Size: 0 bytes - Data Size: 4096 bytes - Relative Performance: 0x1 Better

Or support only one:

# nvme id-ns -H /dev/nvme1n1 | grep Rel
LBA Format 0 : Metadata Size: 0 bytes - Data Size: 512 bytes - Relative Performance: 0 Best (in use)

I'd rather just stick with 512b and not worry about it. Of the drives
above, the one which doesn't care is the newest/fastest. It likely
supports 4k format because it's a U.3 drive which could be used in a
storage array already configured for 4k. The other two are in the same
LVM VG, and if one of them were formatted 4k I wouldn't be able to
migrate volumes between them (so any possible, likely not noticeable, performance benefit from a 4k format would be outweighed by the
inconvenience.) The one that recommends a 4k format is the oldest,
smallest, and slowest by far (Gen3 vs Gen4) and at best is half the
speed of the others, regardless of format.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Daniel Harris@21:1/5 to manikulin@gmail.com on Mon Jan 13 16:30:01 2025

Just an Update to this thread.

It was actually a software bug in desktop-portal (or something like that).
Once that was removed my system has been rock solid.

Thanks Dan

On Tue, Dec 10, 2024 at 3:41 AM Max Nikulin <manikulin@gmail.com> wrote:

On 09/12/2024 00:14, Michael Stone wrote:

Not all drives
support 4k, and many that do get no benefit from such a configuration.

[...]

# nvme id-ns -H /dev/nvme0n1 | grep Rel
LBA Format 0 : Metadata Size: 0 bytes - Data Size: 512 bytes -
Relative Performance: 0x2 Good (in use)
LBA Format 1 : Metadata Size: 0 bytes - Data Size: 4096 bytes -
Relative Performance: 0x1 Better

It is my case. I decided that ext4 uses 4k blocks anyway, so it is
better to be consistent with hardware&firmware developers.

As to erase block size, my expectation is that some drivers might
benefit if that size is known: flushing caches (especially in laptop
mode), allocating space for new files. I have no evidences that it is implemented though. Perhaps dedicated chips and caches inside drives may
do it more efficiently (besides dumb cheap models).

mkfs.* tools might use erase block size to align filesystem structures.

It is the reason why I was surprised that erase block size is not
exposed to kernel.

My real curiosity was caused by "not mounting" a drive to allow self
healing. "Idle" is imprecise from my point of view, but I think we may
stop here. There is a chance that I will accidentally notice a detailed article on this topic.

<div dir="ltr"><div>Just an Update to this thread.</div><div><br></div><div>It was actually a software bug in desktop-portal (or something like that). Once that was removed my system has been rock solid.</div><div><br></div><div>Thanks Dan<br></div></

<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Dec 10, 2024 at 3:41 AM Max Nikulin <<a href="mailto:manikulin@gmail.com" target="_blank">manikulin@gmail.com</a>> wrote:<br></div><

blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 09/12/2024 00:14, Michael Stone wrote:<br>
> Not all drives <br>
> support 4k, and many that do get no benefit from such a configuration. <br>
[...]<br>
> # nvme id-ns -H /dev/nvme0n1 | grep Rel<br>
> LBA Format 0 : Metadata Size: 0 bytes - Data Size: 512 bytes - <br> > Relative Performance: 0x2 Good (in use)<br>
> LBA Format 1 : Metadata Size: 0 bytes - Data Size: 4096 bytes - <br> > Relative Performance: 0x1 Better<br>

It is my case. I decided that ext4 uses 4k blocks anyway, so it is <br>
better to be consistent with hardware&firmware developers.<br>

As to erase block size, my expectation is that some drivers might <br>
benefit if that size is known: flushing caches (especially in laptop <br> mode), allocating space for new files. I have no evidences that it is <br> implemented though. Perhaps dedicated chips and caches inside drives may <br> do it more efficiently (besides dumb cheap models).<br>

mkfs.* tools might use erase block size to align filesystem structures.<br>

It is the reason why I was surprised that erase block size is not <br>
exposed to kernel.<br>

My real curiosity was caused by "not mounting" a drive to allow self <br>
healing. "Idle" is imprecise from my point of view, but I think we may <br>
stop here. There is a chance that I will accidentally notice a detailed <br> article on this topic.<br>

</blockquote></div></div>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	546
Nodes:	16 (2 / 14)
Uptime:	153:20:57
Calls:	10,383
Files:	14,054
Messages:	6,417,839

Re: ext4 FS Crash

Who's Online

System Info