Not sure if I can attach a picture of the error messages but some of the errors are
ext4_find_entry
ext4_journal_check_start
ext4_setattr
mounting filesystem read-only
I have done a file system check and everything seems ok and the drive
doesn't have any error messages and journalctl doesn't seem to log any
errors (not that i can find anyway).
ps my system is a desktop system not a laptop if that makes a difference.
<div>ps my system is a desktop system not a laptop if that makes a difference.</div><div><br></div><div>Thanks Dan<br></div></div>
HelloKernel 4.19 I'd be very surprised if that would be related.
I have been using the stable branch but recently it has not been so
stable. I have experienced some unexpected behavior Not sure if its
related to this ubuntu bug ( https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1805816 )
Seeing the similarity especially that we are both using similar drives
(mine i( Samsung SSD 980 PRO 1TB) makes me think it might be a
hardware instead of a software issue.
Anyway its interesting that it has only started recently. Different
kernel possibly?
Not sure if I can attach a picture of the error messages but some of
the errors are
ext4_find_entry
ext4_journal_check_start
ext4_setattr
mounting filesystem read-only
I have done a file system check and everything seems ok and the drive
doesn't have any error messages and journalctl doesn't seem to log any
errors (not that i can find anyway).
I will have to disable the apst but i just thought this bug might have
been fixed from 2018 but maybe not.
ps my system is a desktop system not a laptop if that makes a
difference.
Thanks Dan
Seeing the similarity especially that we are both using similar drives
(mine i( Samsung SSD 980 PRO 1TB) makes me think it might be a hardware instead of a software issue.
Daniel Harris wrote:
Seeing the similarity especially that we are both using similar drives (mine i( Samsung SSD 980 PRO 1TB) makes me think it might be a hardware instead of a software issue.
The referenced Samsung SSD 980 PRO has a critical firmware bug.
This bug hit me too. I had to replace my broken SSD by a new one, via the Samsung support.
I use a full-disk encryption, and therefore had a data loss - up to my
last backup (5 days ago).
Some more details here:
https://www.pugetsystems.com/support/guides/critical-samsung-ssd-firmware-update/
Best regards,
Klaus.
--
Klaus Singvogel
GnuPG-Key-ID: 1024R/5068792D 1994-06-27
<div><br></div><div>I guess its time to buy a new drive : (<br></div><div><br></div><div>EXT4-fs  (nvme0n1p2): Remounting filesystem read-onty<br>EXT4-fs  error (device nvme01p2): ext4_journal_check_start:83: comm systemd-journal: Detected abortedJounal<br>[633582.907324] EXT4-fs error (device nvme0n1p2): _ext4_find_entry:1683: inode #5898248: comm ntpd: reading directory iblock 0<br>[633582 908250] EXT4-fs (nvme0n1p2) : Remounting filesystem read-only<br>EXT4-fs (nvme0n1p2): Remounting
(633583. 798466) EXT4-fs (nvme0n1p2): Remounting filesystem read-only<br>EXT4-fs  error (device nvme01p2): ext4_journal_check_start:83: comm systemd-journal: Detected aborted Jounal<br>EXT4-fs (nvme@nip2): Remounting filesystem read-only<br><br><br>EXT4-fs error (device nvme0n1p2): _ext4_find_entry:1683: inode #1966081: comm cron: reading directory Iblock 0<br>EXT4-fs (nvme@nip2): Remounting filesystem read-only<br>EXT4-fs error (device nvme0n1p2) in ext4_setattr:5628: —— Journal has aborted<br>
On Wednesday, 4 December 2024 09:29:17 -03 Daniel Harris wrote:
Hello
I have been using the stable branch but recently it has not been so
stable. I have experienced some unexpected behavior Not sure if its related to this ubuntu bug ( https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1805816 )
I have done a file system check and everything seems ok and the drive doesn't have any error messages and journalctl doesn't seem to log any errors (not that i can find anyway).
I will have to disable the apst but i just thought this bug might haveKernel 4.19 I'd be very surprised if that would be related.
been fixed from 2018 but maybe not.
Thanks Dan
I'd save my data a.s.a.p. and install a new NVME drive if I were in your shoes.
have a nice day
--
Eike Lantzsch KY4PZ / ZP5CGE
Thanks for all your replies.
As far as I can tell there are no errors reported using fsck or smartctl or >nvme
and the firmware is the correct and newest version so no problems there.
The following are the messages that appear but only taken from my phone and >copied from the photo (lots of scrolling errors repeating over).
I thought these new drives were supposed to last longer than older moving HDD >but obviously not
I guess its time to buy a new drive : (
On Wed, Dec 4, 2024 at 2:47 PM Klaus Singvogel wrote:
Some more details here: https://www.pugetsystems.com/support/guides/critical-samsung-ssd-firmware-update/
That's interesting (in a morbid sort of way).
Do you know if fwupdmgr will detect out-of-date firmware on the drives?
Jeffrey Walton wrote:interesting comments here. I've been using SSD's since 40G was the
On Wed, Dec 4, 2024 at 2:47 PM Klaus Singvogel wrote:
Some more details here:
https://www.pugetsystems.com/support/guides/critical-samsung-ssd-firmware-update/
That's interesting (in a morbid sort of way).
Do you know if fwupdmgr will detect out-of-date firmware on the drives?
I'm sorry, but I don't know. I only became aware of fwupdmgr afterwards.
At least the replacement Samsung SSD was detected by fwupdmgr on my last run.
Best regards,
Klaus.
On Wed, Dec 04, 2024 at 05:11:47PM +0000, Daniel Harris wrote:
Thanks for all your replies.or
As far as I can tell there are no errors reported using fsck or smartctl
nvme
and the firmware is the correct and newest version so no problems there.
The following are the messages that appear but only taken from my phoneand
copied from the photo (lots of scrolling errors repeating over).
I thought these new drives were supposed to last longer than older moving HDD
but obviously not
Is this during boot? The messages indicate a corrupted journal, which generally means a device error, or maybe a device which lost power while writing. It should be possible to mount read-only without replaying the journal for recovery purposes, but it's basically unfixable.
I guess its time to buy a new drive : (
Did you try "nvme smart-log /dev/nvme0" to look for issues?
<div>Â </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">>I guess its time to buy a new drive : (<br>
interesting comments here. I've been using SSD's since 40G was the biggest. The 256G spinning rust, now 15 years old is the only spinning rust left
here. And I've drawer full of samsung 860-870 series drives that have all gone wonky but not RO yet.. I now have a mixture of stuff from Taiwan in 2T and 4T sizes, all healthy. I guess I was not the only one that got questionable drives from Samsung. This is the first time I've seen them discussed in this context. Thank you for saying something out loud.
So its not actually a crash. On the 2 occasions it has happened, I have been >away from my computer for a while, and when I return and move the mouse, I can >see messages scrolling on a black screen (no X running). I can move to a new >vt but I cannot log in. When I try to log in I just get the errors repeating >on the screen. After I do a hard reset everything works perfectly. No errors >anywhere.
Hi Gene,While I am saying that my results with earlier Samsung have been less
gene heskett wrote:
interesting comments here. I've been using SSD's since 40G was the biggest. >> The 256G spinning rust, now 15 years old is the only spinning rust left
here. And I've drawer full of samsung 860-870 series drives that have all
gone wonky but not RO yet.. I now have a mixture of stuff from Taiwan in 2T >> and 4T sizes, all healthy. I guess I was not the only one that got
questionable drives from Samsung. This is the first time I've seen them
discussed in this context. Thank you for saying something out loud.
To point this out: it was only exactly this model from Samsung: SSD 980 PRO, which isn't working properly.
Repeat: only Samsung SSD 980 PRO.
It can be fixed by a Firmware upgrade, and more recently charges of Samsung SSD 980 PRO are flashed/sold with a good Firmware out-of-the-box.
Best regards,
Klaus.
While I am saying that my results with earlier Samsung have been less
than glorious. triple layer nand's turning into half capacity for
instance.
On 12/5/24 06:59, Klaus Singvogel wrote:
It can be fixed by a Firmware upgrade, and more recently charges of Samsung SSD 980 PRO are flashed/sold with a good Firmware out-of-the-box.While I am saying that my results with earlier Samsung have been less than glorious. triple layer nand's turning into half capacity for instance.
On 05/12/2024 16:19, Jörg-Volker Peetz wrote:
1. SSD's have some self healing capacities (discarding defect sectors) which are performed when the drive is not mounted. Therefore, enter the BIOS of the computer and let it running for ca. an hour. Then restart
the computer.
I am curious which way OS notifies a drive that it is mounted. I believed that drivers read and write blocks, maybe switch power save states, but
mount is performed on a higher level.
On Thu, Dec 05, 2024 at 10:53:54AM +0000, Daniel Harris wrote:
So its not actually a crash. On the 2 occasions it has happened, I have beenI can
away from my computer for a while, and when I return and move the mouse,
see messages scrolling on a black screen (no X running). I can move to a new
vt but I cannot log in. When I try to log in I just get the errors repeating
on the screen. After I do a hard reset everything works perfectly. No errors
anywhere.
Have you tried a memory test? Those symptoms and the smart output make
me think the problem is in hardware other than the drive itself. Memory
is the easiest to check and the easiest to remedy.
Memtest passed with no errors
On Thu, Dec 05, 2024 at 04:26:18PM CET, Max Nikulin <manikulin@gmail.com> said:
On 05/12/2024 16:19, Jörg-Volker Peetz wrote:
1. SSD's have some self healing capacities (discarding defect sectors) which are performed when the drive is not mounted. Therefore, enter the BIOS of the computer and let it running for ca. an hour. Then restart
the computer.
I am curious which way OS notifies a drive that it is mounted. I believed that drivers read and write blocks, maybe switch power save states, but mount is performed on a higher level.
We would the drive need to be notified ?
On 05/12/2024 16:19, Jörg-Volker Peetz wrote:
1. SSD's have some self healing capacities (discarding defect
sectors) which are performed when the drive is not mounted.
Therefore, enter the BIOS of the computer and let it running for ca.
an hour. Then restart the computer.
I am curious which way OS notifies a drive that it is mounted. I
believed that drivers read and write blocks, maybe switch power save
states, but mount is performed on a higher level.
gene heskett wrote:
On 12/5/24 06:59, Klaus Singvogel wrote:
While I am saying that my results with earlier Samsung have been less than >> glorious. triple layer nand's turning into half capacity for instance.
It can be fixed by a Firmware upgrade, and more recently charges of Samsung SSD 980 PRO are flashed/sold with a good Firmware out-of-the-box.
In my memory, the PRO version of Samsung SSDs (not: NVMe) survived the test of a reputable website - many years ago. All other SSDs died when the article was written, except the Samsung PRO SSD. Their test continued.
But, on the other hand, the regular version (no PRO in the name) of the Samsung SSD died first in the test.
I think the data written to the Samsung SSD in the test exceeded twice the MTBF rate, several hundred TB.
But I also noticed that the quality of Samsung SSDs has adapted to the quality of their competitors (not in the good way). I read a lot about the 980 PRO and the Firmware debacle, but also heard that the 990 PRO shouldn't be any better.
Best regards,
Klaus.
On Thu, Dec 05, 2024 at 04:26:18PM CET, Max Nikulin <manikulin@gmail.com> said:Wrong question IMO. If as you say, and that makes perfect sense, if the
On 05/12/2024 16:19, Jörg-Volker Peetz wrote:
1. SSD's have some self healing capacities (discarding defect sectors)
which are performed when the drive is not mounted. Therefore, enter the
BIOS of the computer and let it running for ca. an hour. Then restart
the computer.
I am curious which way OS notifies a drive that it is mounted. I believed
that drivers read and write blocks, maybe switch power save states, but
mount is performed on a higher level.
We would the drive need to be notified ?
Should have been more clear. The drive should be idle for a longer
time. This is assured by not mounting any partition of the SSD.
I was able to "repair" unreadable sectors on a built-in SSD of an
HP-Probook laptop. As far as I remember I also deleted files which
could not be read any more because of defective sectors and restored
the files from backup. Such unreadable files can be found by
performing, e.g., a checksum calculation of all files on the SSD.
Then, leaving the SSD alone, it was able to "replace" the defective
sectors by spare sectors.
On Thu, Dec 05, 2024 at 10:26:18PM +0700, Max Nikulin wrote:
On 05/12/2024 16:19, Jörg-Volker Peetz wrote:
1. SSD's have some self healing capacities (discarding defect sectors) which
are performed when the drive is not mounted. Therefore, enter the BIOS of the
computer and let it running for ca. an hour. Then restart the computer.
I am curious which way OS notifies a drive that it is mounted. I believed that
drivers read and write blocks, maybe switch power save states, but mount is >> performed on a higher level.
It doesn't: leaving the system unmounted ensures that the drive is idle, but in
general that's not necessary--just leaving the system alone will usually have the same result unless you've got a runaway process chewing on the disk. The SSD
will do maintenance tasks when it's idle, or under pressure (has no other choice
because there are no writable blocks available).
On Fri, Dec 06, 2024 at 02:26:23PM +0100, Jörg-Volker Peetz wrote:
Should have been more clear. The drive should be idle for a longer time. This
is assured by not mounting any partition of the SSD.
I was able to "repair" unreadable sectors on a built-in SSD of an HP-Probook >> laptop. As far as I remember I also deleted files which could not be read any
more because of defective sectors and restored the files from backup. Such >> unreadable files can be found by performing, e.g., a checksum calculation of >> all files on the SSD. Then, leaving the SSD alone, it was able to "replace" >> the defective sectors by spare sectors.
Sorry, I don't buy that. Whatever happened, it wasn't the drive pondering unreadable sectors and then regenerating them. I can believe that deleting unreadable files and restoring them made them readable again. (Overwriting a bad
sector will cause the original block to be freed and potentially discarded; after rewriting, the data is not in the same physical location it was before.)
As outlined in a previous post, trimming unused space may also let the drive discard bad blocks. None of that requires the drive to be unmounted.
Michael, thank you for the long message. Actually I wonder what is
"idle" that allows drive to perform self-maintenance. I expect that
the device should not be in some deep power saving state (I am yet to >discover available tunables that allows drive to "sleep"). Should it
be some period of time (seconds? minutes?) completely without any IO
or is it enough if read/write speed is below some threshold and
from/to another chip?
As to erase block size, I am aware of it. On the other hand I am
surprised that a drive does not allow kernel to optimize writes on a
higher level (as uSD does):
grep '' /sys/block/*/queue/discard_granularity
...
/sys/block/mmcblk0/queue/discard_granularity:4194304 >/sys/block/nvme0n1/queue/discard_granularity:4096 >/sys/block/sda/queue/discard_granularity:4096 # hdd (shingled)
I switched this NVME drive to 4k mode. However I considered your
message as statement that internally drives still use higher erase
block size
On 09/12/2024 00:14, Michael Stone wrote:
Not all drives[...]
support 4k, and many that do get no benefit from such a configuration.
# nvme id-ns -H /dev/nvme0n1 | grep Rel
LBA Format 0 : Metadata Size: 0 bytes - Data Size: 512 bytes -
Relative Performance: 0x2 Good (in use)
LBA Format 1 : Metadata Size: 0 bytes - Data Size: 4096 bytes -
Relative Performance: 0x1 Better
It is my case. I decided that ext4 uses 4k blocks anyway, so it is
better to be consistent with hardware&firmware developers.
As to erase block size, my expectation is that some drivers might
benefit if that size is known: flushing caches (especially in laptop
mode), allocating space for new files. I have no evidences that it is implemented though. Perhaps dedicated chips and caches inside drives may
do it more efficiently (besides dumb cheap models).
mkfs.* tools might use erase block size to align filesystem structures.
It is the reason why I was surprised that erase block size is not
exposed to kernel.
My real curiosity was caused by "not mounting" a drive to allow self
healing. "Idle" is imprecise from my point of view, but I think we may
stop here. There is a chance that I will accidentally notice a detailed article on this topic.
<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Dec 10, 2024 at 3:41 AM Max Nikulin <<a href="mailto:manikulin@gmail.com" target="_blank">manikulin@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 09/12/2024 00:14, Michael Stone wrote:<br>
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 546 |
Nodes: | 16 (2 / 14) |
Uptime: | 153:20:57 |
Calls: | 10,383 |
Files: | 14,054 |
Messages: | 6,417,839 |