• How Do SSDs Wear Out?

    From Boris@21:1/5 to All on Fri Feb 14 05:47:40 2025
    I understand that HDDs can have mechanical failures, but when SSDs came on
    the scene, I wondered how it is that SSDs 'wear' out. I've got many
    machines with HDDs (one is a 20 year old XP box, still working fine) and
    some with SSDs, none of which have failed. I've also got many external
    HDDs, all still good.

    Anyway, I've always heard that SSDs can wear out after many writes. I
    started to read about the physical construction of SSDs, but I ended going down the rabbit hole, reading about wear leveling and, of course trim, but never found anything about *why* a SSD 'wears' out.

    How does a SSD wear out? And while I'm asking, does the same 'wearing' out occur happen on a USB flash drive?

    Thanks.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Paul@21:1/5 to Boris on Fri Feb 14 03:11:16 2025
    On Fri, 2/14/2025 12:47 AM, Boris wrote:
    I understand that HDDs can have mechanical failures, but when SSDs came on the scene, I wondered how it is that SSDs 'wear' out. I've got many
    machines with HDDs (one is a 20 year old XP box, still working fine) and
    some with SSDs, none of which have failed. I've also got many external
    HDDs, all still good.

    Anyway, I've always heard that SSDs can wear out after many writes. I started to read about the physical construction of SSDs, but I ended going down the rabbit hole, reading about wear leveling and, of course trim, but never found anything about *why* a SSD 'wears' out.

    How does a SSD wear out? And while I'm asking, does the same 'wearing' out occur happen on a USB flash drive?

    Thanks.


    The physical cells, the structure at the atomic level, is
    damaged by the writes.

    Each cell has a "voltage" stored on it. Established by putting
    some electrons on a floating gate. The path for this is
    quantum mechanically disallowed, and to get the electrons
    onto the gate requires tunneling. The electrons will sit
    on the gate for up to ten years (retention time estimate, info
    on this has not been updated in a long long time so we are left
    to guess whether it scales in any way with gate size).

    Imagine a capacitor, charged to any voltage between 0.000V and 1.000V.
    If we divide the cell voltage into "ranges of voltage", we can
    associate values with the voltage. 0.125V = 001, 0.250V = 010, 0.375V = 011
    and so on. This requires some fairly careful charging. By dividing
    the voltages like this, there isn't a lot of noise margin.

    The cell voltage is passed to an analog comparator. It defines a
    "window of voltages" for which 001 is the interpretation, another
    "window of voltages" for which 010 is the interpretation.

    In this way, we can store multiple bits per cell (three bits in the
    example given so far, or TLC). Notice though, that the more "bits"
    we pretend to store in each cell, the voltage ranges are getting
    smaller and smaller. Our greedy stuffing of bits like this,
    shortens the estimated drive life.

    If there is any threshold shift in the cell as it ages,
    then the voltages could be thrown off. This causes an
    equivalent "bit corruption", when the interpreted voltage is incorrect.

    Back when flash storage devices had one bit per cell (SLC),
    the noise margins were very good. You could write the cell
    100,000 times, and the voltage value was always interpreted
    correctly. Any voltage over 0.500V was a logic 1, any
    voltage less than 0.500V was a logic 0.

    But we could not be happy with our (mostly bulletproof) discovery.
    We insisted on density over integrity. Thus the TLC and QLC
    SSDs of today, stuff more bits per cell, and the corrected
    value (taking write amplification into account) is 600 writes per cell.
    Which is a large drop compared to the SLC value of 100,000 writes per cell.

    A 1TB drive may have a rating of 600TBW. That amounts to writing
    the drive 600 times, end to end. If you buy a 2TB drive, the rating
    is 1200TBW, which is still 600 writes of 2TB each time. SSDs are
    like toilet paper, they are a consumable item, they wear out.

    OK, so let's try to use our shiny new SSD. Everyone likes to write
    sector 0 (the MBR). Perhaps it receives more writes than the other
    sectors. Before I know it, the MBR has been written 600 times.
    Yet, my copy of shell32.dll has only been written 1 time. Our SSD
    got "wore out" by abusing only one of the sectors. That's not very good. without some clever scheme, you can "burn a hole" in the SSD.
    We had to fix that.

    By mapping the sectors, using a mapping table, and "moving the MBR around
    each time it is written", that is wear leveling. The drive has a pool of unwritten blocks. On a write request, an unused block is written.
    Perhaps the block is at address 27, and it contained MBR sector 0.
    The map file the drive keeps then, it has to remember that aspect.
    On a read, we request sector 0, the map goes "oh, that is block 27",
    and the drive does the read at that address, and there is our MBR.
    Now, if I abuse the MBR by writing it a lot, a hole isn't burned in it.
    The sector has been "virtualized", and only the mapping table knows
    where my sector is stored :-)

    When you TRIM a drive, that exchanges usage information with the drive.
    You tell the drive, "at the current time, there is nothing at
    address 27, so you can put it in your spare pile". This can improve
    the write speed of the drive, as it has more bulk material when
    it does housekeeping inside, and rearranges your data (under the
    direction of the edited map table). If you ask for a sector (white space)
    that has been moved into the free pool pile, then zeros are substituted.
    *This has an impact on your UnErase capability with Recuva.* If
    you erase a file by mistake, do a TRIM, then the erased file,
    the clusters are "gone". But other than that side effect, the TRIM
    is an attempt to give the SSD a "hint" as to which areas of the
    drive don't really need storage, because they are white space
    on the partition and no "used clusters" are stored there.

    What is the end result of all this ? Well, at the end of life,
    you could have written the MBR *thousands* of times and it does
    not matter. The statistics of the free pool usage, and the re-circulation
    of the blocks, means that one block is written 599 times,
    another block 600 times, a third block 601 times, but the blocks
    have been worn equally with pretty low spread between blocks.
    The "wear" on the cells, has been equalized by the wear leveling schemes.

    It also means, if you pop a flash chip out of the drive, and read
    it sequentially with your lab reader device, the data is "scrambled"
    and almost unreadable. Unless the technician can find the map file, the
    data is spread all over the place.

    A USB flash stick doesn't do this. A USB flash stick with TLC cells in
    it, wears out in no time. A USB flash stick with SLC cells, it just
    goes and goes, seemingly forever.

    Whereas, via a lot of whizzy tech, the SSD is an observably more reliable device,
    and via watching the wear life field in the SMART table, you can
    tell how many years remain on the drive. You can write the MBR a
    thousand times right now, and the predicted life of the drive does
    not change all that much. It's still "99% good". Whereas if you did
    that to the USB stick you bought from Walmart, now it is dead (because
    the MBR can't be used any more).

    There is atomic level damage to the structure of the cell, on writes.
    The level of damage is temperature dependent. The predicted
    charge retention time on a write is also temperature dependent.
    Scientists noticed this in the lab, that there was less damage
    at elevated temperature. They figured out, if we could "anneal"
    the drive after some period of usage, the cells would be almost
    brand new in terms of structural damage. But nobody has figured
    out a way to make individual cells "anneal" on command. And I think
    the temperature required for this, might be slightly out of range
    for the materials used in the drive. The annealing remains as a
    lab curiosity.

    Generally speaking, all storage devices like to know their
    temperature, during a write. The controls at the point of writing,
    may need to be "temperature compensated". A hard drive makes some
    adjustment, if the housing is running at high temperature.
    An SSD may be doing the same sort of thing.

    If we were willing to accept a drop in drive capacity, then we
    would no longer need to be staring at the SMART table all the time.

    [Picture] the SMART table of my SSD drive right now...

    https://i.postimg.cc/rsxhfq4x/crystal-daily-driver-4-TBSSD.gif

    Notice that my drive has been running for 14,000 hours, and it is
    at 99% good. That means, based on averages, it might last to 1,400,000 hours
    at the current rate of usage.

    If I were a video editor, editing raw video (200GB per vid), and saving
    those out multiple times a day, I would go through that drive in no
    time. One of the reasons the usage is so low on that drive, is I use
    my RAMDisk for a lot of stuff, and the SSD does not get the wear.
    Some of my VMs, the container gets transferred to the RAMDisk,
    I do some stuff, I throw the container away at the end of the session.
    Thus, my usage is not an indication of what your usage will be.

    One partition, it gets to store those pictures above. So that partition
    is contributing to the wear of the device. That, and Windows Defender
    scans (which write out some sort of status).

    In terms of backup policy then, I won't have to worry for a number
    of years, about the life of the drive. However, if the drive has
    a "heart failure", like if the map file got trashed or some other
    metadata table got trashed, maybe the next day, the drive would not
    detect and I could not boot. While that outcome is obscure, there
    have been cases of my drive taking a dump like that. And that is
    why we still need backups (preferably on a cheap and large hard drive).

    Back in the OCZ era (first generation SSDs), heart failures were more
    common, and this had to do with the quality of the firmware the drive
    runs inside. The drive has processor cores, multiple of them, and
    the firmware the drive runs, has to juggle the map file without losing it.
    On one occasion, when Intel was entering the SSD business, their
    firmware people took one look at samples of code written, and they
    were not at all happy about the firmware qualities. Intel then rewrote
    the firmware for their drive, and did not copy anyone elses firmware
    (via buying the firmware along with the controller chip used). There was
    a general industry silence after this event, but my presumption is,
    that information made the rounds in the industry, about what sort of
    tricks were needed to improve on loss of metadata and so on.

    What the drive is doing, is tricky. It must have atomic updates,
    some sort of journal inside. It must have all sorts of protections
    inside, to protect it on a power fail. The drives don't use a
    Supercap for emergency power. Some of the drives don't even have
    DRAM for the map file storage (HMB Host Managed Buffer drives).
    The ball juggling going on inside the drive is perilous. Yet,
    my drive has had a few power fails, without disappearing on me :-)

    Enjoy!

    Paul

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Carlos E.R.@21:1/5 to Paul on Mon Feb 17 21:46:21 2025
    On 2025-02-14 09:11, Paul wrote:
    On Fri, 2/14/2025 12:47 AM, Boris wrote:

    ...

    The physical cells, the structure at the atomic level, is
    damaged by the writes.

    Each cell has a "voltage" stored on it. Established by putting
    some electrons on a floating gate. The path for this is
    quantum mechanically disallowed, and to get the electrons
    onto the gate requires tunneling. The electrons will sit
    on the gate for up to ten years (retention time estimate, info
    on this has not been updated in a long long time so we are left
    to guess whether it scales in any way with gate size).

    I wonder if we can store the disk for five years, then plug it in and
    somehow refresh the charges in the cells.

    ...

    By mapping the sectors, using a mapping table, and "moving the MBR around each time it is written", that is wear leveling. The drive has a pool of unwritten blocks. On a write request, an unused block is written.
    Perhaps the block is at address 27, and it contained MBR sector 0.
    The map file the drive keeps then, it has to remember that aspect.
    On a read, we request sector 0, the map goes "oh, that is block 27",
    and the drive does the read at that address, and there is our MBR.
    Now, if I abuse the MBR by writing it a lot, a hole isn't burned in it.
    The sector has been "virtualized", and only the mapping table knows
    where my sector is stored :-)

    Where is the map stored? I always wondered about this.

    ...


    Thanks a lot for the summary :-)


    --
    Cheers, Carlos.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Boris@21:1/5 to Paul on Tue Feb 18 01:42:22 2025
    Paul <nospam@needed.invalid> wrote in news:vomtr3$3cnqe$1@dont-email.me:

    On Fri, 2/14/2025 12:47 AM, Boris wrote:
    I understand that HDDs can have mechanical failures, but when SSDs
    came on the scene, I wondered how it is that SSDs 'wear' out. I've
    got many machines with HDDs (one is a 20 year old XP box, still
    working fine) and some with SSDs, none of which have failed. I've
    also got many external HDDs, all still good.

    Anyway, I've always heard that SSDs can wear out after many writes.
    I started to read about the physical construction of SSDs, but I
    ended going down the rabbit hole, reading about wear leveling and, of
    course trim, but never found anything about *why* a SSD 'wears' out.

    How does a SSD wear out? And while I'm asking, does the same
    'wearing' out occur happen on a USB flash drive?

    Thanks.


    The physical cells, the structure at the atomic level, is
    damaged by the writes.

    Each cell has a "voltage" stored on it. Established by putting
    some electrons on a floating gate. The path for this is
    quantum mechanically disallowed, and to get the electrons
    onto the gate requires tunneling. The electrons will sit
    on the gate for up to ten years (retention time estimate, info
    on this has not been updated in a long long time so we are left
    to guess whether it scales in any way with gate size).

    Imagine a capacitor, charged to any voltage between 0.000V and 1.000V.
    If we divide the cell voltage into "ranges of voltage", we can
    associate values with the voltage. 0.125V = 001, 0.250V = 010, 0.375V
    = 011 and so on. This requires some fairly careful charging. By
    dividing the voltages like this, there isn't a lot of noise margin.

    The cell voltage is passed to an analog comparator. It defines a
    "window of voltages" for which 001 is the interpretation, another
    "window of voltages" for which 010 is the interpretation.

    In this way, we can store multiple bits per cell (three bits in the
    example given so far, or TLC). Notice though, that the more "bits"
    we pretend to store in each cell, the voltage ranges are getting
    smaller and smaller. Our greedy stuffing of bits like this,
    shortens the estimated drive life.

    If there is any threshold shift in the cell as it ages,
    then the voltages could be thrown off. This causes an
    equivalent "bit corruption", when the interpreted voltage is
    incorrect.

    Back when flash storage devices had one bit per cell (SLC),
    the noise margins were very good. You could write the cell
    100,000 times, and the voltage value was always interpreted
    correctly. Any voltage over 0.500V was a logic 1, any
    voltage less than 0.500V was a logic 0.

    But we could not be happy with our (mostly bulletproof) discovery.
    We insisted on density over integrity. Thus the TLC and QLC
    SSDs of today, stuff more bits per cell, and the corrected
    value (taking write amplification into account) is 600 writes per
    cell. Which is a large drop compared to the SLC value of 100,000
    writes per cell.

    A 1TB drive may have a rating of 600TBW. That amounts to writing
    the drive 600 times, end to end. If you buy a 2TB drive, the rating
    is 1200TBW, which is still 600 writes of 2TB each time. SSDs are
    like toilet paper, they are a consumable item, they wear out.

    OK, so let's try to use our shiny new SSD. Everyone likes to write
    sector 0 (the MBR). Perhaps it receives more writes than the other
    sectors. Before I know it, the MBR has been written 600 times.
    Yet, my copy of shell32.dll has only been written 1 time. Our SSD
    got "wore out" by abusing only one of the sectors. That's not very
    good. without some clever scheme, you can "burn a hole" in the SSD.
    We had to fix that.

    By mapping the sectors, using a mapping table, and "moving the MBR
    around each time it is written", that is wear leveling. The drive has
    a pool of unwritten blocks. On a write request, an unused block is
    written. Perhaps the block is at address 27, and it contained MBR
    sector 0. The map file the drive keeps then, it has to remember that
    aspect. On a read, we request sector 0, the map goes "oh, that is
    block 27", and the drive does the read at that address, and there is
    our MBR. Now, if I abuse the MBR by writing it a lot, a hole isn't
    burned in it. The sector has been "virtualized", and only the mapping
    table knows where my sector is stored :-)

    When you TRIM a drive, that exchanges usage information with the
    drive. You tell the drive, "at the current time, there is nothing at
    address 27, so you can put it in your spare pile". This can improve
    the write speed of the drive, as it has more bulk material when
    it does housekeeping inside, and rearranges your data (under the
    direction of the edited map table). If you ask for a sector (white
    space) that has been moved into the free pool pile, then zeros are substituted. *This has an impact on your UnErase capability with
    Recuva.* If you erase a file by mistake, do a TRIM, then the erased
    file, the clusters are "gone". But other than that side effect, the
    TRIM is an attempt to give the SSD a "hint" as to which areas of the
    drive don't really need storage, because they are white space
    on the partition and no "used clusters" are stored there.

    What is the end result of all this ? Well, at the end of life,
    you could have written the MBR *thousands* of times and it does
    not matter. The statistics of the free pool usage, and the
    re-circulation of the blocks, means that one block is written 599
    times, another block 600 times, a third block 601 times, but the
    blocks have been worn equally with pretty low spread between blocks.
    The "wear" on the cells, has been equalized by the wear leveling
    schemes.

    It also means, if you pop a flash chip out of the drive, and read
    it sequentially with your lab reader device, the data is "scrambled"
    and almost unreadable. Unless the technician can find the map file,
    the data is spread all over the place.

    A USB flash stick doesn't do this. A USB flash stick with TLC cells in
    it, wears out in no time. A USB flash stick with SLC cells, it just
    goes and goes, seemingly forever.

    Whereas, via a lot of whizzy tech, the SSD is an observably more
    reliable device, and via watching the wear life field in the SMART
    table, you can tell how many years remain on the drive. You can write
    the MBR a thousand times right now, and the predicted life of the
    drive does not change all that much. It's still "99% good". Whereas if
    you did that to the USB stick you bought from Walmart, now it is dead (because the MBR can't be used any more).

    There is atomic level damage to the structure of the cell, on writes.
    The level of damage is temperature dependent. The predicted
    charge retention time on a write is also temperature dependent.
    Scientists noticed this in the lab, that there was less damage
    at elevated temperature. They figured out, if we could "anneal"
    the drive after some period of usage, the cells would be almost
    brand new in terms of structural damage. But nobody has figured
    out a way to make individual cells "anneal" on command. And I think
    the temperature required for this, might be slightly out of range
    for the materials used in the drive. The annealing remains as a
    lab curiosity.

    Generally speaking, all storage devices like to know their
    temperature, during a write. The controls at the point of writing,
    may need to be "temperature compensated". A hard drive makes some
    adjustment, if the housing is running at high temperature.
    An SSD may be doing the same sort of thing.

    If we were willing to accept a drop in drive capacity, then we
    would no longer need to be staring at the SMART table all the time.

    [Picture] the SMART table of my SSD drive right now...

    https://i.postimg.cc/rsxhfq4x/crystal-daily-driver-4-TBSSD.gif

    Notice that my drive has been running for 14,000 hours, and it is
    at 99% good. That means, based on averages, it might last to 1,400,000
    hours at the current rate of usage.

    If I were a video editor, editing raw video (200GB per vid), and
    saving those out multiple times a day, I would go through that drive
    in no time. One of the reasons the usage is so low on that drive, is I
    use my RAMDisk for a lot of stuff, and the SSD does not get the wear.
    Some of my VMs, the container gets transferred to the RAMDisk,
    I do some stuff, I throw the container away at the end of the session.
    Thus, my usage is not an indication of what your usage will be.

    One partition, it gets to store those pictures above. So that
    partition is contributing to the wear of the device. That, and Windows Defender scans (which write out some sort of status).

    In terms of backup policy then, I won't have to worry for a number
    of years, about the life of the drive. However, if the drive has
    a "heart failure", like if the map file got trashed or some other
    metadata table got trashed, maybe the next day, the drive would not
    detect and I could not boot. While that outcome is obscure, there
    have been cases of my drive taking a dump like that. And that is
    why we still need backups (preferably on a cheap and large hard
    drive).

    Back in the OCZ era (first generation SSDs), heart failures were more
    common, and this had to do with the quality of the firmware the drive
    runs inside. The drive has processor cores, multiple of them, and
    the firmware the drive runs, has to juggle the map file without losing
    it. On one occasion, when Intel was entering the SSD business, their
    firmware people took one look at samples of code written, and they
    were not at all happy about the firmware qualities. Intel then rewrote
    the firmware for their drive, and did not copy anyone elses firmware
    (via buying the firmware along with the controller chip used). There
    was a general industry silence after this event, but my presumption
    is, that information made the rounds in the industry, about what sort
    of tricks were needed to improve on loss of metadata and so on.

    What the drive is doing, is tricky. It must have atomic updates,
    some sort of journal inside. It must have all sorts of protections
    inside, to protect it on a power fail. The drives don't use a
    Supercap for emergency power. Some of the drives don't even have
    DRAM for the map file storage (HMB Host Managed Buffer drives).
    The ball juggling going on inside the drive is perilous. Yet,
    my drive has had a few power fails, without disappearing on me :-)

    Enjoy!

    Paul

    Thanks much for the education. I've read it over many times, and it's
    taken me to all sorts of articles, starting with those on floating gate transistors.

    https://tinyurl.com/zcj5j4d7

    I have a question:

    Does each cell have only one bit ("1" or "0") of changeable information?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Paul@21:1/5 to Carlos E.R. on Tue Feb 18 00:28:33 2025
    On Mon, 2/17/2025 3:46 PM, Carlos E.R. wrote:
    On 2025-02-14 09:11, Paul wrote:
    On Fri, 2/14/2025 12:47 AM, Boris wrote:

    ...

    The physical cells, the structure at the atomic level, is
    damaged by the writes.

    Each cell has a "voltage" stored on it. Established by putting
    some electrons on a floating gate. The path for this is
    quantum mechanically disallowed, and to get the electrons
    onto the gate requires tunneling. The electrons will sit
    on the gate for up to ten years (retention time estimate, info
    on this has not been updated in a long long time so we are left
    to guess whether it scales in any way with gate size).

    I wonder if we can store the disk for five years, then plug it in and somehow refresh the charges in the cells.

    ...

    By mapping the sectors, using a mapping table, and "moving the MBR around
    each time it is written", that is wear leveling. The drive has a pool of
    unwritten blocks. On a write request, an unused block is written.
    Perhaps the block is at address 27, and it contained MBR sector 0.
    The map file the drive keeps then, it has to remember that aspect.
    On a read, we request sector 0, the map goes "oh, that is block 27",
    and the drive does the read at that address, and there is our MBR.
    Now, if I abuse the MBR by writing it a lot, a hole isn't burned in it.
    The sector has been "virtualized", and only the mapping table knows
    where my sector is stored :-)

    Where is the map stored? I always wondered about this.

    ...


    Thanks a lot for the summary :-)

    The issue of "recharging cells" has already come up, with respect
    to TLC. For at least Samsung, they may have a provision for doing that.

    SSDs don't have a real time clock, so they cannot tell five years has
    passed. All they have is power-on-hours, which is a useful metric for
    an SSD that is alive and working every single day.

    The drive can tell when a sector is getting "spongy" due to the
    error count. A TLC sector, might have a bit in error, for every sector. Correcting all the sectors is nothing new for the drive. And since
    the syndrome is 50 bytes for a 512 byte sector, that's a *huge* syndrome allowing a lot of bits in error to be corrected. The drive can allow
    the TLC cell to have more and more errors in it. Then, once a portion
    of the error capability is used up, the drive could re-write the sector.

    That's one way they could do it.

    But since no company has enthusiast promoters like in the OCZ days,
    we cannot get information from company reps about how things work.

    *******

    The map can be stored in an "SLC-like" critical data storage
    area of the flash. But that's not the part I am particularly
    interested in. I'm more curious about how a map file can be
    maintained, without burning a hole in the SSD while doing so.
    It is the handling policy of the map file, whether it is journaled
    or protected in some way, that I am curious about.

    But don't expect to find an honest explainer page on the web.

    All we know, is the power can go off, and the SSD drive seems to survive.

    Paul

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Paul@21:1/5 to Boris on Tue Feb 18 03:03:56 2025
    On Mon, 2/17/2025 8:42 PM, Boris wrote:


    Thanks much for the education. I've read it over many times, and it's
    taken me to all sorts of articles, starting with those on floating gate transistors.

    https://tinyurl.com/zcj5j4d7

    I have a question:

    Does each cell have only one bit ("1" or "0") of changeable information?


    Each cell is analog. It has a voltage on it. If a cell
    is 5/8th full, that is equivalent to 101 binary. I can
    store three bits of information, if I encode as eight
    voltage levels within a single cell.

    The claimed limit for such a technique at the moment,
    is five bits per cell, or 32 voltage levels for the
    analog voltage store of a cell. This has not been
    delivered yet.

    The error correcting code, is powerful. And the size of
    the syndrome (50 bytes per 512 byte sector) is a heavy
    tax for the design to use. When the five-bit-storing cell
    comes along, how long will the syndrome need to be ? It
    will be a significant portion of a sector size. This
    is what prevents this game from "going to infinity",
    is the need for more and more powerful error correction.

    The error correction is not a hard macro (logic gates).
    This is why the SSD has ARM CPU cores and error correction
    is done in firmware. Most of the cores work on error correction.
    An SSD can have a three or four core CPU in it.

    The noise margins on cell storage are going to be so poor
    at some point, when you write a sector, reading it back
    will already have an error in it. You won't need to wait
    for the cells to become "spongy", they will already be making
    errors.

    SLC \___ good storage \____ Not a lot of these chips
    MLC / and lots of writes / are being manufactured
    TLC ... acceptable, much of mainstream SSD uses this
    QLC ... hanging on
    PLC ... practicality, to be determined (not shipping now)

    Paul

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Carlos E.R.@21:1/5 to Paul on Tue Feb 18 14:26:08 2025
    On 2025-02-18 06:28, Paul wrote:
    On Mon, 2/17/2025 3:46 PM, Carlos E.R. wrote:
    On 2025-02-14 09:11, Paul wrote:
    On Fri, 2/14/2025 12:47 AM, Boris wrote:

    ...

    The physical cells, the structure at the atomic level, is
    damaged by the writes.

    Each cell has a "voltage" stored on it. Established by putting
    some electrons on a floating gate. The path for this is
    quantum mechanically disallowed, and to get the electrons
    onto the gate requires tunneling. The electrons will sit
    on the gate for up to ten years (retention time estimate, info
    on this has not been updated in a long long time so we are left
    to guess whether it scales in any way with gate size).

    I wonder if we can store the disk for five years, then plug it in and somehow refresh the charges in the cells.

    ...

    By mapping the sectors, using a mapping table, and "moving the MBR around >>> each time it is written", that is wear leveling. The drive has a pool of >>> unwritten blocks. On a write request, an unused block is written.
    Perhaps the block is at address 27, and it contained MBR sector 0.
    The map file the drive keeps then, it has to remember that aspect.
    On a read, we request sector 0, the map goes "oh, that is block 27",
    and the drive does the read at that address, and there is our MBR.
    Now, if I abuse the MBR by writing it a lot, a hole isn't burned in it.
    The sector has been "virtualized", and only the mapping table knows
    where my sector is stored :-)

    Where is the map stored? I always wondered about this.

    ...


    Thanks a lot for the summary :-)

    The issue of "recharging cells" has already come up, with respect
    to TLC. For at least Samsung, they may have a provision for doing that.

    SSDs don't have a real time clock, so they cannot tell five years has
    passed. All they have is power-on-hours, which is a useful metric for
    an SSD that is alive and working every single day.

    The drive can tell when a sector is getting "spongy" due to the
    error count. A TLC sector, might have a bit in error, for every sector. Correcting all the sectors is nothing new for the drive. And since
    the syndrome is 50 bytes for a 512 byte sector, that's a *huge* syndrome allowing a lot of bits in error to be corrected. The drive can allow
    the TLC cell to have more and more errors in it. Then, once a portion
    of the error capability is used up, the drive could re-write the sector.

    That's one way they could do it.

    But since no company has enthusiast promoters like in the OCZ days,
    we cannot get information from company reps about how things work.

    *******

    The map can be stored in an "SLC-like" critical data storage
    area of the flash. But that's not the part I am particularly
    interested in. I'm more curious about how a map file can be
    maintained, without burning a hole in the SSD while doing so.

    Exactly!

    It is the handling policy of the map file, whether it is journaled
    or protected in some way, that I am curious about.

    But don't expect to find an honest explainer page on the web.

    All we know, is the power can go off, and the SSD drive seems to survive.

    Ok :-)



    --
    Cheers, Carlos.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)