• Re: HDD long-term data storage with ensured integrity

    From Jonathan Dowland@21:1/5 to David Christensen on Sat Apr 6 09:50:28 2024
    On Tue Apr 2, 2024 at 10:57 PM BST, David Christensen wrote:
    AIUI neither LVM nor ext4 have data and metadata checksum and correction features. But, it should be possible to achieve such by including dm-integrity (for checksumming) and some form of RAID (for correction)
    in the storage stack. I need to explore that possibility further.

    It would be nice to have checksumming and parity stuff in the filesystem
    layer, as BTRFS and XFS offer, but failing that, you can do it above
    that layer using tried-and-tested tools such as sha1sum, par2, etc.

    I personally would not rely upon RAID for anything except availability.
    My advice is once you've detected corruption, which is exceedingly rare, restore from backup.

    --
    Please do not CC me for listmail.

    👱🏻 Jonathan Dowland
    jmtd@debian.org
    🔗 https://jmtd.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Christensen@21:1/5 to David Christensen on Sat Apr 6 09:50:42 2024
    On 4/2/24 14:57, David Christensen wrote:
    AIUI neither LVM nor ext4 have data and metadata checksum and correction features.  But, it should be possible to achieve such by including dm-integrity (for checksumming) and some form of RAID (for correction)
    in the storage stack.  I need to explore that possibility further.


    I have RTFM dm-integrity before and it is still experimental. I need
    something that is production ready:

    https://manpages.debian.org/bookworm/cryptsetup-bin/cryptsetup.8.en.html

    Authenticated disk encryption (EXPERIMENTAL)


    David

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Christensen@21:1/5 to Stefan Monnier on Sat Apr 6 09:51:22 2024
    On 4/2/24 06:55, Stefan Monnier wrote:
    The most obvious alternative to ZFS on Debian would be Btrfs. Does anyone >> have any comments or suggestions regarding Btrfs and data corruption bugs, >> concurrency, CMM level, PSP, etc.?

    If you're worried about such things, I'd think "the most obvious
    alternative" is LVM+ext4. Both Btrfs and ZFS share the same underlying problem: more features => more code => more bugs.


    Stefan


    AIUI neither LVM nor ext4 have data and metadata checksum and correction features. But, it should be possible to achieve such by including
    dm-integrity (for checksumming) and some form of RAID (for correction)
    in the storage stack. I need to explore that possibility further.


    David

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Monnier@21:1/5 to All on Sat Apr 6 09:51:31 2024
    The most obvious alternative to ZFS on Debian would be Btrfs. Does anyone have any comments or suggestions regarding Btrfs and data corruption bugs, concurrency, CMM level, PSP, etc.?

    If you're worried about such things, I'd think "the most obvious
    alternative" is LVM+ext4. Both Btrfs and ZFS share the same underlying problem: more features => more code => more bugs.


    Stefan

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Christensen@21:1/5 to DdB on Sat Apr 6 09:52:33 2024
    On 3/31/24 02:18, DdB wrote:
    i intend to create a huge backup server from some oldish hardware.
    Hardware has been partly refurbished and offers 1 SSD + 8 HDD on a
    6core Intel with 64 GB RAM. ... the [Debian] installer ... aborts.

    On 4/1/24 11:35, DdB wrote:
    A friend of mine just let me use an external CD-Drive with the netboot image. ... all is well.


    Now you get to solve the same problem I have been stuck on since last
    November -- how to use those HDD's.


    ZFS has been my bulk storage solution of choice for the past ~4 years,
    but the recent data corruption bugs [1, 2] have me worried. From a
    technical perspective, it's about incorrect concurrent execution of GNU
    cp(1), Linux, and/or OpenZFS. From a management perspective, it's about Capability Maturity Model (CMM) [3] and Programming Systems Product
    (PSP) [4].


    The most obvious alternative to ZFS on Debian would be Btrfs. Does
    anyone have any comments or suggestions regarding Btrfs and data
    corruption bugs, concurrency, CMM level, PSP, etc.?


    Does anyone have any comments or suggestions regarding how to use
    magnetic hard disk drives, commodity x86 computers, and Debian for
    long-term data storage with ensured integrity?


    David


    [1] https://github.com/openzfs/zfs/issues/15526

    [2] https://github.com/openzfs/zfs/issues/15933

    [3] https://en.wikipedia.org/wiki/Capability_maturity_model

    [4] https://en.wikipedia.org/wiki/The_Mythical_Man-Month

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marc SCHAEFER@21:1/5 to David Christensen on Mon Apr 8 11:40:01 2024
    For offline storage:

    On Tue, Apr 02, 2024 at 05:53:15AM -0700, David Christensen wrote:
    Does anyone have any comments or suggestions regarding how to use magnetic hard disk drives, commodity x86 computers, and Debian for long-term data storage with ensured integrity?

    I use LVM on ext4, and I add a MD5SUMS file at the root.

    I then power up the drives at least once a year and check the MD5SUMS.

    A simple CRC could also work, obviously.

    So far, I have not detected MORE corruption with this method than the
    drive ECC itself (current drives & buses are much better than they
    used to be). When I have errors detected, I replace the file with
    another copy (I usually have multiple off-site copies, and sometimes
    even on-site online copies, but not always). When the errors add
    up, it is time to buy another drive, usually after 5+ years or
    even sometimes 10+ years.

    So, just re-reading the content might be enough, once a year or so.

    This is for HDD (for SDD I have no offline storage experience, it
    could be shorter).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Christensen@21:1/5 to Marc SCHAEFER on Mon Apr 8 20:30:01 2024
    On 4/8/24 02:38, Marc SCHAEFER wrote:
    For offline storage:

    On Tue, Apr 02, 2024 at 05:53:15AM -0700, David Christensen wrote:
    Does anyone have any comments or suggestions regarding how to use magnetic >> hard disk drives, commodity x86 computers, and Debian for long-term data
    storage with ensured integrity?

    I use LVM on ext4, and I add a MD5SUMS file at the root.

    I then power up the drives at least once a year and check the MD5SUMS.

    A simple CRC could also work, obviously.

    So far, I have not detected MORE corruption with this method than the
    drive ECC itself (current drives & buses are much better than they
    used to be). When I have errors detected, I replace the file with
    another copy (I usually have multiple off-site copies, and sometimes
    even on-site online copies, but not always). When the errors add
    up, it is time to buy another drive, usually after 5+ years or
    even sometimes 10+ years.

    So, just re-reading the content might be enough, once a year or so.

    This is for HDD (for SDD I have no offline storage experience, it
    could be shorter).


    Thank you for the reply.


    So, an ext4 file system on an LVM logical volume?


    Why LVM? Are you implementing redundancy (RAID)? Is your data larger
    than a single disk (concatenation/ JBOD)? Something else?


    David

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marc SCHAEFER@21:1/5 to David Christensen on Mon Apr 8 22:10:01 2024
    Hello,

    On Mon, Apr 08, 2024 at 11:28:04AM -0700, David Christensen wrote:
    So, an ext4 file system on an LVM logical volume?

    Why LVM? Are you implementing redundancy (RAID)? Is your data larger than
    a single disk (concatenation/ JBOD)? Something else?

    For off-site long-term offline archiving, no, I am not using RAID.

    No, it's not LVM+md, just plain LVM for flexibility.

    Typically I use 16 TB hard drives, and I tend to use one LV per data
    source, the LV name being the data source and the date of the copy.
    Or sometimes I just copy a raw volume (ext4 or something else)
    to a LV.

    With smaller drives (4 TB) I tend to not use LVM, just plain ext4 on the
    raw disk.

    I almost never use partitionning.

    However, I tend to use luks encryption (per ext4 filesystem) when the
    drives are stored off-site. So it's either LVM -> LV -> LUKS -> ext4
    or raw disk -> LUKS -> ext4.

    You can find some of the scripts I use to automate this off-site
    long-term archiving here:

    https://git.alphanet.ch/gitweb/?p=various;a=tree;f=offsite-archival/LVM-LUKS

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Christensen@21:1/5 to Marc SCHAEFER on Tue Apr 9 00:50:01 2024
    On 4/8/24 13:04, Marc SCHAEFER wrote:
    Hello,

    On Mon, Apr 08, 2024 at 11:28:04AM -0700, David Christensen wrote:
    So, an ext4 file system on an LVM logical volume?

    Why LVM? Are you implementing redundancy (RAID)? Is your data larger than >> a single disk (concatenation/ JBOD)? Something else?

    For off-site long-term offline archiving, no, I am not using RAID.

    No, it's not LVM+md, just plain LVM for flexibility.

    Typically I use 16 TB hard drives, and I tend to use one LV per data
    source, the LV name being the data source and the date of the copy.
    Or sometimes I just copy a raw volume (ext4 or something else)
    to a LV.

    With smaller drives (4 TB) I tend to not use LVM, just plain ext4 on the
    raw disk.

    I almost never use partitionning.

    However, I tend to use luks encryption (per ext4 filesystem) when the
    drives are stored off-site. So it's either LVM -> LV -> LUKS -> ext4
    or raw disk -> LUKS -> ext4.

    You can find some of the scripts I use to automate this off-site
    long-term archiving here:

    https://git.alphanet.ch/gitweb/?p=various;a=tree;f=offsite-archival/LVM-LUKS


    Thank you for the clarification. :-)


    David

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From piorunz@21:1/5 to David Christensen on Wed Apr 10 02:10:01 2024
    On 02/04/2024 13:53, David Christensen wrote:

    Does anyone have any comments or suggestions regarding how to use
    magnetic hard disk drives, commodity x86 computers, and Debian for
    long-term data storage with ensured integrity?

    I use Btrfs, on all my systems, including some servers, with soft Raid1
    and Raid10 modes (because these modes are considered stable and
    production ready). I decided on Btrfs not ZFS, because Btrfs allows to
    migrate drives on the fly while partition is live and heavily used,
    replace them with different sizes and types, mixed capacities, change
    Raid levels, change amount of drives too. I could go from single drive
    to Raid10 on 4 drives and back while my data is 100% available at all times.
    It saved my bacon many times, including hard checksum corruption on NVMe
    drive which otherwise I would never know about. Thanks to Btrfs I
    located the corrupted files, fixed them, got hardware replaced under
    warranty.
    Also helped with corrupted RAM: Btrfs just refused to save file because
    saved copy couldn't match read checksum from the source due to RAM bit
    flips. Diagnosed, then replaced memory, all good.
    I like a lot when one of the drives get ATA reset for whatever reason,
    and all other drives continue to read and write, I can continue using
    the system for hours, if I even notice. Not possible in normal
    circumstances without Raid. Once the problematic drive is back, or after
    reboot if it's more serious, then I do "scrub" command and everything is resynced again. If I don't do that, then Btrfs dynamically correct
    checksum errors on the fly anyway.
    And list goes on - I've been using Btrfs for last 5 years, not a single
    problem to date, it survived hard resets, power losses, drive failures, countless migrations.

    [1] https://github.com/openzfs/zfs/issues/15526

    [2] https://github.com/openzfs/zfs/issues/15933

    Problems reported here are from Linux kernel 6.5 and 6.7 on Gentoo
    system. Does this even affects Debian Stable with 6.1 LTS?

    --
    With kindest regards, Piotr.

    ⢀⣴⠾⠻⢶⣦⠀
    ⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system ⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org/
    ⠈⠳⣄⠀⠀⠀⠀

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Christensen@21:1/5 to piorunz on Wed Apr 10 13:20:01 2024
    On 4/9/24 17:08, piorunz wrote:
    On 02/04/2024 13:53, David Christensen wrote:

    Does anyone have any comments or suggestions regarding how to use
    magnetic hard disk drives, commodity x86 computers, and Debian for
    long-term data storage with ensured integrity?

    I use Btrfs, on all my systems, including some servers, with soft Raid1
    and Raid10 modes (because these modes are considered stable and
    production ready). I decided on Btrfs not ZFS, because Btrfs allows to migrate drives on the fly while partition is live and heavily used,
    replace them with different sizes and types, mixed capacities, change
    Raid levels, change amount of drives too. I could go from single drive
    to Raid10 on 4 drives and back while my data is 100% available at all
    times.
    It saved my bacon many times, including hard checksum corruption on NVMe drive which otherwise I would never know about. Thanks to Btrfs I
    located the corrupted files, fixed them, got hardware replaced under warranty.
    Also helped with corrupted RAM: Btrfs just refused to save file because
    saved copy couldn't match read checksum from the source due to RAM bit
    flips. Diagnosed, then replaced memory, all good.
    I like a lot when one of the drives get ATA reset for whatever reason,
    and all other drives continue to read and write, I can continue using
    the system for hours, if I even notice. Not possible in normal
    circumstances without Raid. Once the problematic drive is back, or after reboot if it's more serious, then I do "scrub" command and everything is resynced again. If I don't do that, then Btrfs dynamically correct
    checksum errors on the fly anyway.
    And list goes on - I've been using Btrfs for last 5 years, not a single problem to date, it survived hard resets, power losses, drive failures, countless migrations.


    Those sound like some compelling features.


    I believe the last time I tried Btrfs was Debian 9 (?). I ran into
    problems because I did not do the required manual maintenance
    (rebalancing). Does the Btrfs in Debian 11 or Debian 12 still require
    manual maintenance? If so, what and how often?


    [1] https://github.com/openzfs/zfs/issues/15526

    [2] https://github.com/openzfs/zfs/issues/15933

    Problems reported here are from Linux kernel 6.5 and 6.7 on Gentoo
    system. Does this even affects Debian Stable with 6.1 LTS?


    I do not know.


    --
    With kindest regards, Piotr.

    ⢀⣴⠾⠻⢶⣦⠀
    ⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system ⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org/
    ⠈⠳⣄⠀⠀⠀⠀


    David

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Curt@21:1/5 to David Christensen on Wed Apr 10 17:00:01 2024
    On 2024-04-10, David Christensen <dpchrist@holgerdanske.com> wrote:

    I use Btrfs, on all my systems, including some servers, with soft Raid1
    and Raid10 modes (because these modes are considered stable and
    production ready). I decided on Btrfs not ZFS, because Btrfs allows to
    migrate drives on the fly while partition is live and heavily used,
    replace them with different sizes and types, mixed capacities, change
    Raid levels, change amount of drives too. I could go from single drive
    to Raid10 on 4 drives and back while my data is 100% available at all
    times.
    It saved my bacon many times, including hard checksum corruption on NVMe
    drive which otherwise I would never know about. Thanks to Btrfs I
    located the corrupted files, fixed them, got hardware replaced under
    warranty.
    Also helped with corrupted RAM: Btrfs just refused to save file because
    saved copy couldn't match read checksum from the source due to RAM bit
    flips. Diagnosed, then replaced memory, all good.
    I like a lot when one of the drives get ATA reset for whatever reason,
    and all other drives continue to read and write, I can continue using
    the system for hours, if I even notice. Not possible in normal
    circumstances without Raid. Once the problematic drive is back, or after
    reboot if it's more serious, then I do "scrub" command and everything is
    resynced again. If I don't do that, then Btrfs dynamically correct
    checksum errors on the fly anyway.
    And list goes on - I've been using Btrfs for last 5 years, not a single
    problem to date, it survived hard resets, power losses, drive failures,
    countless migrations.


    Those sound like some compelling features.

    I don't believe in immortality. After many a summer dies the swan.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Paul Leiber@21:1/5 to All on Wed Apr 10 18:10:01 2024
    Am 10.04.2024 um 13:10 schrieb David Christensen:
    On 4/9/24 17:08, piorunz wrote:
    On 02/04/2024 13:53, David Christensen wrote:

    Does anyone have any comments or suggestions regarding how to use
    magnetic hard disk drives, commodity x86 computers, and Debian for
    long-term data storage with ensured integrity?

    I use Btrfs, on all my systems, including some servers, with soft Raid1
    and Raid10 modes (because these modes are considered stable and
    production ready). I decided on Btrfs not ZFS, because Btrfs allows to
    migrate drives on the fly while partition is live and heavily used,
    replace them with different sizes and types, mixed capacities, change
    Raid levels, change amount of drives too. I could go from single drive
    to Raid10 on 4 drives and back while my data is 100% available at all
    times.
    It saved my bacon many times, including hard checksum corruption on NVMe
    drive which otherwise I would never know about. Thanks to Btrfs I
    located the corrupted files, fixed them, got hardware replaced under
    warranty.
    Also helped with corrupted RAM: Btrfs just refused to save file because
    saved copy couldn't match read checksum from the source due to RAM bit
    flips. Diagnosed, then replaced memory, all good.
    I like a lot when one of the drives get ATA reset for whatever reason,
    and all other drives continue to read and write, I can continue using
    the system for hours, if I even notice. Not possible in normal
    circumstances without Raid. Once the problematic drive is back, or after
    reboot if it's more serious, then I do "scrub" command and everything is
    resynced again. If I don't do that, then Btrfs dynamically correct
    checksum errors on the fly anyway.
    And list goes on - I've been using Btrfs for last 5 years, not a single
    problem to date, it survived hard resets, power losses, drive failures,
    countless migrations.


    Those sound like some compelling features.


    I believe the last time I tried Btrfs was Debian 9 (?).  I ran into
    problems because I did not do the required manual maintenance (rebalancing).  Does the Btrfs in Debian 11 or Debian 12 still require manual maintenance?  If so, what and how often?

    Scrub and balance are actions which have been recommended. I am using btrfsmaintenance scripts [1][2] to automate this. I am doing a weekly
    balance and a monthly scrub. After some reading today, I am getting
    unsure if this is approach is correct, especially if balance is
    necessary anymore (it usually doesn't find anything to do anyway), so
    please take these periods with caution. My main message is that such
    operations can be automated using the linked scripts.

    Best regards,

    Paul

    [1] https://packages.debian.org/bookworm/btrfsmaintenance
    [2] https://github.com/kdave/btrfsmaintenance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Christensen@21:1/5 to Paul Leiber on Thu Apr 11 01:20:01 2024
    On 4/10/24 08:49, Paul Leiber wrote:
    Am 10.04.2024 um 13:10 schrieb David Christensen:
    Does the Btrfs in Debian 11 or Debian 12 still require
    manual maintenance?  If so, what and how often?

    Scrub and balance are actions which have been recommended. I am using btrfsmaintenance scripts [1][2] to automate this. I am doing a weekly
    balance and a monthly scrub. After some reading today, I am getting
    unsure if this is approach is correct, especially if balance is
    necessary anymore (it usually doesn't find anything to do anyway), so
    please take these periods with caution. My main message is that such operations can be automated using the linked scripts.

    Best regards,

    Paul

    [1] https://packages.debian.org/bookworm/btrfsmaintenance
    [2] https://github.com/kdave/btrfsmaintenance


    Thank you. Those scripts should be useful.


    David

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From piorunz@21:1/5 to David Christensen on Fri Apr 12 17:20:01 2024
    On 10/04/2024 12:10, David Christensen wrote:
    Those sound like some compelling features.


    I believe the last time I tried Btrfs was Debian 9 (?).  I ran into
    problems because I did not do the required manual maintenance (rebalancing).  Does the Btrfs in Debian 11 or Debian 12 still require manual maintenance?  If so, what and how often?

    I don't do balance at all, it's not required.

    Scrub is recommended, because it will detect any bit-rot due to hardware
    errors on HDD media. It scans the entire surface of allocated sectors on
    all drives. I do scrub usually monthly.

    --
    With kindest regards, Piotr.

    ⢀⣴⠾⠻⢶⣦⠀
    ⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system ⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org/
    ⠈⠳⣄⠀⠀⠀⠀

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Christensen@21:1/5 to piorunz on Sat Apr 13 05:10:01 2024
    On 4/12/24 08:14, piorunz wrote:
    On 10/04/2024 12:10, David Christensen wrote:
    Those sound like some compelling features.


    I believe the last time I tried Btrfs was Debian 9 (?).  I ran into
    problems because I did not do the required manual maintenance
    (rebalancing).  Does the Btrfs in Debian 11 or Debian 12 still require
    manual maintenance?  If so, what and how often?

    I don't do balance at all, it's not required.

    Scrub is recommended, because it will detect any bit-rot due to hardware errors on HDD media. It scans the entire surface of allocated sectors on
    all drives. I do scrub usually monthly.


    Thank you for the information.


    David

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marc SCHAEFER@21:1/5 to Marc SCHAEFER on Fri May 3 13:30:01 2024
    On Mon, Apr 08, 2024 at 10:04:01PM +0200, Marc SCHAEFER wrote:
    For off-site long-term offline archiving, no, I am not using RAID.

    Now, as I had to think a bit about ONLINE integrity, I found this
    comparison:

    https://github.com/t13a/dm-integrity-benchmarks

    Contenders are btrfs, zfs, and notably ext4+dm-integrity+dm-raid

    I tend to have a biais favoring UNIX layered solutions against
    "all-into-one" solutions, and it seems that performance-wise,
    it's also quite good.

    I wrote this script to convince myself of auto-correction
    of the ext4+dm-integrity+dm-raid layered approach.

    It gives:

    [ ... ]
    [ 390.249699] md/raid1:mdX: read error corrected (8 sectors at 21064 on dm-11) [ 390.249701] md/raid1:mdX: redirecting sector 20488 to other mirror: dm-7
    [ 390.293807] md/raid1:mdX: dm-11: rescheduling sector 262168
    [ 390.293988] md/raid1:mdX: read error corrected (8 sectors at 262320 on dm-11)
    [ 390.294040] md/raid1:mdX: read error corrected (8 sectors at 262368 on dm-11)
    [ 390.294125] md/raid1:mdX: read error corrected (8 sectors at 262456 on dm-11)
    [ 390.294209] md/raid1:mdX: read error corrected (8 sectors at 262544 on dm-11)
    [ 390.294287] md/raid1:mdX: read error corrected (8 sectors at 262624 on dm-11)
    [ 390.294586] md/raid1:mdX: read error corrected (8 sectors at 263000 on dm-11)
    [ 390.294712] md/raid1:mdX: redirecting sector 262168 to other mirror: dm-7

    pretty much convicing.

    So after testing btrfs and being not convinced, after doing some test on
    a production zfs -- not convinced either -- I am going to ry ext4+dm-integrity+dm-raid.

    #! /bin/bash

    set -e

    function create_lo {
    local f

    f=$(losetup -f)

    losetup $f $1
    echo $f
    }

    # beware of the rm -r below!
    tmp_dir=/tmp/$(basename $0)
    mnt=/mnt

    mkdir $tmp_dir

    declare -a pvs
    for p in pv1 pv2
    do
    truncate -s 250M $tmp_dir/$p

    l=$(create_lo $tmp_dir/$p)

    pvcreate $l

    pvs+=($l)
    done

    vg=$(basename $0)-test
    lv=test

    vgcreate $vg ${pvs[*]}

    vgdisplay $vg

    lvcreate --type raid1 --raidintegrity y -m 1 -L 200M -n $lv $vg

    lvdisplay $vg

    # sync/integrity complete?
    sleep 10
    cat /proc/mdstat
    echo
    lvs -a -o name,copy_percent,devices $vg
    echo
    echo -n Type ENTER
    read ignore

    mkfs.ext4 -I 256 /dev/$vg/$lv
    mount /dev/$vg/$lv $mnt

    for f in $(seq 1 10)
    do
    # ignore errors
    head -c 20M < /dev/random > $mnt/f_$f || true
    done

    (cd $mnt && find . -type f -print0 | xargs -0 md5sum > $tmp_dir/MD5SUMS)

    # corrupting some data in one PV
    count=5000
    blocks=$(blockdev --getsz ${pvs[1]})
    if [ $blocks -lt 32767 ]; then
    factor=1
    else
    factor=$(( ($blocks - 1) / 32767))
    fi

    p=1
    for i in $(seq 1 $count)
    do
    offset=$(($RANDOM * $factor))
    echo ${pvs[$p]} $offset
    dd if=/dev/random of=${pvs[$p]} bs=$(blockdev --getpbsz ${pvs[$p]}) seek=$offset count=1
    # only doing on 1, not 0, since we have no way to avoid destroying the same sector!
    #p=$((1 - p))
    done

    dd if=/dev/$vg/$lv of=/dev/null bs=32M
    dmesg | tail

    umount $mnt

    lvremove -y $vg/$lv

    vgremove -y $vg

    for p in ${pvs[*]}
    do
    pvremove $p
    losetup -d $p
    done

    rm -r $tmp_dir

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael =?utf-8?B?S2rDtnJsaW5n?=@21:1/5 to All on Fri May 3 14:50:01 2024
    On 3 May 2024 13:26 +0200, from schaefer@alphanet.ch (Marc SCHAEFER):
    https://github.com/t13a/dm-integrity-benchmarks

    Contenders are btrfs, zfs, and notably ext4+dm-integrity+dm-raid

    ZFS' selling point is not performance, _especially_ on rotational
    drives. In fact, it's fairly widely accepted that ZFS is in fact
    inferior in performance compared to pretty much everything else
    modern, even at the best of times; and some of its features help
    mitigate its lower against-disk performance.

    ZFS' value proposition lies elsewhere.

    Which is fine. It's the right choice for some people; for others,
    other alternatives provide better trade-offs.

    --
    Michael Kjörling 🔗 https://michael.kjorling.se “Remember when, on the Internet, nobody cared that you were a dog?”

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Christensen@21:1/5 to Marc SCHAEFER on Fri May 3 23:00:01 2024
    On 5/3/24 04:26, Marc SCHAEFER wrote:
    On Mon, Apr 08, 2024 at 10:04:01PM +0200, Marc SCHAEFER wrote:
    For off-site long-term offline archiving, no, I am not using RAID.

    Now, as I had to think a bit about ONLINE integrity, I found this
    comparison:

    https://github.com/t13a/dm-integrity-benchmarks

    Contenders are btrfs, zfs, and notably ext4+dm-integrity+dm-raid

    I tend to have a biais favoring UNIX layered solutions against
    "all-into-one" solutions, and it seems that performance-wise,
    it's also quite good.

    I wrote this script to convince myself of auto-correction
    of the ext4+dm-integrity+dm-raid layered approach.


    Thank you for devising a benchmark and posting some data. :-)


    FreeBSD also offers a layered solution. From the top down:

    * UFS2 file system, which supports snapshots (requires partitions with
    soft updates enabled).

    * gpart(8) for partitions (volumes).

    * graid(8) for redundancy and self-healing.

    * geli(8) providers with continuous integrity checking.


    AFAICT the FreeBSD stack is mature and production quality, which I find
    very appealing. But the feature set is not as sophisticated as ZFS,
    which leaves me wanting. Notably, I have not found a way to replicate
    UFS snapshots directly -- the best I can dream up is synchronizing a
    snapshot to a backup UFS2 filesystem and then taking a snapshot with the
    same name.


    I am coming to the conclusion that the long-term survivability of data
    requires several components -- good live file system, good backups, good archives, continuous internal integrity checking with self-healing,
    periodic external integrity checking (e.g. mtree(1)) with some form of
    recovery (e.g. manual), etc.. If I get the other pieces right, I could
    go with OpenZFS for the live and backup systems, and worry less about
    data corruption bugs.


    David

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marc SCHAEFER@21:1/5 to David Christensen on Sat May 4 09:50:01 2024
    On Fri, May 03, 2024 at 01:50:52PM -0700, David Christensen wrote:
    Thank you for devising a benchmark and posting some data. :-)

    I did not do the comparison hosted on github. I just wrote the
    script which tests the dm-integrity on dm-raid error detection
    and error correction.

    FreeBSD also offers a layered solution. From the top down:

    I prefer this approach, indeed.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)