Last year I bought a couple of Icy Box RAID enclosures. These have given
me intermittent problems for a while, mainly bit-flip errors, invalid
hashes, and underallocation. All sorted by Disk Utility, I kept trouble
at bay by replacing the USB cables, connecting direct to the Mac (rather
than through a hub) etc. Problems seemed to be caused when using Carbon
Copy Cloner, but I'd guess this is because CCC a) checks the validity of
its copying (so I get to find out about any problems immediately), and b)
CCC copies terabytes of data when I use it.
Anyhow, last week there were big problems. The culmination came yesterday when I used Carbon Copy Cloner to back up my Mac's internal drive to a
folder on Icy Box. Immediately afterwards there were overlapped extent allocation errors, and the whole disk is pretty much unusable.
I've applied all my knowledge to track down the cause, but I'm a bit stuck. What's likely to be causing this, and what can I rule out?
• Hardware/firmware error in the Icy Box?
• Physical problems with the cable?
• Problems because the long (3m) USB cable snakes its way past lots of other cables, including mains cables?
• Hardware/software error at the Mac end?
Any suggestions on the cause, or advice for future testing gratefully received.
Martin S Taylor<correspondence@mraermtoivnestthaiyslor.com> wrote:
Last year I bought a couple of Icy Box RAID enclosures. These have given
me intermittent problems for a while, mainly bit-flip errors, invalid hashes, and underallocation. All sorted by Disk Utility, I kept trouble
at bay by replacing the USB cables, connecting direct to the Mac (rather than through a hub) etc. Problems seemed to be caused when using Carbon Copy Cloner, but I'd guess this is because CCC a) checks the validity of its copying (so I get to find out about any problems immediately), and b) CCC copies terabytes of data when I use it.
What kind of RAID is the Icy Box doing? 0/1/5/6/10? Hardware RAID in the
box, or software RAID (as multiple drives joined together on the Mac, like a Fusion drive)?
Where are you seeing these errors? In CCC, or from the box?
Anyhow, last week there were big problems. The culmination came yesterday when I used Carbon Copy Cloner to back up my Mac's internal drive to a folder on Icy Box. Immediately afterwards there were overlapped extent allocation errors, and the whole disk is pretty much unusable.
I've applied all my knowledge to track down the cause, but I'm a bit stuck. What's likely to be causing this, and what can I rule out?
• Hardware/firmware error in the Icy Box?
If it's hardware RAID I would suspect the Icy Box...
• Physical problems with the cable?
If there was a cable error the USB protocol should resend it. If it's
serious I'd expect discs dropping out.
Is the USB powering the Icy Box, or does it have external power?
• Problems because the long (3m) USB cable snakes its way past lots of other cables, including mains cables?
Unlikely, but USB should handle that.
• Hardware/software error at the Mac end?
Unfortunately HFS and APFS don't have checksums, so it is possible for corrupt data to end up on the disc if there's a problem with the hardware.
(ZFS checksums all on-disc data, so you can confirm everything is
consistent. This is a good way to confirm hardware is behaving itself. Unfortunately Apple skipped that for APFS, saying its flash was sufficiently reliable)
I have a full backup! But it's still a pain verifying the integrity of what I've got.Any suggestions on the cause, or advice for future testing gratefully received.
If it's RAID1 it would be possible to pull one of the drives out of the
array and see if it's better with a single drive, but I wouldn't advise that unless you have a full backup. Although with a dubious RAID I'd be wanting
a full backup in any case.
In article<Icq*gaHdz@news.chiark.greenend.org.uk>, Theo <theom+news@chiark.greenend.org.uk> wrote:
(ZFS checksums all on-disc data, so you can confirm everything is consistent. This is a good way to confirm hardware is behaving itself. Unfortunately Apple skipped that for APFS, saying its flash was sufficiently
reliable)
btrfs also does that, used on synology nases (and a couple of others).
synology nases can also snapshot files, much like time machine on a mac.
(ZFS checksums all on-disc data, so you can confirm everything is
consistent. This is a good way to confirm hardware is behaving itself. Unfortunately Apple skipped that for APFS, saying its flash was sufficiently reliable)
It's not the speed that's important. Backblaze charge peanuts to back up a DAS, but considerably more than peanuts to back up a NAS.
In article
<0001HW.29E8864B004A5DC87000022E738F@news.eternal-september.org>,
Martin S Taylor <correspondence@mRaErMtOiVnEsTtHaIySlor.com> wrote:
I think Synology NASes cannot be used as DASes, can they? I have to use a DAS, not a NAS.
correct. they cannot, however, some of them have 10gb-e ports so the
speeds are comparable (assuming your computer also has a 10gb-e port).
Oh, and the other clue is that the Icy Box runs at about 180MB/s normally, but sometimes slows right down to 30MB/s or even less. The simplest and most reliable way to get it up to speed is to reboot the Mac. (Ejecting it and remounting it doesn't work.)
I think Synology NASes cannot be used as DASes, can they? I have to use a DAS, not a NAS.
Last year I bought a couple of Icy Box RAID enclosures. These have given me intermittent problems for a while, mainly bit-flip errors, invalid hashes, and underallocation. All sorted by Disk Utility, I kept trouble at bay by replacing the USB cables, connecting direct to the Mac (rather than through a hub) etc. Problems seemed to be caused when using Carbon Copy Cloner, but I'd guess this is because CCC a) checks the validity of its copying (so I get to find out about any problems immediately), and b) CCC copies terabytes of data when I use it.
Anyhow, last week there were big problems. The culmination came yesterday when I used Carbon Copy Cloner to back up my Mac's internal drive to a folder on Icy Box. Immediately afterwards there were overlapped extent allocation errors, and the whole disk is pretty much unusable.
I've applied all my knowledge to track down the cause, but I'm a bit stuck. What's likely to be causing this, and what can I rule out?
• Hardware/firmware error in the Icy Box?
• Physical problems with the cable?
• Problems because the long (3m) USB cable snakes its way past lots of other cables, including mains cables?
• Hardware/software error at the Mac end?
• Bug in Carbon Copy Cloner?
Any suggestions on the cause, or advice for future testing gratefully received.
Do the Icy boxes have Ethernet? If so, using that might be more reliable.
Otherwise can you move the Icy boxes closer to the Mac and use a shorter cable for a week or two?
Is the firmware on the Icy boxes up to date (if it can be updated)?
Do you have a spare Mac?
Connect the Icy Box to the spare Mac and confirm that you still see the
same errors. If not suspect the Mac - but apart from the USB hardware
in the Mac I can't suggest a fault mechanism.
If you do see the same errors, try an alternative RAID enclosure. This
would mean wiping the disks so they can be managed by the new RAID controller, so get new disks as well.
Is the firmware on the Icy boxes up to date (if it can be updated)?
On 13 Apr 2023, Graham J wrote
(in article <u19lrh$13sjg$1@dont-email.me>):
Do you have a spare Mac?
Connect the Icy Box to the spare Mac and confirm that you still see the
same errors. If not suspect the Mac - but apart from the USB hardware
in the Mac I can't suggest a fault mechanism.
If you do see the same errors, try an alternative RAID enclosure. This
would mean wiping the disks so they can be managed by the new RAID
controller, so get new disks as well.
Well I *do* have another Mac, but the fault can't be reproduced relilably enough to see.
And yes, buying new disks is expensive, but I've tried that and it makes no difference.
Hi Theo:
Thanks for the comprehensive reply.
On 13 Apr 2023, Theo wrote
(in article <Icq*gaHdz@news.chiark.greenend.org.uk>):
Martin S Taylor<correspondence@mraermtoivnestthaiyslor.com> wrote:
Last year I bought a couple of Icy Box RAID enclosures. These have given me intermittent problems for a while, mainly bit-flip errors, invalid hashes, and underallocation. All sorted by Disk Utility, I kept trouble at bay by replacing the USB cables, connecting direct to the Mac (rather than through a hub) etc. Problems seemed to be caused when using Carbon Copy Cloner, but I'd guess this is because CCC a) checks the validity of its copying (so I get to find out about any problems immediately), and b) CCC copies terabytes of data when I use it.
What kind of RAID is the Icy Box doing? 0/1/5/6/10? Hardware RAID in the box, or software RAID (as multiple drives joined together on the Mac, like a
Fusion drive)?
Hardward RAID 5
Where are you seeing these errors? In CCC, or from the box?
Neither –it's when I check the integrity of the disk using Disk Utility.
If it's hardware RAID I would suspect the Icy Box...
That's my suspicion too, but I sent one Icy Box back because I thought it was causing the problems, and the new one is exactly the same.
Martin S Taylor<correspondence@mraermtoivnestthaiyslor.com> wrote:
Hi Theo:
Thanks for the comprehensive reply.
On 13 Apr 2023, Theo wrote
(in article <Icq*gaHdz@news.chiark.greenend.org.uk>):
Martin S Taylor<correspondence@mraermtoivnestthaiyslor.com> wrote:
Last year I bought a couple of Icy Box RAID enclosures. These have given
me intermittent problems for a while, mainly bit-flip errors, invalid hashes, and underallocation. All sorted by Disk Utility, I kept trouble at bay by replacing the USB cables, connecting direct to the Mac (rather
than through a hub) etc. Problems seemed to be caused when using Carbon Copy Cloner, but I'd guess this is because CCC a) checks the validity of
its copying (so I get to find out about any problems immediately), and b)
CCC copies terabytes of data when I use it.
What kind of RAID is the Icy Box doing? 0/1/5/6/10? Hardware RAID in the box, or software RAID (as multiple drives joined together on the Mac, like
a
Fusion drive)?
Hardward RAID 5
OK. One thing you could try is a single disc, and then two discs as RAID1.
If the Icy is doing something strange then it's less likely to happen here, since it is no longer doing parity calculations.
Where are you seeing these errors? In CCC, or from the box?
Neither –it's when I check the integrity of the disk using Disk Utility.
I'm not sure what that's doing - I presume it's checking your filesystem integrity, not the integrity of the data on the disc (since APFS has no data checksums)?
In which case you'd only see problems when corruption hits
filesystem metadata rather than data blocks. Which might explain the random behaviour.
I would be tempted to look for a tool that writes data to the disc and then checks it can read it back perfectly. I'm not sure what's best for that,
but some of the fake SD card checkers like f3 might do it.
If it's hardware RAID I would suspect the Icy Box...
That's my suspicion too, but I sent one Icy Box back because I thought it was
causing the problems, and the new one is exactly the same.
If using the Icy as single discs is reliable, you could do software RAID on the Mac:
https://support.apple.com/en-gb/guide/disk-utility/dskufd8dce72/mac
although it's really only being a USB caddy at that point. I don't
think you would lose a lot of speed that way, as the hard drives are the bottleneck and the Mac's CPU is much faster than the weedy Icy one.
Theo
Yes, that will serve to test the Icy Boxes, but won't store enough data
for me. One Icy Box holds four 6TB drives, on which I have about 12TB of data. RAID1 won't store that, obviously, and I don't have the spare disks (or the time) to test it. I'd be more tempted to sell the Icy Boxes and
buy something else. (Recommendation?)
Yes, that will serve to test the Icy Boxes, but won't store enough data for me. One Icy Box holds four 6TB drives, on which I have about 12TB of data. RAID1 won't store that, obviously, and I don't have the spare disks (or the time) to test it. I'd be more tempted to sell the Icy Boxes and buy something else. (Recommendation?)
Yes, that will serve to test the Icy Boxes, but won't store enough data for me. One Icy Box holds four 6TB drives, on which I have about 12TB of data. RAID1 won't store that, obviously, and I don't have the spare disks (or the time) to test it. I'd be more tempted to sell the Icy Boxes and buy something
else. (Recommendation?)
I would look for a RAID box that has a built-in console (either a screen
and keypad, or serial interface to a dumb terminal, or a web page
accessible via Ethernet). That way the health of the RAID can be proved
and monitored independently of the computer that stores data on it.
Probably this looks more like a NAS than a DAS.
No idea of available products, sorry.
Martin S Taylor<correspondence@mraermtoivnestthaiyslor.com> wrote:
Yes, that will serve to test the Icy Boxes, but won't store enough data
for me. One Icy Box holds four 6TB drives, on which I have about 12TB of data. RAID1 won't store that, obviously, and I don't have the spare disks (or the time) to test it. I'd be more tempted to sell the Icy Boxes and
buy something else. (Recommendation?)
I don't know how it works out in money terms, but how about selling the lot and getting a couple of 14TB USB HDD to put in software RAID1?
I have some of these in a NAS: https://www.amazon.co.uk/WD-Elements-Desktop-External-Drive/dp/B07Y3KDVZH/ (they were about £160 at the time ~2020: there are often deals)
You have to be a little careful not to get SMR drives (mine are all CMR, but they change the innards from time to time). On Reddit r/datahoarders is a good place to confirm these things. Mine are helium ex-HGST mechanisms, so good quality.
Martin S Taylor wrote:
[snip]
Yes, that will serve to test the Icy Boxes, but won't store enough data for me. One Icy Box holds four 6TB drives, on which I have about 12TB of data. RAID1 won't store that, obviously, and I don't have the spare disks (or the time) to test it. I'd be more tempted to sell the Icy Boxes and buy something
else. (Recommendation?)
I would look for a RAID box that has a built-in console (either a screen
and keypad, or serial interface to a dumb terminal, or a web page
accessible via Ethernet). That way the health of the RAID can be proved
and monitored independently of the computer that stores data on it.
Probably this looks more like a NAS than a DAS.
But in your case it could be stressing the Icy box RAID - if drive A writes
a block at 200MB/s and drive B at 20MB/s, the RAID controller has to buffer
a lot of blocks for drive B. Maybe at some point writes are getting dropped because it runs out of buffer space. It should just backpressure rather
than dropping them, but maybe they never tested this extreme case?
On 14 Apr 2023, Theo wrote
(in article <Icq*5PKdz@news.chiark.greenend.org.uk>):
Martin S Taylor<correspondence@mraermtoivnestthaiyslor.com> wrote:
Yes, that will serve to test the Icy Boxes, but won't store enough data for me. One Icy Box holds four 6TB drives, on which I have about 12TB of data. RAID1 won't store that, obviously, and I don't have the spare disks (or the time) to test it. I'd be more tempted to sell the Icy Boxes and buy something else. (Recommendation?)
I don't know how it works out in money terms, but how about selling the lot and getting a couple of 14TB USB HDD to put in software RAID1?
I have been considering this. You mean, sit two of them side-by-side on my desk and let MacOS put them into RAID1? So I don't shuck them, and don't use an enclosure?
I have some of these in a NAS: https://www.amazon.co.uk/WD-Elements-Desktop-External-Drive/dp/B07Y3KDVZH/ (they were about £160 at the time ~2020: there are often deals)
Now £233 :(
You have to be a little careful not to get SMR drives (mine are all CMR, but
they change the innards from time to time). On Reddit r/datahoarders is a good place to confirm these things. Mine are helium ex-HGST mechanisms, so good quality.
But this is huge!! I had no idea of the distinction. Although it doesn't explain the corruption of the disk structure (does it?) it most definitely explains why a disk running at 200MB/s will slow down to 25MB/s when it's under stress. All my HDDs appear to be SMR, and while that's fine for off-site backups and the like, as the article says it's pretty useless for a working drive streaming video footage while I edit it using Final Cut.
I would look for a RAID box that has a built-in console (either a screen
and keypad, or serial interface to a dumb terminal, or a web page
accessible via Ethernet). That way the health of the RAID can be proved
and monitored independently of the computer that stores data on it.
Probably this looks more like a NAS than a DAS.
No idea of available products, sorry.
If you were to have a box that supported iSCSI, you could then run an iSCSI initiator on the Mac and on top of that have regular HFS+/APFS formatted volumes.
I couldn't say if Backblaze would be happy with that, but they would look like regular HDD to a lot of the stack.
You have to be a little careful not to get SMR drives (mine are all CMR, but >> they change the innards from time to time). On Reddit r/datahoarders is a
good place to confirm these things. Mine are helium ex-HGST mechanisms, so >> good quality.
But this is huge!! I had no idea of the distinction. Although it doesn't explain the corruption of the disk structure (does it?) it most definitely explains why a disk running at 200MB/s will slow down to 25MB/s when it's under stress. All my HDDs appear to be SMR, and while that's fine for off-site backups and the like, as the article says it's pretty useless for a working drive streaming video footage while I edit it using Final Cut.
While with software RAID you
just plug your drives into another Mac and keep going.
Your disk system needs to be designed for performance.
What about a PCIe controller with embedded firmware RAID, connected to several HDDs (CMR style) via SATA cables.
Your Mac ***may*** have the performance to manage a s/w RAID but a
separate RAID subsystem makes more sense to me. Also, if the Mac fails
the PCIe card can be plugged into another Mac and you can continue to
work with the storage.
In article
<0001HW.29E88EBE004C58DF70000C93238F@news.eternal-september.org>,
Martin S Taylor <correspondence@mRaErMtOiVnEsTtHaIySlor.com> wrote:
It's not the speed that's important. Backblaze charge peanuts to back up a >> DAS, but considerably more than peanuts to back up a NAS.
one solution is to use iscsi and it will appear as a local volume.
unfortunately, macos doesn't include an iscsi initiator so you have to
use with third party options and they have some issues.
on the other hand, synology has all sorts of cloud sync options,
however, i have not looked into pricing.
on the other hand, synology has all sorts of cloud sync options,I haven't checked their pricing but I seem to remember a couple of promotional
however, i have not looked into pricing.
emails from them recently promoting their cloud services.
I don't think Synology offer cloud storage, though I'm happy to be corrected.
on the other hand, synology has all sorts of cloud sync options,
however, i have not looked into pricing.
I haven't checked their pricing but I seem to remember a couple of promotional
emails from them recently promoting their cloud services.
In article
<0001HW.29EA9E5600C8008B70000D2FB38F@news.eternal-september.org>,
Martin S Taylor <correspondence@mRaErMtOiVnEsTtHaIySlor.com> wrote:
I don't think Synology offer cloud storage, though I'm happy to be corrected.
be happy!
<https://c2.synology.com/>
Unfortunately HFS and APFS don't have checksums, so it is possible for corrupt data to end up on the disc if there's a problem with the hardware.
(ZFS checksums all on-disc data, so you can confirm everything is
consistent. This is a good way to confirm hardware is behaving itself. Unfortunately Apple skipped that for APFS, saying its flash was sufficiently reliable)
On 15 Apr 2023, nospam wrote
(in article<150420230739452158%nospam@nospam.invalid>):
In article
<0001HW.29EA9E5600C8008B70000D2FB38F@news.eternal-september.org>,
Martin S Taylor <correspondence@mRaErMtOiVnEsTtHaIySlor.com> wrote:
I don't think Synology offer cloud storage, though I'm happy to be
corrected.
be happy!
<https://c2.synology.com/>
Thanks. Not that happy, though, since I have well over 15TB of data on my drives, and €900 a year is a bit much.
<https://c2.synology.com/>
Thanks. Not that happy, though, since I have well over 15TB of data on my drives, and ¤900 a year is a bit much.
Amazon S3 Glacier is $0.00099 per GB per month so in the order of £200
per year for 15TB.
You can use Filezilla Pro (£21 for 3 years) as the client and FTP your
files into it.
Withdrawing data costs extra so it really is a backup service - but I
think that will be fine for your particular use case.
Encrypting on the Mac first (at the project level perhaps, or maybe per video) is recommended. So perhaps create a temporary encrypted dmg, copy
the vids into it and then FTP that to S3.
But in your case it could be stressing the Icy box RAID - if drive A writes a block at 200MB/s and drive B at 20MB/s, the RAID controller has to buffer a lot of blocks for drive B. Maybe at some point writes are getting dropped because it runs out of buffer space. It should just backpressure rather than dropping them, but maybe they never tested this extreme case?
When the rebuild has finished on Sunday I'll open the case and see if all the drives are CMR or SMR. That *might* explain things, I guess.
Amazon S3 Glacier is $0.00099 per GB per month so in the order of £200
per year for 15TB.
I would be very tempted to sell the lot and start afresh. Although the Icy may behave itself in the presence of saner drives, I'd probably explore
other options.
On 14 Apr 2023, Martin S Taylor wrote
(in article<0001HW.29E98D6D008809BA700009CA138F@news.eternal-september.org>):
But in your case it could be stressing the Icy box RAID - if drive A writes
a block at 200MB/s and drive B at 20MB/s, the RAID controller has to buffer
a lot of blocks for drive B. Maybe at some point writes are getting dropped
because it runs out of buffer space. It should just backpressure rather than dropping them, but maybe they never tested this extreme case?
When the rebuild has finished on Sunday I'll open the case and see if all the
drives are CMR or SMR. That *might* explain things, I guess.
One CMR, three SMR.
While I was about it, I set Carbon Copy Cloner to do an MD5 checksum, re-reading the files it had replaced. Two of them were corrupt.
On 17 Apr 2023, Theo wrote
(in article <Hcq*ME1dz@news.chiark.greenend.org.uk>):
I would be very tempted to sell the lot and start afresh. Although the Icy >> may behave itself in the presence of saner drives, I'd probably explore
other options.
I am considering just that. Perhaps an OWC enclosure with SoftRAID.
Martin S Taylor wrote:
On 17 Apr 2023, Theo wrote
(in article <Hcq*ME1dz@news.chiark.greenend.org.uk>):
I would be very tempted to sell the lot and start afresh. Although the Icy
may behave itself in the presence of saner drives, I'd probably explore other options.
I am considering just that. Perhaps an OWC enclosure with SoftRAID.
I see that you need large storage capacity. Does it have to be a single volume?
Why do you need RAID? Is it for resilience, so you can hot-swap a
failed drive? Or simply to manage the large storage space?
On 17 Apr 2023, Bruce Horrocks wrote
(in article<eb8cd199-9edc-f2c6-3b03-c5ce7bd5181c@scorecrow.com>):
Amazon S3 Glacier is $0.00099 per GB per month so in the order of £200
per year for 15TB.
On their site they're quoting $0.00405 per GB per month.
I've applied all my knowledge to track down the cause, but I'm a bit stuck. What's likely to be causing this, and what can I rule out?
• Hardware/firmware error in the Icy Box?
• Physical problems with the cable?
• Problems because the long (3m) USB cable snakes its way past lots of other cables, including mains cables?
• Hardware/software error at the Mac end?
• Bug in Carbon Copy Cloner?
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 546 |
Nodes: | 16 (2 / 14) |
Uptime: | 148:58:08 |
Calls: | 10,383 |
Calls today: | 8 |
Files: | 14,054 |
D/L today: |
2 files (1,861K bytes) |
Messages: | 6,417,762 |