But of course, any errors on drive A propagate daily to drive B.
Hi!
On a thread at another mailing list, someone mentioned that they, each
day, alternate doing backups between two external usb drives. That got
me to thinking (which is always dangerous) . . .
I have a full backup on usb external drive A, "refreshed" daily using rsnapshot. Then, every day, I use rsync to make usb external drive B an "exact" copy of usb external drive A. It seemed to be a good idea,
since if drive A fails, I can immediately plug in drive B to replace
it, with no down time, and nothing lost.
But of course, any errors on drive A propagate daily to drive B.
So, is there a consensus on which would be better:
1) continue to "mirror" drive A to drive B?
or,
2) alternate backups daily between drives A and B?
So, is there a consensus on which would be better:Β
1) continue to "mirror" drive A to drive B?
or,
2) alternate backups daily between drives A and B?
Having both drives connected and spinning simultaneusly creates a
window of opportunity for some nasty ransomware (or a software bug,
mistake, power surge, whatever) to destroy both backups.
Of course it is safer to always have one copy offline.
True. But easier (and cheaper) said than done. [...]
Hi!
On a thread at another mailing list, someone mentioned that they, each
day, alternate doing backups between two external usb drives. That got
me to thinking (which is always dangerous) . . .
I have a full backup on usb external drive A, "refreshed" daily using rsnapshot. Then, every day, I use rsync to make usb external drive B an "exact" copy of usb external drive A. It seemed to be a good idea,
since if drive A fails, I can immediately plug in drive B to replace
it, with no down time, and nothing lost.
But of course, any errors on drive A propagate daily to drive B.
So, is there a consensus on which would be better:
1) continue to "mirror" drive A to drive B?
or,
2) alternate backups daily between drives A and B?
Also why I would not want all backup-storage devices connected simultaneously. All it takes is one piece of software going haywire
and you may have a situation where both the original and all backups
are corrupted simultaneously.
(...)
So, is there a consensus on which would be better:
1) continue to "mirror" drive A to drive B?
or,
2) alternate backups daily between drives A and B?
May I ask why you decided to switch from rsnapshot to rdiff-backup, and
then to borg?
On Wed Oct 2, 2024 at 12:33 AM BST, Default User wrote:
May I ask why you decided to switch from rsnapshot to rdiff-backup, and
then to borg?
The main issue I hit with rdiff-backup was if I wanted to move files
or directories containing large files around on my storage: that
resulted in the new locations being considered "new", and the next
backup increment being comparatively large.
At the time I was using rsnapshot, I was subscribed to some very high
traffic mailing lists (such as LKML), and storing the mail in Maildir
format (=1 file per email). rsnapshot's design of lots of hardlinks for files that are present in more than one backup increment proved very expensive at the time (I switched to rdiff-backup in around 2006-2007).
Do you mean inodes expensive ? Which filesystem do you used ?
I use rdiff to do the backups on the "server" (its job is serving video content to the TV box over NFS) and ran into that problem, so what I did was write a series of scripts that relinked identical files. It's not perfect,
I suspect there are still bugs. It tries to be efficient (by not comparing files that can't possibly be the same because they have different sizes, or are already linked), but it gets the job done. Eventually.
I use rdiff to do the backups on the "server" (its job is serving video content to the TV box over NFS) and ran into that problem, so what I did was write a series of scripts that relinked identical files. It's not perfect,
I suspect there are still bugs. It tries to be efficient (by not comparing files that can't possibly be the same because they have different sizes, or are already linked), but it gets the job done. Eventually. Running it
takes about as long as running the backup in the first place. But hey,
we're talking about 1 GiB of filespace which might change by 10-20 MiB between backups, so not a big deal.
eben@gmx.us wrote:
I use rdiff to do the backups on the "server" ... and ran into that
problem, so what I did was write a series of scripts that relinked
identical files.
Possibly of interest: Debian package rdfind:
Description: find duplicate files utility rdfind is a program to find duplicate files and optionally list, delete them or replace them with symlinks or hard links. It is a command line program written in c++,
which has proven to be pretty quick compared to its alternatives.
It was 18 years ago so I can't remember that clearly, but I think it
was a mixture of inodes expense and an enlarged amount of CPU time
with the file churn (mails moved from new to cur, and later to a
separate archive Maildir, that sort of thing). It was probably ext3
given the time.
On Mon Oct 7, 2024 at 9:37 AM BST, Michel Verdier wrote:
Do you mean inodes expensive ? Which filesystem do you used ?
It was 18 years ago so I can't remember that clearly, but I think it was
a mixture of inodes expense and an enlarged amount of CPU time with the
file churn (mails moved from new to cur, and later to a separate archive Maildir, that sort of thing). It was probably ext3 given the time.
It was 18 years ago so I can't remember that clearly, but I think it was
a mixture of inodes expense and an enlarged amount of CPU time with the
file churn (mails moved from new to cur, and later to a separate archive Maildir, that sort of thing). It was probably ext3 given the time.
I add dateext parameter for logrotate so old logs keep the same name.
I've used rsnapshot for several years now with no such issue. My
rsnapshot repository resides on ext4, on its own LVM logical volume, on
top of an encrypted RAID 5 array on four four terabyte spinning rust
drives.
/crc/rsnapshot root@hawk:~# df -i /crc/rsnapshot/
Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/hawk--vg--raid-rsnapshot 16M 3.2M 13M 21%
On Mon, Oct 07, 2024 at 08:44:44PM +0100, Jonathan Dowland wrote:
On Mon Oct 7, 2024 at 9:37 AM BST, Michel Verdier wrote:
Do you mean inodes expensive ? Which filesystem do you used ?
It was 18 years ago so I can't remember that clearly, but I think it was
a mixture of inodes expense and an enlarged amount of CPU time with the file churn (mails moved from new to cur, and later to a separate archive Maildir, that sort of thing). It was probably ext3 given the time.
Note that the transition to Ext4 must have been around 2006, making
huge directories viable (HTree). So perhaps this is a factor too.
When you have hundreds of millions of files in rsnapshot it really
starts to hurt because every backup run involves:
- Deleting the oldest tree of files;
- Walking the entire tree of the most recent backup once to cp -l it and
then;
This rsnapshot I have is really quite slow with only two 7200rpm HDDs.
It spends way longer walking its data store than actually backing up any data. I could definitely make it speedier by switching to something
else. But I like rsnapshot for this particular case.
Although it probably matters most how many files you have only in the
most recent backup iteration rather than the entire rsnapshot store. For
me that is approx 5.8 million.
On 2024-10-08, Andy Smith wrote:
When you have hundreds of millions of files in rsnapshot it really
starts to hurt because every backup run involves:
- Deleting the oldest tree of files;
rsnapshot can rename it apart and delete it after backup is done. Thus involving only the backup system
- Walking the entire tree of the most recent backup once to cp -l it and
then;
rsnapshot only renames directories when rotating backups then does rsync
with hard links to the newest
rsync uses metadata so it also depends on the filesystem. Some are
quicker. I think metadata is quite like the index used by other backup systems.
Like I say I like and use rsnapshot in some places, but speed and
resource efficiency are not its winning points.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 546 |
Nodes: | 16 (0 / 16) |
Uptime: | 164:19:49 |
Calls: | 10,385 |
Calls today: | 2 |
Files: | 14,057 |
Messages: | 6,416,518 |