PDA

View Full Version : How to reliably/easily backup 100Gb of FLACs?



mflint
2005-11-09, 04:44
In another thread, 'street_samurai' wrote:

Having said that, I also use FLAC as I don't want to rip my CDs ever again... and space is cheap.
True, space is cheap, and ripping is a time-consuming process. But we still put our trust in a hard-drive that probably runs 24x7 in many cases, and probably wasn't designed for that. (And that's without worrying about fire or theft!)

So how to make decent backups of all those FLACs? Rough calculations indicate that, when all my music is ripped, I'll have been 80Gb and 90Gb of FLACs.

I see two alternatives: backup to another hard-drive, or to a stack of DVDs.

Hard-disk: to do it properly, would need a large disk in another machine, with regular backups using rsync or something similar. In an ideal world I'd not like to keep another box running constantly. Alternatively, could occasionally plug in a USB external drive and manually run 'rsync'... but that's a bit flaky when you're as disorganised as I am! ;-)

DVDs: Would work, but would need somehow to keep track of those tracks have been backed-up, and those that haven't. Ideally, this would be semi-automatic: have something that magically recognises when there's a fresh 4Gb of unarchived FLACs, then sends an email to say "Hey! Stick a blank dvd in the drive". I'm Unix-based, so could maybe use the 'archive' bit on each file?


Any other ideas? How does everyone else do it?

Matthew

mherger
2005-11-09, 04:56
> Hard-disk: to do it properly, would need a large disk in another
> machine, with regular backups using rsync or something similar. In an
> ideal world I'd not like to keep more than one box running constantly.

Use an external HD (USB2/Firewire), copy everything over, put the HD in a
safe place.

--

Michael

-----------------------------------------------------------
Help translate SlimServer by using the
SlimString Translation Helper (http://www.herger.net/slim/)

Michaelwagner
2005-11-09, 04:59
Personally, I back up to another machine.

Not perhaps the brightest, since they live in the same room, and certain catastrophic failures (fire) would take them both out.

Most of my collection is also backed up on a USB hard disk. But I've been lazy about keeping it up to date.

I think at one point Sean (or someone from Slim) commented on putting an old computer at a friends house with DSL or cable, running an FTP server on it, and backing up to there. Probably not a bad idea.

clumsyoik
2005-11-09, 05:07
DVDs: Would work, but would need somehow to keep track of those tracks have been backed-up, and those that haven't. Ideally, this would be semi-automatic: have something that magically recognises when there's a fresh 4Gb of unarchived FLACs, then sends an email to say "Hey! Stick a blank dvd in the drive". I'm Unix-based, so could maybe use the 'archive' bit on each file?


Any other ideas? How does everyone else do it?

Matthew
Here is my current system: I currently rip CDs to a 'new' folder (say ~/music/new). Whenever I have 4GB of files in there, I copy them to ~/music/nn where nn is 01, 02 etc. Then burn it as a DVD.

As another level of backup, and to ensure that any changed tags (eventually) get backed up, I re-burn each DVD periodically (these days, at <20p per disc, this can be quite often, certainly every year)

I'd like a better solution too. I'd have to probably re-tag quite a few tracks after restoring from DVD. Im thinking of storing some kind of binary diff, but then again, its easier to re-burn the entire DVD.

mherger
2005-11-09, 05:09
> I think at one point Sean (or someone from Slim) commented on putting
> an old computer at a friends house with DSL or cable, running an FTP
> server on it, and backing up to there. Probably not a bad idea.

If you go the internet way, use rsync instead of ftp. Can easily be
automated and won't transfer full files if you eg. only change tags.

--

Michael

-----------------------------------------------------------
Help translate SlimServer by using the
SlimString Translation Helper (http://www.herger.net/slim/)

nmizel
2005-11-09, 05:31
If you go the internet way, use rsync instead of ftp. Can easily be
automated and won't transfer full files if you eg. only change tags.


Exactly. You can even run rsync over ssl for better security.

Nicolas

mflint
2005-11-09, 05:31
Here is my current system: I currently rip CDs to a 'new' folder (say ~/music/new). Whenever I have 4GB of files in there, I copy them to ~/music/nn where nn is 01, 02 etc. Then burn it as a DVD.
I like that. :-)


I'd have to probably re-tag quite a few tracks after restoring from DVD. Im thinking of storing some kind of binary diff...
I like that too! :-)

pfarrell
2005-11-09, 06:28
On Wed, 2005-11-09 at 03:44 -0800, mflint wrote:
> So how to make decent backups of all those FLACs? Rough calculations
> indicate that, when all my music is ripped, I'll have been 80Gb and
> 90Gb of FLACs.
>
> I see two alternatives: backup to another hard-drive, or to a stack of


Since 300GB disks are selling for well under $100
its a no brainer decision for me. 100GB of data is going to
take 25 or so DVDs and that is a lot of disk shuffling


--
Pat
http://www.pfarrell.com/music/slimserver/slimsoftware.html

stuorguk
2005-11-09, 06:55
I figured the highest risk of data loss for me, was HD failure.

So I put all my audio files on a RAID1 Linux box, in the attic (out of sight from thieves). Doesn't cover me from a virus deleting all my files, or fire, but it's better than nothing. I figure if I have a fire, I will have more things to worry than my music collection.

RAID1 on Linux was easy to set up - just 2 IDE drives of the same size plugged into the standard controller. If one drive fails, I get an e-mail to warn me (in theory anyway!)

Stuart.

max.spicer
2005-11-09, 11:04
I went the other way as I figured it wouldn't be too bad once I got over the pain of the initial backup. However, I've still not yet managed a full backup of my data!

I decided to use pbackup.sf.net to do incremental backups of my data to dvds. I use Nero's packet writer, InCD, to do the writing to DVDs. Sadly, InCD has crashed every time I've tried so far.

Max


Since 300GB disks are selling for well under $100
its a no brainer decision for me. 100GB of data is going to
take 25 or so DVDs and that is a lot of disk shuffling

street_samurai
2005-11-09, 12:13
Remember that modern hard drives are designed to be run all the time. Spinning anything mechanical up and down actually does more damage than leaving it on all the time. Thus a hard drive will be more reliable if left on all the time.

As far as storage, I too use an external hard drive encloser and a big hard disk. As mentioned $100 for 300gb and $25 for a decent USB 2.0 enclosure. Then I use SyncBack (but any decent backup program will do) to run weekly backups of my music to my enclosure.

If something truly terrible happens: fire, theft, "act of god" then at least you still have your original CDs which you could re-rip. They are the ultimate backup.

ss.

Jim
2005-11-09, 12:18
If something truly terrible happens: fire, theft, "act of god" then at least you still have your original CDs which you could re-rip. They are the ultimate backup.
ss.

But are they?

One of the reasons I often wonder about when to seriously start backing my stuff up is because CD rot is not just a CDR/CDRW related thing. Read up about all those 80's and early 90's produced commerical CD's that are now rotting !!!

EDIT: Link provided: http://news.bbc.co.uk/1/hi/entertainment/music/3940669.stm

MrC
2005-11-09, 12:28
Remember that modern hard drives are designed to be run all the time. Spinning anything mechanical up and down actually does more damage than leaving it on all the time. Thus a hard drive will be more reliable if left on all the time.

While this is completely true, its also worth some additional perspective. Modern drives are designed for > 50,000 spin up/down cycles. That's over 27 times per day, each day, for 5 years. Not many people have to worry about this.

pfarrell
2005-11-09, 12:54
Jim said:
> street_samurai Wrote:
>> If something truly terrible happens: fire, theft, "act of god" then at
>> least you still have your original CDs which you could re-rip. They are
>> the ultimate backup.
>> ss.
>
> But are they?
>
> One of the reasons I often wonder about when to seriously start backing
> my stuff up is because CD rot is not just a CDR/CDRW related thing.
> Read up about all those 80's and early 90's produced commerical CD's
> that are now rotting !!!

It all repends on how long a vision you have for your backup.
For a really long term view, you have to periodically put the
data into a modern format, and then write it out to whatever
looks to be forward looking.

Fifteen years ago, I backed up my critical stuff on QIC 40MB tape.
Then 1 GB DAT tapes came out, so I used them.
Then a few years later, 4GB DAT tapes, then a few years after that DLT.
You have to expect that the physical media will deteriorate over
time. More importantly, the drives become obsolete.

If you have 8 inch floppies, I am not sure that anyone can read them.
Even 5-1/4" floppies have been unreadable in my house.

Hard disks are probably useful for 3 to 5 years, if the data
is valuable, you really have to move to new drives periodically.
Its an engineering and budget question, more so than technology.

The only reliable long term storage is permanent ink on acid free paper.
That can last a thousand years or so.

Pat
http://www.pfarrell.com

Jetlag
2005-11-09, 13:35
On my PC are all of my FLAC files along with the uncompressed WAV rips (EAC=secure mode). Now I back up to an external Maxtor OneTouch II 300GB I got on sale. I used to have a server running but I am selling my house, so had to make it look a bit more like a hoome and less like space command.

I also keep all of my FLAC files on my laptop.

The initial rip/encode took quite a while, but now as I acquire new CDs I just rip them and add them to the collection.

Jim
2005-11-09, 13:39
On my PC are all of my FLAC files along with the uncompressed WAV rips
Now I'm going to assume you have a valid reason to have both a FLAC and a WAV of the same audio. If I'm not invading your privacy by asking, I would be very interested why?

Jetlag
2005-11-09, 13:51
I have a boatload of free space on my PC (>200GB), and that is with thousands of digital photos and all of my music files. So not really worried about it filling up anytime soon. I only use FLAC files for my SB2 and Karma.

Occaisionally my girlfriend finds a couple of my songs that she wants me to put on her iPod or I swap out the ones on my 256MB 'gym' MP3 player. I encode these (MP3 & AAC) from the WAV files. Lots of encoders handle WAV, not to many handle FLAC.

kolepard
2005-11-09, 14:02
> > Since 300GB disks are selling for well under $100
>> its a no brainer decision for me. 100GB of data is going to
> > take 25 or so DVDs and that is a lot of disk shuffling

One interesting variation on this that I've read about is to put a
NAS or NAS RAID inside a fire-resistant gun safe. A lot of them have
electrical outlets inside, and you can actually use an ethernet
powerline adapter to access the device while it is running inside the
safe.

Granted, it's not going to survive a full-on fire torture test like
it would (hopefully) do inside a media safe (not that you couldn't
overcook one of those, too), but it's a lot more protection than most
people have short of an offsite backup. And you can access the
information live on your own LAN.

Alternatively, you could look into something like the Schwab
DataFortress or the ioSafe.

Kevin
--
Kevin O. Lepard
kolepard (AT) charter (DOT) net

Happiness is being 100% Microsoft free.

Jim
2005-11-10, 06:25
Occaisionally my girlfriend finds a couple of my songs that she wants me to put on her iPod or I swap out the ones on my 256MB 'gym' MP3 player. I encode these (MP3 & AAC) from the WAV files. Lots of encoders handle WAV, not to many handle FLAC.
Do you bother with tags? If not then there's no point doing anything. If every time you convert the WAV's you are tagging stuff then you could make life easier....

I'd suggest invesing a little time looking at a solution (e.g. Transcode) so you could set up a batch run to convert all of your FLAC's with tags.

Dumping the wav's you'd still save space and be able to store 4 different lossy files as well as the FLAC for whenever you wanted to transfer them.

e.g. 128kbps AAC
128kbps MP3
320kbps AAC
320kbps MP3

Once you're learned how to set it up it can do them whilst you're in bed!

Michaelwagner
2005-11-10, 07:17
put a NAS or NAS RAID inside a fire-resistant gun safe. A lot of them have electrical outlets insideAn interesting idea. Just curious, though ... why would a gun safe have an outlet inside?

jimdibb
2005-11-10, 07:24
On 11/10/05, Michaelwagner <
Michaelwagner.1yabjb (AT) no-mx (DOT) forums.slimdevices.com> wrote:
>
>
> kolepard Wrote:
> > put a NAS or NAS RAID inside a fire-resistant gun safe. A lot of them
> > have electrical outlets insideAn interesting idea. Just curious, though
> ... why would a gun safe have
> an outlet inside?
>
> How much free airspace is in there? You'd think it might catch fire on the
inside first with nowhere for the heat from the drives to go...

pfarrell
2005-11-10, 08:46
Well, you could always air condition the gun safe, but that
might be noisey, so you'd have to watercool the A/C.
Pretty soon you've spent more for the safe than you did
for your CD collection. :-)

On a slightly more serious note, the CDs are the ultimate backup, which can be a problem as many of mine are no longer in print. As a worst case, I could just buy the 700 or so CDs,
but that would cost about $10,000. So a few hundred dollars for a backup solution is probably good insurance.

Jetlag
2005-11-10, 09:07
A very good friend of mine (fellow computer and HT geek) uses the most foolproof system I know of. He uses 2 hot-swap HDDs and keeps one in his safe deposit box.

Since his Studio is directly across the street from his bank, at least once per week he simply copies all of is data to the HDD, walks across the street and swaps it out for the other drive. This way he always has all of his data on his PC and on at least one removeable drive.

Since he already pays for the box for storing his important papers, etc, there was only the added cost of a hot swap bay and 2 drives. I think he even writes off the cost of the safe deposit box on his taxes. I forget which brand of backup software he uses, but it does incremental and quickly adds new data or updates revised files. He says it does not take very long.

I normally have a RAID3 server running in my house which is very safe. His method is about as safe as it gets.

Dan Goodinson
2005-11-10, 09:16
I've got the makings of something very similar at home.

Even though I have a meagre collection of music, I've invested so much
effort into getting it just right (specifically all the tags) that I've
got the makings of quite a decent backup system.

2 80GB drives in a mirrored RAID array. I currently run Windows backup
utility to copy everything on the RAID array to an 80GB drive in a
removable caddy. The backup runs every weekend. At the moment, though,
I keep everything in the same house. I perhaps should consider getting
a further 80GB HDD and keeping 1 disk at home and 1 disk at the office
:-)

The only downside right now is that the drives are not hot-swappable.
Consequently to change the drive I need to shutdown the PC. I've been a
bit slack and the drive has barely been removed from the caddy since
installation...

klausbgva
2005-11-10, 12:59
My approach to the very same problem.

After many test and trials. I am working on a really serious solution fix that problem.

Today I mirror the Data between 2 servers within the same rack.

I occassionally copy everything to 1 USB external dirve. But I will reache soon the size where disks are getting too small.

The next solution will be a SATA Raid 5 controller with up to 8 drives.

My brother is getting the same configuration we will each buy 2x the size of what we need. copy all our critical data to the raid.

Put the 2 server in the same network create an original rsync of the folders.
After that it will be synched with rsynch every day

The homes are about 1/2 mile about if they ever fail at the same time I think there will be bigger problems than to get a few music files back.

Klaus

Robin Bowes
2005-11-10, 13:35
klausbgva said the following on 10/11/2005 19:59:
>
> The next solution will be a SATA Raid 5 controller with up to 8
> drives.

I recommend either RAID6 or, at the very least, have a hot-spare in your
RAID5 array.

R.

--
http://robinbowes.com

If a man speaks in a forest,
and his wife's not there,
is he still wrong?

max.spicer
2005-11-10, 13:56
Why?? You're not very likely to loose two drives simultaneously, are you? This all sounds a bit over the top to me for a home music system (or is it not a home music system?).

Max


I recommend either RAID6 or, at the very least, have a hot-spare in your
RAID5 array.

mflint
2005-11-11, 02:17
Thanks everyone - some good suggestions there.

"Going RAID" is probably overkill for me (and it'd be hard to justify the expense to the 'Domestic Finance Director').

I did also think about buying something like a NSLU2 or Linkstation, adding a wireless card and giving it to a neighbour. Then I could do wireless rsync backups... but I would have to buy a them few beers occasionally.

But it'll probably be the "rip to a 'new' directory then burn to DVD when it contains about 4Gb" method, as suggested by "clumsyoik".

Cheers! :-)

Matthew

Dan Goodinson
2005-11-11, 04:08
Not sure if useful to you, but the RAID setup I have was very cheap indeed. I only use RAID 1 (mirroring) so I only need 1 extra disk for the array, in addition to any disks for actual backup. I used a super-cheap RAID add-in controller from eBuyer: 10 or so.

So if you can afford the extra disk for your RAID array, the additional cost of the add-in controller is pretty small. In fact, most new motherboards come with built-in RAID controllers, so you may be able to get around the cost of the RAID card. Mine's an IDE RAID controller, but I understand that SATA controllers can be had for around 15.

-----Original Message-----
From: discuss-bounces (AT) lists (DOT) slimdevices.com [mailto:discuss-bounces (AT) lists (DOT) slimdevices.com] On Behalf Of mflint
Sent: 11 November 2005 09:17
To: discuss (AT) lists (DOT) slimdevices.com
Subject: [slim] Re: How to reliably/easily backup 100Gb of FLACs?

*snip*
"Going RAID" is probably overkill for me (and it'd be hard to justify the expense to the 'Domestic Finance Director').

Robin Bowes
2005-11-11, 04:17
max.spicer said the following on 10/11/2005 20:56:
> Why?? You're not very likely to loose two drives simultaneously, are
> you?

It's more likely than you think. I've lost several over the last 18 months.

> This all sounds a bit over the top to me for a home music system
> (or is it not a home music system?).

It's not the system I'm protecting; it's the hundred of hours spent
ripping and tagging several hunderd CDs!

R.

> Robin Bowes Wrote:
>
>>I recommend either RAID6 or, at the very least, have a hot-spare in
>>your
>>RAID5 array.


--
http://robinbowes.com

If a man speaks in a forest,
and his wife's not there,
is he still wrong?

max.spicer
2005-11-11, 05:22
Welcome onboard, by the way. That's 3 York users now. Won't be long before we can have Slim parties. Or maybe not. ;-)

Max

clumsyoik
2005-11-11, 05:49
My approach to the very same problem.

After many test and trials. I am working on a really serious solution fix that problem.
...
The next solution will be a SATA Raid 5 controller with up to 8 drives.
...
After that it will be synched with rsynch every day
Klaus
If you must backup to hard drive, don't simply rsync from master->backup. You need to keep the history of any changes. What if a virus/small child/hardware fault corrupts some of your data without you realising? When it gets automatically rsync'ed to the backup, your data is gone for good.

For a really serious solution, you need to use something like rdiff-backup, which will allow you to recover back to an arbitrary point in time.

Jim
2005-11-11, 05:59
If backing up FLAC's it's not like another sort of backup - you only ever need to backup a FLAC once.

I actually have 30 or so DVD's full of FLAC's I actually bothered backing up.

But none of them are tagged, I didn't get round to that then. Now I haven't backed them up again, just because a tag changes or a filename changes you don't need to backup again. You only need to backup the FLAC's one time, and just keep a backup of all your current tags (on a spare gmail account, webpage, floppy disk or wherever). If you have the tags relating to the fingerprint of each FLAC it is trivial to restore a old untagged version and then tag it again.

Michaelwagner
2005-11-11, 08:46
I did also think about buying something like a NSLU2 or Linkstation, adding a wireless card and giving it to a neighbour.
Or two routers that can do VPN, give one to a friend, who can then be pretty much anywhere where they have DSL or cable modems, and you could send the stuff that way. Removes the requirement that it be a close neighbour.

MrC
2005-11-11, 09:29
For a really serious solution, you need to use something like rdiff-backup, which will allow you to recover back to an arbitrary point in time.
Have you actually tried binary data diffs with large files? This would take ages for larger libraries! And the storage required is almost as large and sometimes larger than the file itself. This becomes tantamount to multiple versions of your backups, so a straight rotate and copy scheme is faster.

jimdibb
2005-11-11, 10:22
One small point about cost. RAID1 has the smallest buy in, but $/MB is more
expensive than RAID5 (as long as the R1 and R5 controllers are about the
same price.)

Finally a subject that's in my area of professional expertise, rather than
hobbyist interest.

On 11/11/05, Dan Goodinson <Dan.Goodinson (AT) businessobjects (DOT) com> wrote:
>
> Not sure if useful to you, but the RAID setup I have was very cheap
> indeed. I only use RAID 1 (mirroring) so I only need 1 extra disk for the
> array, in addition to any disks for actual backup. I used a super-cheap RAID
> add-in controller from eBuyer: 10 or so.
>

geoffb
2005-11-11, 20:42
On 11/11/05, clumsyoik <clumsyoik.1yc21b (AT) no-mx (DOT) forums.slimdevices.com> wrote:
> What if a
> virus/small child/hardware fault corrupts some of your data without you
> realising? When it gets automatically rsync'ed to the backup, your data
> is gone for good.

I use a system someone else already mentioned; external drive, USB
enclosure, SyncBack. Cheap to set up, and very fast for backups.

To get around the above issue of overwriting good backups with corrupt
data, I only backup after I've done enough hours of changes that it
would be particularly painful to repeat them. I don't see the need to
back up daily, when I only rip new material once a month or so. When
I do the backup, SyncBack throws up a list of what's changed; I
eyeball it briefly, and if anything unexpected is in there (hasn't
happened yet), it would indicate hardware problems / small children,
and be simple to recover from.

Of couse, there's still a risk in this, but if you're that paranoid,
you could simply get a second USB drive and rotate them.

Cheers
Geoff

Jim
2005-11-11, 22:23
I still don't get why everyone is discussing backup tactics that would apply to NORMAL *EVER CHANGING* data such as databases, documents etc... If the actual *audio* in your files is changing then you're doing something very strange.

Unless we've gone off topic I thought we were talking music, audio files.

In that case once it's ripped that's it. Back it up. Rules being...when backup media full move onto new media. Keep a regular backup of your tags (just text info) , having them relate to a checksum of your backed-up files. Obviously keep them away from your backup of the audio files.

If you lose your files.....restore them....run routine over the restored files to re-apply your CURRENT tags.

mherger
2005-11-12, 00:19
> If you lose your files.....restore them....run routine over the
> restored files to re-apply your CURRENT tags.

And now tell me how you back up / restore your mp3 tags only. If this is
so obviously simple, I clearly must have missed it.

--

Michael

-----------------------------------------------------------
Help translate SlimServer by using the
StringEditor Plugin (http://www.herger.net/slim/)

Bruce Hartley
2005-11-12, 00:52
Yes please.

How do you back up tags with a checksum?
How do you restore tags with that checksum?

clumsyoik
2005-11-12, 03:31
Have you actually tried binary data diffs with large files?
...
And the storage required is almost as large and sometimes larger than the file itself.

Have *you*?

I have never used rdiff-backup, but I have used xdelta.

Here is a simple non-scientific benchmark, although still fairly representative of the kind of diffs that would be generated in this application.

This is using xdelta, which I believe to be pretty good at generating binary diffs.

% metaflac --list --block-type=VORBIS_COMMENT test1.flac
METADATA block #2
type: 4 (VORBIS_COMMENT)
is last: false
length: 300
vendor string: reference libFLAC 1.1.0 20030126
comments: 10
comment[0]: ALBUM=Dark Side Of The Moon
comment[1]: ARTIST=Pink Floyd
comment[2]: DESCRIPTION=
comment[3]: GENRE=Pop/Rock
comment[4]: TITLE=Us And Them
comment[5]: TRACKNUMBER=6
comment[6]: replaygain_track_gain=-2.13 dB
comment[7]: replaygain_track_peak=0.696075
comment[8]: replaygain_album_gain=-3.67 dB
comment[9]: replaygain_album_peak=0.968200

% metaflac --list --block-type=VORBIS_COMMENT test2.flac
METADATA block #2
type: 4 (VORBIS_COMMENT)
is last: false
length: 262
vendor string: reference libFLAC 1.1.0 20030126
comments: 6
comment[0]: ALBUM=DskjdfhskjfhOf The Moon
comment[1]: ARTIST=Pink Floydslsekjlwerjkwlkj
comment[2]: DESCRIPTION=lkasjdfajfowierskdjflskjflasd
lkjdfsklfj
sdfkslkdfjasldfjalkdf
comment[3]: GENRE=aksjdfaslkdjfla
comment[4]: TITLE=lskdjlaskdfjsaldkfj
comment[5]: TRACKNUMBER=87

% xdelta delta test1.flac test2.flac diff

% ls -l
413 2005-11-12 10:18 diff
45651843 2005-11-12 10:11 test1.flac
45651843 2005-11-12 10:16 test2.flac


This would take ages for larger libraries!

Clearly binary diffs can be very efficient indeed.

This took about 5 seconds, would probably be 20 if they weren't cached. So, yes, it would be a long time for a large library, but it would be no worse than rsync. (In practice, you could selectively ignore files based on the last-modified time and filesize)

Jim
2005-11-12, 08:56
And now tell me how you back up / restore your mp3 tags only. If this is
so obviously simple, I clearly must have missed it.



Yes please.

How do you back up tags with a checksum?
How do you restore tags with that checksum?

You make a md5 of the file as the last thing you do before you put it on a backup media. Then you tag the file with a "original md5" tag - in the comments tag if nowhere else.

With FLAC you don't have to do this, you ALWAYS have a reference point which never changes with tagging (the fingerprint).

If you are using such a crappy format that there is nowhere to tag then you'd have to reference every tag change to the orginals md5.

Or you could do what I do and use a proper databse program to organise your music (CATraxx) and store the orignal md5's and write the tags out from there.

I've written up my own code to do a few things between my FLAC files & CATraxx, but you could use a spreadsheet or your own DB, or Slim's DB.

It only takes writing a few scripts/routines which I am sure most folks who are technical enough to be discussing RAID arrays, complex backup strategies etc... are able to do.

One time I even lost a HD with no backup, well not the HD but the data (repartition). I fired up a hex editor and soon discovered that it was just the file indexing that was screwed up (the NTFS backup one too) - I could see what looked like FLAC files ("fLaC" is a giveaway). Now being a bit of a FLAC-geek I wrote some code to scan the disk looking for fingerprints within the files and working out where the files started/ended. I extracted these sectors out to files and could then use FLAC's testing option to test they were 100% restored. Out of about 300 albums I had to feed only 3 into EAC again.

MrC
2005-11-12, 10:03
Have *you*?

I have never used rdiff-backup, but I have used xdelta.

Here is a simple non-scientific benchmark, although still fairly representative of the kind of diffs that would be generated in this application.

This is using xdelta, which I believe to be pretty good at generating binary diffs.
No need to get huffy.

Have I tried binary diffs? Indeed I have. I've done some pretty extensive evaluation of such, having been a consultant to a company for which I provided version control software for hardware designers (who can have VERY large binary files).

Your test case is a trivial one, and does not show the problems associated with larger changes in a file, but is probably representative of what will occur with music files. Since only metadata is being changed at the beginning of the file, binary diffs are fine. However, running such an expensive set of operations to simply obtain and backup the tags seems wasteful.


Clearly binary diffs can be very efficient indeed.Sure, but rarely. As the change set increases, the time required to generate deltas goes up almost expoentially.


This took about 5 seconds, would probably be 20 if they weren't cached. So, yes, it would be a long time for a large library, but it would be no worse than rsync. (In practice, you could selectively ignore files based on the last-modified time and filesize)
Agreed, you could be more clever about selecting which files to backup. But your method would always be worse than rsync if you tell rsync not to bother doing a diff, and to just copy. As Jim earlier pointed out, this is not constantly changing binary data in general - only the tags change, so all you need is the tag information and one set of music files.

clumsyoik
2005-11-12, 14:40
No need to get huffy.

Apologies if it came out that way. No offense intended.



Your test case is a trivial one, and does not show the problems associated with larger changes in a file, but is probably representative of what will occur with music files. Since only metadata is being changed at the beginning of the file, binary diffs are fine. However, running such an expensive set of operations to simply obtain and backup the tags seems wasteful.

Agreed. But the alternatives seem to involve writing custom scripts to extract and store (and presumably restore) tags.

klausbgva
2005-11-12, 14:50
Yes all my raid systems do have a spare disk.

rsync has been very a very good solution for me. I do use it to backup my laptop everynight. I do keep 2 versions of the archive.

As for the Data it's self on the server I am somewath paranoid. Limited permissions for all users and only 1 user with delete rights (which I try not to use.

TO be really safe I will add 1 LTO drive in one of the locations and run incremental backups every day and full backups every week (keeping 4 weeks)

Music I not the only thing stored on the drives I will put all other files there as well.
My main concern however is the increasing number of large video files. They will be excluded and burned to DVD.

Site to site bandwidth is another major issue. DSL is way to small when you add a 1-2GB of files every few day's
We will use the unlicenced spectrum and WIMAX connection.