PDA

View Full Version : Removing duplicate tracks



CavesOfTQLT
2005-05-02, 04:35
Having encoded most of my CDs onto HDD, and after running my own Slimserver & Softsqueeze trial, I know that this whole idea of streaming music to different rooms is going to be really worthwhile and I look forward to getting my first Squeezebox2 soon.

Since starting the trial the one thing I've noticed is the amount of duplicate tracks I've got, in some cases I've got six copies of the same track due to them being on different albums, and they're all taking up space that could be used for five other different tracks. Now this ain't too bad with small file sizes like mp3, but now I'm encoding my CDs in FLAC, these dupes take up a lot of space.

Wouldn't it be great if you could have just one copy of the track stored on your server, and the Slimserver software just references to that track in each album where it should be. I'm thinking along the lines of a scenario where you click on 'Wipe Cache', the Slimserver rebuilds the database, and you then get a tab called 'Duplicates', possibly under server settings, and this lists all the duplicate tracks.
You then play each copy (if you wanted to, that is) and choose the best one, and the other copies would be taken out of your library and put into a 'deleted' bin just in case of any problem later. The Slimserver would then put a small track descriptor reference file in the location where the track was removed so that the 'track' would appear in any album scroll, etc., and if you press play on that 'track' then it plays the track as normal. Once you're happy that everything is running okay, you'd then delete the duplicate tracks in the 'deleted' bin permanently.

I don't know whether something like this could be done, but I thought I'd mention it as a possible streamlining idea for the future.

Jim Dibb
2005-05-02, 04:46
That's a great idea!

On 5/2/05, CavesOfTQLT <CavesOfTQLT.1oek4n (AT) no-mx (DOT) forums.slimdevices.com> wrote:

> Wouldn't it be great if you could have just one copy of the track
> stored on your server, and the Slimserver software just references to
> that track in each album where it should be. I'm thinking along the
> lines of a scenario where you click on 'Wipe Cache', the Slimserver
> rebuilds the database, and you then get a tab called 'Duplicates',
> possibly under server settings, and this lists all the duplicate
> tracks.
> You then play each copy (if you wanted to, that is) and choose the best
> one, and the other copies would be taken out of your library and put
> into a 'deleted' bin just in case of any problem later. The Slimserver
> would then put a small track descriptor reference file in the location
> where the track was removed so that the 'track' would appear in any
> album scroll, etc., and if you press play on that 'track' then it plays
> the track as normal. Once you're happy that everything is running okay,
> you'd then delete the duplicate tracks in the 'deleted' bin
> permanently.

> --
> CavesOfTQLT

CavesOfTQLT
2005-05-02, 05:14
That's a great idea!
Cheers Jim.

I just don't see the reason to have multiple copies of the same thing on your server, unless you're happy to show off to someone that you've got 10000 'tracks' when you've actually only got 8000 ;) [That's a joke BTW]

Hey maybe the Slimserver team could lead the way on this idea, because I don't think anybody else has thought about it ... yet ;)

Jim Dibb
2005-05-02, 05:27
Sorry I can't do anything to actually help the situation along. I'm
not even on 6.x.x yet because I'm just running an SB1 and the 5.4 is
working fine for me. Wish I had the time to play with this stuff more
(and slightly better Perl skills.)

On 5/2/05, CavesOfTQLT <CavesOfTQLT.1oelqz (AT) no-mx (DOT) forums.slimdevices.com> wrote:
>
> Jim Dibb Wrote:
> > That's a great idea!
> Cheers Jim.
>...
>
> Hey maybe the Slimserver team could lead the way on this idea, because
> I don't think anybody else has thought about it ... yet ;)
>
>
> --
> CavesOfTQLT
>

DrNic
2005-05-02, 05:40
I certainly second this idea..
I have many compilation albums - especially the "chill out" variety and you can guarantee that at least 2 tracks are the same on all the albums (grrrr!)
I like your idea on how it would work, but having absolutely now idea how to program Perl then all I can do is stick my hand up in the air and say count my vote...

Nic

James Dunn
2005-05-02, 05:55
I haven't tried it but to some extent, you should be able to set the
"Multiple Items in Tags" parameter to say "/" and put several album names in
one album name field. The only (big) problem that I can see is that, unless
you are lucky, the track number will be wrong for all but one of the albums.
Compilation albums will probably be OK though - if you're not hung up about
play order.

Also, you'll probably have a big job going through the tracks looking for
subtle errors/differences in the names of the tracks - particularly if you'd
relied on automatic CDDB tagging. Now that there is an SQL database and a
published schema you can do an SQL "select distinct" to produce a list of
supposedly unique tracks. I tried it with my collection and then manually
looked through the list looking for errors/inconsistencies; it's taken me a
long time to get the naming (and years) consistent. Have a look at how many
are different in your collection.

Cheers,

James

-----Original Message-----
From: discuss-bounces (AT) lists (DOT) slimdevices.com
[mailto:discuss-bounces (AT) lists (DOT) slimdevices.com] On Behalf Of CavesOfTQLT
Sent: Monday, May 02, 2005 1:14 PM
To: discuss (AT) lists (DOT) slimdevices.com
Subject: [slim] Re: Removing duplicate tracks


Jim Dibb Wrote:
> That's a great idea!
Cheers Jim.

I just don't see the reason to have multiple copies of the same thing
on your server, unless you're happy to show off to someone that you've
got 10000 'tracks' when you've actually only got 8000 ;) [That's a joke
BTW]

Hey maybe the Slimserver team could lead the way on this idea, because
I don't think anybody else has thought about it ... yet ;)


--
CavesOfTQLT

CavesOfTQLT
2005-05-02, 06:40
Yep it would require all the tags to be exactly the same, and the results from the online databases don't help, but I, and I'm sure there are many who are using the Slimserver system, well especially perfectionists, manually correct the tags after ripping them. I can't think of a single CD where I've not had to re-do the tags because of small discrepancies, or because they don't follow my naming convention.

But even I make mistakes so spotting copies of tracks that have slight discrepancies can easily be done by using the 'Browse Artists/All Songs' function of Slimserver following a 'wipe cache'. Just make a note of the incorrectly named tracks and use your tag program to make all the copies the same. Note I'm only talking about Artists and Track names for this idea. When finished do another scan, and repeat until all copies finally match.

Those that just want plug'n'play, or who aren't bothered with the copies, could leave things as is and not use the 'remove dupes' function.


Cheers

Wendell Hicken
2005-05-02, 06:58
On 5/2/05, CavesOfTQLT <CavesOfTQLT.1oek4n (AT) no-mx (DOT) forums.slimdevices.com> wrote:
> Since starting the trial the one thing I've noticed is the amount of
> duplicate tracks I've got, in some cases I've got six copies of the
> same track due to them being on different albums, and they're all
> taking up space that could be used for five other different tracks. Now
> this ain't too bad with small file sizes like mp3, but now I'm encoding
> my CDs in FLAC, these dupes take up a lot of space.

You can use MusicMagic Mixer to find/delete your duplicates. There's no support
for leaving behind reference files to the master track, but if
SlimServer were to
add support for some kind of magic files, I could change MMM to leave
them behind.
Note that MMM is finding duplicates based on audio fingerprints, not metadata.

Note that finding duplicates is a premium feature (meaning you need an active
registration key), but a trial key will work just fine, and you should
be able to clean
all your duplicates before the trial ends.

Wendell

Jim
2005-05-02, 10:52
Since starting the trial the one thing I've noticed is the amount of duplicate tracks I've got, in some cases I've got six copies of the same track due to them being on different albums, and they're all taking up space that could be used for five other different tracks. Now this ain't too bad with small file sizes like mp3, but now I'm encoding my CDs in FLAC, these dupes take up a lot of space.......

But how do you class them as duplicates? Obvioulsy the human ear can easily tell the difference between a remix, or maybe that on an artists Greatest Hits CD you´ve got the single version as opposed to the slightly different album version.

But remember you are in the lossless domain now, where every bit is a difference - I have over 20,000 FLAC´s now and whilst I have many songs that you might class as dupes I have no identical FLAC audio fingerprints in my entire collection. You have to remember that MMM's fingerprints are not the same as FLAC´s exact fingerprints (which are in fact MD5 hashes of the header-stripped WAV file). MMM's are merely a rough representation as to how it sounds, an accoustic fingerprint so it's no wonder you have dupes appearing.

Obvious reasons for differences would be different levels of compression used for various compilations etc... but I figure having a few "dupes" scattered here and there is hardly a big issue. If you´re going to start removing files you may as well get rid of any traditional concept of a compilation - just store all tracks from compělations as "singles" on your server - now you can define your compilations and no longer have the 80-minute limit of a CD and can at last remove those embarassing tracks that sometimes slip out during a random shuffle.

In any case, have a real listen to the "dupes" yourself before removing - after deleting dupes without care a badly mastered album with one perfectly mastered track is going to sound just as bad as a brilliantly mastered album with one duff track, and of course we then have the percieved volume differences....

Seems to me you are using MP3 (lossy) methods/ideas to solve a FLAC (lossless) "problem". Unless (which I doubt) these are exact duplicates then for every track you delete you have just made the entire album lossy, or worse than lossy since it'll be lossy disguised as lossless - if you decide to go this route then archive to DVDR at least.

CavesOfTQLT
2005-05-02, 11:16
Concerning duplicate tracks; if I've got R.E.M's Everybody Hurts* on one of their albums, and I've also got the same track on a compilation, and another on another compilation, that's three copies taking up space, whether that space is valuable or not. Surely it would be easy in this day and age of HDD to have just the one copy and to add a small reference 'track' file to each of the albums where the track was removed. That way Slimserver would know that a particular track should be present in an album, and the reference file would point to where the track is actually located, in order to play it, or even to reconstruct the album if this was needed.

The only other way I can think of is to have just a single copy of every track, and to use the 'Album' tag to reference every album that track should be in. But then my preferred format of saving all R.E.M's albums in one folder, all Coldplay's albums in another folder etc., wouldn't be possible. Plus it would be a tagging nightmare.

*Obviously if another artist/s did a version of R.E.M's Everybody Hurts, such as The Corrs, then that copy would remain because it is different, as would an instrumental version be if R.E.M decided to do one.

Jim
2005-05-02, 11:32
Concerning duplicate tracks; if I've got R.E.M's Everybody Hurts* on one of their albums, and I've also got the same track on a compilation, and another on another compilation, that's three copies taking up space, whether that space is valuable or not.

I fully understand your situation, and I myself was thinking this the last time it came to buy a new hard drive or delete stuff.

But you seem to be using the phrase "duplicate" without giving an exact definition of what you mean. Duplicate audio in terms of MP3 or other lossy files is "Yeah, it's the same length, it starts the same, ends the same, sounds the same" - that's because the MP3 user has already decided "almost" is as acceptable enough for him as "same as" by definition of using a lossy format.

But you my friend have already made one wiser choice than most in using FLAC and accepting the extra HD space that comes with it for the benefit of proper CD quality audio.

Mr MP3 can say "I have 20,000 tracks, and none are dupes because I deleted all those". You can say "I have 20,000 tracks, maybe 5000 of them are pretty much the same but at least I know all my albums are lossless as I have not mixed and matched tracks with different md5 fingerprints".

The FLAC format itself can easily tell you if two files are duplicates by looking at the audio md5 fingerprints. By replacing carefully ripped tracks with versions that merely sound the same (and are not exactly the same) might be freeing up some space but is also contaminating your FLAC collection and no longer do you have bit-for-bit copies of some albums.

And also, whilst a saving of 5 gigabytes for a user with MP3's is a hell of a lot, for a FLAC user it's just 20 albums. With the time needed and the small saving (in terms of albums, not bytes) and the fact that you will be contaminating your lossless collection I ask is it worth it?

I applaud your idea, however I think it's more of a solution to gain space from a lossy collection. If ever I came across two FLAC files that were exactly the same I of course would like to use a feature like this, but as I said earlier with a growing collection of 20,000+ I haven't yet - and I do have about 20 different versions of some songs that sound the same to me and would no doubt be picked up by your MMM software as "dupes".

JJZolx
2005-05-02, 11:40
Concerning duplicate tracks; if I've got R.E.M's Everybody Hurts* on one of their albums, and I've also got the same track on a compilation, and another on another compilation, that's three copies taking up space, whether that space is valuable or not. Surely it would be easy in this day and age of HDD to have just the one copy and to add a small reference 'track' file to each of the albums where the track was removed. That way Slimserver would know that a particular track should be present in an album, and the reference file would point to where the track is actually located, in order to play it, or even to reconstruct the album if this was needed.

The only other way I can think of is to have just a single copy of every track, and to use the 'Album' tag to reference every album that track should be in. But then my preferred format of saving all R.E.M's albums in one folder, all Coldplay's albums in another folder etc., wouldn't be possible. Plus it would be a tagging nightmare.

*Obviously if another artist/s did a version of R.E.M's Everybody Hurts, such as The Corrs, then that copy would remain because it is different, as would an instrumental version be if R.E.M decided to do one.

Maybe I'm missing something, but when you start talking about compilation albums and storing multiple copies of tracks on the disk - isn't that pretty much what playlists are for?

I think you're asking a _lot_ from SlimServer as a music library manager. It may get there some day, but given the current capabilities and the complexities of a web application that could do something like this, you're probably looking at the wishlist for SlimServer version 12.0 or so.
________
Ducati Monster 695 (http://www.cyclechaos.com/wiki/Ducati_Monster_695)

Bruce Hartley
2005-05-02, 11:47
If this feature is going to go anywhere, I would like the "duplicate avoiding" links to be in the file system.

That way if you are going to rebuild the database from scratch you wouldn't lose anything.

I guess on a linux server, I could remove a "duplicate" file and replace it with a link (hard/symbolic can't remember) to another copy of the file.
As said, the tag for track number etc. would be wrong, but the track number if the file name could be correct.

In windows, the link could be a windows shortcut.

Just a few random thoughts...............

CavesOfTQLT
2005-05-02, 12:34
Geez, this is getting deep and it needn't be.

Take Michael Jackson's Beat It song. As far as I know he's only ever done one version, and that's the one that appears on his Thriller album.

Now if that same track appears on a compilation, say Hits Of The 80s, and I happen to have both those albums in my collection, then I've got two copies of the same track (and these should be correctly tagged with the track title Beat It). The size of the file, length it is, recorded at 89dB or 93dB, etc. makes no difference - it's still the same Beat It track. So why have two copies of it on the HDD. Pick the best one and delete the other.

But doing this would mean one of the albums now has the Beat It track missing. So why not put a small reference file in that album telling Slimserver where the track is actually located, and what the track number was in that album, etc. Slimserver is already doing a lot of the work when it rebuilds the database, and I'm sure it would take just a small amount of additional programming to accomplish a system like this.
Further, if the reference file also included a text descriptor in it, say C/SlimServer/MultipleAlbumTrack, then it would be just as easy to use Windows Explorer to identify where the missing track is should you decide to manually copy/encode 'albums' later using that method.

Anyway it was just an idea. I'm off now to drown my sorrows with some music. And no, it's not Beat It though I guess some of you are thinking that ;)

kdf
2005-05-02, 13:26
Quoting CavesOfTQLT <CavesOfTQLT.1of64b (AT) no-mx (DOT) forums.slimdevices.com>:

>
> and I'm sure it would take just a
> small amount of additional programming to accomplish a system like
> this.

everyone says this, but it is never about how much code it may or may not take.
its a small abount of code ot make slimserver send a signal to start your
coffee in the morning too, but thats not the purpose of slimserver. If you
want to avoid duplicates, there is a vast array of other file managing tools
already out there that do a fine job. Its a fine idea, but likely not a
priority.

-kdf

Jim
2005-05-02, 13:36
The size of the file, length it is, recorded at 89dB or 93dB, etc. makes no difference - it's still the same Beat It track.

If it doesn't make a difference then why are you using FLAC?

Why not re-encode all your FLAC's to 320kbps MP3's. And you could additionally do this dupe-removing too to save a ton of space.

Don't just use FLAC because the assumption bigger=better or because these "audiophile" people all do it. If the statement you wrote above is correct then clearly FLAC is not for you, and you'd be wasting disk space using it.

Michaelwagner
2005-05-02, 15:43
Yeah, this is a tricky idea.

Perhaps your example of Beat It is correct, but I have three copies of "Billy Jean" - 2 by him - one was the original demo and one was the finished copy - and one by someone else completely.

If you ask the software to consider as duplicates all copies of Billy Jean by Michael Jackson, then you might end up with the demo version you don't like (or maybe you do). The point is, ID3 tags alone won't detect duplicates well, especially since the definition of duplicate is so subjective. I have several copies of Road Runner, by Bo Diddly, live, studio, recorded with someone else, etc. Lots of Clapton like that too. It's hard to assume they'll be tagged differently.

It's true there are some tracks, especially on compilation albums, that are pretty much the same to my ears, but I think it would be non-trivial to reliably make the same determination by computer that our ears would make.

At a guess, I think this would be the job for a program outside of SlimServer, one that would perhaps leave a file system in place as you described, that SlimServer would navigate properly. But the computation involved, if you could come up with an algorithm, would be massive and not appropriate for a server that also has to service hardware in real time.

IMHO.

Michael

CavesOfTQLT
2005-05-02, 15:50
I give up. I just don't know where this reference to file formats has come from because I'm on about songs and their repetitions in the music library, and not whether they're in FLAC, mp3, OGG or MA formats.

Anyway, just forget I brought up the subject.



Edit: This was posted without me seeing the reply ABOVE which came in just as I was typing this.

CavesOfTQLT
2005-05-02, 15:57
Michael, a quick reply to your post before I go. In cases like this where you've got slightly different versions of a song but they're all tagged the same, then I did mention in one of my posts above that you could listen to each 'duplicate' and decide which to keep and which to 'delete'.

Anyway, it's getting to the point now where I wish I'd never brought up the suggestion.

Jim Dibb
2005-05-03, 04:54
Caves, I'm with you. I can't see why there's any desire to have 3
FLAC copies of the same song off 3 different compilations either. If
you chose 1 as 'best' of the 3, why would you care if you had the
other two as long as the albums the came from played correctly
including that song? An interface of choosing 1 or more of the
duplicates to remain, and replacing all the others with a pointer file
which includes the tagging from the original song, and a pointer to
the copy of the file you wish to keep seems like a completely valid
thing to do.

"Jim" above is way off saying
>Don't just use FLAC because the assumption bigger=better or because
>these "audiophile" people all do it. If the statement you wrote above
>is correct then clearly FLAC is not for you, and you'd be wasting disk
>space using it.

FLAC is an uncompressed, lossless codec. That's it. Why would
someone want to give up FLAC just because the have the same studio
version of "Manic Monday" (or whatever) on the original album and
again on a "Best of the 80s" disk and don't care to have both taking
up space?

I'd like to hear why it's a waste of disk space to use it and then
delete all the true dups (not different artists, not different
arrangements. Just different CDs?). There's a few reasons why 'not'
to use FLAC. What is the reason to use FLAC?

I understand some of the comments above about how this might not be a
high priority thing for slimserver developers to work on. It might
not even be that useful for many people as they wouldn't have many
duplicates. The idea itself is not flawed though.


On 5/2/05, CavesOfTQLT <CavesOfTQLT.1offlz (AT) no-mx (DOT) forums.slimdevices.com> wrote:
>
> Michael, a quick reply to your post before I go. In cases like this
> where you've got slightly different versions of a song but they're all
> tagged the same, then I did mention in one of my posts above that you
> could listen to each 'duplicate' and decide which to keep and which to
> 'delete'.
>
> Anyway, it's getting to the point now where I wish I'd never brought up
> the suggestion.
>
>
> --
> CavesOfTQLT
>

Bennett, Gavin (LDN Int)
2005-05-03, 05:06
My twopenneth worth......

I have spent a long time thinking about this for my own collection (MP3) and
found there are two ways of dealing with it:

1. Remove the dupes - as you guys have been discussing - and create
links.
Under Windows there are "hard points" which are close to Unix
symbolic links but only available through an api.
"hard points" unfort. do not work accros partitions or disks.

2. Remoe the duplicates from a "play session". I.e. when I select play
all "Rock" to remove any duplicate tracks from the resulting virtual play
list. This would have to rely on Track Name and Artist matching and could
be an optional.

....and ofcourse....

3. do nothing

My preference would be (2) because:
a) It means when I play an album I will always get the tracks
from that album in the correct order.
b) Disk space is cheap compared with the hassel of running and
checking a de-dupe process.


Gavin

long, boring disclaimer follows............................






















































..


MAN FINANCIAL LIMITED E-MAIL DISCLAIMER

"This electronic mail message was sent by Man Financial Limited ("MFL") of
Sugar Quay, Lower Thames Street, London, EC3R 6DU, a Company registered in
England no. 1600658. MFL is authorised and regulated by the Financial
Services Authority in the UK and is a member of the Man Group. MFL appear on
the UK Financial Services Authority register under no. 106052. MFL is a
member of the London Stock Exchange.

This electronic mail message is intended only for the personal and
confidential use of the designated recipient(s) named above. If you are not
that person, you are not authorised to view, disseminate, distribute or copy
this message or any part of it without our consent; and you are requested to
return this message to the sender immediately and delete all copies from
your system.

The value of investments and foreign exchange can go up as well as down and
involve the risk of loss. You may lose more than the amount originally
invested and, in respect of products traded on margin, you may have to pay
more later. Opinions, conclusions and other information expressed in this
message are not given or endorsed by MFL unless otherwise indicated by an
authorised representative.

Due to the electronic nature of e-mails, there is a risk that the
information contained in this message has been modified. Consequently MFL
can accept no responsibility or liability as to the completeness or accuracy
of the information.

Whilst efforts are made to safeguard messages and attachments, MFL cannot
guarantee that messages or attachments are virus free, do not contain
malicious code or are compatible with your electronic systems and does not
accept liability in respect of viruses, malicious code or any related
problems that you may experience.

MFL's e-mail system is for business purposes only. All e-mail may be
reviewed by authorised personnel, and may be provided to regulatory
authorities or others with a legal right to access such information.

If you would like to find out more information about Man Financial Limited
please click on the following hyperlink to our web site. "

Web site : http://www.manfinancial.com

Jim
2005-05-03, 10:40
I'd like to hear why it's a waste of disk space to use it and then
delete all the true dups (not different artists, not different
arrangements. Just different CDs?).
I'd again like to hear why phrases like "dupe" and "the same" are being used so liberally with lossless audio.

Quick "is it the same?" audio test (for anyone, regardless of listening equipment - even my deaf granny can do this):



metaflac "D:\Michael Jackson\Thriller\01 - Beat It.flac" --show-md5

metaflac "D:\Various\80's Crap Vol 7\20 - Beat It.flac" --show-md5


Do the MD5's match? If yes then congratulations - you have found 2 tracks the same, I'm still yet to this in my 20,000+ FLAC collection. Delete this track, it's the same and is wasting hard disk space. If the MD5's didn't match then it is not the same.

Now if it's not the same and you delete/replace it then the album you have is no longer lossless, and to make matters worse and generate confusion you are storing it in a lossless format.

Time to choose how important lossless is to you, if it's quite acceptable to lose entire tracks of a CD as "the other one sounds the same", or "ReplayGain will sort out the volume difference" then by all means do it, but then ask yourself why you chose lossless and if you are quite happy to listen to albums with different volumes or digitally applied amplitude adjustments then maybe you should do another FLAC<>320kbpsMP3 blind listening test as you could save a lot of cash if you switch over to MP3 - you did do one in the first place didn't you? You didn't just chose FLAC because "they" told you to did you?



There's a few reasons why 'not'
to use FLAC. What is the reason to use FLAC?

Your ears tell you to, you care about having a 1:1 backup of your entire album-for-album CD collection for when CD-rot kicks in.



The idea itself is not flawed though.


I know:


I applaud your idea, however I think it's more of a solution to gain space from a lossy collection. If ever I came across two FLAC files that were exactly the same I of course would like to use a feature like this, but as I said earlier with a growing collection of 20,000+ I haven't yet - and I do have about 20 different versions of some songs that sound the same to me and would no doubt be picked up by your MMM software as "dupes".

Jim Dibb
2005-05-03, 11:33
On 5/3/05, Jim <Jim.1ogvoz (AT) no-mx (DOT) forums.slimdevices.com> wrote:

> Time to choose how important lossless is to you, if it's quite
> acceptable to lose entire tracks of a CD as "the other one sounds the
> same", or "ReplayGain will sort out the volume difference" then by all
> means do it, but then ask yourself why you chose lossless and if you
> are quite happy to listen to tracks with different volumes or digitally
> applied amplitude adjustments then maybe you should do another
> FLAC<>320kbpsMP3 blind listening test as you could save a lot of cash
> if you switch over to MP3 - you did do one in the first place didn't
> you? You didn't just chose FLAC because "they" told you to did you?
>
Thanks for clarifying, Jim. I personally ripped all my CD's at 192k
MP3 rate and am considering doing it again differently. The volume
difference between different CDs was not something I had realized.

I felt sympathy to Caves of TQLT and as a supporter of the idea (and
how it borrows from other reference counted strategies) am glad to now
have more insight.

Sorry I missed this part.
> Jim Wrote:
> > *I applaud your idea*, however I think it's more of a solution to gain
> > space from a lossy collection.

Regards,
Another Jim (Dibb)

Jim
2005-05-03, 13:28
Thanks for clarifying, Jim. I personally ripped all my CD's at 192k
MP3 rate and am considering doing it again differently. The volume
difference between different CDs was not something I had realized.

I felt sympathy to Caves of TQLT and as a supporter of the idea (and
how it borrows from other reference counted strategies) am glad to now
have more insight.


No problem, in later posts I might have come over as slightly patronising but that was because I was getting annoyed that nobody understood my "it's either lossless or it isn't" point of view. I never faulted the idea of the original poster, just how he was considering implementing it blindly on his (for now) lossless collection.

If you decide in your re-ripping to go the lossless route be careful, over time you might become as fanatical about every single hz of audio as me :P

CavesOfTQLT
2005-05-03, 14:04
I see we're back onto comparing files together, rather than comparing the actual songs. Whether those songs are encoded as FLAC, OGG or any other format, and whether they have slight volume differences between the CDs they're ripped from, is totally moot. Yes one track may sound better for whatever reason and this would be the one to keep, but there would be no need to keep the others. That is, unless they're different versions of the same song; say Michael Jackson decided to re-release Beat It by using trumpets rather than drums.

I still can't see where all these references to comparing, say FLAC files together, to see if they're the same has come in. A checksum of a copy of a song off one album, is very unlikely to match a copy of the same song off another one, due to volume diffs, length of entry and exit silences, etc. But the songs are the same. So why have all these duplicate songs taking up valuable HDD space? Pick the best & bin the rest leaving a marker where the duplicates used to be to keep the database integrity intact. But hey, if people can afford to have lots and lots of HDD space then they wouldn't be bothered by an idea like this ;)

Anyway I'll leave it there.

Ben Sandee
2005-05-03, 14:12
> I still can't see where all these references to comparing, say FLAC
> files together, to see if they're the same has come in. A checksum of a
> copy of a song off one album, is very unlikely to match a copy of the
> _same_ song off another one, due to volume diffs, length of entry and
> exit silences, etc. But the songs are the same. So why have all these
> duplicate songs taking up valuable HDD space? Pick the best & bin the

"valuable HDD space" ?? The storage for a 30mb FLAC encoded song
costs WELL under a nickel these days.

I'm beginning to think maybe you are just a very persistent and well
disguised troll because you seem so naively well-intentioned....

Ben

Jim
2005-05-03, 14:26
I see we're back onto comparing files together, rather than comparing the actual songs.

Yeah, best stop doing that.

Well, I'm sorry CavesOfTQLT you were right. I've seen the light and I now have over 700 gigabytes of hard disk space, and a few thousand bucks in my pocket too.

I've just sold my hi-fi system, dumped my FLAC files and will just be using my iPod with it's 128kbps AAC's.

Ths songs are the same.

Anyone wanna buy a SB1?

A great compliment to FLAC though, and nice to see it has matured to a point now where people who don't even know why they are using it are using it :D

Ben
2005-05-03, 23:06
No problem, in later posts I might have come over as slightly patronising but that was because I was getting annoyed that nobody understood my "it's either lossless or it isn't" point of view. I never faulted the idea of the original poster, just how he was considering implementing it blindly on his (for now) lossless collection.

If you decide in your re-ripping to go the lossless route be careful, over time you might become as fanatical about every single hz of audio as me :P

I almost hate jumping into this thread, but I guess it depends on your reason for using FLAC (or any lossless codec). If it's to archive your album collection, then you're right, It's either lossless or it isn't (Why do I have a sneaking suspicion you're a cuesheet guy? ;) ). However, if it's just to have the best sounding individual tracks you can, then having the whole album doesn't matter.

I rip to FLAC because I can hear a difference between it and MP3. I'm not trying to make backup copies of my albums, though. Heck, I delete tracks from the ripped albums that I know I don't like... For the tracks I do like, though, I want them to be lossless.

I can't remember the last time I played through an album anyway. I always play through on shuffle by song. So, I've manually deleted 'dupes' (whether exact or not, the same studio recording of a song) when I have them on both a compilation album and the original album. Since my Slimserver is also my main desktop machine, I would like to save drive space if possible. Maybe the original poster and I are in the minority there, I guess.

Ben

CavesOfTQLT
2005-05-04, 01:57
... So, I've manually deleted 'dupes' (whether exact or not, the same studio recording of a song) ...

Ben
And in my original post the removing of these 'exact' copies was all I was on about, with Slimserver (or even some other 3rd party program) just putting a reference file into the album where the dupe got deleted. Shame the thread got spoilt by sarcasm being brought into it.

Jim
2005-05-04, 05:44
I almost hate jumping into this thread, but I guess it depends on your reason for using FLAC (or any lossless codec). If it's to archive your album collection, then you're right, It's either lossless or it isn't (Why do I have a sneaking suspicion you're a cuesheet guy? ;) ).

I love drooling over cuesheets almost as much as EAC log files :D



However, if it's just to have the best sounding individual tracks you can, then having the whole album doesn't matter.


Nothing wrong with that, Ass covered in post #9:


If you´re going to start removing files you may as well get rid of any traditional concept of a compilation - just store all tracks from compělations as "singles" on your server - now you can define your compilations and no longer have the 80-minute limit of a CD and can at last remove those embarassing tracks that sometimes slip out during a random shuffle.


But I still believe it's either delete or keep. Don't try to replace.

John Hernandez
2005-05-05, 09:05
Jim wrote:
>
> Time to choose how important lossless is to you, if it's quite
> acceptable to lose entire tracks of a CD as "the other one sounds the
> same", or "ReplayGain will sort out the volume difference" then by all
> means do it, but then ask yourself why you chose lossless and if you
> are quite happy to listen to tracks with different volumes or digitally
> applied amplitude adjustments then maybe you should do another
> FLAC<>320kbpsMP3 blind listening test as you could save a lot of cash
> if you switch over to MP3 - you did do one in the first place didn't
> you? You didn't just chose FLAC because "they" told you to did you?
>

I didn't do any blind listening tests. I chose FLAC primarily because
it's future-proof in the sense that I can liberally transcode to other
formats. I also chose it because it's OSS and not patent-encumbered.
And because it enables gapless playback. I'm not ashamed to admit I
chose FLAC based on information "they" told me. Come to think of it, I
haven't even read the license or validated an uncompressed FLAC against
the original wav file.

On the other hand, I agree 100% with the many potential pitfalls of
duplicate removal you point out, including problems with gaps, volume
differences, different sound due to digital mastering & transfer
quality, tainting of a lossless collection, etc. Most of these pitfalls
apply to MP3 collections, too. That said, it may work for those who
don't tend to listen in "album" mode.

oreillymj
2005-05-30, 05:32
Okay I'm not a big fan of the software, but for tag management and the "Show Duplicates" option, iTunes 4.8 is great, and free.

But I've stuck with the SQLite Db for track tag info and disabled the iTunes plugin as it was hammering my machine with constant unneeded refreshes.