PDA

View Full Version : Data Duplication



scalesr1
2005-08-13, 00:54
Hello.



My music library has grown hideously over the years and I am starting to
sort it out - Currently standing at around 250GB, 47K songs etc.



One problem I have is that on occasion I find that I have an album in two
places - when I play the album, it plays each track twice. If I check the UI
- I can see that each track is listed twice, I can then easily find out
where each one is hiding, decide which one to remove and that problem is
sorted.



My question is - is there a relatively easy way / tool / trick to seek out
all such duplications and if so how / where should I look.



I am running SlimServer 6.1 - would I be able to use my favourite database
environment Filemaker pro to hook into the SQL database in some way? - If so
I could then do it in filemaker I would guess.



Kind regards



Richard Scales

pfarrell
2005-08-13, 09:13
On Sat, 2005-08-13 at 08:54 +0100, Richard Scales wrote:
> My question is is there a relatively easy way / tool / trick to seek
> out all such duplications and if so how / where should I look.

Define "easy".
But first, define duplicate. If the album has the same name,
then it is very easy in Sql.

If the names vary slightly, it is much harder, altho by
no means impossible.

Or do you mean some are .mp3 and some .flac and some .wma?

> I am running SlimServer 6.1 would I be able to use my favourite
> database environment Filemaker pro to hook into the SQL database in
> some way? If so I could then do it in filemaker I would guess.

You need the OBCD driver to talk to the SlimServer database.
Once that is setup, do whatever sql you want. Or even
write Java/Perl/PHP to talk to the database.

You may want to look at the calculated hash for each song, if you
are interesting in song-level duplication.

--
Pat
http://www.pfarrell.com/music/slimserver/slimsoftware.html

scalesr1
2005-08-14, 00:56
By 'easy' I mean something that I can do in a couple of hours - the
alternative being to go through every artist (2500 of them) and look for
duplicate album entries.

By 'duplicate' I mean that I simply want to locate multiple instances of the
same album for a given artist.

Can you give me any pointers as to where I might look to configure ODBC
drivers - mysql.com have an odbc driver download which I will look at - from
memory I believe that once the odbc driver is configured and a data source
added, any odbc compliant app can access the data - does this sound right to
you?

Kind regards

Richard


-----Original Message-----
From: Pat Farrell [mailto:pfarrell (AT) pfarrell (DOT) com]
Sent: 13 August 2005 17:13
To: Slim Devices Discussion
Subject: Re: [slim] Data Duplication

On Sat, 2005-08-13 at 08:54 +0100, Richard Scales wrote:
> My question is - is there a relatively easy way / tool / trick to seek
> out all such duplications and if so how / where should I look.

Define "easy".
But first, define duplicate. If the album has the same name,
then it is very easy in Sql.

If the names vary slightly, it is much harder, altho by
no means impossible.

Or do you mean some are .mp3 and some .flac and some .wma?

> I am running SlimServer 6.1 - would I be able to use my favourite
> database environment Filemaker pro to hook into the SQL database in
> some way? - If so I could then do it in filemaker I would guess.

You need the OBCD driver to talk to the SlimServer database.
Once that is setup, do whatever sql you want. Or even
write Java/Perl/PHP to talk to the database.

You may want to look at the calculated hash for each song, if you
are interesting in song-level duplication.

--
Pat
http://www.pfarrell.com/music/slimserver/slimsoftware.html

pfarrell
2005-08-14, 18:21
On Sun, 2005-08-14 at 08:56 +0100, Richard Scales wrote:
> By 'easy' I mean something that I can do in a couple of hours - the
> alternative being to go through every artist (2500 of them) and look for
> duplicate album entries.

Ok, then there are lots of easy approaches.


> By 'duplicate' I mean that I simply want to locate multiple instances of the
> same album for a given artist.

So you mean has the same tag data, exact match.

Not things like same album in both MP3 and flac, or
"Muti, Beethoven No. 9, Philadelphia Symphonic Orchestra" versus
"Muti, Philadelphia Symphonic Orchestra, Beethoven No. 9,"

If you have consistent file structures, you can even just force
a 'dir' command out to a file, sort, and then look for dups.


> Can you give me any pointers as to where I might look to configure ODBC
> drivers - mysql.com have an odbc driver download which I will look at - from
> memory I believe that once the odbc driver is configured and a data source
> added, any odbc compliant app can access the data - does this sound right to
> you?

Yes, once you have obcd drivers, you are golden and can use
nearly anything, even Excel. But I'm not current on Windows platforms
with assorted drivers, so I am sorry, I can't help on that part.

Last time I looked, there was a OBCD driver applet on Window's
Control-Panel app. You specify the basics there.

It is probably faster to get someone more facile with Windows than me,
but if you are stuff, email me off-list and I'll try to help.

--
Pat
http://www.pfarrell.com/music/slimserver/slimsoftware.html

mac
2005-08-14, 20:32
Assuming windows:

1. Install a SQLite ODBC driver. This one seems to work well: http://www.ch-werner.de (http://www.ch-werner.de/sqliteodbc/sqliteodbc.exe)

2. Install a query tool that allows you to use an ODBC source. This one seems to work well: http://www.unitysolutions.com/utbdirect (http://www.download.com/Universal-Table-Browser/3000-2065-10286008.html?part=dl-Universal&subj=dl&tag=button)

Configure the ODBC driver as shown here (http://lowfat.sytes.net/~mike/slimserver/odbc.jpg). When you open the query tool select the data source that you just configured. You can now issue whatever SQL statements (http://lowfat.sytes.net/~mike/slimserver/utb.jpg) you want against your Slimserver database.

Hope this helps.