PDA

View Full Version : Tag Database Evolution + Streaming formats other than mp3 (bug 131)



Richard Purdie
2004-04-12, 13:38
I've been thinking about the tag database. The next logical step in it's
evolution would be to merge infoCache and infoCacheDB. The main motivation
would be code simplicfication and reduced memory overhead. Is that
something worth doing? If so, should I have a look into it? It will mean
fairly big changes to Info.pm.

Secondly, bug 131 is still in the back of my mind. stream.mp3 returns stream
mp3 data which is fine. How do you stream wav data though? remove the
headers or not? What should ogg, flac, aiff etc. stream? The simplest
solution is just to stream the raw files spliced together. I guess we could
make a special case for .pcm which would get the same type of stream as the
squeezebox? That does make the mp3 stream slightly different from the rest
however.

There is most of the above in the patch I've previously submitted. The only
piece missing is dealing with silence. For that we probably need some
silence files of the different formats...

RP

dean
2004-04-12, 14:11
Hi Richard,

On Apr 12, 2004, at 1:38 PM, Richard Purdie wrote:
> I've been thinking about the tag database. The next logical step in
> it's
> evolution would be to merge infoCache and infoCacheDB. The main
> motivation
> would be code simplicfication and reduced memory overhead. Is that
> something worth doing? If so, should I have a look into it? It will
> mean
> fairly big changes to Info.pm.
I'd like to see somebody take a crack at fixing up Info.pm to use a
more generic database API. Then fold infoCache and infoCache.db into a
simple persistent hash backend.

And yes, it's a fairly substantial effort. :)

> Secondly, bug 131 is still in the back of my mind. stream.mp3 returns
> stream
> mp3 data which is fine. How do you stream wav data though? remove the
> headers or not? What should ogg, flac, aiff etc. stream? The simplest
> solution is just to stream the raw files spliced together. I guess we
> could
> make a special case for .pcm which would get the same type of stream
> as the
> squeezebox? That does make the mp3 stream slightly different from the
> rest
> however.
It probably makes sense to enable:

http://server:9000/stream.wav
http://server:9000/stream.aif
http://server:9000/stream.flc
http://server:9000/stream.ogg

And use the Source.pm framework to support the transcoding as necessary.

> There is most of the above in the patch I've previously submitted. The
> only
> piece missing is dealing with silence. For that we probably need some
> silence files of the different formats...
Absolutely...

-dean

Dan Sully
2004-04-12, 14:21
* dean blackketter <dean (AT) slimdevices (DOT) com> shaped the electrons to say...

>I'd like to see somebody take a crack at fixing up Info.pm to use a
>more generic database API. Then fold infoCache and infoCache.db into a
>simple persistent hash backend.
>
>And yes, it's a fairly substantial effort. :)

On this note - here are some items I've thought up along the way:

* Abstract "filesystem" - ask "provider" for list of entries, clearly defined api.
- plugable backends, files, rdbms.

* Make more things objects. loose a lot of wacky public
datastructures. make them a black box to the callers.

* optimize performance - dprof/strace - we are doing a *lot* of system calls.

* Data driven where possible. - web setup

I've been slowly working on some other things such as Ogg streaming (anyone
know how to deal with XS + MSVC?) and unifying the Formats.

-D
--
Sir, are you classified as human?
Uhh, negative, I am a meat popsicle.

Richard Purdie
2004-04-12, 14:38
Dean:
> I'd like to see somebody take a crack at fixing up Info.pm to use a
> more generic database API. Then fold infoCache and infoCache.db into a
> simple persistent hash backend.

So when you say "generic database API" you mean between Info.pm and the rest
of the SlimServer code?

If so, we're thinking along the same lines.

> And yes, it's a fairly substantial effort. :)

:)

re [streaming formats other than mp3 (bug 131)]
> It probably makes sense to enable:
>
> http://server:9000/stream.wav
> http://server:9000/stream.aif
> http://server:9000/stream.flc
> http://server:9000/stream.ogg
>
> And use the Source.pm framework to support the transcoding as necessary.

The patch lying around does that. My question was should the files just be
joined one after another into a stream, headers intact? What should happen
when you fast forward, rewind etc? There's a big can of worms here - I've
stopped working on that patch until I work out what the correct way forward
is...

RP

dean
2004-04-12, 16:07
On Apr 12, 2004, at 2:38 PM, Richard Purdie wrote:

> Dean:
>> I'd like to see somebody take a crack at fixing up Info.pm to use a
>> more generic database API. Then fold infoCache and infoCache.db into
>> a
>> simple persistent hash backend.
>
> So when you say "generic database API" you mean between Info.pm and
> the rest
> of the SlimServer code?
>
> If so, we're thinking along the same lines.
Actually, what I was thinking was to make Info.pm a layer that could
talk to a generic database API and did whatever music specific work
needed.

> re [streaming formats other than mp3 (bug 131)]
>> It probably makes sense to enable:
>>
>> http://server:9000/stream.wav
>> http://server:9000/stream.aif
>> http://server:9000/stream.flc
>> http://server:9000/stream.ogg
>>
>> And use the Source.pm framework to support the transcoding as
>> necessary.
>
> The patch lying around does that. My question was should the files
> just be
> joined one after another into a stream, headers intact? What should
> happen
> when you fast forward, rewind etc? There's a big can of worms here -
> I've
> stopped working on that patch until I work out what the correct way
> forward
> is...
We do much of this already for Squeezebox so it should be applicable to
an HTTP client as well.

Richard Purdie
2004-04-13, 03:39
dean blackketter:
> Actually, what I was thinking was to make Info.pm a layer that could
> talk to a generic database API and did whatever music specific work
> needed.

I wondered if that was what you meant.

My concern is that at the moment we access data in memory. It may not be a
lot of things but it is fast. The code isn't optimised to get the data in
wants in one go - instead we have lots of accesses to the memory hash as and
when individual elements of data are needed. Databases are slow compared to
memory and I think we'd see one heck of a performance hit by having all the
data only accessible via SQL (for example).

How do you see this API working? Are you still thinking of letting the
memory hash continue to exist as a buffer between the storage medium
(database) and the slimserver? Only writes then have to get passed to the
slow layer. Or do you think the database will be fast enough and we lose the
memory hash? Or do we look at just caching recently accessed data in
Info.pm?

Also, are you thinking of allowing more than one storage medium to be in use
at once?

Half the battle is understanding what everyone wants from the changes...

> > re [streaming formats other than mp3 (bug 131)]
> >> It probably makes sense to enable:
> >>
> >> http://server:9000/stream.wav
> >> http://server:9000/stream.aif
> >> http://server:9000/stream.flc
> >> http://server:9000/stream.ogg
> >>
> >> And use the Source.pm framework to support the transcoding as
> >> necessary.
> >
> We do much of this already for Squeezebox so it should be applicable to
> an HTTP client as well.

The main problem is there is no out of band method to pass the stream format
with aif and wav.

RP

Steve Baumgarten
2004-04-13, 08:34
> My concern is that at the moment we access data in memory. It may not
> be a lot of things but it is fast. The code isn't optimised to get the
> data in wants in one go - instead we have lots of accesses to the memory
> hash as and when individual elements of data are needed. Databases are
> slow compared to memory and I think we'd see one heck of a performance
> hit by having all the data only accessible via SQL (for example).

A generic database API layer is nice, but it might also be worth
considering something like tied hashes with a Berkeley DB holding the tag
database. This gives you the best of many worlds:

o Simple to implement; barely more code that what exists right now

o Fast access to data in hashes; perl's tie interface handles fetches
and writes to the Berkeley DB automatically

o Existing code doesn't change

o Scales well; memory footprint doesn't increase with size of tags DB

On the other hand, it wouldn't be a general solution that would handle any
and all database needs for slimserver.pl. Simpler and quicker to
implement; a reasonable amount of bang for the buck; but not what you'd
get from an overall rewrite that would handle all data storage/retrieval
needs.

SBB

=?ISO-8859-1?Q?Fr=E9d=E9ric_Miserey?=
2004-04-14, 08:07
Berkeley DB has now a native XML product. As lots of Slim's info is
imported from XML, this might be a good lead.

Is there any relational data in the slimserver ? If not, there is no
need to go for an RDBMS. The best example being LDAP servers. Berkeley
(the non XML version) code is usually in there and it handles hundred
millions of records.

Frédéric

On 13 avr. 04, at 17:34, Steve Baumgarten wrote:

>> My concern is that at the moment we access data in memory. It may not
>> be a lot of things but it is fast. The code isn't optimised to get the
>> data in wants in one go - instead we have lots of accesses to the
>> memory
>> hash as and when individual elements of data are needed. Databases are
>> slow compared to memory and I think we'd see one heck of a performance
>> hit by having all the data only accessible via SQL (for example).
>
> A generic database API layer is nice, but it might also be worth
> considering something like tied hashes with a Berkeley DB holding the
> tag
> database. This gives you the best of many worlds:
>
> o Simple to implement; barely more code that what exists right now
>
> o Fast access to data in hashes; perl's tie interface handles fetches
> and writes to the Berkeley DB automatically
>
> o Existing code doesn't change
>
> o Scales well; memory footprint doesn't increase with size of tags DB
>
> On the other hand, it wouldn't be a general solution that would handle
> any
> and all database needs for slimserver.pl. Simpler and quicker to
> implement; a reasonable amount of bang for the buck; but not what you'd
> get from an overall rewrite that would handle all data
> storage/retrieval
> needs.
>
> SBB

Caleb Epstein
2004-04-14, 08:12
On Wed, Apr 14, 2004 at 05:07:01PM +0200, Fr?d?ric Miserey wrote:

> Berkeley DB has now a native XML product. As lots of Slim's info is
> imported from XML, this might be a good lead.

It also adds a whole bunch of dependencies (Xerces, Pathan)
which might be overkill. There also seem to be no Perl
bindings (yet).

I'm not sure if I'm the only one, but I've found dealing with
XML in Perl (or really getting all the right modules installed
to do same) is a MAJOR PITA. Why is this so?

--
Caleb Epstein | bklyn . org | It's not whether you win or lose but how you
cae at | Brooklyn Dust | played the game.
bklyn dot org | Bunny Mfg. | -- Grantland Rice

=?ISO-8859-1?Q?Fr=E9d=E9ric_Miserey?=
2004-04-14, 08:18
Isn't Xerces Perl doing the right thing for you ? It seems a little
behind compared to its C++/Java twins but does it hurt ?

On 14 avr. 04, at 17:12, Caleb Epstein wrote:

> I'm not sure if I'm the only one, but I've found dealing with
> XML in Perl (or really getting all the right modules installed
> to do same) is a MAJOR PITA. Why is this so?

Caleb Epstein
2004-04-14, 08:38
On Wed, Apr 14, 2004 at 05:18:26PM +0200, Fr?d?ric Miserey wrote:

> Isn't Xerces Perl doing the right thing for you ? It seems a little
> behind compared to its C++/Java twins but does it hurt ?

Only when it core dumps (!)

I have tended to use the "simpler" XML parsers like
XML::Parser::EasyTree which are based on libxml IIRC.

--
Caleb Epstein | bklyn . org | Ben, why didn't you tell me?
cae at | Brooklyn Dust | -- Luke Skywalker
bklyn dot org | Bunny Mfg. |

Steve Baumgarten
2004-04-14, 08:43
> I'm not sure if I'm the only one, but I've found dealing with
> XML in Perl (or really getting all the right modules installed
> to do same) is a MAJOR PITA. Why is this so?

Because you're not using XML::Simple, perhaps? With XML::Simple it's
literally 1 line of code to slurp an XML file into native perl
hashes/arrays. (For extra-large files where it's not appropriate to slurp
the whole thing in, there are other ways to go about it that require a
little more effort.)

Between that and XML::Mini::Document (to write XML), I've found XML
parsing and generation in perl to be a breeze.

Back on topic for a moment, I agree that database access should be simple
and quick; involving XML is not generally the way to achieve that.

SBB