Home of the Squeezebox™ & Transporter® network music players.
Page 1 of 14 12311 ... LastLast
Results 1 to 10 of 134
  1. #1
    Senior Member erland's Avatar
    Join Date
    Dec 2005
    Location
    Sweden
    Posts
    11,039

    Thoughts regarding new database schema ?

    Has anyone thought about the principles in the new database schema proposed in:
    http://bugs.slimdevices.com/show_bug.cgi?id=8303

    As I understand it needs to support:
    - Storage of custom tags configured by a user
    - Possibility to assign a track to one or serveral user profiles

    I was thinking about making some experiments related to this enhancement request based on the work I've done in Custom Scan and Multi Library plugin, but I don't want to spend the time doing it if there already is a plan/design available in someones head.

    If no one have thought about this new database schema and no one has the time to think about it at the moment, I would prefer if you just say so instead of just being quiet.

    Details about Custom Scan and Multi Library implementation:

    1.
    Custom Scan plugin stores all custom tags in a single table that contains the track identifier (from the tracks table) and the custom tag. So each track will typically be represented by several rows in this table. The disadvantage with this solution is that this table can get pretty large if a lot of custom tags are used, in a 50 000 track library with 10 custom tags per track it results in a 500 000 rows table. The advantage is that you can basically support any categorization information, with support for external third party scanning modules the information doesn't even have to be custom tags.

    I've seen some performance problem in Custom Scan with a solution like this, but I suspect some of them is based on the fact that strings are sometimes used to join things instead of creating unique integer ids.

    So, do you think a single table like this will be a good solution for this enhancement request or are you thinking something totally different ?


    2.
    Multi Library plugin has a similar approach where there is a single table that contains the relation between a track identifier (from the tracks table) and a
    library identifier. So a single track can be assigned to one or several libraries.

    Is it something like this you have in mind for solving the "per user" part of this enhancement request ?
    Erland Isaksson (My homepage)
    Developer of many plugins/applets

  2. #2
    Senior Member gharris999's Avatar
    Join Date
    Apr 2005
    Location
    Santa Fe, NM
    Posts
    3,509
    It's true that I haven't really thought this through, and also, I'm functioning without the benefit of caffeine at the moment. But that said, as a classical music enthusiast, the present state of the SC db allows me to only minimally characterize and classify my library. There currently is no sense of "richness" in terms of SC's browsing options. It has always felt to me to be very one dimensional and inflexible.

    The mere fact that the UI doesn't allow one to set a default to browse by genre, composer, album has forced me to tag my entire collection (~40k tracks) with ARTIST==composer rather than ARTIST==performer. I've attempted to get around that limitation by including performer names in the album titles so I can at least search for them. But that still seems like a horrible kludge.

    This is a serious, almost laughable hole in SC's capabilities, in my opinion. I imagine that for many classical music enthusiasts, SC's UI continues to strike them as "toy like."

    Even the "various artists" support has never worked for me. It's always returned unpredictable or just plain wrong results. For every major SlimServer/SqueezeCenter release, I've tried it, kicked the tires and ended up turning it off.

    All that bashing aside, the SC UI does work for me, in a minimal sense, and I acknowledge that the developers have been very thoughtful in their adding of features over time. I don't want to come off as unduly negative here.

    But it's been a frustrating experience for me for the past 4 years to have to consistently and awkwardly shoehorn my tagging scheme into what seems to be an obviously pop-music-centric db design.

    I do have every intention of giving your custom browse plugin a serious try-out later this summer. Perhaps it will provide me with the browsing options I've been missing and inspire me to re-think my tagging scheme.

    But I do think that the time is ripe for a fundamental re-think of SC's db design, and I applaud you for your willingness to devote some gray-matter to this.
    Last edited by gharris999; 2008-06-06 at 10:07.

  3. #3
    Senior Member vrobin's Avatar
    Join Date
    May 2007
    Posts
    460
    If things are going to move on this front, I'll be glade to take out the generic XML Schema (XSD) I worked on and expose it to your judgement. Not for transforming it in a relationnal schema, but it could bring ideas (like the generic tagname/tagvalue and creditname/creditrole paradigms).

    In reaction to Erland post, I could say that I'm not a big fan of the "one big table" design at first sight (I remember the thread about table optimization and the crazy size needed by the join when the collection is big)

    Robin
    PS: for those that showed interest about it, the development of the tools I already talked a lot in the past in this forums is frozen since more than six months, but it will resurrect sooner or later when my personnal situation change)

  4. #4
    Robin Bowes
    Guest

    Thoughts regarding new database schema ?

    vrobin wrote:
    > If things are going to move on this front, I'll be glade to take out the
    > generic XML Schema (XSD) I worked on and expose it to your judgement.
    > Not for transforming it in a relationnal schema, but it could bring
    > ideas (like the generic tagname/tagvalue and creditname/creditrole
    > paradigms).
    >
    > In reaction to Erland post, I could say that I'm not a big fan of the
    > "one big table" design at first sight (I remember the thread about
    > table optimization and the crazy size needed by the join when the
    > collection is big)
    >
    > Robin
    > PS: for those that showed interest about it, the development of the
    > tools I already talked a lot in the past in this forums is frozen since
    > more than six months, but it will resurrect sooner or later when my
    > personnal situation change)


    I did some work (way back) in looking into an alternative schema for
    Slimserver.

    It wasn't complete, or even correct - just proof of concept - but it
    seemed to work well.

    Let me see if I can find it ...

    Ah, there we go:

    http://forums.slimdevices.com/showthread.php?t=1141#4

    The basic concept is to have tracks, attributes, and a table linking
    tracks to attributes. One of the reasons my example is "wrong" is that
    it won't support multiple tags of the same type, but that could easily
    be fixed.

    (the other) Robin.


  5. #5
    Senior Member erland's Avatar
    Join Date
    Dec 2005
    Location
    Sweden
    Posts
    11,039
    I started to experiment a bit with this yesterdag and I've attached a first patch in the bug report regarding the new database schema:
    http://bugs.slimdevices.com/show_bug.cgi?id=8303

    I'm not sure this is will be the final solution, but if anyone is willing to help there is a number of things to do:

    1.
    If you are a database export, have a look at the table layout in the attached patch an suggest which indexes that should be created for optimal performance.

    2.
    Try the performance of the patch in a larger library, it works pretty well in my small library of 3000 tracks, but it would be interesting to know how it works on a larger library. Check both how the scanning time is affected and the browsing performance.

    3.
    Have a look at the new database tables and comment on anything you feel is relevant. As an example, would it be better to have a single table with duplicated information on several rows than the current solution with a tags table and a tag_track relational table ?

    The patch doesn't include any browsing code, so there is a "simple" BrowseByTags plugin attached to the bug report which will create a new "Browse By Tags" menu. The BrowseByTags plugin requires you to install the CustomBrowse plugin, although you won't need to change any Custom Browse configuration it should work with the default installation.

    The browsing speed might be affected by the Custom Browse plugin which is known to be a bit slower than the standard SqueezeCenter browsing code, this will probably improve when we have some real browsing code integrated with SqueezeCenter.
    Erland Isaksson (My homepage)
    Developer of many plugins/applets

  6. #6
    Senior Member
    Join Date
    Apr 2005
    Posts
    8,410

    Thoughts regarding new database schema ?

    Erland,

    Not a direct comment on your patches, but I thought I'd raise some of the
    discussion that happened on campfire a few months ago if this is the start
    of some real work on the db schema.

    There are two complexities with the current database scanning/browsing model
    which we may want to address:

    1) The current code attempts to implement all db related preferences at run
    time rather than scan time. This has made the browsing code somewhat slower
    and more complex, especially due to the multiple contributor options which
    are possible [and indeed these are not always implemented in all jive
    queries...].

    2) The current code support incremental scanning - this is definately a user
    requirement, but it leads to complexity to get a consitent database and has
    been the cause of a number of bugs - we ended up resolving an inconsitent
    contributor_album tables by recreating it after each scan for instance.

    Additionally users would prefer to avoid the time taken for a new scan when
    they upgrade or to resolve inconsitent databases...


    So the high level proposal that was discussed (based on a suggestion by
    Fred) is that we separate the schema into two sets of tables:

    1) which contains the raw tags from the users's files - this is
    incrementally created as the user adds more files and importantly mirrors
    the tags without any additional processing (so minimising the scope for

    2) the working set of tables used by the server - this can be optimised to
    speed up queries as appropriate and contains processed data.

    The idea was that the second set of tables is recreated from scratch after
    each full or partial scan which updates the raw tags tables - and it is
    created solely from processing the raw tags tables. Also if possible its
    created using sql scripts rather than perl/dbi itteration through the tables
    as it is much faster...

    The benefits of this approach is that for a given set of tags we always
    create the same working tables as we do it in one go. It is also possible
    to implement preferences which change the working tables without the cost of
    needing to rescan the who library. So options which define which
    contributors used etc need not be processed at query time - they can be
    implemented at the time of creating the working set. If we can keep the
    version of the raw tags tables as constant as possible it also minimise the
    number of new library scans required when upgrading versions...

    The downside is that for incremental updates to the library the full set of
    working tables is recreated. Hence the idea of doing this as sql scripts to
    make this as fast as possible. There may also need to be some place for
    persitent data which is maintained across this.

    Anyway thought I'd raise it - this approach does seem appropriate to me, I
    would suggest one of the goals of any new schema would be to improve
    browsing on slow servers which does mean optimising some of the querying we
    do.

    Adrian



  7. #7
    Senior Member erland's Avatar
    Join Date
    Dec 2005
    Location
    Sweden
    Posts
    11,039
    Quote Originally Posted by Triode View Post
    The downside is that for incremental updates to the library the full set of
    working tables is recreated. Hence the idea of doing this as sql scripts to
    make this as fast as possible.
    I wonder if it's possible to handle everything through SQL scripts ?
    As an example I'm thinking of the split character that makes it possible to enter multiple artists or genres in the same tag.

    Quote Originally Posted by Triode View Post
    Anyway thought I'd raise it - this approach does seem appropriate to me, I
    would suggest one of the goals of any new schema would be to improve
    browsing on slow servers which does mean optimising some of the querying we
    do.
    I like the idea, the patch is currently not in this direction, it is more focused on getting custom tags into the system without a major re-design of existing code/tables.

    Do you have any idea what the bottleneck is for slow servers ?
    I assume low memory is the biggest problem ?
    I assume this might cause performance issues in other places than on a fast machine ?
    Do you or someone else has any idea if it is preferable to have a big table or several small tables joined together on a slow machine ?
    Erland Isaksson (My homepage)
    Developer of many plugins/applets

  8. #8
    Former Squeezebox Guy andyg's Avatar
    Join Date
    Jan 2006
    Location
    Pittsburgh, PA
    Posts
    7,396

    Thoughts regarding new database schema ?

    We are planning on looking at this for 7.3. We don't really want any
    patches yet. I think we need to redesign the schema first that takes
    into account all of the bugs that have been marked related to this
    bug. Certainly everyone is welcome to contribute to the design. How
    about starting a wiki page?

  9. #9
    Senior Member Philip Meyer's Avatar
    Join Date
    Apr 2005
    Location
    UK
    Posts
    5,596

    Thoughts regarding new database schema ?

    Sounds like a good plan to me.

    Are you planning to copy the data into the second set of tables, or create link tables that reference the first set of tables? Two distinct databases may be a good idea (or use two schemas) so that there are no mistakes understanding the different sets of tables.

    A second copy sounds easier to maintain, but scanning may double the size of the DB. Would this affect any target platforms?

    I have experience of optimising SQL Queries (for MS SQL Server 2005) - not too much experience of MySQL, but I may be able to help write queries, analyse performance and suggest indexes.

    It would be nice if all tags could be extracted in the first phase, even if they are not part of SC's supported 2nd phase DB. Plugins could read the data out without needing to worry about file formats.

    This would be great, as it could be easier to swap in different 1st-pass databases (eg. different sets of test libraries), so the tricky bits of scanning can be tested faster without thrashing disks through file access.

  10. #10
    Senior Member erland's Avatar
    Join Date
    Dec 2005
    Location
    Sweden
    Posts
    11,039
    Quote Originally Posted by andyg View Post
    We are planning on looking at this for 7.3. We don't really want any
    patches yet. I think we need to redesign the schema first that takes
    into account all of the bugs that have been marked related to this
    bug.
    So, are we talking about a total redesign with the only requirement that the browse/search possible today should also be possible in the new version together with the browse/search issues mentioned in the related bug reports ?

    Or are we talking about a solution that just solves the issues described in the related bug reports but keeps some backward compatibility to avoid changing all existing browse and search code ?
    Erland Isaksson (My homepage)
    Developer of many plugins/applets

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •