Working on a plugin - Scanning question

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • mamema
    Senior Member
    • Mar 2011
    • 284

    #31
    why is a duplicate musicbrainz_id a problem at all?
    In my case this would lead to more than one track be "measured" even not played?
    As they are the same track, but diffrent release i could live with that.

    Or do i haven't understand this? Quite possible. :-)

    Comment

    • erland
      Senior Member
      • Jan 2006
      • 11322

      #32
      Originally posted by mamema
      why is a duplicate musicbrainz_id a problem at all?
      In my case this would lead to more than one track be "measured" even not played?
      As they are the same track, but diffrent release i could live with that.

      Or do i haven't understand this? Quite possible. :-)
      The following data exist in TrackStat
      - added: The file modification time when a file was added to LMS first time
      - playCount: The number of times a significant part of a track has been played, ignoring the cases when a track was skipped early
      - lastPlayed: The time a significant part of track was last played, ignoring the cases when a track was skipped early
      - rating: The user rating that indicated how good a track is

      When a track is moved/renamed the only way to recover the above data is to use musicbrainz id and try to find the old track entry.
      However, the question is which entry to take the data from if there are multiple tracks with a matching musicbrainz id ?

      - I think the best effort could be to ensure the above data is always the same for all tracks with a specific musicbrainz id. The question is if it will be bad to not be able to have different data for the individual releases. It shouldn’t be a problem for the individual tracks but could result in unexpected behavior for album statistic as it means that an album you have never played might end up in “Most played album” because one of its tracks also exist on your favorite album.

      - Alternatively you can just randomly pick one of the entries, which is basically what the current implementation unintentionally results in.

      - A third alternative is to not recover any data if there are multiple matching occurrences.
      Erland Lindmark (My homepage)
      Developer of many plugins/applets
      Starting with LMS 8.0 I no longer support my plugins/applets (see here for more information )

      Comment

      • mamema
        Senior Member
        • Mar 2011
        • 284

        #33
        Originally posted by erland
        When a track is moved/renamed the only way to recover the above data is to use musicbrainz id and try to find the old track entry.
        However, the question is which entry to take the data from if there are multiple tracks with a matching musicbrainz id ?.
        This still bugs me, as there isn't moved anything. You've said that duplicate ids would lead to this, but why?

        song 1 - id 123
        song 2 - id 456
        song 3 - id 123

        trackstat does find
        song 1 and song 3 to be equal

        but the point is, why is trackstat working on duplicate ids even though nothing has changed

        i'm would start from the beginning. Why is trackstat working on a problem (rename / move) which doesn't exist?

        Would this be a solution: song 1 with id 123 AND another ID (needs to be defined - row count.....)
        Then trackstat would find song 3 with id 123 AND a different other ID the trach ID is unique for example......

        trackstat matches
        Last edited by mamema; 2021-02-26, 19:06.

        Comment

        • erland
          Senior Member
          • Jan 2006
          • 11322

          #34
          Originally posted by mamema
          This still bugs me, as there isn't moved anything. You've said that duplicate ids would lead to this, but why?

          song 1 - id 123
          song 2 - id 456
          song 3 - id 123

          trackstat does find
          song 1 and song 3 to be equal

          but the point is, why is trackstat working on duplicate ids even though nothing has changed

          i'm would start from the beginning. Why is trackstat working on a problem (rename / move) which doesn't exist?
          I guess I’m only human and didn’t think about all corner cases when implementing the current logic.

          When TrackStat was implemented there were no easy way to know if something was changed during scanning, the only possibility that existed was to trigger a post scan operation when a rescan done event was triggered. Today there is an API and an importer concept which means that you can implement an importer that’s only triggered if a track is deleted/added/changed, so it’s possible to implement it a lot better today than it was 15 years ago when I implemented it.

          In addition to this there might be improvement potential to the current post scan action to ensure it only acts on track_statistics entries which doesn’t have matching url in tracks table. Tracks that have a matching url in tracks and track_statistics is unlikely a moved/renamed track. Could be implemented by extending the select that fills the temp table but I’m not sure this fix the complete problem, I suspect it just limit the cases it cause problems a bit.
          Erland Lindmark (My homepage)
          Developer of many plugins/applets
          Starting with LMS 8.0 I no longer support my plugins/applets (see here for more information )

          Comment

          • erland
            Senior Member
            • Jan 2006
            • 11322

            #35
            Originally posted by mamema
            Would this be a solution: song 1 with id 123 AND another ID (needs to be defined - row count.....)
            Then trackstat would find song 3 with id 123 AND a different other ID the trach ID is unique for example......

            trackstat matches
            Not sure I understand what you mean with “another ID” and “different other ID” and where they would be stored in database, but maybe my post a few minutes ago explained it so you can figure out a solution based on that.

            Generally I still feel the proper solution is to implement an importer that’s triggered by a track being updated/changed during scanning and a post scan importer that fills TrackStat tables initially when TrackStat is installed. You would still have to decide how to handle duplicate musicbrainz id but it will ensure TrackStat only does something on tracks where something has actually changed. Improving the current refresh operation as it is will just be a quick fix that solves some scenarios but not other. On slower hardware than yours there are also other parts of the refresh operation that also cause issues, it takes at least a minute in my 4000 track library as an example and I don’t have duplicated musicbrainz ids.
            Erland Lindmark (My homepage)
            Developer of many plugins/applets
            Starting with LMS 8.0 I no longer support my plugins/applets (see here for more information )

            Comment

            • mamema
              Senior Member
              • Mar 2011
              • 284

              #36
              Originally posted by erland
              Not sure I understand what you mean with “another ID” and “different other ID” and where they would be stored in database, but maybe my post a few minutes ago explained it so you can figure out a solution based on that.
              .
              my idea was, as musicbrainz_id is used to get the relation, but it could be the case, that this id isn't unique, just glue another unique id to that relation, which would be for example track_id, so we get a real unique value and the problem of duplicate musicbrainz_id is gone.
              Sure there may be better mid and longterm solutions, but as you know, i'm still in learning mode.

              Comment

              • erland
                Senior Member
                • Jan 2006
                • 11322

                #37
                Originally posted by mamema
                my idea was, as musicbrainz_id is used to get the relation, but it could be the case, that this id isn't unique, just glue another unique id to that relation, which would be for example track_id, so we get a real unique value and the problem of duplicate musicbrainz_id is gone.
                Sure there may be better mid and longterm solutions, but as you know, i'm still in learning mode.
                Problem is where to store such id, the contents of tracks table is deleted every rescan and the id needs to be the same if you rename or retag a music file, LMS has no idea if a file has been renamed/moved or is just a newly added file. The only way I can see is to store it inside the music file and write it to db during scanning but I wonder if people would really spend time manually enter id tags to their music files if the only purpose is to be able to rename/move files without loosing rating/statistic data. For FLAC files there is a checksum of the audio data inside the file that might be usable but that would only cover that file format and I’m not sure the audio data really have to be different between two releases of the same track recording.

                I think we would have to be satisfied with a solution that’s good enough, like the three I mentioned earlier, even if it isn’t 100% correct in all situations. Don’t remember if musicbrainz tools also add an album/release identity to the tags, if they do it might be possible to use that as other id but it will still be a significant development effort just to cover a corner case so I doubt it’s worth the effort.
                Erland Lindmark (My homepage)
                Developer of many plugins/applets
                Starting with LMS 8.0 I no longer support my plugins/applets (see here for more information )

                Comment

                • mamema
                  Senior Member
                  • Mar 2011
                  • 284

                  #38
                  Originally posted by erland
                  I think we would have to be satisfied with a solution that’s good enough, like the three I mentioned earlier, even if it isn’t 100% correct in all situations. Don’t remember if musicbrainz tools also add an album/release identity to the tags, if they do it might be possible to use that as other id but it will still be a significant development effort just to cover a corner case so I doubt it’s worth the effort.
                  At the moment, for me the main goal is to learn, so doesn't matter which corner case i got solved. I hope, i learn enough and after that i can proceed with other stuff. I'm not ready yet to decide which case i will be working on....

                  Comment

                  • mamema
                    Senior Member
                    • Mar 2011
                    • 284

                    #39
                    This would be a very good tagged file of my library for example.
                    In my case, i would be willing to tag the file with some trackstat id if no other id would be suitable.
                    Attached Files

                    Comment

                    • erland
                      Senior Member
                      • Jan 2006
                      • 11322

                      #40
                      Originally posted by mamema
                      This would be a very good tagged file of my library for example.
                      In my case, i would be willing to tag the file with some trackstat id if no other id would be suitable.
                      I wonder if musicbrainz_releasetrackid is unique ?
                      I don’t remember which tag LMS writes to the tracks table but that should be easy for you to see by looking at the tracks table for this music file.
                      Erland Lindmark (My homepage)
                      Developer of many plugins/applets
                      Starting with LMS 8.0 I no longer support my plugins/applets (see here for more information )

                      Comment

                      • mamema
                        Senior Member
                        • Mar 2011
                        • 284

                        #41
                        Originally posted by erland
                        I wonder if musicbrainz_releasetrackid is unique ?
                        I don’t remember which tag LMS writes to the tracks table but that should be easy for you to see by looking at the tracks table for this music file.
                        yes will do, also with tracks which have several release version.
                        At the moment my whole library got scanned with index on the trackstat temp table. Will see if this helps.
                        Report here my findings

                        Comment

                        • mamema
                          Senior Member
                          • Mar 2011
                          • 284

                          #42
                          Originally posted by mamema
                          yes will do, also with tracks which have several release version.
                          At the moment my whole library got scanned with index on the trackstat temp table. Will see if this helps.
                          Report here my findings
                          so, confirmed, i have duplicates in temp_track_statistics. In library.db there is no additional musicbrainz track id, but another interesting value, which i'm now thinking about to use to tackle this duplicate issue. It's "filesize". musicbrainz_id and filesize shouldn't be duplicate that easy.... should work with several file formats.

                          BTW: my 3 hour run was interrupted with a watchtower (docker) refresh of the LMS container. :-) So Michael is very active.
                          Last edited by mamema; 2021-02-27, 17:37.

                          Comment

                          • erland
                            Senior Member
                            • Jan 2006
                            • 11322

                            #43
                            Originally posted by mamema
                            so, confirmed, i have duplicates in temp_track_statistics. In library.db there is no additional musicbrainz track id, but another interesting value, which i'm now thinking about to use to tackle this duplicate issue. It's "filesize". musicbrainz_id and filesize shouldn't be duplicate that easy.... should work with several file formats.
                            I’m guessing file size might change when changing tagging ?
                            However, it’s certainly better than the current solution, as long as the logic only operates on tracks which can’t be joined using url you are probably fine. The corner case it would potentially loose data in would be if the user both changes tags and move/rename a file, assuming file size changes if you edit tags of course.
                            Erland Lindmark (My homepage)
                            Developer of many plugins/applets
                            Starting with LMS 8.0 I no longer support my plugins/applets (see here for more information )

                            Comment

                            • mamema
                              Senior Member
                              • Mar 2011
                              • 284

                              #44
                              Originally posted by erland
                              I’m guessing file size might change when changing tagging ?
                              However, it’s certainly better than the current solution, as long as the logic only operates on tracks which can’t be joined using url you are probably fine. The corner case it would potentially loose data in would be if the user both changes tags and move/rename a file, assuming file size changes if you edit tags of course.
                              yes, file size changes, so the whole file is different. Which at the end could always be the case it someone tags files. Even i i would introduce a special hash value for trackstat.
                              Last edited by mamema; 2021-02-28, 07:23.

                              Comment

                              • mamema
                                Senior Member
                                • Mar 2011
                                • 284

                                #45
                                Originally posted by mamema
                                yes, file size changes, so the whole file is different. Which at the end could always be the case it someone tags files. Even i i would introduce a special hash value for trackstat.
                                added the filesize to the dbcreate.sql


                                and to the query

                                Comment

                                Working...