PDA

View Full Version : New (improved?) infoFormat



Robert Moser
2005-06-24, 22:02
I've been playing around with a revised, hopefully more understandable
infoFormat function. I left all the old code in on this patch, I just
renamed infoFormat to infoFormat_old and moved the old infoFormat
specific initialization to an init_old function.

What I've tried to do with this is take a format string and parse it
once, creating a subroutine. That subroutine is called from infoFormat
and passed the track object as its only parameter. It then returns the
formatted string. The single parse is done as needed, with the results
stored in the %parsedFormats hash for later reuse.

The parsedFormats hash is preloaded on init() with all the elements that
were defined in the infoHash() and addToinfoHash() subroutines.

Eventually, I'll make an accessor function to allow custom elements to
be added to that hash, thus making them available for titleformats.

Simple, single attribute only formats will only need one subroutine call
from infoFormat. Formats with more than one element will call a
subroutine which will in turn call a subroutine per element.

No changes to either how infoFormat is called, or to how title formats
are written. Well, except if you were using the undocumented =>=>,
<=<=, or <##> bits in your formats, these were removed.

Profiling under Windows doesn't work very well. It seems to be a bit
faster from the data I did get, but I'd love it if someone could profile
this against the current infoFormat to see if there is a difference.

One possible optimization would be to build the subroutine such that
minimal $track->get() calls are made. This would be a bit more complex
than what I have currently though, so I'd rather see if the profiling
shows that it is necessary. Even with the somewhat complex format I
have below it is only doing 6-9 get()'s anyway.

Collapsing will work like this:
* Any non-element data starting the string will be eliminated if the
first element doesn't exist (null or empty string)
* Non-element data between elements will be eliminated if the following
element doesn't exist, or if none of the previous elements existed
* Non-element data at the end of the string will be eliminated if the
element preceding it doesn't exist
* Inside a bracket, all elements and non-elements are grouped into a
single element for the purposes of collapsing
* The brackets are eliminated if the single logical element inside
doesn't exist
* Brackets for this are () {} [] "" and ''

Basically, take this format:
DISC-TRACKNUM. TITLE (ARTIST - ALBUM[YEAR])
AAAAWBBBBBBBBXXCCCCCYZDDDDDDDDDDDDDDDDDDDDZ

If A and B exist, W appears
If C and either A or B exist, X appears
If D and A, B or C exist, Y appears
If D exists, Z appears

D gets parsed as
ARTIST - ALBUM[YEAR]
EEEEEEUUUFFFFFVGGGGV
which is also added to the parsedFormats hash
If E and F exist, U appears
If G exists, V appears
If any of E, F, or G exists Z (from above) appears

So, if DISC doesn't exist, but everything else does, it will appear like
this:
TRACKNUM. TITLE (ARTIST - ALBUM[YEAR])

If ARTIST doesn't exist either it will appear like this:
TRACKNUM. TITLE (ALBUM[YEAR])

No ALBUM or DISC, like this:
TRACKNUM. TITLE (ARTIST[YEAR])

No YEAR, like this:
DISC-TRACKNUM. TITLE (ARTIST - ALBUM)

Index: /slimsvn/trunk/server/Slim/Music/Info.pm
================================================== =================
--- D:/Sound/slimsvn/trunk/server/Slim/Music/Info.pm (revision 3550)
+++ D:/Sound/slimsvn/trunk/server/Slim/Music/Info.pm (working copy)
@@ -51,6 +51,8 @@

my ($currentDB, $localDB, $ncElemstring, $ncElems, $elemstring, $elems, $validTypeRegex);

+my (@elements, $elemRegex, %parsedFormats);
+
# Save our stats.
tie our %isFile, 'Tie::Cache::LRU', 16;

@@ -126,7 +128,7 @@
},
);

-sub init {
+sub init_old {

# non-cached elements
$ncElemstring = "VOLUME|PATH|FILE|EXT|DURATION|LONGDATE|SHORTDATE|C URRTIME|FROM|BY";
@@ -142,7 +144,12 @@
#. "|DURATION" # SECS expressed as mm:ss (not in infoCache)
#. "|LONGDATE|SHORTDATE|CURRTIME" #current date/time (not in infoCache)
$elems = qr/$elemstring/;
+}

+sub init {
+
+ initParsedFormats();
+
loadTypesConfig();

# precompute the valid extensions
@@ -487,7 +494,7 @@
}

# formats information about a file using a provided format string
-sub infoFormat {
+sub infoFormat_old {
no warnings; # this is to allow using null values with string concatenation, it only effects this procedure
my $fileOrObj = shift; # item whose information will be formatted
my $str = shift; # format string to use
@@ -637,7 +644,352 @@

return $str;
}
+sub initParsedFormats {
+ %parsedFormats = ();
+ my @trackAttrs; # for relating track attributes to album/artist attributes
+ # Subs for all regular track attributes
+ for my $attr (keys %{Slim::DataStores::DBI::Track->attributes}) {
+ $parsedFormats{uc $attr} =
+ sub {
+ my $output = $_[0]->get($attr);
+ return (defined $output ? $output : '');
+ };
+ }

+ # Override album
+ $parsedFormats{'ALBUM'} =
+ sub {
+ my $output = '';
+ my $album = $_[0]->album();
+ if ($album) {
+ $output = $album->title();
+ $output = '' if $output eq string('NO_ALBUM');
+ }
+ return (defined $output ? $output : '');
+ };
+
+ # add album related
+ @trackAttrs = qw(ALBUMSORT DISC DISCC);
+ for my $attr (qw(namesort disc discc)) {
+ $parsedFormats{shift @trackAttrs} =
+ sub {
+ my $output = '';
+ my $album = $_[0]->album();
+ if ($album) {
+ $output = $album->get($attr);
+ }
+ return (defined $output ? $output : '');
+ };
+ }
+
+ # add artist related
+ $parsedFormats{'ARTIST'} =
+ sub {
+ my $output = '';
+ my $artist = $_[0]->artist();
+ if ($artist) {
+ $output = $artist->get('name');
+ $output = '' if $output eq string('NO_ARTIST');
+ }
+ return (defined $output ? $output : '');
+ };
+ $parsedFormats{'ARTISTSORT'} =
+ sub {
+ my $output = '';
+ my $artist = $_[0]->artist();
+ if ($artist) {
+ $output = $artist->get('namesort');
+ }
+ return (defined $output ? $output : '');
+ };
+
+ # add other contributors
+ for my $attr (qw(composer conductor band genre)) {
+ $parsedFormats{uc($attr)} =
+ sub {
+ my $output = '';
+ my ($item) = $_[0]->$attr();
+ if ($item) {
+ $output = $item->name();
+ }
+ return (defined $output ? $output : '');
+ };
+ }
+
+ # add genre
+ $parsedFormats{'GENRE'} =
+ sub {
+ my $output = '';
+ my ($item) = $_[0]->genre();
+ if ($item) {
+ $output = $item->name();
+ $output = '' if $output eq string('NO_GENRE');
+ }
+ return (defined $output ? $output : '');
+ };
+
+ # add comment and duration
+ for my $attr (qw(comment duration)) {
+ $parsedFormats{uc($attr)} =
+ sub {
+ my $output = $_[0]->$attr();
+ return (defined $output ? $output : '');
+ };
+ }
+
+ # add file info
+ $parsedFormats{'VOLUME'} =
+ sub {
+ my $output = '';
+ my $url = $_[0]->get('url');
+ my $filepath;
+ if ($url) {
+ if (isFileURL($url)) { $url=Slim::Utils::Misc::pathFromFileURL($url); }
+ $output = (splitpath($url))[0];
+ }
+ return (defined $output ? $output : '');
+ };
+ $parsedFormats{'PATH'} =
+ sub {
+ my $output = '';
+ my $url = $_[0]->get('url');
+ my $filepath;
+ if ($url) {
+ if (isFileURL($url)) { $url=Slim::Utils::Misc::pathFromFileURL($url); }
+ $output = (splitpath($url))[1];
+ }
+ return (defined $output ? $output : '');
+ };
+ $parsedFormats{'FILE'} =
+ sub {
+ my $output = '';
+ my $url = $_[0]->get('url');
+ my $filepath;
+ if ($url) {
+ if (isFileURL($url)) { $url=Slim::Utils::Misc::pathFromFileURL($url); }
+ $output = (splitpath($url))[2];
+ $output =~ s/\.[^\.]*?$//;
+ }
+ return (defined $output ? $output : '');
+ };
+ $parsedFormats{'EXT'} =
+ sub {
+ my $output = '';
+ my $url = $_[0]->get('url');
+ my $filepath;
+ if ($url) {
+ if (isFileURL($url)) { $url=Slim::Utils::Misc::pathFromFileURL($url); }
+ my $file = (splitpath($url))[2];
+ ($output) = $file =~ /\.([^\.]*?)$/;
+ }
+ return (defined $output ? $output : '');
+ };
+
+ # Add date/time elements
+ $parsedFormats{'LONGDATE'} = \&Slim::Utils::Misc::longDateF;
+ $parsedFormats{'SHORTDATE'} = \&Slim::Utils::Misc::shortDateF;
+ $parsedFormats{'CURRTIME'} = \&Slim::Utils::Misc::timeF;
+
+ # Add localized from/by
+ $parsedFormats{'FROM'} = sub { return string('FROM'); };
+ $parsedFormats{'BY'} = sub { return string('BY'); };
+
+ # fill element related variables
+ @elements = keys %parsedFormats;
+
+ # add placeholder element for bracketed items
+ push @elements, '_PLACEHOLDER_';
+
+ $elemstring = join "|", @elements;
+ $elemRegex = qr/$elemstring/;
+
+ # Add lightweight FILE.EXT format
+ $parsedFormats{'FILE.EXT'} =
+ sub {
+ my $output = '';
+ my $url = $_[0]->get('url');
+ my $filepath;
+ if ($url) {
+ if (isFileURL($url)) { $url=Slim::Utils::Misc::pathFromFileURL($url); }
+ $output = (splitpath($url))[2];
+ }
+ return (defined $output ? $output : '');
+ };
+
+}
+
+my %endbrackets = (
+ '(' => qr/(.+?)(\))/,
+ '[' => qr/(.+?)(\])/,
+ '{' => qr/(.+?)(\})/,
+ '"' => qr/(.+?)(")/, # " # syntax highlighters are easily confused
+ "'" => qr/(.+?)(')/, # ' # syntax highlighters are easily confused
+ );
+
+my $bracketstart = qr/(.*?)([{[("'])/; # '" # syntax highlighters are easily confused
+
+# The fillFormat routine takes a track and references to parsed data arrays describing
+# a desired information format and returns a string containing the formatted data.
+# The prefix array contains separator elements that should only be included in the output
+# if the corresponding element contains data, and any element preceding it contained data.
+# The indprefix array is like the prefix array, but it only requires the corresponding
+# element to contain data.
+# The elemlookup array contains code references which are passed the track object and return
+# a string if that track has data for that element.
+# The suffix array contains separator elements that should only be included if the corresponding
+# element contains data.
+# The data for each item is placed in the string in the order prefix + indprefix + element + suffix.
+
+sub fillFormat {
+ my ($track, $prefix, $indprefix, $elemlookup, $suffix) = @_;
+ my $output = '';
+ my $hasPrev;
+ my $index = 0;
+ for my $elemref (@{$elemlookup}) {
+ my $elementtext = $elemref->($track);
+ if (defined($elementtext) && $elementtext gt '') {
+ # The element had a value, so build this portion of the output.
+ # Add in the prefix only if some previous element also had a value
+ $output .= join('', ($hasPrev ? $prefix->[$index] : ''),
+ $indprefix->[$index],
+ $elementtext,
+ $suffix->[$index]);
+ $hasPrev ||= 1;
+ }
+ $index++;
+ }
+ return $output;
+}
+
+sub parseFormat {
+ my $format = shift;
+ my $formatparsed = $format; # $format will be modified, so stash the original value
+ my $newstr = '';
+ my (@parsed, @placeholders, @prefixes, @indprefixes, @elemlookups, @suffixes);
+
+ # don't rebuild formats
+ return $parsedFormats{$format} if exists $parsedFormats{$format};
+
+ # find bracketed items so that we can collapse them correctly
+ while ($format =~ s/$bracketstart//) {
+ $newstr .= $1 . $2;
+ my $endbracketRegex = $endbrackets{$2};
+ if ($format =~ s/$endbracketRegex//) {
+ push @placeholders, $1;
+ $newstr .= '_PLACEHOLDER_' . $2;
+ }
+ }
+ $format = $newstr . $format;
+
+ # break up format string into separators and elements
+ # elements must be separated by non-word characters
+ @parsed = ($format =~ m/(.*?)\b($elemRegex)\b/gc);
+ push @parsed, substr($format,pos($format) || 0);
+
+ if (scalar(@parsed) < 2) {
+ # pure text, just return that text as the function
+ my $output = shift(@parsed);
+ $parsedFormats{$formatparsed} = sub { return $output; };
+ return $parsedFormats{$formatparsed};
+ }
+
+ # Every other item in the parsed array is an element, which will be replaced later
+ # by a code reference which will return a string to replace the element
+ while (scalar(@parsed) > 1) {
+ push @prefixes, shift(@parsed);
+ push @indprefixes, '';
+ push @elemlookups, shift(@parsed);
+ push @suffixes, '';
+ }
+
+ # the first item will never have anything before it, so move it from the prefixes array
+ # to the independent prefixes array
+ $indprefixes[0] = $prefixes[0];
+ $prefixes[0] = '';
+
+ # if anything is left in the parsed array (there were an odd number of items, put it in
+ # as the last item in the suffixes array
+ if (@parsed) {
+ $suffixes[-1] = $parsed[0];
+ }
+
+ # replace placeholders with their original values, and replace the element text with the
+ # code references to look up the value for the element.
+ my $index = 0;
+ for my $elem (@elemlookups) {
+ if ($elem eq '_PLACEHOLDER_') {
+ $elemlookups[$index] = shift @placeholders;
+ if ($index < $#prefixes) {
+ # move closing bracket from the prefix of the element following
+ # to the suffix of the current element
+ $suffixes[$index] = substr($prefixes[$index + 1],0,1,'');
+ }
+ if ($index) {
+ # move opening bracket from the prefix dependent on previous content
+ # to the independent prefix for this element, but only attempt this
+ # when this isn't the first element, since that has already had the
+ # prefix moved to the independent prefix
+ $indprefixes[$index] = substr($prefixes[$index],length($prefixes[$index]) - 1,1,'');
+ }
+ }
+ # replace element with code ref from parsed formats. If the element does not exist in
+ # the hash, it needs to be parsed and created.
+ $elemlookups[$index] = $parsedFormats{$elem} || parseFormat($elem);
+ $index++;
+ }
+
+ $parsedFormats{$formatparsed} =
+ sub {
+ my $track = shift;
+ return fillFormat($track, \@prefixes, \@indprefixes, \@elemlookups, \@suffixes);
+ };
+
+ return $parsedFormats{$formatparsed};
+}
+
+sub infoFormat {
+ my $fileOrObj = shift; # item whose information will be formatted
+ my $str = shift; # format string to use
+ my $safestr = shift; # format string to use in the event that after filling the first string, there is nothing left
+ my $output = '';
+ my $format;
+
+ my $track = ref $fileOrObj ? $fileOrObj : $currentDB->objectForUrl($fileOrObj, 1);
+
+ return '' unless defined $track;
+
+ # use a safe format string if none specified
+ # Users can input strings in any locale - we need to convert that to
+ # UTF-8 first, otherwise perl will segfault in the nasty regex below.
+ if ($str && $] > 5.007) {
+
+ eval {
+ Encode::from_to($str, $Slim::Utils::Misc::locale, 'utf8');
+ Encode::_utf8_on($str);
+ };
+
+ } elsif (!defined $str) {
+
+ $str = 'TITLE';
+ }
+
+ # Get the formatting function from the hash, or parse it
+ $format = $parsedFormats{$str} || parseFormat($str);
+
+ $output = $format->($track) if ref($format) eq 'CODE';
+
+ if ($output eq "" && defined($safestr)) {
+
+ # if there isn't any output, use the safe string, if supplied
+ return infoFormat($track,$safestr);
+
+ } else {
+ $output =~ s/%([0-9a-fA-F][0-9a-fA-F])%/chr(hex($1))/eg;
+ }
+
+ return $output;
+}
+
#
# if no ID3 information is available,
# use this to get a title, which is derived from the file path or URL.

Robert Moser
2005-06-24, 22:12
In case anyone was wondering what I was complaining about as far as
profiling is concerned, this is what I get using -d:DProf and dprofpp.
This was with a playlist of 205 items with items per page set to 200,
and webtitleformat set to DISC-TRACKNUM. TITLE (ARTIST - ALBUM[YEAR])

I started slimserver, then opened the EN skin's status.html after my
squeeze connected, then Ctrl-C'd the server.


The problem:
---------------------------------------------------
D:\Sound\slimsvn\trunk\server>dprofpp
Garbled profile is missing some exit time stamps:
main::main
main::idle
Slim::Networking::Select::select
Slim::Web::HTTP::processHTTP
Slim::Web::HTTP::processURL
Slim::Web::HTTP::generateHTTPResponse
Slim::Web::Pages::status
Slim::Web::HTTP::filltemplatefile
Slim::Web::HTTP::_generateContentFromFile
Template::process
Template::Service::process
Template::Context::process
Template::Document::process
Template::Provider::__ANON__
Try rerunning dprofpp with -F.

Standard infoFormat
-------------------------------------------
D:\Sound\slimsvn\trunk\server>dprofpp -F -I
Faking 14 exit timestamp(s).
Total Elapsed Time = -1.13090 Seconds
User+System Time = 0 Seconds
Inclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
0.00 0.000 13.257 1 0.0000 13.257 main::main
0.00 0.000 7.783 176 0.0000 0.0442
Slim::Networking::Select::select
0.00 0.000 7.731 19 0.0000 0.4069 main::idle
0.00 0.010 7.488 1 0.0099 7.4876 Slim::Web::HTTP::processHTTP
0.00 0.000 7.468 1 0.0000 7.4681
Slim::Web::HTTP::generateHTTPRespo
nse
0.00 0.000 7.468 1 0.0000 7.4681 Slim::Web::HTTP::processURL
0.00 0.000 7.429 1 0.0000 7.4288 Slim::Web::Pages::status
0.00 0.088 7.239 1 0.0883 7.2394 Slim::Web::Pages::playlist
0.00 0.020 6.476 201 0.0001 0.0322
Slim::Music::Info::standardTitle
0.00 0.299 6.014 68 0.0044 0.0884 main::BEGIN
0.00 0.141 5.610 200 0.0007 0.0281 Slim::Music::Info::infoFormat
0.00 0.110 5.299 200 0.0005 0.0265 Slim::Music::Info::infoHash
0.00 0.099 5.091 804 0.0001 0.0063
Class::DBI::Relationship::HasMany:
:__ANON__
0.00 0.480 3.335 1 0.4800 3.3347
Slim::Display::Graphics::loadFonts
0.00 0.000 3.335 1 0.0000 3.3347 Slim::Display::Graphics::init

New infoFormat:
---------------------------------------------------------
D:\Sound\slimsvn\trunk\server>dprofpp -F -I
Faking 8 exit timestamp(s).
Total Elapsed Time = -0.62970 Seconds
User+System Time = 0 Seconds
Inclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
0.00 0.000 9.644 1 0.0000 9.6437 main::main
0.00 0.309 6.234 68 0.0045 0.0917 main::BEGIN
0.00 0.000 3.875 219 0.0000 0.0177
Slim::Networking::Select::select
0.00 0.000 3.858 62 0.0000 0.0622 main::idle
0.00 0.030 3.754 1600 0.0000 0.0023 Slim::Music::Info::__ANON__
0.00 0.010 3.582 2 0.0050 1.7909 Slim::Web::HTTP::processHTTP
0.00 0.000 3.573 2 0.0000 1.7863
Slim::Web::HTTP::generateHTTPRespo
nse
0.00 0.000 3.573 2 0.0000 1.7863 Slim::Web::HTTP::processURL
0.00 0.000 3.524 1 0.0000 3.5245 Slim::Web::Pages::status
0.00 0.421 3.455 1 0.4210 3.4547
Slim::Display::Graphics::loadFonts
0.00 0.000 3.455 1 0.0000 3.4547 Slim::Display::Graphics::init
0.00 0.118 3.162 1 0.1183 3.1622 Slim::Web::Pages::playlist
0.00 0.010 2.466 400 0.0000 0.0062 Slim::Music::Info::fillFormat
0.00 0.010 2.376 201 0.0000 0.0118
Slim::Music::Info::standardTitle
0.00 1.793 1.793 13 0.1379 0.1379
Slim::Display::Graphics::parseBMP

Dan Sully
2005-06-24, 22:21
* Robert Moser shaped the electrons to say...

>In case anyone was wondering what I was complaining about as far as
>profiling is concerned, this is what I get using -d:DProf and dprofpp.
>This was with a playlist of 205 items with items per page set to 200,
>and webtitleformat set to DISC-TRACKNUM. TITLE (ARTIST - ALBUM[YEAR])
>
>I started slimserver, then opened the EN skin's status.html after my
>squeeze connected, then Ctrl-C'd the server.

Try Devel::Profiler

-D
--
You have the puzzle pieces? Good, then turn off the damn walls.

Robert Moser
2005-06-24, 22:50
Dan Sully blurted out:
> Try Devel::Profiler
>
> -D

That doesn't work very well either. It adds a ton of overhead (takes a
couple of minutes to open the status page, whereas with Dprof it took
ten seconds or so), plus it still leaves off some exit timestamps, and
occasionally misses some enter timestamps. When the enter timestamp is
missing, dprofpp won't show anything even with -F.

I blame windows. Hmm, maybe I'll try to get it running on my
girlfriend's OS X iBook. That's for tomorrow though.

Grotus
2005-06-27, 10:15
Has anyone had a chance to look at this? I'd like to know if this is easier to understand than the previous method.

Dan Sully
2005-06-27, 10:25
* Grotus shaped the electrons to say...

>Has anyone had a chance to look at this? I'd like to know if this is
>easier to understand than the previous method.

It's definitely much cleaner. Not sure yet if this should go into 6.1 or wait
until it's been branched. What are your thoughts on the compatability?

-D
--
<iNoah> all your base class are belong to us

Grotus
2005-06-27, 10:41
Dan Sully wrote:
> * Grotus shaped the electrons to say...
>
>> Has anyone had a chance to look at this? I'd like to know if this is
>> easier to understand than the previous method.
>
>
> It's definitely much cleaner. Not sure yet if this should go into 6.1 or
> wait
> until it's been branched. What are your thoughts on the compatability?
>
> -D

It is a drop-in replacement for infoFormat. The interface remains the
same and the title format stays the same. There shouldn't be any
compatability problems.

There are only two scenarios for it to behave differently than the
previous method.

1) if the user knew about and used the advanced =>=> <=<= and <##> brackets
2) if the user was using elements without a non-word character between them

I find the second scenario to be more likely than the first, but still
not very likely.