PDA

View Full Version : Utf-8 problems upgrading from 5.4 to 6.1.1



nico
2005-09-20, 07:25
Hi,

In upgrading to 6.1.1 I think some issues with the handling of multi-byte characters have crept into the system.

I'm hosting slimserver on a Solaris 10 x86 box with LANG=en_US.UTF-8. The filenames of my flac files are utf-8 encoded, and the ALBUM/ARTIST/TITLE metadata contained in the embedded vorbis tags is also utf-8. The files are stored on a network file server; when listed via ls(1) on solaris and also when viewed from Windows, all the unicode chars in the filenames appear correct and normal.

When I view strings that contain unicode chars via the browser UI or on SqueezeBox I see \xE9 (for example) displayed instead of the unicode chars. I also see lots of errors come from slimserver.pl's stderr such as:
utf8 "\xC9" does not map to Unicode at /home/slimserver/SlimServer_v6.1.1/Slim/Formats/Parse.pm line 153, <GEN539> line 60670.

As a particular example, a vorbis TITLE tag might contain the raw bytes "J'Y Suis Jamias AllĂ©". The final two bytes are C3 A9, which I understand is the correct utf-8 sequence for an accented e (é).

I am sure this used to work with 5.X, can anyone please help me restore the proper behaviour with 6.1.1?

Many thanks,
Nico

Slimserver: Version: 6.1.1 - 3774 - solaris - EN - iso-8859-1
OS: Solaris 10 x86
Browser: FireFox 1.0.6

PS: I wonder if a clue is in the slimserver version signature above "EN - iso-8859-1". I notice in other people's posts the signature has "utf8" instead. Could it be that slimserver is mistakenly running in Latin-1 mode instead of utf8? If so, how could I change this? Here is the output of locale:
% locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE=C
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_ALL=

Dan Sully
2005-09-20, 10:21
* nico shaped the electrons to say...

>I am sure this used to work with 5.X, can anyone please help me restore
>the proper behaviour with 6.1.1?

Try the 6.2 nightlies - a _lot_ has been done on the Unicode support there.

-D
--
This knob controls the thing that changes when you turn it. - noah

nico
2005-09-20, 19:29
Hi Dan,

Thanks for the response. The nightly [SlimServer Version: 6.2b1 - 4372 - solaris - EN - iso-8859-1] certainly appears to have fixed the problem, unicode chars are all displayed properly again.

Sadly, there are a slew of new bugs I discovered with this release, so I'm afraid I will be rolling back to 6.1.1. Do you encourage us to post bugs for issues we find on these nightlies?

Thanks again for your help.
Nico

Dan Sully
2005-09-20, 20:27
* nico shaped the electrons to say...

>Thanks for the response. The nightly [SlimServer Version: 6.2b1 - 4372
>- solaris - EN - iso-8859-1] certainly appears to have fixed the
>problem, unicode chars are all displayed properly again.
>
>Sadly, there are a slew of new bugs I discovered with this release, so
>I'm afraid I will be rolling back to 6.1.1. Do you encourage us to post
>bugs for issues we find on these nightlies?

Yes - please file bugs - or they won't get fixed if we don't already know
about them. You should also consider joining the beta list/forum.

-D
--
"My pockets hurt." - Homer Simpson