[Koha] Koha 2.2.9, Unicode (UTF-8), Latin-1 (ISO-8859-1) and migration to Koha 3

Ricardo Dias Marques lists at ricmarques.net
Fri Apr 25 11:50:47 NZST 2008


Hi Galen,

On Thu, Apr 24, 2008 at 8:45 PM, Galen Charlton
<galen.charlton at liblime.com> wrote:

>  Very briefly, Koha 3's C4::Charset module's MarcToUTF8Record routine
>  should give you some ideas.  You can use that as the core of a routine
>  to convert a file that contains mixed Latin-1 and UTF-8 records to
>  UTF-8.  However, it will not correctly handle a MARC record that has
>  *both* Latin-1 and UTF-8, but could be modified to test each field and
>  subfield to see if it contains UTF-8 or Latin-1.

Thanks Galen! I have read the code of the MarcToUTF8Record routine,
like you suggested, and it does seem to be a very good starting point.

If anyone else is curious about the MarcToUTF8Record routine, you may
read the (current) source code in the Charset.pm file of Koha 3 (Beta
2), which is also available - in its "current" form - in the "git" web
site - http://git.koha.org/ - specifically at:

http://git.koha.org/cgi-bin/gitweb.cgi?p=Koha;a=blob;f=C4/Charset.pm


I'll try to start experimenting with this when I get back to work on
Monday (Friday, 25th of April is a National Holiday in Portugal).

Best wishes,
Ricardo Dias Marques
lists AT ricmarques DOT net


More information about the Koha mailing list