[Koha] Koha 2.2.9, Unicode (UTF-8), Latin-1 (ISO-8859-1) and migration to Koha 3
Ricardo Dias Marques
lists at ricmarques.net
Tue Apr 22 23:32:22 NZST 2008
Hi list,
I have a Koha 2.2.9 system running on a machine with SLES 9 (SUSE
Linux Enterpise Server) with SP3 (Service Pack 3), running Apache
2.2.0 and MySQL 4.0.18
I'm now considering migrating it to another machine that will run Koha
3 Beta 2, on a server with SLES 10 with SP1 (Service Pack 1), running
Apache 2.2.4 and MySQL 5.0.26
I have done a mysqldump of the koha database in the Koha 2.2.9 system.
Unfortunately, I found out that the dump has mixed character
encodings, namely that some characters are in Unicode ("UTF-8") and
others are in Latin-1 ("ISO 8859-1" and/or "ISO-8859-15").
I am Portuguese (living in Portugal), so the "problematic" characters
are the Portuguese accented characters ("ã" - a tilde; "ç" - c
cedilla; "é" - e acute; and other characters with accents).
This leads to my first question:
1 - Should a Koha 2.2.9 system be preferably set up for Unicode
("UTF-8") or Latin-1 ("ISO-8859-1" / "ISO 8859-15")?
By reading the following page:
Encoding and Character Sets in Koha
http://wiki.koha.org/doku.php?id=encodingscratchpad
... and namely, the first version of that page -
http://wiki.koha.org/doku.php?id=encodingscratchpad&rev=1152103445 -
it seems that for versions of Koha >= 2.2.6, I should set up the
"locale", Apache and MySQL for Unicode ("UTF-8").
Is this correct?
My next question is this one:
2 - What is the "best" way to convert this "mixed" mysqldump (UTF-8 /
ISO-8859-1) file to a "pure" UTF-8 one (or to a "pure" Latin-1 one)?
I have already found out these pages, but would appreciate feedback
from fellow Koha users that already had this problem:
How to sanitize a string with mixed encodings - UTF-8 and Latin1
http://www.fischerlaender.net/php/sanitize-utf8-latin1
Encoding issues MySql Latin / UTF-8
http://www.vlugge.eu/blog/algemeen/encoding-issues-mysql-latin-utf-8/
Mixed ISO-8859/UTF-8 conversion
http://www.perlmonks.org/?node_id=642617
And now, the "main question":
3 - Is the best migration strategy, the following sequence:
3.1. - "Transform" the mysqldump to a "pure" UTF-8 file
3.2. - Install Koha 2.2.9 on the "second" machine (running SLES 10)
3.3 - Import the mysqldump on the "second" machine
3.4 - Install Koha 3 Beta 2 on the "second" machine
3.5 - Follow the steps described at:
Upgrading from Koha 2.2 to Koha 3.0
http://wiki.koha.org/doku.php?id=22_to_30
... or is there an easier / better way to do this?
Thanks for taking the time to read this!
ANY help / information / feedback would be much appreciated! :)
Best wishes,
Ricardo Dias Marques
lists AT ricmarques DOT net
More information about the Koha
mailing list