[Koha] Bulkmarcimport and special characteres

Joshua Ferraro jmf at liblime.com
Fri Nov 10 01:46:26 NZDT 2006


On Wed, Nov 08, 2006 at 06:53:12PM +0000, MJ Ray wrote:
> > From: "José Lagoas" <jlagoas at lnec.pt>
> > Second Question: The Portuguese Language uses some special characters
> > like "ã". We have used bulkmarcimport for importing our database. This
> > program changes "ão" to "ô", "ãe" to "ê " and "áu" to "ù". Have you any
> > idea how to solve this problem?
> 
> Sounds like a character encoding problem.  If bulkmarcimport is using 
> UTF-8 at last, try running:
>   iconv -f iso-8859-15 -t utf-8 -o outfile.marc infile.marc
> and then feed outfile.marc to bulkmarcimport.
Actually, this may work for UNIMARC, but it will completely corrupt a
MARC21 file because MARC21 uses MARC8 encoding (and I doubt iconv
understands MARC8). Also, iconv will not update the leader to specify
the encoding.

To properly convert from MARC8 to UTF-8, you'll need to use a MARC
editor (I think MARCEdit can do it), or you'll need to write a script to
do the conversion using one of the MARC toolkits out there.

Hope that helps,

-- 
Joshua Ferraro                       SUPPORT FOR OPEN-SOURCE SOFTWARE
President, Technology       migration, training, maintenance, support
LibLime                                Featuring Koha Open-Source ILS
jmf at liblime.com |Full Demos at http://liblime.com/koha |1(888)KohaILS


More information about the Koha mailing list