[Koha] Import MARC record error: utf8 "\xE0"

Paul paul.a at aandc.org
Mon Sep 2 02:52:38 NZST 2013

At 01:17 PM 8/31/2013 +0700, Pongtawat Chippimolchai wrote:
>Hello all,
>I got the following error while importing MARC records into Koha 3.12.3 (on
>Ubuntu 12.04 amd64 via Debian package):
>utf8 "\xE0" does not map to Unicode at /usr/lib/perl/5.14/Encode.pm line
>The record staged fine but it couldn't be imported. When clicking to view
>it in staging area, the same error appear.

Perhaps a "long shot", but I remember having a similar problem which turned 
out to be (not Unicode) but simply including hyphens in ISBN numbers -- you 
have put an apparent ISSN with hyphen into 020$a, and my verification 
script shows: "020 - Invalid data found: 8057-7730.  Only 
[^(97(8|9))?\d{9}(\d|X)$] is valid within this field." (I realize that MARC 
purists will object that more - e.g. "(pbk)" - can be added, but bulk 
imports seem to be touchy on this subject, so I added the "$" at the end of 
the regex.)

Best - Paul

>I checked the imported file, and it seem to look fine to me. The MARCXML in
>the database look ok too. But if I save the MARC BLOB and use marcdump on
>it, I got the same error.
>The imported file is here:
>The saved MARC BLOB is here:
>Running perl -d commit_file.pl show something like this:
>utf8 "\xE0" does not map to Unicode at /usr/lib/perl/5.14/Encode.pm line
>  at /usr/lib/perl/5.14/Encode.pm line 174
>    Encode::decode('UTF-8', 'ha\x{1e}0
>1) called at /usr/share/perl5/MARC/File/Encode.pm line 35
>    MARC::File::Encode::marc_to_utf8('ha\x{1e}0
>called at /usr/share/perl5/MARC/File/USMARC.pm line 172
>called at /usr/share/perl5/MARC/Record.pm line 81
>    MARC::Record::new_from_usmarc('MARC::Record',
>'00932nas0a220021700045000050017000000080043000170200014000600...') called
>at /usr/share/koha/lib/C4/ImportBatch.pm line 585
>    C4::ImportBatch::BatchCommitRecords(13, '', 100, 'CODE(0x5227d58)')
>called at /usr/share/koha/bin/commit_file.pl line 85
>    main::process_batch(13) called at 
> /usr/share/koha/bin/commit_file.plline 57
>I really don't get the way Perl (and Koha) handle UTF8 at all. If I print
>that escaped string out, some character will be readable but the others
>appear as ?. For the records that could be imported, if I print out their
>data from Perl I see the same pattern (readable char mixed with ?). But
>they somehow end up correct in the database.
>I have no idea what's wrong here. If you have any clue, please help.
>Thank you very much,
>Koha mailing list  http://koha-community.org
>Koha at lists.katipo.co.nz

Maritime heritage and history, preservation and conservation,
research and education through the written word and the arts.
<http://NavalMarineArchive.com> and <http://UltraMarine.ca>

More information about the Koha mailing list