[Koha] Import MARC record error: utf8 "\xE0"

Bernardo Gonzalez Kriegel bgkriegel at gmail.com
Mon Sep 2 08:47:19 NZST 2013


Hi,
I found the same error using the stage tool,
but I can import successfully using command line tool bulkmarcimport.pl [1]

Perhaps is a bug in the encoding/decoding of the stage tool.
As a workaround try the command line import tool.

Regards,
Bernardo

[1] http://snag.gy/T3X02.jpg




-- 
Bernardo Gonzalez Kriegel
bgkriegel at gmail.com


On Sat, Aug 31, 2013 at 3:17 AM, Pongtawat Chippimolchai <
pongtawat.c at gmail.com> wrote:

> Hello all,
>
> I got the following error while importing MARC records into Koha 3.12.3 (on
> Ubuntu 12.04 amd64 via Debian package):
>
> utf8 "\xE0" does not map to Unicode at /usr/lib/perl/5.14/Encode.pm line
> 174.
>
> The record staged fine but it couldn't be imported. When clicking to view
> it in staging area, the same error appear.
>
> I checked the imported file, and it seem to look fine to me. The MARCXML in
> the database look ok too. But if I save the MARC BLOB and use marcdump on
> it, I got the same error.
>
> The imported file is here:
> https://www.dropbox.com/s/u25mf4dqk69wtgx/error.mrc
> The saved MARC BLOB is here:
> https://www.dropbox.com/s/soawpzmnuy5j7zr/import_records-marc.bin
>
> Running perl -d commit_file.pl show something like this:
>
> -------------
> utf8 "\xE0" does not map to Unicode at /usr/lib/perl/5.14/Encode.pm line
> 174.
>  at /usr/lib/perl/5.14/Encode.pm line 174
>    Encode::decode('UTF-8', 'ha\x{1e}0
>
> \x{1f}a\x{e0}\x{b9}\x{84}\x{e0}\x{b8}\x{9e}\x{e0}\x{b8}\x{a8}\x{e0}\x{b8}\x{b2}\x{e0}\x{b8}\x{a5}
>
> \x{e0}\x{b8}\x{81}\x{e0}\x{b8}\x{a4}\x{e0}\x{b8}\x{a9}\x{e0}\x{b8}\x{8e}\x{e0}\x{b8}\x{b2}\x{e0}\x{b8}\x{98}\x{e0}\x{b8}\x{b4}\x{e0}\x{b8}\x{a7}\x{e0}\x{b8}\x{b8}\x{e0}\x{b8}\x{92}\x{e0}\x{b8}',
> 1) called at /usr/share/perl5/MARC/File/Encode.pm line 35
>    MARC::File::Encode::marc_to_utf8('ha\x{1e}0
>
> \x{1f}a\x{e0}\x{b9}\x{84}\x{e0}\x{b8}\x{9e}\x{e0}\x{b8}\x{a8}\x{e0}\x{b8}\x{b2}\x{e0}\x{b8}\x{a5}
>
> \x{e0}\x{b8}\x{81}\x{e0}\x{b8}\x{a4}\x{e0}\x{b8}\x{a9}\x{e0}\x{b8}\x{8e}\x{e0}\x{b8}\x{b2}\x{e0}\x{b8}\x{98}\x{e0}\x{b8}\x{b4}\x{e0}\x{b8}\x{a7}\x{e0}\x{b8}\x{b8}\x{e0}\x{b8}\x{92}\x{e0}\x{b8}')
> called at /usr/share/perl5/MARC/File/USMARC.pm line 172
>
>
>  MARC::File::USMARC::decode('00932nas0a220021700045000050017000000080043000170200014000600...')
> called at /usr/share/perl5/MARC/Record.pm line 81
>    MARC::Record::new_from_usmarc('MARC::Record',
> '00932nas0a220021700045000050017000000080043000170200014000600...') called
> at /usr/share/koha/lib/C4/ImportBatch.pm line 585
>    C4::ImportBatch::BatchCommitRecords(13, '', 100, 'CODE(0x5227d58)')
> called at /usr/share/koha/bin/commit_file.pl line 85
>    main::process_batch(13) called at
> /usr/share/koha/bin/commit_file.plline 57
> -------------
>
> I really don't get the way Perl (and Koha) handle UTF8 at all. If I print
> that escaped string out, some character will be readable but the others
> appear as ?. For the records that could be imported, if I print out their
> data from Perl I see the same pattern (readable char mixed with ?). But
> they somehow end up correct in the database.
>
> I have no idea what's wrong here. If you have any clue, please help.
>
> Thank you very much,
> Pongtawat
> _______________________________________________
> Koha mailing list  http://koha-community.org
> Koha at lists.katipo.co.nz
> http://lists.katipo.co.nz/mailman/listinfo/koha
>


More information about the Koha mailing list