[Koha] rebuild_nozebra.pl failures: wide characters and missing end tags in MARC XML

Jeffrey LePage jeffrey_lepage at yahoo.com
Sat Feb 21 06:07:40 NZDT 2009


Greetings,

We have a small library, and for the sake of simplicity, we run without zebra.

When I run rebuild_nozebra.pl interactively I'm getting errors on 15 of our 8000+ records.    Judging by the biblioitem numbers, I think all or most of these records got into the system when we imported MARC records from Sagebrush Athena.

The errors fall into 2 categories:

Error 1
----------------
Cannot decode string with wide characters at /usr/lib/perl/5.8/Encode.pm line 166

I can see the wide characters when I query biblioitems.marcxml.  For example, there's an accent mark over an illustrator's name in one of the Harry Potter books.  

How do I fix these records? I note that the biblioitems table contains:
   a) marcxml longtext
   b) marc longblob

Error 2
---------------
No close tag marker

The MARC record is indeed missing close tags.  In each case, the datafield tag 952, subfield 6 is not closed.  The MARC record ends like this:

<datafield tag=952" ind1=" " ind2=" ">
  <subfield code="7">0</subfield>
  <subfield code="p">5595</subfield>
  <subfield code="4">0</subfield>
  <subfield code="0">0</subfield>
  <subfield code="6">SPA_635_900000000000000_S


*******************
How should I fix these errors and how should I prevent them in the future?  If I manually repair biblioitems.marcxml do I also need to repair biblioitems.marc (which is a blob)?


Thanks,
Jeff LePage




-- 
Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html


      


More information about the Koha mailing list