[Koha] Dealing with bad MARC records

Steven F.Baljkas baljkas at mts.net
Tue Jun 19 10:38:39 NZST 2007


Monday, June 18, 2007      17:30 CDT

Hi, Kevin,

Well, it's good to be able to wake up after a whole day sick and be able to do something useful for someone ...

I've actually encountered the problem you're having, too.

What I did as a quick fix was to convert the file into tagged MARC -- MARCEdit allowed me to do this; if it doesn't work for you let me know -- and then opened the tagged file in MS Word. Using MS Word however, I could ask it to find the weird non-numeric field designator and change it to something distinctively numeric.

I had to do that because MARCEdit's global search and replace functions also wouldn't work for non-numeric fields either (at least not for me).

Also note: in this case, you wouldn't want to change the 5|| into 500 because you might have 500 (General Note) fields you'd want, but maybe something like 582 (doesn't exist formally) or 59x (whatever of the 590s you haven't designated/planned any use for otherwise).

If you don't have access to MS Word or you have other problems, give me a shout back off listserv. Despite my health, I should have a few free hours this week and as long as the programs work to do things automatically, I could convert the files for you easily enough.

Do let the listserv know how things worked out, Kevin.

In the meantime, hope this helps (at least as a work-around).

Cheers,
Steven F. Baljkas
library tech at large
Koha neophyte
volunteer cataloguer
Winnipeg, MB, Canada

P.S. Sorry that LAC's records are fouling things up for you. I've encountered those 5|| fields, too. I think they were meant to be normative 500's and something 'exotic' happened in someone's cataloguing editor.
  If you don't mind taking the time when you have your solution worked out, you can report those kinds of boo-boo's to LAC (using the AMICUS no. for reference) and they will try to correct them. That way, everything improves for everyone.
 -- SFB

============================================================
From: Kevin O'Rourke <lists at caboose.org.uk>
Date: 2007/06/18 Mon AM 03:58:20 CDT
To: Koha Mailing List <koha at lists.katipo.co.nz>
Subject: [Koha] Dealing with bad MARC records

This question is not directly related to Koha itself, but to preparing 
MARC records for import.

Some of the records we've been downloading from Libraries and Archives 
Canada contain a "5||" (two pipe characters) tag with information about 
translation.  As far as I can tell this is not valid MARC.

I developed a little Java program to pre-process records before 
importing into Koha, using the marc4j library to read MARC.  This 
library refuses to read records containing non-numeric tags, causing us 
problems.

Can anyone recommend any tools for ensuring that a MARC record contains 
only valid MARC?  I could use MarcEdit to 'break' the files, edit out 
the bad tags and then re'make' them but this is a bit complicated, 
time-consuming and error-prone.

-- 
Kevin O'Rourke
ICT Coordinator, National Teachers' Institute, Kaduna, Nigeria
062 316972

_______________________________________________
Koha mailing list
Koha at lists.katipo.co.nz
http://lists.katipo.co.nz/mailman/listinfo/koha
============================================================



More information about the Koha mailing list