[Koha] How to remove unwanted characters when importing MARC data?

Michael Kuhn mik at adminkuhn.ch
Thu Jun 22 07:05:19 NZST 2017


Hi Pedro

> I don't think you'll find an easy way within Koha to do that, maybe bulk 
> edit but I don't know, haven't used it.
> 
>  From our experience - we've had all sorts of unwanted data like the one 
> you're experiencing and worst - MARCXML is the way to go.
> Assuming the MARC file is well-formed, convert MARC -> MARCXML (see 
> MARC4J <https://github.com/marc4j/marc4j>, there are others) and apply a 
> custom made XSL (take your pick: xmlstarlet 
> <http://xmlstar.sourceforge.net/docs.php>, xmllint 
> <http://xmlsoft.org/xmllint.html>, xsltproc 
> <http://xmlsoft.org/XSLT/xsltproc.html>, whatever) after.

Which of these would you actually recommend? (which ist the "best" one?)

> Switch fields, 
> remove unwanted characters, field joining, field splitting, whatever, it 
> can be done with XSL. MarcEdit wouldn't respond to all our needs.
> 
> Yes, you'll have to learn XSL if you don't know already and yes it will 
> require time to figure it all out but if you're working with Koha for 
> the long run, you'll virtually be equipped with a tool that'll solve all 
> your data problems in the future.

Until now I have just written my own MARCXML when migrating data, but 
there was no need to change MARCXML.

I know already a bit of XSL but this means someone will have to download 
the data, then edit it in a way that still has to be found, only then it 
can be imported into Koha. Since the library acquires such data several 
times a year the editing process has to be as easy as possible because 
it will be done by an unsuspecting librarian with no shell experience or 
even access.

Thanks for the hints on XSL - I will put that on my personal todo list, 
but this gets longer and longer, being filled up with acronyms and no end...

Best wishes: Michael
-- 
Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis
Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz
T 0041 (0)61 261 55 61 · E mik at adminkuhn.ch · W www.adminkuhn.ch


More information about the Koha mailing list