[Koha] How to remove unwanted characters when importing MARC data?
Michael Kuhn
mik at adminkuhn.ch
Thu Jun 22 04:55:38 NZST 2017
Hi
Our library receives MARC data from EKZ (a German cataloging data
provider) which includes two unwanted characters:
* a beginning "non-sorting character"
* an ending "non-sorting character"
These characters can't be seen in the OPAC and in the hitlist of the
staff client, but they do appear in the framework and also in the top
line of the webbrowser. Here is an example of a file containing such
characters: http://adminkuhn.ch/download/kuhn0000000
When opening the original .mrc file with vi these characters show as:
<98>The<9c> obsession
With "od -c" they show as:
302 230 T h e 302 234 o b s e s s i o n
Of course these characters could be removed e. g. with sed (but this
will result in a wrong character length in MARC LEADER positions 0-4)
and also it has to be done separately on the shell outside and before
the regular importing process. Or even using software like MarcEdit.
Now the question is if there is an EASY way how to delete these unwanted
characters within Koha, for example by using the MARC modification
templates which is used anyway when loading such data?
Best wishes: Michael
--
Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis
Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz
T 0041 (0)61 261 55 61 · E mik at adminkuhn.ch · W www.adminkuhn.ch
More information about the Koha
mailing list