[Koha] Bulkmarcimport and special characteres

Cindy Murdock cmurdock at ccfls.org
Fri Nov 10 10:26:46 NZDT 2006


Joshua Ferraro wrote:
> To properly convert from MARC8 to UTF-8, you'll need to use a MARC
> editor (I think MARCEdit can do it), or you'll need to write a script to
> do the conversion using one of the MARC toolkits out there.

I've been using marc2xml & xml2marc to convert from marc8 to xml to 
(presumably) utf8.  Example:

marc2xml mymarc8file.mrc > newxmlfile.xml

then when it's finished:
xml2marc newxmlfile.xml > newmarcfile.mrc

At least, this is useful for finding which records it's choking on.  For 
example, if you get errors from xml2marc and it stops prematurely, use 
'tail --bytes=500 nnewmarcfile.mrc' to find out which record was just 
before the one that caused the problem, and then find that record in 
newxmlfile.xml, and examine the record after that to find the encoding 
problem.

Of course, I'm no utf8 expert, so take this with a grain of salt!

hth,
c.


More information about the Koha mailing list