[Koha] BOM in MARC 21 data files

Michael Kuhn mik at adminkuhn.ch
Mon Dec 25 08:37:22 NZDT 2017


Hi

Recently we have the problem some vendors are delivering their 
UTF-8-encoded MARC 21 data files (ISO 2709) starting with a byte order 
mark / BOM (see https://en.wikipedia.org/wiki/Byte_order_mark). Most 
vendors I know (e. g. Overdrive) don't include such a BOM, but in 
Germany some do. When asking them to change this they say "every ILS 
should be able to handle this".

However, when trying to import such a file into Koha the first record is 
simply ignored. This is especially fatal when importing a data file 
containing only one record, of course.

1. As far as I know the BOM is not part of the MARC 21 (ISO 2709) 
format, so there mustn't be any leading BOM. Am I correct?

2. Is there a way within Koha to remove that unwanted BOM or to teach 
Koha to ignore it and import the data anyway?

Of course such a BOM could be manually removed by using sed, but I feel 
that a) either the vendor shouldn't include the BOM at all or b) Koha 
"should be able to handle this".

What do you think?

Best wishes and happy holidays! Michael
-- 
Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis
Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz
T 0041 (0)61 261 55 61 · E mik at adminkuhn.ch · W www.adminkuhn.ch


More information about the Koha mailing list