[Koha] Bulkmarcimport and special characteres

Zeno Tajoli tajoli at cilea.it
Sat Nov 11 00:30:23 NZDT 2006


Hi all,

At 13.46 09/11/2006, Joshua Ferraro wrote:
>On Wed, Nov 08, 2006 at 06:53:12PM +0000, MJ Ray wrote:
> > > From: "José Lagoas" <jlagoas at lnec.pt>
> > > Second Question: The Portuguese Language uses some special characters
> > > like "ã". We have used bulkmarcimport for importing our database. This
> > > program changes "ão" to "ô", "ãe" to "ê " and "áu" to "ù". Have you any
> > > idea how to solve this problem?
> >
> > Sounds like a character encoding problem.  If bulkmarcimport is using
> > UTF-8 at last, try running:
> >   iconv -f iso-8859-15 -t utf-8 -o outfile.marc infile.marc
> > and then feed outfile.marc to bulkmarcimport.
>Actually, this may work for UNIMARC, but it will completely corrupt a
>MARC21 file because MARC21 uses MARC8 encoding (and I doubt iconv
>understands MARC8). Also, iconv will not update the leader to specify
>the encoding.
>
>To properly convert from MARC8 to UTF-8, you'll need to use a MARC
>editor (I think MARCEdit can do it), or you'll need to write a script to
>do the conversion using one of the MARC toolkits out there.

If Jose has characters encode in MARC8, the best tool to use is MARC::Charset,
http://search.cpan.org/~esummers/MARC-Charset-0.95/.

If you have data in iso-8859-x you can use iconv.

To know what do you have, call the software vendor.

All IMHO

Bye, (:->>


Zeno Tajoli
CILEA - Segrate (MI)
tajoliAT_SPAM_no_prendiATcilea.it
(Indirizzo mascherato anti-spam; sostituisci quanto tra AT con @)



More information about the Koha mailing list