Hi all, At 13.46 09/11/2006, Joshua Ferraro wrote:
From: "José Lagoas" <jlagoas@lnec.pt> Second Question: The Portuguese Language uses some special characters like "ã". We have used bulkmarcimport for importing our database. This program changes "ão" to "ô", "ãe" to "ê " and "áu" to "ù". Have you any idea how to solve this problem?
Sounds like a character encoding problem. If bulkmarcimport is using UTF-8 at last, try running: iconv -f iso-8859-15 -t utf-8 -o outfile.marc infile.marc and then feed outfile.marc to bulkmarcimport. Actually, this may work for UNIMARC, but it will completely corrupt a MARC21 file because MARC21 uses MARC8 encoding (and I doubt iconv understands MARC8). Also, iconv will not update the leader to specify
On Wed, Nov 08, 2006 at 06:53:12PM +0000, MJ Ray wrote: the encoding.
To properly convert from MARC8 to UTF-8, you'll need to use a MARC editor (I think MARCEdit can do it), or you'll need to write a script to do the conversion using one of the MARC toolkits out there.
If Jose has characters encode in MARC8, the best tool to use is MARC::Charset, http://search.cpan.org/~esummers/MARC-Charset-0.95/. If you have data in iso-8859-x you can use iconv. To know what do you have, call the software vendor. All IMHO Bye, (:->> Zeno Tajoli CILEA - Segrate (MI) tajoliAT_SPAM_no_prendiATcilea.it (Indirizzo mascherato anti-spam; sostituisci quanto tra AT con @)