[Koha] Encode 'ç' to import authority .marcxml file with authority

Javi Legido javi at legido.com
Tue Jul 20 04:12:57 NZST 2021


Hi Harold.

Many thanks for your quick reply.

Changing encoding:

-        return string.strip().encode("ascii",
"xmlcharrefreplace").decode("ascii")
+        return string.strip().encode("utf8",
"xmlcharrefreplace").decode("utf8")

Produces a MARCXML file which produces "0 records in file", so I can't
import it. The string was:

França

Attached the MARCXML record for authorities and bibliographic which works
(meaning that can be imported) but only for authorities produces the wrong
encoding.

Thanks.

Javier

On Mon, 19 Jul 2021 at 17:26, Harald Schaefer <fechsaer at gmail.com> wrote:

> Hi,
>
> you should use the utf8 encoding, when creating a python file.
>
> The marcxml file should have in the first line encoding='UTF-8'
>
> In python you should use encode('utf8')
>
> Regards, Harald
>
> Am 19.07.21 um 16:10 schrieb Javi Legido:
> > Hi there.
> >
> > I'm trying to import an authority type 'GEOGR_NAME' with 'ç' in its name
> > (field '151 a'):
> >
> > França
> >
> > So far:
> >
> > 1. If I manually add it from GUI (I want to import it from .marcxml file)
> > it works typing 'ç' character. If I save the record as MARCXML I get
> below
> > encoding:
> >
> >      <subfield code="a">Fran&#xE7;a</subfield>
> >
> > 2. If I use python to encode it:
> >
> >          return string.strip().encode("ascii",
> > "xmlcharrefreplace").decode("ascii")
> >
> > The generated MARCXML line looks like:
> >
> >      <subfield code="a">França</subfield>
> >
> > In the GUI looks like 'Franȧ', and if I save it as MARCXML looks like:
> >
> >      <subfield code="a">Fran&#x227;</subfield>
> >
> > Worth mentioning that the bibliographic bit referencing this authority
> > looks perfect, and it was created exactly the same as for authority, so
> the
> > only problem is with authority.
> >
> > Does anybody faced similar problem before? In other words I need to
> > generate programatically a MARCXML file to later on import it to koha
> > (21.x), and some of the records (authorities) contains 'ç' and are not
> > being encoded right.
> > _______________________________________________
> >
> > Koha mailing list  http://koha-community.org
> > Koha at lists.katipo.co.nz
> > Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha
> _______________________________________________
>
> Koha mailing list  http://koha-community.org
> Koha at lists.katipo.co.nz
> Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha
>


More information about the Koha mailing list