[Koha] non-latin script / unicode problem
LAURENT Henri-Damien
henridamien.laurent at biblibre.com
Tue Jan 11 22:53:05 NZDT 2011
Le 11/01/2011 10:42, Irakli Garibashvili a écrit :
> Hi!
Hi Irakli,
welcome on the list.
Happy New Year.
If I remember correctly, we met in Yerevan some years ago.
>
> We are using Koha 3.0 (Marc21, Nozebra) and it looks like fields for items
> have some problems with Georgian script:
I thought that you would be in UNIMARC rather than MARC21. My illusions
about the use of UNIMARC outside France took a bad stroke :D
>
> I am not sure what is the reason for this problem but
>
> data entry (in Georgian script - in UTF-8) for the fields like CallNumb,
> copynumber, (notes??),.. in some cases result in corrupted text.
It is hard to know precisely, problem could be that your biblio record
doesnot have a correct leader. All leaders should have "a" for position 9.
So that it is double encoded by MARC::Record when decoded.
And I also guess that somehow encoding is better handled in 3.2 (with
some data normalization)
My 2 cents.
>
> I have found that some records in nozebra index and XML also contain
> similar corrupted texts.
>
> I am not quite sure how to explain what exactly is corrupted in this text,
> but it looks like I see each bit of UTF-8 separately, which must be an
> indication, that character conversion goes wrong for these fields/tables.
>
> I have checked MySQL structure - all appropriate fields have UTF8_general
> collation... (so this must be correct)
>
> Problem with perl codes? Where?
>
> Could someone help me?
>
>
> Thanks in advance,
> Irakli
--
Henri-Damien LAURENT
BibLibre
More information about the Koha
mailing list