Re: [Koha] bulkmarcimport and unicode data
Tuesday, March 20, 2007 17:35 CDT Hari, Wasn't there something in the past few months about an incompatibility in bulkmarcimport.pl with certain (?) parts of unicode? Would it be possible for you to do a global search and replace for those 3 problem characters (with whatever tools you have on hand) and recode them into MARC8? That might be a work-around for now. Just a thought. When you find your solution, do let the list know. Cheers, Steven F. Baljkas library tech at large Koha neophyte Winnipeg, MB, Canada ============================================================ From: "R Hariram Aatreya" <rhariram@gmail.com> Date: 2007/03/20 Tue AM 08:09:44 CDT To: koha <koha@lists.katipo.co.nz>, koha-devel <koha-devel@nongnu.org> Subject: [Koha] bulkmarcimport and unicode data hi All, I am trying to import MARC data (in unicode Tamil) into koha using bulkmarcimport.pl However, bulkmarcimport.pl does not like 3 characters (U+0B88, U+0B89, U+0BC8). Whenever these 3 characters are encountered, the rest of the string (subfield value) is gobbled and is missing from the database. But, bulkmarcimport.pl imports correctly if these 3 characters do not occur in the MARC data to be imported. Any idea whats going on ? Thanks, hari. _______________________________________________ Koha mailing list Koha@lists.katipo.co.nz http://lists.katipo.co.nz/mailman/listinfo/koha ============================================================
I missed cc'g it to the lists. hari. ====== "Steven F. Baljkas" <baljkas@mts.net> date Mar 21, 2007 3:01 PM subject Re: [Koha] bulkmarcimport and unicode data mailed-by gmail.com Steven, Thanks for your reply.
Wasn't there something in the past few months about an incompatibility in bulkmarcimport.pl with certain (?) parts of unicode?
I did search the archives. Could not find anything relevant.
Would it be possible for you to do a global search and replace for those 3 problem characters (with whatever tools you have on hand) and recode them into MARC8? That might be a work-around for now.
From what I understand, MARC8 does not support Tamil.
Right now, the solution I have in mind (as a work around) is: 1. replace each of the 3 problem chars with a control char 2. mysqldump the koha database 3. replace the control char with the corresponding problem char 4. restore into mysql Is there a better work around ? Unless someone is already at it, I would like to look at the incompatibility issue for unicode in bulkmarcimport. Would you know whats the problem ? Any pointers would be useful. Thanks, hari.
participants (2)
-
R Hariram Aatreya -
Steven F.Baljkas