<div id="RTEContent">When I imported a MARC record with chinese characters using MARCedit to convert the file into UTF8, several of the characters did not display properly (see attached file) I already set the web browser encoding to match with MARC Edit. Still come up with the same problem... it looks like during the import process, some characters were not read correctly by Koha?<br><br>Has anyone encountered similar problem?<br><br><b><i>Thomas D <koha@alinto.com></i></b> wrote:<blockquote class="replbq" style="border-left: 2px solid rgb(16, 16, 255); margin-left: 5px; padding-left: 5px;"> Carol,<br><br>I believe that Firefox is up to version 1.5 but 1.5 may be only a beta<br>release currently. I suspect the configuration of Firefox is more likely to<br>be the problem. Try changing the font used within Firefox to the same one<br>used for MARCEdit. Find edit preferences in the drop down menus for Firefox<br>or the MS Windows version equivalent. Go to Edit
Preferences : General :<br>Choose Fonts and Colors.<br><br>I suspect the reason the problem is manifest even while the font that you<br>have currently selected in Firefox may display the same character correctly<br>when you type it is because two almost identical glyphs are used in Unicode<br>for the same basic character sometimes. Whatever font you have currently<br>configured for Firefox may be unable to display the glyph that MARCEdit<br>assigned in the conversion of the record to UTF-8. The optimal font should<br>display the different glyphs correctly for the same basic character. The<br>actual keystrokes required to generate the different glyphs will be<br>different even if the basic character is the same.<br><br><br>MAILING LISTS<br><br>The links on koha.org seem to have been fixed now. Try<br>http://www.koha.org/community/mailing-lists.html .<br><br><br>Thomas D<br><br><br>Quoting Carol Ku <carolcool01@yahoo.com> :<br>> ---------------- Beginning of the original
message ------------------<br>> <br>> Hi Thomas:<br>> I am using Mozilla Firefox, i thought there is only one<br>> version? 1.0? Koha<br>> 2.2.4. Yes I used the bulkmarcimport... Yes, the problem<br>> remains in Chinese...<br>> only for one or two characters though.... If I were to enter<br>> the data myself, it<br>> wouldn't cause a problem...<br>> <br>> Oh, the window mailing list... i sent emails to savannah<br>> before, but all were<br>> bounced back, so I resubscribe it again through koha home<br>> page, and i was given<br>> the old mailing list.... Yes you are right, i have not been<br>> getting any response<br>> from windows mailing list.<br>> <br>> <br>> Thomas D <koha@alinto.com> wrote:<br>> MULTITUDE OF CHARACTER SETS AND ENCODINGS<br>> <br>> There are variant encodings of Unicode. UTF-8 is but one of<br>> them. There<br>> are also other encodings such as UTF-16, UTF-32, and
USC2.<br>> Conversion<br>> applications can convert between different encodings.<br>> <br>> MS Windows can use UTF-16 directly for keyboard output but not<br>> UTF-8. To my<br>> knowledge there are no keyboard generation applications that<br>> work around<br>> this problem directly.<br>> <br>> Unix can use UTF-8 directly for keyboard output so encoding<br>> conversion<br>> issues are less problematic.<br>> <br>> MARC records have more usually used other older character sets<br>> to represent<br>> similar sets of characters to Unicode. These library character<br>> set<br>> standards were developed before Unicode existed. One such<br>> standard that is<br>> prevalent in MARC-21 records is the MARC-8 character set.<br>> MARC-8 should not<br>> be confused with UTF-8. They are not compatible but character<br>> set<br>> conversion applications can convert between them.<br>> <br>> <br>> CHARCTER SETS AND
ENCODINGS IN KOHA<br>> <br>> Koha 3.0 should convert between MARC-8 and UTF-8 for at least<br>> major Western<br>> European languages. Chinese may have to wait for Koha 3.0.X.<br>> especially as<br>> I do not know how to identify which Chinese glyphs are which.<br>> At least with<br>> Western European languages, I know how to read the alphabets<br>> even when I do<br>> not know how to read the language.<br>> <br>> Previously you have changed the Koha SQL columns from ISO 8859<br>> to UTF-8 if<br>> necessary and the charset headers for the web pages that Koha<br>> sends to the<br>> webserver from ISO 8859 to UTF-8. The web browser then would<br>> seem to have<br>> done a certain degree of conversion work automatically that I<br>> had not<br>> expected would happen as well for characters that you typed as<br>> opposed to<br>> characters that were merely displayed within the web browser.<br>> However, this<br>>
seems to have worked for you so far on MS Windows. I would<br>> presume that the<br>> web browser itself would then be converting between UTF-16<br>> from MS Windows<br>> and UTF-8 inside the web browser before posting back to Koha.<br>> <br>> If the issue that you have now is only for one or two<br>> characters after a<br>> conversion, that seems like the converting application had<br>> partial failure.<br>> I would suggest that the conversion inside MARCedit was<br>> successful but that<br>> the conversion inside your web browser for Koha was less<br>> succesful. What<br>> web browser and version are you using with Koha?<br>> <br>> I am assuming that the most all of the characters in your<br>> problematic<br>> records are in Chinese. I am also assuming that you have used<br>> bulkmarcimport.pl to import these records. Please let me know<br>> if either is<br>> not the case.<br>> <br>> <br>> KOHA WINDOWS LIST
CHANGE<br>> <br>> Do you have any responses from the Savannah list any longer?<br>> The address<br>> for the Koha Windows list is now koha-win32@nongnu.org . This<br>> change is<br>> part of a move of the Koha project from Sourceforge to<br>> Savannah. The<br>> Sourceforge site had become much to unresponsive with the<br>> volume of users<br>> relative to the provision of servers. You should have a better<br>> response<br>> about MS Windows issues on the MS Windows list. Unfortunate<br>> mailing lists<br>> on Savannah do seem to suffer from delays in the mail queue.<br>> <br>> <br>> Thomas D<br>> <br>> <br>> Quoting Carol Ku :<br>> > ---------------- Beginning of the original message<br>> ------------------<br>> > <br>> > I imported two book records with chinese characters.<br>> However,<br>> > there are about<br>> > one or two characters that show up wacky. I used MARCEdit to<br>>
> convert the text<br>> > file into MARC UTF file. When I open the file using<br>> MARCedit,<br>> > all the<br>> > characters look fine.<br>> > <br>> > I was told that MARCEdit uses Arial Unicode MS, is it the<br>> > same code as UTF8? <br>> > If not, how can I oversome this problem?<br>> > <br>> > __________________________________________________<br>> > Do You Yahoo!?<br>> > Tired of spam? Yahoo! Mail has the best spam protection<br>> > around <br>> > http://mail.yahoo.com<br>> > <br>> <br>> ---------------------------------<br>> _______________________________________________<br>> > Koha mailing list<br>> > Koha@lists.katipo.co.nz<br>> > http://lists.katipo.co.nz/mailman/listinfo/koha<br>> > <br>> > ------------------- End of the original message<br>> ---------------------<br>> <br>> <br>> <br>> <br>>
---------------------------------------------<br>> Protect your mails from viruses thanks to Alinto Premium<br>> services<br>> http://www.alinto.com<br>> <br>> <br>> <br>> __________________________________________________<br>> Do You Yahoo!?<br>> Tired of spam? Yahoo! Mail has the best spam protection<br>> around <br>> http://mail.yahoo.com <br>> ------------------- End of the original message ---------------------<br><br><br><br><br>---------------------------------------------<br>Protect your mails from viruses thanks to Alinto Premium services http://www.alinto.com<br></koha@alinto.com></carolcool01@yahoo.com></blockquote><br></div><p>__________________________________________________<br>Do You Yahoo!?<br>Tired of spam? Yahoo! Mail has the best spam protection around <br>http://mail.yahoo.com