Friday, February 24, 2006 19:21 CST Hi, Carol, Joshua, et al., Just catching up with the e-mail a little late today. Carol had posed a question a while back on getting actual Chinese characters both in MARC records and out in the Koha OPAC display. I had suggested MARCEdit might be able to help with some things in the past, but if memory serves -- I'm not 100% on this and you would have to check the MARC standards online or in print to confirm this but--, the $6 linking fields have to be generated by the ILS itself to provide such linkages. Most of the ILS I have experienced simply cannot do this. (I assume that Voyager and Aleph can because they are known for multiscript abilities.) I don't think Koha can (not that that is a bad thing for this stage in its development, considering the complexity and rarity of the need for $6 linkages). The purpose of the 880, Joshua, according to my erstwhile go-to guy on the history and evolution of MARC coding, was to hold non-Roman script characters. I gather that, although computers may have been able to code for them for a long while now (since the 1960s? via hexadecimal and ASCII and Lord knows probably other codes), there weren't any ILS at the beginning that could make use of them. Hence, 880 and associated fields were contrived so that data could be preserved against loss (recall our discussion of backing things up: I tell you, we cataloguers are THE pre-eminent packrats of humanity - no iota to be lost! ;-D). If I recall my original advice to you, Carol, it was to ***try to see if Koha could display the original Chinese script correctly in the 100 and 245 fields*** (and for that matter, the other core descriptive fields, although I know I didn't say that before). I know that there have been questions -- some of them intiated by you, IIRC -- on UTF8 etc. that are relevant to this. I just don't know what Koha's status is on using non-Roman scripts. Sorry. It would be really good to know how that works though. From what I am used to seeing in other ILS, it usually requires that the cataloguers enter special codes -- either hexadecimal or system proprietary ones -- in order to ensure that special characters will display properly for patrons (usually, the cataloguers end up having to look at codes or messy weird stuff in place of real letters or logograms). Continuing the re-iteration of my original advice to you, Carol: if you are able to put the actual Chinese characters into the 100 and 245, you could use a 700$a or 900$a to provide access to the author entry transcribed, and perhaps a 242 (technically TRANSLATION of title, which is never a bad thing), 246, 700$t, 730 and/or 740 fields to provide transcribed title access. Although anyone who knows me knows I actually like the cataloguing rules, what I am proposing (other than the 900$a which is free to do with as you please) bends if not breaks the cataloguing rules. However, it should work, given that Joshua has assured us that Koha can access 700 fields (at least from 2.2.5 on) and 246s. A quick word about transcriptions, because I know this topic has caused headaches for Chinese-language cataloguers. In 2000, LC prescribed that the Pinyin system of romanization was to supersede the previously used Wade-Giles system. Carol, ***if you are retrieving older MARC21 (really USMARC, etc.) records, you may very well have to move the transcribed fields and create the now REQUIRED pinyin transcriptions.*** (There is not, so far as I know, a requirement to DELETE the older Wade-Giles transcriptions, but someone working directly in the field may well correct me on that point .) Now, Carol, please also note: you can take all this with a grain of salt (ignore it) if you don't care about having your library comply with the ALA-LC practices mandated in this case. I have seen libraries request and rely on their own transliteration systems for Russian and Ukrainian so I know that ALA-LC practices are not always heeded when it comes to dealing with non-Roman script cataloguing matters. I hope that this re-iteration and amplification will help somewhat. I am very interested to learn how this resolves for you and I do hope it will work out well. Best wishes with this tricky problem! Cheers, Steven F. Baljkas library tech at large Koha neophyte Winnipeg, MB, Canada ============================================================ From: Carol Ku <carolcool01@yahoo.com> Date: 2006/02/24 Fri PM 03:41:35 CST To: Joshua Ferraro <jmf@liblime.com> CC: Koha Mailing List <koha@lists.katipo.co.nz>, Koha Windows <koha-win32@nongnu.org> Subject: Re: [Koha-win32] Re: [Koha] Tag 880 You are perfectly right. I noticed that most libraries such as Library of Congress, they represent Chinese pinyin in tag 100, and this piece of info is linked to tag 880 $6100 and with other related subfields etc... We would like to do the following: 1) save the pinyin and Chinese in Koha. 2) Display Chinese info in OPAC 3) Allow user to search using pinyin if they don't have Chinese input Since Koha display only one line item for title, author etc, we are thinking may be we need to use MARCEdit to tweak tag 100 to 880 $6100. I was advised by others that I should instead designate new tag e.g. 900 for tag 880 $6100, 901 for tag 880 $6245 etc. To make things more complicated, MARCEdit does not seem to recognize the $6 link either. So it will treat all tag 880 as one tag. Joshua Ferraro <jmf@liblime.com> wrote: On Fri, Feb 24, 2006 at 12:26:29PM -0800, Carol Ku wrote:
i think the $6 linking field is different from a regular subfield a, b or c etc.
In MARC, all the information on the book will be stored in the native language in tag 880. Then they use $6 linking field to tie 880 to tag 100 for Name etc... so e.g. 880 $6100 a.... so this tag means information stored here is the author name (designated by code $6100) in e.g Chinese. $6 is not a regular subfield.... OK ... first let's discuss what you're trying to do. I have had two years of Chinese language classes so I know that there are several ways to represent Chinese. Are you attempting to put 'pinyin' in the 100a and then link to the actual characters in 880? What is your goal in using the 880?
Cheers, -- Joshua Ferraro VENDOR SERVICES FOR OPEN-SOURCE SOFTWARE President, Technology migration, training, maintenance, support LibLime Featuring Koha Open-Source ILS jmf@liblime.com |Full Demos at http://liblime.com/koha |1(888)KohaILS _______________________________________________ Koha mailing list Koha@lists.katipo.co.nz http://lists.katipo.co.nz/mailman/listinfo/koha
OK ... sorry it's taken so long for a response on this, I'm currently involved in a migration for a client....here goes: First off, thanks for asking this question, in the process of answering it I discovered and fixed two bugs in the Koha MARC editor (so before you try this I'd suggest updating Biblio.pm and addbiblio.pl to the latest CVS versions, ask me for details if you need to). So, using the Koha MARC editor, I did a bit of original MARC cataloging for a Chinese language book. koha.liblime.com, like Carol's Koha, runs on UTF-8, so it can easily store and display any UTF-8 Characters. Here is the record: http://opac.liblime.com/cgi-bin/koha/opac-MARCdetail.pl?bib=23717 You'll notice that I used the 880 Linkage fields to add the pinyin as specified in the MARC standard. The interesting bit is that although Koha does not yet understand how to treat the 880 $6 (which as far as I can tell is a true exception to the rule), a keyword search for the pinyin does in fact bring up the record. (author and title won't work however). So that's good, not great, but good. Notice that there are also Linkage entries in the 100 and 245 tags: it goes both ways. I understand how this could be used by the system to not only link the two for searching, but also to generate the proper rules of the associated 880 tag. Of course, understanding how it SHOULD work, doesn't mean it does yet ... but keep reading, it gets better, I promise. As I understand it, one of the ways 880 can be used is for transliteration, that is, storing different ways to represent the same language. Now, here's the problem with 880 in MARC: it's far too limited for what I'd like you to be able to do. First, it doesn't allow any fine distinctions for different 'scripts'. You can, in fact, specify the kind of script you're linking but you only have the following choices: (3 Arabic (B Latin $1 Chinese, Japanese, Korean (N Cyrillic (2 Hebrew However, at least in Standard Mandarin, which I studied, there are no less than five ways to represent the language: traditional hanzi, simple hanzi, pinyin, Yale and Wade-Giles (well, there's also Zhuyinfuhao, but I assume you are not tailoring to youngsters). MARC is sadly lacking in that you can only provide a one-to-one mapping and thus only include two representation variations. But let's not stop there. In addition to there being lots of different ways to represent the Chinese language, there are also many ways to _encode_ _each_ representation. UTF-8 and Big-5 are two that come to mind. I suspect this is where most of the problem comes from in the first place: your students being at keyboards without the ability to encode in the proper way to search the traditional catalog. Here comes Koha to the rescue, and here's what I would suggest you start doing. First, have a look at what it looks like: http://opac.liblime.com/cgi-bin/koha/opac-MARCdetail.pl?bib=23719 What you are looking at is a record for a Chinese language book that I cataloged using Koha's MARC editor after making several minor adjustments to the Koha MARC Framework. Without breaking any MARC rules, using local use fields, and using Koha's 'search also' feature, you can find that record using a keyword, author, or title search using ANY of UTF-8, Big5, pinyin, Yale, or Wade-Giles. But don't stop there, you can add as many transliterations as you like, there is literally no limit. Oh ... and feel free to leave those 880s in there, some day Koha will be able to handle them as well. Eat your heart out Voyager :-). -- Joshua Ferraro VENDOR SERVICES FOR OPEN-SOURCE SOFTWARE President, Technology migration, training, maintenance, support LibLime Featuring Koha Open-Source ILS jmf@liblime.com |Full Demos at http://liblime.com/koha |1(888)KohaILS
Joshua: You are wonderful!!! You mention you used the Koha MARC edit in catalogue, that means you manually enter the Chinese books in koha catalogue, and not download the record from some libraries? Pardon me if this question appears to be silly. But your input is very helpful still. Carol Joshua Ferraro <jmf@liblime.com> wrote: OK ... sorry it's taken so long for a response on this, I'm currently involved in a migration for a client....here goes: First off, thanks for asking this question, in the process of answering it I discovered and fixed two bugs in the Koha MARC editor (so before you try this I'd suggest updating Biblio.pm and addbiblio.pl to the latest CVS versions, ask me for details if you need to). So, using the Koha MARC editor, I did a bit of original MARC cataloging for a Chinese language book. koha.liblime.com, like Carol's Koha, runs on UTF-8, so it can easily store and display any UTF-8 Characters. Here is the record: http://opac.liblime.com/cgi-bin/koha/opac-MARCdetail.pl?bib=23717 You'll notice that I used the 880 Linkage fields to add the pinyin as specified in the MARC standard. The interesting bit is that although Koha does not yet understand how to treat the 880 $6 (which as far as I can tell is a true exception to the rule), a keyword search for the pinyin does in fact bring up the record. (author and title won't work however). So that's good, not great, but good. Notice that there are also Linkage entries in the 100 and 245 tags: it goes both ways. I understand how this could be used by the system to not only link the two for searching, but also to generate the proper rules of the associated 880 tag. Of course, understanding how it SHOULD work, doesn't mean it does yet ... but keep reading, it gets better, I promise. As I understand it, one of the ways 880 can be used is for transliteration, that is, storing different ways to represent the same language. Now, here's the problem with 880 in MARC: it's far too limited for what I'd like you to be able to do. First, it doesn't allow any fine distinctions for different 'scripts'. You can, in fact, specify the kind of script you're linking but you only have the following choices: (3 Arabic (B Latin $1 Chinese, Japanese, Korean (N Cyrillic (2 Hebrew However, at least in Standard Mandarin, which I studied, there are no less than five ways to represent the language: traditional hanzi, simple hanzi, pinyin, Yale and Wade-Giles (well, there's also Zhuyinfuhao, but I assume you are not tailoring to youngsters). MARC is sadly lacking in that you can only provide a one-to-one mapping and thus only include two representation variations. But let's not stop there. In addition to there being lots of different ways to represent the Chinese language, there are also many ways to _encode_ _each_ representation. UTF-8 and Big-5 are two that come to mind. I suspect this is where most of the problem comes from in the first place: your students being at keyboards without the ability to encode in the proper way to search the traditional catalog. Here comes Koha to the rescue, and here's what I would suggest you start doing. First, have a look at what it looks like: http://opac.liblime.com/cgi-bin/koha/opac-MARCdetail.pl?bib=23719 What you are looking at is a record for a Chinese language book that I cataloged using Koha's MARC editor after making several minor adjustments to the Koha MARC Framework. Without breaking any MARC rules, using local use fields, and using Koha's 'search also' feature, you can find that record using a keyword, author, or title search using ANY of UTF-8, Big5, pinyin, Yale, or Wade-Giles. But don't stop there, you can add as many transliterations as you like, there is literally no limit. Oh ... and feel free to leave those 880s in there, some day Koha will be able to handle them as well. Eat your heart out Voyager :-). -- Joshua Ferraro VENDOR SERVICES FOR OPEN-SOURCE SOFTWARE President, Technology migration, training, maintenance, support LibLime Featuring Koha Open-Source ILS jmf@liblime.com |Full Demos at http://liblime.com/koha |1(888)KohaILS --------------------------------- Relax. Yahoo! Mail virus scanning helps detect nasty viruses!
On Sat, Feb 25, 2006 at 09:39:43AM -0800, Carol Ku wrote:
Joshua: You are wonderful!!! You mention you used the Koha MARC edit in catalogue, that means you manually enter the Chinese books in koha catalogue, and not download the record from some libraries? Yep, that's right. So long as you have UTF-8 set up correctly, you can input as well as display any utf-8 you want in Koha.
Now, let me warn you that the Koha MARC editor is still not perfect and I would still recommend you use an external MARC editor if that has been your cataloging practice thusfar. You should be able to set it up to handle a repeatable 900, 945, etc. tag, with the appropriate subfields to handle the transcriptions (don't limit yourself to just the title/author, feel free to include any others you want). So long as you set up your MARC Framework to handle the 9XX fields and link to them via the search points (ie 245a will need a 'search also' entry for '945a','945c' (and you might as well put '245c' in there while you're at it)), it will work without having to use Koha's editor. If you need more specific implementation details let me know.
Pardon me if this question appears to be silly. But your input is very helpful still. No question is silly :-). I'm glad I could be of some help.
Sincerely, -- Joshua Ferraro VENDOR SERVICES FOR OPEN-SOURCE SOFTWARE President, Technology migration, training, maintenance, support LibLime Featuring Koha Open-Source ILS jmf@liblime.com |Full Demos at http://liblime.com/koha |1(888)KohaILS
So if we were to download books directly from other libraries... it may involve lots of record code editing, as all the Chinese info in other libraries will be entered into tag 880. e.g. 100 Author name in pinyin 245 Author name in pinyin 880 $6100 Author name in chinese 880 $6245 Title name in Chinese With the MARC record downloaded, I wonder if I can swap the information in tag 880 $6100 with that of tag 100, so that when I upload the record into Koha, OPAC will display the author name in Chinese instead of pinyin. I suspect MARCEdit will not be able to recognize the $6 link. So I cannot designate the info at tag 100 to be swapped with info at tag 880$100 and not say 880$245. Anyway, Joshua, you have been of a great help. Thank you. Carol Joshua Ferraro <jmf@liblime.com> wrote: On Sat, Feb 25, 2006 at 09:39:43AM -0800, Carol Ku wrote:
Joshua: You are wonderful!!! You mention you used the Koha MARC edit in catalogue, that means you manually enter the Chinese books in koha catalogue, and not download the record from some libraries? Yep, that's right. So long as you have UTF-8 set up correctly, you can input as well as display any utf-8 you want in Koha.
Now, let me warn you that the Koha MARC editor is still not perfect and I would still recommend you use an external MARC editor if that has been your cataloging practice thusfar. You should be able to set it up to handle a repeatable 900, 945, etc. tag, with the appropriate subfields to handle the transcriptions (don't limit yourself to just the title/author, feel free to include any others you want). So long as you set up your MARC Framework to handle the 9XX fields and link to them via the search points (ie 245a will need a 'search also' entry for '945a','945c' (and you might as well put '245c' in there while you're at it)), it will work without having to use Koha's editor. If you need more specific implementation details let me know.
Pardon me if this question appears to be silly. But your input is very helpful still. No question is silly :-). I'm glad I could be of some help.
Sincerely, -- Joshua Ferraro VENDOR SERVICES FOR OPEN-SOURCE SOFTWARE President, Technology migration, training, maintenance, support LibLime Featuring Koha Open-Source ILS jmf@liblime.com |Full Demos at http://liblime.com/koha |1(888)KohaILS --------------------------------- Brings words and photos together (easily) with PhotoMail - it's free and works with Yahoo! Mail.
participants (3)
-
Carol Ku -
Joshua Ferraro -
Steven F.Baljkas