[Koha] Koha migration & subfield problem

Walls, Ian Ian.Walls at med.nyu.edu
Wed Oct 28 11:13:16 NZDT 2009


Thomas,


Yes, that is more in line with that I was thinking. I'm not even sure  
you have to adjust the MARC framework to get the " | " effect I  
experienced; I remember importing on the assumption that a second  
952$o would be ignored.

A Perl script with a well-crafted regex might be a simpler way to  
perform the concatenation.  Fewer import/export steps, certainly.

Thanks for the heads up about the Near East University fork. I'll look  
at the code, and see if we can utilize any of the logic at my  
institution.

Cheers,


Ian Walls


On Oct 27, 2009, at 17:57, "Thomas Dukleth" <kohalist at agogme.com> wrote:

> [Correction]
>
> On looking at Ian Walls' idea again with a better understanding of  
> what I
> imagine his intent had been I see that it might work if the  
> repeatability
> of the Koha call number subfield would not be maintained afterwards.
> However, it would be a rather convoluted method to accomplish the
> concatenation.
>
>
> Thomas Dukleth
> Agogme
> 109 E 9th Street, 3D
> New York, NY  10003
> USA
> http://www.agogme.com
> +1 212-674-3783
>
>
> On Tue, October 27, 2009 21:42, Thomas Dukleth wrote:
>> Reply inline:
>>
>>
>> 1.  INNOVATIVE BUT INSUFFICIENT WORKAROUND.
>>
>> On Tue, October 27, 2009 14:34, Walls, Ian wrote:
>>> Amy,
>>>
>>>
>>> Koha is MARC agnostic on its own; you can adjust your MARC  
>>> Frameworks to
>>> accept repeated subfields as you like.  So, as far as MARC record
>>> validity
>>> goes, you should be just fine with multiple 952$o.
>>>
>>> In my experience, if you supply multiple subfields within a 952,  
>>> they
>>> all
>>> get loaded into mapped database field (952$o ->  itemcallnumber in  
>>> the
>>> default framework), separated by " | ", up to the character limit  
>>> of the
>>> field (30 in the case of itemcallnumber).  This would let you cram
>>> multiple bits of information into the same field; if you don't  
>>> like the
>>> "
>>> | " (and you didn't lose too much info), you could export back to
>>> MarcEdit, change the " | " to a more pleasant separator, and re- 
>>> import.
>>
>> Ian Wallis' idea about making the Koha call number subfield  
>> repeatable in
>> the MARC framework is very innovative but is unlikely to help with  
>> how the
>> Koha program actually uses the call number.  Even if your MARC may
>> validate, there is always a danger of breaking Koha by tinkering in  
>> the
>> wrong manner.  Having access to the source code means that you can  
>> always
>> fix it if necessary but some caution should be taken over fields  
>> linked to
>> database columns.
>>
>> Be careful to always keep a backup of your data for a long time in  
>> case
>> you may later discover problems from an experimental change.
>>
>>
>> 2.  OTHER DATA POPULATION METHODS.
>>
>>>
>>> Another option, depending on your comfort with XSLT, would be to  
>>> convert
>>> your old MARC into MARCXML, run it through a stylesheet that  
>>> merges the
>>> subfields into 952$o, then convert back to MARC.  You can get a  
>>> little
>>> more fine grained control with the logic in MARCXML (not a lot,  but
>>> enough).
>>
>> There are many ways to programmatically concatenate the call number  
>> parts
>> to form a complete call number.  A single regular expression may be
>> possible but perhaps what you are using to control the regular  
>> expression
>> or the particular regular expression language limits its  
>> functionality.
>>
>>
>> 3.  HISTORICAL PROBLEM.
>>
>> MARC Koha was designed to use a single holdings field for item  
>> information
>> which is the same model used by the French interlibrary loan standard
>> Recommandation 995.  This is a severe limitation for holdings which  
>> had
>> been due for correction in Koha 3.2.
>>
>> I have edited the Koha English MARC 21 frameworks with the goal of  
>> making
>> the special Koha holdings field as close to the MARC 21 holdings as
>> possible within the severe constraints given.  In 2006, for Koha  
>> 2.2, I
>> started editing the Koha English MARC 21 frameworks.  I provided a  
>> place
>> in the Koha MARC 21 items field, 952, for the various parts of the  
>> call
>> number have their own separate subfilelds.  No Koha code used the  
>> separate
>> call number parts to form a concatenated call number as it should  
>> have.
>> The provision had merely been a capacity waiting to be implemented.
>>
>> An early intention of Koha 3.0 development had been to index all item
>> information in Zebra as part of a MARC record.  A single MARC field  
>> is
>> limited to numbers and letters of the English alphabet and some
>> programmers have feared using punctuation symbols to designate  
>> subfields.
>> Consequently, as more subfields were added to the Koha MARC 21  
>> items field
>> for the items table in the database, some underused subfields such  
>> as call
>> number parts were sadly removed, and placed in the non-repeatable  
>> Koha
>> MARC 21 biblioitems field, 942.
>>
>> Making 942 repeatable might break Koha on some unexpected  
>> triggering use
>> of some 942 content by Koha matched to the database.  The call number
>> parts might be safely removed from 942 and added to some new local  
>> use
>> repeatable field.  The safety of such a move should be investigated  
>> first.
>> If you merely add a new repeatable field without linking to the  
>> database,
>> then you could preserve the information for the call number parts  
>> without
>> risk of breaking Koha but Koha would also not manage the information
>> automatically in conjunction with the items without some additional  
>> code
>> for the purpose.
>>
>>
>> 4.  MY SUGGESTION FOR NOW.
>>
>> Managing call number information within the single items field of  
>> Koha is
>> probably the best solution for now unless you really want to be
>> adventurous.
>>
>> Find some suitable method of concatenating the complete call number  
>> from
>> your original records for the Koha call number subfield.  A Perl  
>> script
>> using MARC::Record could accomplish the task.  Koha support companies
>> exist to provide that level of assistance, although, such  
>> assistance may
>> be available to some degree via the mailing list.
>>
>> In future, regular expressions together with lists of known values  
>> may be
>> able to break apart the Koha concatenated call number subfield into  
>> its
>> constituent parts provided a particular library's call number usage  
>> is
>> sufficiently standardised.  Such a step would be needed for many Koha
>> libraries to migrate fully to a future more standards compliant  
>> holdings
>> model for Koha.
>>
>> If you worry about your concatenated call numbers not following a
>> sufficiently regular pattern for future decomposition into their  
>> correct
>> constituent parts, then post your irregularities and we can examine  
>> them.
>> There may be some alternatives with modifying the bibliographic  
>> frameworks
>> with available items subfield places which are not quite sufficient  
>> for
>> storing all the possible standard call number parts within 0-9 and  
>> a-z.
>>
>>
>> 5.  REAL FIXES.
>>
>> There is a fork of Koha used at Near East University in Cyprus in  
>> which
>> Tümer Garip had fixed the problem for his library in 2006.  That c 
>> ode is
>> available but already in 2006 it was very divergent from the main  
>> branch
>> of Koha.  Tümer's fork is missing many improvements in the main br 
>> anch of
>> Koha and Koha is missing many of his improvements which were  
>> implemented
>> too rapidly for the main branch to keep pace and also maintain  
>> support for
>> the needs of all the libraries using Koha.  It is very unfortunate  
>> that
>> the different needs and pacing of work had not allowed better
>> coordination.
>>
>> LibLime had been developing better holdings support for Koha 3.2 to  
>> start
>> solving many problems with holdings in Koha.  Better support for  
>> serials
>> had been one of the goals.  That work had been based in part on Tü 
>> mer's
>> work.  However, LibLime's withdrawal from participation with the  
>> rest of
>> the Koha community has meant that the work is sadly not being shared.
>>
>> The problem will be addressed in a future version of Koha but  
>> without the
>> participation of LibLime it will not be addressed properly for  
>> 3.2.  We
>> could test some framework changes for Koha 3.2.  I have an  
>> uncommitted
>> update to the MARC 21 frameworks which I had not quite finished  
>> over a
>> year ago.  An update from that point is also needed which requires  
>> a few
>> days work.
>>
>> I consider the issue a bug but it does not prevent libraries  
>> interested in
>> Koha from using Koha.
>>
>>
>> Thomas Dukleth
>> Agogme
>> 109 E 9th Street, 3D
>> New York, NY  10003
>> USA
>> http://www.agogme.com
>> +1 212-674-3783
>>
>>
>>>
>>> Hope this helps.
>>>
>>>
>>> Ian Walls
>>> Systems Integration Librarian
>>> NYU Health Sciences Libraries
>>> 550 First Ave., New York, NY 10016
>>> (212) 263-8687
>>>
>>>
>>>
>>>
>>>
>>> From: koha-bounces at lists.katipo.co.nz
>>> [mailto:koha-bounces at lists.katipo.co.nz] On Behalf Of Amy Schuler
>>> Sent: Tuesday, October 27, 2009 10:25 AM
>>> To: koha at lists.katipo.co.nz
>>> Subject: [Koha] Koha migration & subfield problem
>>>
>>> Hello,
>>>
>>> Does anyone have any ideas regarding my problem with 852/952 $m.   
>>> The
>>> problem is that, in editing all of my holdings data to migrate  
>>> nicely
>>> into
>>> Koha, I cannot figure out how to migrate my old 852 $m data (call  
>>> number
>>> suffix, a standard MARC subfield, I have publication years and  
>>> volume #s
>>> stored in this field). When I use MarcEdit to swap fields & edit  
>>> data, I
>>> can write a regular expression that will find all 852 $h, i, and m  
>>> and
>>> move them to 952 $o.  However, this only finds & replaces those 852s
>>> that
>>> have all 3 of those subfields.  Most of my records do not have all  
>>> 3,
>>> they
>>> only have $h and i.  So, when I run a second find & replace to  
>>> pick up
>>> the
>>> 852s that only have those two subfields, then a second 952 $o is
>>> created.
>>>
>>> In short, it seems that my choices in regard to $m data are: (1)  
>>> move $m
>>> to some completely different subfield ($z public note, $t copy  
>>> number?
>>> Little misleading), or (2) have 2 instances of $o in some of my 952
>>> fields, although I am not sure how this will render, or even whether
>>> Koha
>>> will accept this (is $o repeatable??)
>>>
>>> I hope I have been clear.  Please let me know if you have ideas.   
>>> I am
>>> running Koha v. 3.0001005 and I attach a sample record (with most  
>>> edits
>>> made) below.  Note that this record contains two 952s because it  
>>> is a 2
>>> volume set, as well as the $m's.
>>>
>>>
>>> =LDR  01105cam  2200301 i 4500
>>> =001  \\\74075531\//r952
>>> =003  DLC
>>> =005  19950918112125.9
>>> =008  750501s1975\\\\nyua\\\\\b\\\\00010\eng\\
>>> =010  \\$a   74075531 //r952
>>> =020  \\$a091384800X
>>> =040  \\$aDLC$cDLC$dDLC
>>> =043  \\$an-us---
>>> =050  \\$aQH76$b.E36
>>> =082  00$a333.7/2
>>> =100  10$aEgler, Frank Edwin,$d1911-
>>> =245  14$aThe plight of the rightofway domain :$bvictim of vandalism
>>> /$cby
>>> Frank E. Egler, consultant, and Stan R. Foote.
>>> =260  0\$aMt. Kisco, N.Y. :$bFutura Media Services,$c[1975]
>>> =300  \\$a2 v. :$bill. ;$c23 cm.
>>> =500  \\$aVol. 2 has also special title: Personalized documentation.
>>> =504  \\$aIncludes bibliographical references.
>>> =650  \0$aLandscape protection$zUnited States.
>>> =650  \0$aRight of way$zUnited States.
>>> =650  \0$aBrush$xcontrol$xEnvironmental aspects$zUnited States.
>>> =650  \0$aPlant ecology$zUnited States.
>>> =700  10$aFoote, Stan R.,$ejoint author.
>>> =952  \\$p10020$oHC 110 .E5 E35$mv.1$d20030108
>>> =952  \\$p10021$oHC 110 .E5 E35$mv.2$d20030108
>>>
>>>
>>>
>>> Amy C. Schuler
>>> Manager of Information Services
>>> Cary Institute of Ecosystem Studies
>>> PO Box AB
>>> Millbrook, NY  12545
>>> (845) 677-7600 x164
>>> schulera at caryinstitute.org<mailto:schulera at ecostudies.org>
>>
>> [...]
>>
>> _______________________________________________
>> Koha mailing list
>> Koha at lists.katipo.co.nz
>> http://lists.katipo.co.nz/mailman/listinfo/koha
>>
>
>

------------------------------------------------------------
This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain information that is proprietary, confidential, and exempt from disclosure under applicable law. Any unauthorized review, use, disclosure, or distribution is prohibited. If you have received this email in error please notify the sender by return email and delete the original message. Please note, the recipient should check this email and any attachments for the presence of viruses. The organization accepts no liability for any damage caused by any virus transmitted by this email.
=================================


More information about the Koha mailing list