[Koha] Koha migration & subfield problem

Thomas Dukleth kohalist at agogme.com
Wed Oct 28 10:42:41 NZDT 2009


Reply inline:


1.  INNOVATIVE BUT INSUFFICIENT WORKAROUND.

On Tue, October 27, 2009 14:34, Walls, Ian wrote:
> Amy,
>
>
> Koha is MARC agnostic on its own; you can adjust your MARC Frameworks to
> accept repeated subfields as you like.  So, as far as MARC record validity
> goes, you should be just fine with multiple 952$o.
>
> In my experience, if you supply multiple subfields within a 952, they all
> get loaded into mapped database field (952$o ->  itemcallnumber in the
> default framework), separated by " | ", up to the character limit of the
> field (30 in the case of itemcallnumber).  This would let you cram
> multiple bits of information into the same field; if you don't like the "
> | " (and you didn't lose too much info), you could export back to
> MarcEdit, change the " | " to a more pleasant separator, and re-import.

Ian Wallis' idea about making the Koha call number subfield repeatable in
the MARC framework is very innovative but is unlikely to help with how the
Koha program actually uses the call number.  Even if your MARC may
validate, there is always a danger of breaking Koha by tinkering in the
wrong manner.  Having access to the source code means that you can always
fix it if necessary but some caution should be taken over fields linked to
database columns.

Be careful to always keep a backup of your data for a long time in case
you may later discover problems from an experimental change.


2.  OTHER DATA POPULATION METHODS.

>
> Another option, depending on your comfort with XSLT, would be to convert
> your old MARC into MARCXML, run it through a stylesheet that merges the
> subfields into 952$o, then convert back to MARC.  You can get a little
> more fine grained control with the logic in MARCXML (not a lot,  but
> enough).

There are many ways to programmatically concatenate the call number parts
to form a complete call number.  A single regular expression may be
possible but perhaps what you are using to control the regular expression
or the particular regular expression language limits its functionality.


3.  HISTORICAL PROBLEM.

MARC Koha was designed to use a single holdings field for item information
which is the same model used by the French interlibrary loan standard
Recommandation 995.  This is a severe limitation for holdings which had
been due for correction in Koha 3.2.

I have edited the Koha English MARC 21 frameworks with the goal of making
the special Koha holdings field as close to the MARC 21 holdings as
possible within the severe constraints given.  In 2006, for Koha 2.2, I
started editing the Koha English MARC 21 frameworks.  I provided a place
in the Koha MARC 21 items field, 952, for the various parts of the call
number have their own separate subfilelds.  No Koha code used the separate
call number parts to form a concatenated call number as it should have. 
The provision had merely been a capacity waiting to be implemented.

An early intention of Koha 3.0 development had been to index all item
information in Zebra as part of a MARC record.  A single MARC field is
limited to numbers and letters of the English alphabet and some
programmers have feared using punctuation symbols to designate subfields. 
Consequently, as more subfields were added to the Koha MARC 21 items field
for the items table in the database, some underused subfields such as call
number parts were sadly removed, and placed in the non-repeatable Koha
MARC 21 biblioitems field, 942.

Making 942 repeatable might break Koha on some unexpected triggering use
of some 942 content by Koha matched to the database.  The call number
parts might be safely removed from 942 and added to some new local use
repeatable field.  The safety of such a move should be investigated first.
 If you merely add a new repeatable field without linking to the database,
then you could preserve the information for the call number parts without
risk of breaking Koha but Koha would also not manage the information
automatically in conjunction with the items without some additional code
for the purpose.


4.  MY SUGGESTION FOR NOW.

Managing call number information within the single items field of Koha is
probably the best solution for now unless you really want to be
adventurous.

Find some suitable method of concatenating the complete call number from
your original records for the Koha call number subfield.  A Perl script
using MARC::Record could accomplish the task.  Koha support companies
exist to provide that level of assistance, although, such assistance may
be available to some degree via the mailing list.

In future, regular expressions together with lists of known values may be
able to break apart the Koha concatenated call number subfield into its
constituent parts provided a particular library's call number usage is
sufficiently standardised.  Such a step would be needed for many Koha
libraries to migrate fully to a future more standards compliant holdings
model for Koha.

If you worry about your concatenated call numbers not following a
sufficiently regular pattern for future decomposition into their correct
constituent parts, then post your irregularities and we can examine them. 
There may be some alternatives with modifying the bibliographic frameworks
with available items subfield places which are not quite sufficient for
storing all the possible standard call number parts within 0-9 and a-z.


5.  REAL FIXES.

There is a fork of Koha used at Near East University in Cyprus in which
Tümer Garip had fixed the problem for his library in 2006.  That code is
available but already in 2006 it was very divergent from the main branch
of Koha.  Tümer's fork is missing many improvements in the main branch of
Koha and Koha is missing many of his improvements which were implemented
too rapidly for the main branch to keep pace and also maintain support for
the needs of all the libraries using Koha.  It is very unfortunate that
the different needs and pacing of work had not allowed better
coordination.

LibLime had been developing better holdings support for Koha 3.2 to start
solving many problems with holdings in Koha.  Better support for serials
had been one of the goals.  That work had been based in part on Tümer's
work.  However, LibLime's withdrawal from participation with the rest of
the Koha community has meant that the work is sadly not being shared.

The problem will be addressed in a future version of Koha but without the
participation of LibLime it will not be addressed properly for 3.2.  We
could test some framework changes for Koha 3.2.  I have an uncommitted
update to the MARC 21 frameworks which I had not quite finished over a
year ago.  An update from that point is also needed which requires a few
days work.

I consider the issue a bug but it does not prevent libraries interested in
Koha from using Koha.


Thomas Dukleth
Agogme
109 E 9th Street, 3D
New York, NY  10003
USA
http://www.agogme.com
+1 212-674-3783


>
> Hope this helps.
>
>
> Ian Walls
> Systems Integration Librarian
> NYU Health Sciences Libraries
> 550 First Ave., New York, NY 10016
> (212) 263-8687
>
>
>
>
>
> From: koha-bounces at lists.katipo.co.nz
> [mailto:koha-bounces at lists.katipo.co.nz] On Behalf Of Amy Schuler
> Sent: Tuesday, October 27, 2009 10:25 AM
> To: koha at lists.katipo.co.nz
> Subject: [Koha] Koha migration & subfield problem
>
> Hello,
>
> Does anyone have any ideas regarding my problem with 852/952 $m.  The
> problem is that, in editing all of my holdings data to migrate nicely into
> Koha, I cannot figure out how to migrate my old 852 $m data (call number
> suffix, a standard MARC subfield, I have publication years and volume #s
> stored in this field). When I use MarcEdit to swap fields & edit data, I
> can write a regular expression that will find all 852 $h, i, and m and
> move them to 952 $o.  However, this only finds & replaces those 852s that
> have all 3 of those subfields.  Most of my records do not have all 3, they
> only have $h and i.  So, when I run a second find & replace to pick up the
> 852s that only have those two subfields, then a second 952 $o is created.
>
> In short, it seems that my choices in regard to $m data are: (1) move $m
> to some completely different subfield ($z public note, $t copy number?
> Little misleading), or (2) have 2 instances of $o in some of my 952
> fields, although I am not sure how this will render, or even whether Koha
> will accept this (is $o repeatable??)
>
> I hope I have been clear.  Please let me know if you have ideas.  I am
> running Koha v. 3.0001005 and I attach a sample record (with most edits
> made) below.  Note that this record contains two 952s because it is a 2
> volume set, as well as the $m's.
>
>
> =LDR  01105cam  2200301 i 4500
> =001  \\\74075531\//r952
> =003  DLC
> =005  19950918112125.9
> =008  750501s1975\\\\nyua\\\\\b\\\\00010\eng\\
> =010  \\$a   74075531 //r952
> =020  \\$a091384800X
> =040  \\$aDLC$cDLC$dDLC
> =043  \\$an-us---
> =050  \\$aQH76$b.E36
> =082  00$a333.7/2
> =100  10$aEgler, Frank Edwin,$d1911-
> =245  14$aThe plight of the rightofway domain :$bvictim of vandalism /$cby
> Frank E. Egler, consultant, and Stan R. Foote.
> =260  0\$aMt. Kisco, N.Y. :$bFutura Media Services,$c[1975]
> =300  \\$a2 v. :$bill. ;$c23 cm.
> =500  \\$aVol. 2 has also special title: Personalized documentation.
> =504  \\$aIncludes bibliographical references.
> =650  \0$aLandscape protection$zUnited States.
> =650  \0$aRight of way$zUnited States.
> =650  \0$aBrush$xcontrol$xEnvironmental aspects$zUnited States.
> =650  \0$aPlant ecology$zUnited States.
> =700  10$aFoote, Stan R.,$ejoint author.
> =952  \\$p10020$oHC 110 .E5 E35$mv.1$d20030108
> =952  \\$p10021$oHC 110 .E5 E35$mv.2$d20030108
>
>
>
> Amy C. Schuler
> Manager of Information Services
> Cary Institute of Ecosystem Studies
> PO Box AB
> Millbrook, NY  12545
> (845) 677-7600 x164
> schulera at caryinstitute.org<mailto:schulera at ecostudies.org>

[...]



More information about the Koha mailing list