[Koha] data import question - a few more thoughts

baljkas at mb.sympatico.ca baljkas at mb.sympatico.ca
Fri Jul 25 12:50:50 NZST 2003


Thursday, July 24, 2003   18:46 CDT

Hi, Larry,

Glad that some of what I wrote was useful. I know that the MARC stuff isn't
easy for those who haven't been 'initiated' into its mysteries.

"Larry Stamm" <larry at larrystamm.com> wrote:
>baljkas  <baljkas at mb.sympatico.ca> writes:
>
>    > Just a quick addendum re: Larry Stamm's response -- [snip]
>    > viz. what is arbitrary in the '852'.
>
>    > Of course, all of the structure of tags and subfields is
>    > arbitrary Remember to check LC's MARC documentation to see what
>    > fields and subfields within fields are defined and how they are
>    > used. The concise MARC info online is kept up-to-date and it does
>    > contain examples to help clarify proper usage. For 852 surf to
>    > URL <http://www.loc.gov/marc/bibliographic/ecbdhold.html#mrcb852>
>
>    > From the prescribed definition and usage examples given, you will
>    > note that [blah, blah, blah, snip]
>
>    > Mapping for 852a, 852b, and 852f should not be treated
>    > arbitrarily.
>
>My understanding is that the 852 field is all for "local" data, and is
>mainly for use within an organization. Is that true?  It seems that our
>library staff has manually added all the 852 data since we automated,
>and this is where the majority of irregularities occurs in our MARC
>records.

I snipped back some of my verbiage to make the more important parts a
little more identifiable.

Yes, your understanding is correct, Larry, in that 852 is "mainly" for use
within an organization. However (you knew this was coming, didn't you? ;-)
) ...

Part of the whole purpose of creating MARC records was the sharing of
cataloguing data. LC proves this daily by giving us all free access to its
records (well, one has to have the Internet, etc., etc., but that's not
their fault).

Part of what we're supposed to be doing, as cataloguers, is obey the
conventions (AACR2R, ISBD, etc.) and the standards for use of the MARC21
system (this is why we are given examples, and for those lucky enough to
work in libraries that can afford such, access to the Library of Congress
Rule Interpretations, LCRIs).

If sharing is going to be possible, we do have to look at how things are
set up. Yes, it was arbitrary (small groups of people working mainly at
LC), but it is thorough. It isn't really a good idea to start switching
things around.

I remember Athena very well. Especially the arguments I had with my
so-called assistant over the useless entry of data like source of
acquisition date, fund, price, all of which can be entered in other MARC
fields properly kept out of patron's prying eyes anyway (no, I know that
data isn't useless, but it can be very frustrating to try explaining to a
patron why you're charging them $25 to replace a book they lost when your
own catalogue says it only cost $5 [way back in 1979, of course] and I
would still rather spend cataloguing time on Authority Records or at least
on augmenting the collection with good 520s, genre labels and keywords to
improve search results).

While I admire the ingenuity of the computer-savvy in creating these
wonderful systems -- and Athena is a very neat system that I quite like --
the problems it runs into are largely of its creators own making in
deliberating deciding they know better than the sages of LC.

>The reason I chose the arbitrary mappings I did was because our current
>MARC records have no data in those subfields, so there would be no
>conflict in importation into Koha. [snip]

When I was in grad school for history, my advisor constantly warned us
students of the dangers of present-ism: 'If it's this way now, that's how
it will always be, how it always was.' So, while I completely sympathise,
Larry, LC-NLC have determined the usage of the field and subfields therein
and it really isn't a good idea not to follow form.

Who knows what developments in the future will bring? It makes it easier if
we're all playing the game with the same rule book, even just for the
present, let alone the future.

The idea that all cataloguing is isolated from affecting others really
belongs to the pre-MARC/online era. If we all follow the rules, we can all
help each other out now and in the future.

In the alternate mapping I suggested (and keep in mind, that's only my
off-the-cuff attempt; it has no particular authority), I assumed you wanted
to use the one field 852, so I just chose subfields that would allow you to
do what you wanted within that one field. (BTW if the tab thing you were
writing about to Derek would allow keeping the data in allowed subfields
only, that would be preferable to what I suggested viz. a $d or $w.)

>The manual for our current software (Sagebrush's Athena) suggests this
>mapping:
>
>    852 6 --> Format (eg, book, paperback)
>    852 7 --> Aquisition fund
>    852 8 --> Aquisition date [well, you know that ain't allowed!!
		  in fact, if you followed convention and inputted
		  the date in this field, e.g. as 20030724, depending
		  on what system you migrated to later, you could have
		  real fun with the computer trying to find something
		  to link to]
>    852 9 --> Aquisition price

When I was first confronted with this, I was mighty perturbed. I understand
that Athena's programmers wanted to collect the data in one field: all they
needed to do though was use multiple 852 $x fields! There is no reason at
all for their creativity!! Following the rules would actually simplify
matters.

Personally, I would have preferred the data to be placed in a good old 037
tag. To my mind, its subfields provide for all of the above, plus it can be
entered at the time one creates a temporary minimal record (to indicate to
patrons searching that an item is on-order for example) without the need to
muck about with the 852 or other call number field until the work is
received in cataloguing.

The map for the 037, is simple
   $a Stock number - possibly useful for re-ordering/replacement
   $b Source of stock number/acquisition **
   $c Terms of availability = (in almost all cases) price **
   $f Form of issue - easily adaptable to be an institution's needs
      with a ltd. institution-specific set of terms or more ones
      more generally recognised
   $n Note - where you could easily enter acquisition fund (usually
      coded in most institutions anyway), order date and received-in
      date

ex. 037 _ _ $bBook Barn $c$22.50 $c$5.00 (sale price) $fTrade pbk.
	    $nGenBkFund; ordered 20021209; received 20030108.
	   
>Given the discrepancy between this usage and that laid out in the >Library
of Congress website, I'm wondering just how standardized actual 
>MARC implementaions are across all libraries?

Not very standardised, as you rightly hint. Or at least, not so much in the
details. Core fields like the 008, 100, 245, 250, 260, 300 tend to be
pretty good. But that doesn't mean we should stop trying. If it makes you
feel any the better, I keep running across records in the NLC database
where contributing cataloguers still aren't even following ISBD description
rules for newer records (ca. 1985) and the ISBD rules, IIRC, are older than
me!

As a cautionary tale, let me warn you about what happened to one school
division in Winnipeg. For years that division (and the one I reside and
have worked in) used TKM Microcat, a decent system for when it was first
created. They were told not to bother coding the 008 -- a core required
field -- because that system didn't really use it (can't sort results by
date, e.g. because of this).

That minor reduction in labour over the years came as little comfort to the
division's cheque book a few years back when all the schools had to send
ALL their records for RECON because the new system they migrated to was
MARC-compliant and viewed the records it was fed as the garbage they were!
I kinda think LC should start gathering up a page of horror stories like
this to remind folks why it is important to follow those nagging standards,
even when it doesn't seem to have any purpose at the time. ;-)

In any case, LC sets the structure (and it isn't completely autocratic BTW:
someone at your level of authority in a library can make suggestions; my
former supervisor did) and if you think it's difficult now, think what
might happen if they designate something your system was using for a
different purpose and many others go along with the change.
Either 'welcome back to the age of splendid isolation' or See above for
'The moral of the story is ...'

>I went ahead and made some small changes, like changing "fic", "Fic",
>"FiC", all to "FIC" and limiting further entries into the Athena
>database to "FIC" for fictional books.  That amounted to a criminal
>offence (almost), according to the staff. :)

I would've brought you a thank-you card and cake! ;-)

>From a management point of view, I think it will be easier to just get
>the data correctly imported into Koha's database tables and do the
>correction and standardization in that database. Then limiting manual
>biblio entries to only authorized standard values and/or formats can be 
>hidden within the general chaos of implementing new software.

I agree. Plus making the patrons feel more at ease and part of the positive
change, and this is of no inconsiderable value!

Anyway, I hope some of what I've said this time makes sense, too, and that
it may be useful to you and others.

Cheers from Winnipeg,

Steven F. Baljkas
library tech at large
Koha neophyte





More information about the Koha mailing list