[Koha] Data Conversion
baljkas at mb.sympatico.ca
baljkas at mb.sympatico.ca
Wed Oct 1 21:04:24 UTC 2003
Wednesday, October 1, 2003 15:32-16:04 CDT
Hi, Luke,
Regarding your data conversion problem, you DON'T need to figure out the
PERL stuff right now if you don't want to (I mean, knock yourself out, if
you do ... ;-)).
As I have posted to the listserv several times, there is a VERY NEAT and
FREE program available off of the Library of Congress MARC ... Tools page
at
URL <http://www.loc.gov/marc/marctools.html#recordtools>
called MARCEdit. You will also need to download the free tools MARCMakr
(and MARCBreaker) from the same page.
The part that addresses your specific sit' reads: "[...] the MarcEdit
Delimited Text Translator provides users with a simple method for mapping
delimited files into MARC."
Here is all the info about MARCEdit directly from that page:
===============================
MarcEdit - Free
MarcEdit is a free Windows-based utility that runs on any PC (486+) using
Windows 9x/ME/NT/2000/XP.
MarcEdit is made up of two core components: the MarcEngine and the
MarcEditor-in addition to a number of built-in MARC utilities. Currently,
MarcEdit's MarcEngine supports: 1) MARC->mnemonic plain text 2) mnemonic
plain text->MARC 3) MARC->XML - Function uses LC's new MARCXML Schema
4)MARC->Dublin Core (unqualified 1.1). The MARC editor is equipped with
numerous global editing options, like the ability to add or delete fields
or subfields and the ability to edit or change indicators. The MarcEditor
provides MarcEdit with a robust MARC editor, making record maintenance and
record creation a snap.
MarcEdit provides an integrated interface for utilizing the Library of
Congress's web Z39.50 client to import MARC records directly into the
MarcEditor. In addition to MarcEdit's core functions, the current version
of MarcEdit (version 4.1), includes a built-in Script Wizard and Delimited
Text Translator utility. The MarcEdit Script Wizard has been designed to
help users generate simple scripting solutions for common maintenance
problems, while the MarcEdit Delimited Text Translator provides users with
a simple method for mapping delimited files into MARC. MarcEdit has also
been designed using Microsoft's COM architecture, allowing users to access
the functionality of the MarcEngine through numerous scripting or
programming languages like vbscript, jscript, Visual Basic, C++ and PERL.
The help file includes a number of simple programming examples in both
vbscript and PERL for most of the application's exposed functions-making
your only limitation your imagination.
URL <http://www.onid.orst.edu/~reeset/marcedit/html/>
Contact:
Terry Reese
Oregon State University
121 Valley Library
Corvallis, OR 97331-4501
U.S.A.
Internet e-mail address: terry.reese at orst.edu
Telephone: +1-541-343-2397
FAX: +1-541-737-8267
===================================================
If you require some additional help with it, please write back to me
off-listserv and I will glad to do what I can to assist. If your system
isn't Windows-based and you have no friendly helper nearby with one, I'd be
glad to give the conversion a go for you.
# From the sample you provided, Luke, the mapping should be fairly
straight-forward. For the information in the following fields, I'd suggest
the following mapping to the 541 field (both indicators blanks; below $ =
subfield delimiter) which you are free to use or not:
Type ---------> 541 $o Type of Unit
Accession ----> 541 $e Accession Number (I wasn't clear on what
kind of info your library was storing here
but assumed it was an accession number
of some kind)
Acq. Year ----> 541 $d Date of Acquisition
Cost ---------> 541 $h Purchase Price
I realise that the 541 field is normally used for archival materials, but
its usage in that regard is not prescriptive and it seems a shame to waste
a very well-defined field.
I know the issue of the call numbers has been addressed on listserv before.
I've ignored all of the discussion because I need LC numbers and Koha isn't
quite there yet. Hopefully someone else will be able to redirect you to
that info and where and how it should be mapped.
IIRC someone wrote to the listserv a while back about fixing the ISBNs; in
any case, you will need to remove the hyphens in mapping it over to the 020
ISBN field.
Luke, I can also tell you from the small sample provided that there may be
some inconsistencies in your library's catalogue viz. current ISBD
(International Standard Bibliographic Description) and AACR2R
(Anglo-American Cataloguing Rules 2nd Revision - most up-to-date being 2002
revision, 2003 update) standards.
If your library is following different standards, that's fine; you'll just
need to adjust some of the MARC coding and so it helps to know that from
the start. Otherwise, someone will need to go into the records and tweak
them manually (e.g. author's surname is not to be all caps in proper ISBD),
which, depending on your student and or adult library volunteers, is an
excellent little project, in my experience at least.
You should also find out from your librarian before you start which
thesaurus the subject terms are from (I'd guess Sears from what you gave)
as this is also directly coded into the MARC fields (650 indicator 2 gives
thesaurus source: Sears would be 7 $2sears<version no.>
e.g. 650 _7 $aSpace flight. $2sears19).
Hope this helps, Luke. Do feel free to give me a shout off-listserv if you
want any aid. And please do let us know on listserv how things were done
and went when you're done.
Cheers,
Steven F. Baljkas
library tech at large
Koha neophyte
Winnipeg, MB, CANADA
<baljkas at mb.sympatico.ca>
More information about the Koha
mailing list