[Koha] Batch import-updating of existing biblios

hansbkk at gmail.com hansbkk at gmail.com
Thu Feb 24 16:57:14 NZDT 2011


On Thu, Feb 24, 2011 at 4:09 AM, M. Brooke Helman
<abesottedphoenix at yahoo.com> wrote:
> Salve!
>>>
> And I would like to "add value" to them later on - say I find a MARC
> source that assigns subject headings in a way that I like, or adds
> links to cover images, etc. looking to selectively overlay/insert a
> subset of the new records into my existing ones.
>
> How does Koha handle this?
>>>
>
>      Okay, that's a good question :)
>
>      Koha pretty much doesn't right now, but folks are taking some hammers to
> the code to make it try and manage that safely. PTFS kind of doesn't get that
> you need a scalpel and not a caber for this job. (What else is new?)
>
>>>
> Or would I be better off exporting my current records to MARC, using
> an external MARC editing tool to handle the merge and then re-import?
> Can anyone suggest tools for me to look at? I'm aware of biblios.net
> and MarcEdit so far.
>>>
>
>       Currently yes, and I'd use the latter not the former, since the former
> sucks.
>
>
>>>
> Or if it's really better to do this one record at a time, does that
> change the recommendation as to using Koha for the process?
>>>
>
>       Mebbe. Sometimes Koha can put weird stuff into a MARC record, though
> that's mostly over with by now. I used it as an editor since I was in the middle
> of nowhere and no one would ever notice the aforementioned weird stuff anyhow.
>
>>>
> And finally, what if there are certain field values that I do want
> overwritten in addition to the new ones being brought in, but some
> fields shouldn't be touched at all?
>>>
>
>       Hmph. I'd wait for Nicole's word on that.
>
> Cheers,
> Brooke

Thanks Brooke.

I'm thinking I'll use Koha for one-at-a-time editing, and try to
develop a sound workflow for bulk updating using external tools. Here
are my current ideas - sanity check (from anyone) appreciated - and I
realize it's a kludge, wish I were up to speed on text tools like Perl
8-(

Unfortunately it seems that MARCedit is good for automating merges
when you're merging in specific fields from consistent sources,
matching control numbers etc, not for the kind of ad-hoc updating I'm
doing.

Note the below assumes I'm collecting a bunch of MARC records from a
variety of sources, the goal being to cherry-pick the best quality
data from each - some sources have good Subject headings, others more
accurate imprint/collation data etc.

1 Export MARC records from Koha with a solid local control number (001)

2 Split these out into one-biblio-record per file named "koha.mrc",
each in a folder named with the control number, drop item-level
fields.

3 Bring in external MARC records, each file downloaded to the
per-biblio folders, files named after the source.

4 Use MARCedit's to join all files per folder into one working
multiple-record MARC file per title, then export as CSV and bring into
a spreadsheet.

One worksheet per title, one record per source MARC file. Column
headers show the MARC fields used, everything's all in one place easy
to scan.

5 Copy and paste, consolidate all into a single row (ensuring koha
control number remains), delete the fields not used in Koha.
Save/export the new master record back as MARC into the same folder
named "new.mrc",

6 Join all the new.mrc's into a single file, stage into reservoir.

7 Use (one of the) Koha merge functions to update the existing biblios
based on the control number matching - all existing data in each
record overwritten by the incoming data.


Obviously it would be great if either Koha or some other external tool
allowed one to display multiple MARC records on screen in a GUI,
allowing one to tick off the fields containing desired data, copying
and pasting as needed and then one-button "merge" as specified to
produce a new record.

But it's a lot to ask I know, maybe one day Koha will have some of
this. If I can help the developers at any stage from a motivated
user's POV, someone please let me know how.

And thanks in advance for any feedback on the above, especially
suggestions on how to simplify a rather convoluted process, this just
doesn't seem to be an area where one can KISS. I wish I could just say
"eh good enough" and move on, but my OCD perfectionism won't let go
8-)


More information about the Koha mailing list