Re: [Koha] Multiple bulk imports

29 Sep 2010

      Thanks Ian, very helpful. .  a few more questions though.

On Tue, Sep 28, 2010 at 5:51 PM, Ian Bays <ian.bays@ptfs-europe.com> wrote:
...
 Hi Elaine,
You will need a field to match on.  If ISBN does not work for you then
some sort of control number should do.  I often use a control number in
the 001 and set up a matching rule on that field.
If you have no match field it becomes a manual job and others might have
a better idea how to merge two bib records while keeping items from both
records along with any other associated loan data etc...
That's what I was afraid of.  It may have to be a manual job.  No
ISBNs and the 001 control numbers will be different, because they're
coming from two different databases.  The way we find duplicates  is a
combination of author, title and publication information. A big chunk
of the first collection was published before 1900, and even the dates
are estimates a lot of the time.

Luckily, the first collection is reference only, so there is no loan
data to worry about there. A very small percentage of items from the
second collection are available for loan.

Would this work? For the second collection, the one which will have
overlapping items, I could combine it with the first file and
de-duplicate offline.  If I transfer all the duplicate item
information from the second batch to the record with the control
number from the first batch, would it overwrite the biblio records in
Koha?
Or, would I have to clear out the Koha database and import the new
combined one.  The first batch of material does not circulate, so
there won't be any problems with items out on loan.
After that, the subsequent collections won't have overlapping items,
as they're different formats.  Phew.
...
Remember that the biblio matching uses the zebra search indexes so the
first batch must be loaded and indexed before loading the second batch.
There will probably be a gap of a few months between imports.
...
Also (I recall someone had this issue) if you have a batch which itself
has duplicates by (say) ISBN and you try to load it on an empty
database, and you want it to de-duplicate the bibs and add the items,
that won't work.  This is because the matching looks in the zebra
indexes which are not built at that point.  There are ways to overcome
this (extract a unique list of ISBNs from the data and make minimal bib
records with just ISBN, build zebra indexes then load the real data
matching on ISBN).
Hope that makes sense...
Ian
On 28/09/2010 17:19, Elaine Bradtke wrote:
...
Has anyone done their bulk imports in batches?  We've got different
collections that are in separate databases.  I was planning to import
each collection separately, because there are slightly different
things that have to be done to the data in each collection.
But there's a catch (isn't there always).  There are duplicate copies
of some the same items across the collections.  It seems logical that
they should share the same biblio, and have the location, and
collection information at the item level. How does this work if you
are importing the different collections in sequence?  A lot of our
stuff pre-dates ISBN so even identifying them may be a trick.
However, I was wondering how / if anyone has dealt with this problem.
--
Ian Bays
Director of Projects
PTFS Europe.com
mobile: +44 (0) 7774995297
phone: +44 (0) 800 756 6803
skype: ian.bays
email: ian.bays@ptfs-europe.com
_______________________________________________
Koha mailing list
Koha@lists.katipo.co.nz
http://lists.katipo.co.nz/mailman/listinfo/koha
-- 
Elaine Bradtke
Data Wrangler
VWML
English Folk Dance and Song Society | http://www.efdss.org
Cecil Sharp House, 2 Regent's Park Road, London NW1 7AY
Tel    +44 (0) 20 7485 2206 ext 36
Mob +44 (0) 7789 373982
--------------------------------------------------------------------------
Registered Company No. 297142
Charity Registered in England and Wales No. 305999
---------------------------------------------------------------------------
"Writing about music is like dancing about architecture"
--Elvis Costello (Musician magazine No. 60 (October 1983), p. 52)