How are authority records connected with biblio records ?
Hi, I am migrating several million records from an existing vtls system to koha. We are using mariadb, koha 17.05 (have pulled from master yesterday ), elastic search 5.4 and plack on debian systems. As I wanted to preserve all the authority to biblio record links I wrote my own migration script based on the relevant koha source. Searching in the koha authorities intra will find the authority record but usually lists the number of biblio records having the auth as zero, but not always. If I check the biblio index in ES the items have the correct auth id in the "an" field. I can find no difference between the biblio records in the ES index that are reported as linked to the auth record to the ones that are not. I connect the biblio recs to the authorities in the marc21 structure by putting the koha auth id in field 100 sub field 9 (if it is a name auth). These marc records are now in the new biblio_metadata table. How does koha find the biblio records connected to a paticular auth record ? Is this done dynamically by searching the ES indexes ? Is this an issue with the zebra database? Is it still required ? I haven't been loading data into it since we switched to ES. Best Regards, Dave
* David Holoshka (david.holoshka@ub.lu.se) wrote:
Hi,
I am migrating several million records from an existing vtls system to koha. We are using mariadb, koha 17.05 (have pulled from master yesterday ), elastic search 5.4 and plack on debian systems. As I wanted to preserve all the authority to biblio record links I wrote my own migration script based on the relevant koha source.
Hi David Did you really mean master, or did you pull from the 17.05.x branch? If you meant master you aren't running 17.05 but the unreleased code that will be 17.11.0 eventually. The ES authority support is still a work in progress. Chris
Searching in the koha authorities intra will find the authority record but usually lists the number of biblio records having the auth as zero, but not always. If I check the biblio index in ES the items have the correct auth id in the "an" field. I can find no difference between the biblio records in the ES index that are reported as linked to the auth record to the ones that are not. I connect the biblio recs to the authorities in the marc21 structure by putting the koha auth id in field 100 sub field 9 (if it is a name auth). These marc records are now in the new biblio_metadata table.
How does koha find the biblio records connected to a paticular auth record ?
Is this done dynamically by searching the ES indexes ?
Is this an issue with the zebra database? Is it still required ? I haven't been loading data into it since we switched to ES.
Best Regards,
Dave
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
-- Chris Cormack Catalyst IT Ltd. +64 4 803 2238 PO Box 11-053, Manners St, Wellington 6142, New Zealand
Hi , Thanks for your answer Chris. Is there some kind of plan / road map to get it working or should I just go ahead and try to fix it in a side branch of my own ? Is anyone coordinating the work for this ? Best Regards, Dave On 08/03/2017 10:42 PM, Chris Cormack wrote:
* David Holoshka (david.holoshka@ub.lu.se) wrote:
Hi,
I am migrating several million records from an existing vtls system to koha. We are using mariadb, koha 17.05 (have pulled from master yesterday ), elastic search 5.4 and plack on debian systems. As I wanted to preserve all the authority to biblio record links I wrote my own migration script based on the relevant koha source.
Hi David
Did you really mean master, or did you pull from the 17.05.x branch? If you meant master you aren't running 17.05 but the unreleased code that will be 17.11.0 eventually.
The ES authority support is still a work in progress.
Chris
Searching in the koha authorities intra will find the authority record but usually lists the number of biblio records having the auth as zero, but not always. If I check the biblio index in ES the items have the correct auth id in the "an" field. I can find no difference between the biblio records in the ES index that are reported as linked to the auth record to the ones that are not. I connect the biblio recs to the authorities in the marc21 structure by putting the koha auth id in field 100 sub field 9 (if it is a name auth). These marc records are now in the new biblio_metadata table.
How does koha find the biblio records connected to a paticular auth record ?
Is this done dynamically by searching the ES indexes ?
Is this an issue with the zebra database? Is it still required ? I haven't been loading data into it since we switched to ES.
Best Regards,
Dave
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
Hi David
I am migrating several million records from an existing vtls system to koha. We are using mariadb, koha 17.05 (have pulled from master yesterday ), elastic search 5.4 and plack on debian systems. As I wanted to preserve all the authority to biblio record links I wrote my own migration script based on the relevant koha source.
Searching in the koha authorities intra will find the authority record but usually lists the number of biblio records having the auth as zero, but not always. If I check the biblio index in ES the items have the correct auth id in the "an" field. I can find no difference between the biblio records in the ES index that are reported as linked to the auth record to the ones that are not. I connect the biblio recs to the authorities in the marc21 structure by putting the koha auth id in field 100 sub field 9 (if it is a name auth). These marc records are now in the new biblio_metadata table.
How does koha find the biblio records connected to a paticular auth record ?
Is this done dynamically by searching the ES indexes ?
Is this an issue with the zebra database? Is it still required ? I haven't been loading data into it since we switched to ES.
I don't really know about the differences between Zebra and Elasticsearch. Also I don't know how you're exactly massaging your data, of course. However, using Zebra it is necessary to use Perl script "link_bibs_to_authorities.pl" to check each bibliographic record in the Koha database and attempting to link each of its headings to the matching authority record. The script is also able to do a test run. Also note the system preferences in Koha menu "Administration > authorities" as well as "IncludeSeeFromInSearches". Hope this helps. Someone may correct me if I'm wrong. Best wishes: Michael -- Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz T 0041 (0)61 261 55 61 · E mik@adminkuhn.ch · W www.adminkuhn.ch
Hi Micheal, I originally tried the script link_bibs_to_authorities.pl but there were two issues with it: 1. After running for days it would use up all available memory and stop working 2. It does not preserve the links we have today between auths and biblios but does its best to connect records based on various pattern matching algorithms. Best Regards, Dave On 08/09/2017 02:59 PM, Michael Kuhn wrote:
Hi David
I am migrating several million records from an existing vtls system to koha. We are using mariadb, koha 17.05 (have pulled from master yesterday ), elastic search 5.4 and plack on debian systems. As I wanted to preserve all the authority to biblio record links I wrote my own migration script based on the relevant koha source.
Searching in the koha authorities intra will find the authority record but usually lists the number of biblio records having the auth as zero, but not always. If I check the biblio index in ES the items have the correct auth id in the "an" field. I can find no difference between the biblio records in the ES index that are reported as linked to the auth record to the ones that are not. I connect the biblio recs to the authorities in the marc21 structure by putting the koha auth id in field 100 sub field 9 (if it is a name auth). These marc records are now in the new biblio_metadata table.
How does koha find the biblio records connected to a paticular auth record ?
Is this done dynamically by searching the ES indexes ?
Is this an issue with the zebra database? Is it still required ? I haven't been loading data into it since we switched to ES.
I don't really know about the differences between Zebra and Elasticsearch. Also I don't know how you're exactly massaging your data, of course.
However, using Zebra it is necessary to use Perl script "link_bibs_to_authorities.pl" to check each bibliographic record in the Koha database and attempting to link each of its headings to the matching authority record. The script is also able to do a test run.
Also note the system preferences in Koha menu "Administration > authorities" as well as "IncludeSeeFromInSearches".
Hope this helps. Someone may correct me if I'm wrong.
Best wishes: Michael
Hi David
I originally tried the script link_bibs_to_authorities.pl but there were two issues with it:
1. After running for days it would use up all available memory and stop working
In my case I used this script for a maximum of around 200'000 bibliographic records. I see in your original e-mail you wrote of "several million records"... Did you see the script offers the following options? Maybe using them can change the observed behaviour: --auth-limit=S ... Only process those headings which match an authority record that matches the user-specified WHERE clause. --bib-limit=S ... Only process those bib records that match the user-specified WHERE clause. --commit=N ... Commit the results to the database after every N records are processed.
2. It does not preserve the links we have today between auths and biblios but does its best to connect records based on various pattern matching algorithms.
Unfortunately the manpage of this script (and other scripts...) are not very explicit about what they are really doing. Thanks for pointing this out! Best wishes: Michael -- Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz T 0041 (0)61 261 55 61 · E mik@adminkuhn.ch · W www.adminkuhn.ch
Hi Dave, I think for migrations something like this would work: - Import your authorities. Koha will force 001 to match auth_header.authid. - Create a file with the auth_header.authid and your old authority number if there is one - Add $9<authid> to your authority controlled fields in your MARC records - Import bibliographic records AFAIK auth_header.authid must be the same as 001 in the authority record. Koha will enforce this on import, so if your records already have an authority id used in your data, you might want to store it in 035 with a unique prefix to differentiate between IDs. Hope this makes sense, Katrin On 09.08.2017 16:23, David Holoshka wrote:
Hi Micheal,
I originally tried the script link_bibs_to_authorities.pl but there were two issues with it:
1. After running for days it would use up all available memory and stop working
2. It does not preserve the links we have today between auths and biblios but does its best to connect records based on various pattern matching algorithms.
Best Regards,
Dave
On 08/09/2017 02:59 PM, Michael Kuhn wrote:
Hi David
I am migrating several million records from an existing vtls system to koha. We are using mariadb, koha 17.05 (have pulled from master yesterday ), elastic search 5.4 and plack on debian systems. As I wanted to preserve all the authority to biblio record links I wrote my own migration script based on the relevant koha source.
Searching in the koha authorities intra will find the authority record but usually lists the number of biblio records having the auth as zero, but not always. If I check the biblio index in ES the items have the correct auth id in the "an" field. I can find no difference between the biblio records in the ES index that are reported as linked to the auth record to the ones that are not. I connect the biblio recs to the authorities in the marc21 structure by putting the koha auth id in field 100 sub field 9 (if it is a name auth). These marc records are now in the new biblio_metadata table.
How does koha find the biblio records connected to a paticular auth record ?
Is this done dynamically by searching the ES indexes ?
Is this an issue with the zebra database? Is it still required ? I haven't been loading data into it since we switched to ES.
I don't really know about the differences between Zebra and Elasticsearch. Also I don't know how you're exactly massaging your data, of course.
However, using Zebra it is necessary to use Perl script "link_bibs_to_authorities.pl" to check each bibliographic record in the Koha database and attempting to link each of its headings to the matching authority record. The script is also able to do a test run.
Also note the system preferences in Koha menu "Administration > authorities" as well as "IncludeSeeFromInSearches".
Hope this helps. Someone may correct me if I'm wrong.
Best wishes: Michael
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
participants (4)
-
Chris Cormack -
David Holoshka -
Katrin -
Michael Kuhn