[Koha] Zebra not updating biblios automatically in koha 3.8

Ian Bays ian.bays at ptfs-europe.com
Sun Sep 2 10:32:57 NZST 2012


Hi.
The 3.8 upgrade offers the dom indexing by default and if you have taken 
that option (as seen in $KOHA_CONF) the xsl used instead of record.abs 
(~/koha-dev/etc/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl) 
uses a construct (z:id) for the 001 which uses that (if it exists) as 
the zebra unique id.  This means if you have more than one bib record 
with the same 001 (as you get if you duplicate a bib for instance) it 
will only index the last one and it won't complain at all about it.
Not sure if it's a hangover from using the xml used by authorities which 
stores the auth_id in the 001 or UNIMARC which might use 001 as the bib 
number.  Either way I bet if you remove the 001 or make it unique then 
it will index OK.
The better solution is to fix the xsl to probably not use the z:id for 
biblios or maybe get it to use the 999$c, but the zebra config scares me.
It took ages to find the cause so I hope this helps someone.
Ian
On 01/09/2012 18:11, Doug Kingston wrote:
> On 1 September 2012 09:46, Jared Camins-Esakov
> <jcamins at cpbibliography.com>wrote:
>
>> Doug,
>>
>> So environment variables are not the issue.  We are carefully managing
>>> those.
>>>
>> Make sure when you are using cron jobs that you set the environment
>> variables IN YOUR CRONTAB. Setting environment variables elsewhere is a
>> recipe for confusion and misery down the road. However, this is -- as you
>> say -- not the problem.
>>
>>
>>> I have tried using the new tool checkNonIndexedBiblios.pl (from patch
>>> 6566)
>>> and it indeed finds a few recent biblios that are not indexed.  Using the
>>> -z option to mark them for indexing followed by a manual run of
>>> rebuild_zebra -b -v -z did not get the biblios indexed.  I cranked up the
>>> debugging on zebraidx (by modifying rebuild_zebra.pl and using -v -v) and
>>> did not see any obvious errors in the output that would suggest why
>>> indexing was failing.
>>>
>> Did you change your bibliographic frameworks? It could be a matter of the
>> biblionumber not being stored properly. The other thing to do is to confirm
>> that the non-indexed biblios are *actually* getting added to the zebraqueue
>> by the 6566 script. It's kind of a long shot, but it could be an issue with
>> the zebraqueue table getting corrupted. I've seen this happen when the
>> zebraqueue table got too large, and disk space was low.
>>
> So I think this is working as expected.  Disk space is ample on the system
> in question, and the catalogue is small by most standards (about 2500
> biblios).  I ran rebuild_zebra.pl with the -k flag so it left the exported
> records and here's the tree I got.
>
> library:/tmp# ls -altR p6tjtKrrK3/
> p6tjtKrrK3/:
> total 0
> drwxrwxrwt 6 root root 1040 Sep  1 17:50 ..
> drwx------ 5 koha koha  100 Sep  1 06:36 .
> drwxr-xr-x 2 koha koha   60 Sep  1 06:36 upd_biblio
> drwxr-xr-x 2 koha koha   60 Sep  1 06:36 del_biblio
> drwxr-xr-x 2 koha koha   40 Sep  1 06:36 biblio
>
> p6tjtKrrK3/upd_biblio:
> total 16
> -rw-r--r-- 1 koha koha 12670 Sep  1 06:36 exported_records
> drwxr-xr-x 2 koha koha    60 Sep  1 06:36 .
> drwx------ 5 koha koha   100 Sep  1 06:36 ..
>
> p6tjtKrrK3/del_biblio:
> total 0
> drwx------ 5 koha koha 100 Sep  1 06:36 ..
> drwxr-xr-x 2 koha koha  60 Sep  1 06:36 .
> -rw-r--r-- 1 koha koha   0 Sep  1 06:36 exported_records
>
> p6tjtKrrK3/biblio:
> total 0
> drwx------ 5 koha koha 100 Sep  1 06:36 ..
> drwxr-xr-x 2 koha koha  40 Sep  1 06:36 .
>
> Using marcprint.py, a small python program built around pymarc package, I
> decoded this file and find 13 MARC records, as expected.
> Example:
> =LDR  00871nam a22002417a 4500
> =001  201112071555.ls
> =003  UkLoVW
> =005  20111209110116.0
> =008  111207t1982\\\\enkg\\\\r\\\\\001\0\eng\d
> =040  \\$aUkLoVW$cUkLoVW
> =099  \\$aQS 40
> =100  1\$aSheffield, Ken$92330
> =245  \0$aTen country dances :$bmainly from Thompson, Wright & Wilson.
> =260  \\$aOxford :$b[The Author],$c1982.
> =300  \\$a12 p. :$bmusic ;$c30 cm.
> =490  1\$aFrom two barns ;$vv. 1
> =650  \\$9117$aCountry dances
> =650  \\$9127$aDance music
> =830  \5$aFrom two barns$92331
> =942  \\$2VWML$cBK$hQS 40$n0$6QS_00040
> =999  \\$c14879$d14879
> =952  \\$w2011-12-07$p10914$r2011-12-07$40$00$6QS_00040$915083$bVWML$10$oQS
> 40$d2011-12-07$70$cBOX$2VWML$yBK$aVWML
> =952  \\$w2011-12-07$p11121$r2011-12-07$40$00$6QS_00040$915084$bVWML$10$oQS
> 40$d2011-12-07$71$cBOX$2VWML$yBK$aVWML
>
> I have attached an ascii printout of all 13 records in case someone wants
> to look for a pattern in these records.
>
> The problem is either in the format/contents of those records, or in
> zebraidx/zebrasrv or their config files.  My suspicion is with the later
> since we have already had to fix one problem there with for bug 6566.
>
> -Doug-
>
>> Regards,
>> Jared
>>
>> --
>> Jared Camins-Esakov
>> Bibliographer, C & P Bibliography Services, LLC
>> (phone) +1 (917) 727-3445
>> (e-mail) jcamins at cpbibliography.com
>> (web) http://www.cpbibliography.com/
>>
>>
>>
>>
>> _______________________________________________
>> Koha mailing list  http://koha-community.org
>> Koha at lists.katipo.co.nz
>> http://lists.katipo.co.nz/mailman/listinfo/koha

-- 
Ian Bays
Director of Projects, PTFS Europe Limited
Content Management and Library Solutions
+44 (0) 800 756 6803 (phone)
+44 (0) 7774 995297 (mobile)
+44 (0) 800 756 6384 (fax)
skype: ian.bays
email: ian.bays at ptfs-europe.com



More information about the Koha mailing list