[Koha] Elasticsearch reindex failing for authorities
Ere Maijala
ere.maijala at helsinki.fi
Mon May 3 19:21:40 NZST 2021
Awesome! :) I'd appreciate it if you'd be able to sign off the patch in
Bugzilla.
--Ere
Aleisha Amohia kirjoitti 3.5.2021 klo 0.10:
> Thank you for turning this around so quickly Ere, I backported the fix
> to our 20.05.x site and it's working well!
>
> Aleisha
>
> On 30/04/21 7:02 pm, Ere Maijala wrote:
>> Hi,
>>
>> I think we're just doing it wrong. The problem is that with
>> authorities there's a single SQL query that fetches a record set that
>> includes the marcxml records. Do that for 600 000 records and it use
>> quite a bit of memory. For biblios the record set only contains
>> biblionumbers, and the actual metadata is fetched one-by-one. That's
>> better but still not great.
>>
>> What we should be doing is fetch n records (e.g. with n=1000) at a
>> time, ordered by id. Then each round fetch the next set starting from
>> id > [last fetched id]. This is basicly what I did in bug 27584 to
>> improve the OAI-PMH provider performance.
>>
>> I've created bug 28268 about this. I'll post a patch that you could
>> try soon.
>>
>> Best,
>> Ere
>>
>> Aleisha Amohia kirjoitti 30.4.2021 klo 0.13:
>>> Hi Alvaro
>>>
>>> Thank you for responding! I did try lowering the batch commit but it
>>> doesn't appear to even get to the part where it starts processing
>>> records. I think it gets stuck on fetching the records.
>>>
>>> Aleisha
>>>
>>> On 30/04/21 3:39 am, Alvaro Cornejo wrote:
>>>> Hi Aleisha
>>>>
>>>> Have you tried to lower the batch commit?
>>>>
>>>> see https://perldoc.koha-community.org/misc/search_tools/rebuild_elasticsearch.html
>>>>
>>>> <https://perldoc.koha-community.org/misc/search_tools/rebuild_elasticsearch.html>
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Alvaro
>>>>
>>>>
>>>>
>>>> |----------------------------------------------------------------------------------------|
>>>>
>>>> Stay safe / Cuídate/ Reste sécurisé
>>>> */7/* Switch off as you go / Apaga lo que no usas / Débranchez au fur
>>>> et à mesure.
>>>> *q *Recycle always / Recicla siempre / Recyclez toujours
>>>> P Print only if absolutely necessary / Imprime solo si es necesario /
>>>> Imprimez seulement si nécessaire
>>>>
>>>>
>>>> Le mer. 28 avr. 2021 à 23:52, Aleisha Amohia <aleisha at catalyst.net.nz
>>>> <mailto:aleisha at catalyst.net.nz>> a écrit :
>>>>
>>>> Hi all,
>>>>
>>>> We have a site with around 600,000 authority records and have
>>>> enabled
>>>> elasticsearch. Reindexing and searching has worked easily for
>>>> biblios
>>>> (there are just over 10,000 biblios) but fails for authorities.
>>>>
>>>> $ sudo koha-elasticsearch --rebuild -d -a -v <instance>
>>>> [19024] Checking state of authorities index
>>>> [19024] Dropping and recreating authorities index
>>>> [19024] Indexing authorities
>>>>
>>>> And then it just hangs there for ages until the server runs out of
>>>> memory and dies.
>>>>
>>>> We're running ES 6.x on Koha 20.05.x.
>>>>
>>>> I tried running the script directly and specifying an authid
>>>> and that
>>>> worked as expected.
>>>>
>>>> $ perl /usr/share/koha/bin/search_tools/rebuild_elasticsearch.pl
>>>> <http://rebuild_elasticsearch.pl> -a -d
>>>> -v -ai 2135575
>>>> [17684] Checking state of authorities index
>>>> [17684] Dropping and recreating authorities index
>>>> [17684] Indexing authorities
>>>> [17684] Committing final records...
>>>> [17684] Total 1 records indexed
>>>>
>>>> It feels like this is happening because we have too many authority
>>>> records. Is there a way to fix this? Has anyone come across this
>>>> before?
>>>>
>>>> Thanks!
>>>>
>>>> --
>>>> *Aleisha Amohia*(she/her)
>>>> _______________________________________________
>>>>
>>>> Koha mailing list http://koha-community.org
>>>> <http://koha-community.org>
>>>> Koha at lists.katipo.co.nz <mailto:Koha at lists.katipo.co.nz>
>>>> Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha
>>>> <https://lists.katipo.co.nz/mailman/listinfo/koha>
>>>>
>>
--
Ere Maijala
Kansalliskirjasto / The National Library of Finland
More information about the Koha
mailing list