[Koha] Elasticsearch reindex failing for authorities

Aleisha Amohia aleisha at catalyst.net.nz
Mon May 3 09:10:37 NZST 2021


Thank you for turning this around so quickly Ere, I backported the fix
to our 20.05.x site and it's working well!

Aleisha

On 30/04/21 7:02 pm, Ere Maijala wrote:
> Hi,
>
> I think we're just doing it wrong. The problem is that with
> authorities there's a single SQL query that fetches a record set that
> includes the marcxml records. Do that for 600 000 records and it use
> quite a bit of memory. For biblios the record set only contains
> biblionumbers, and the actual metadata is fetched one-by-one. That's
> better but still not great.
>
> What we should be doing is fetch n records (e.g. with n=1000) at a
> time, ordered by id. Then each round fetch the next set starting from
> id > [last fetched id]. This is basicly what I did in bug 27584 to
> improve the OAI-PMH provider performance.
>
> I've created bug 28268 about this. I'll post a patch that you could
> try soon.
>
> Best,
> Ere
>
> Aleisha Amohia kirjoitti 30.4.2021 klo 0.13:
>> Hi Alvaro
>>
>> Thank you for responding! I did try lowering the batch commit but it
>> doesn't appear to even get to the part where it starts processing
>> records. I think it gets stuck on fetching the records.
>>
>> Aleisha
>>
>> On 30/04/21 3:39 am, Alvaro Cornejo wrote:
>>> Hi Aleisha
>>>
>>> Have you tried to lower the batch commit?
>>>
>>> see https://perldoc.koha-community.org/misc/search_tools/rebuild_elasticsearch.html
>>>
>>> <https://perldoc.koha-community.org/misc/search_tools/rebuild_elasticsearch.html>
>>>
>>>
>>> Regards,
>>>
>>> Alvaro
>>>
>>>
>>>
>>> |----------------------------------------------------------------------------------------|
>>>
>>>   Stay safe / Cuídate/  Reste sécurisé
>>> */7/* Switch off as you go / Apaga lo que no usas /  Débranchez au fur
>>> et à mesure.
>>>   *q *Recycle always / Recicla siempre / Recyclez toujours
>>>   P Print only if absolutely necessary / Imprime solo si es necesario /
>>> Imprimez seulement si nécessaire
>>>
>>>
>>> Le mer. 28 avr. 2021 à 23:52, Aleisha Amohia <aleisha at catalyst.net.nz
>>> <mailto:aleisha at catalyst.net.nz>> a écrit :
>>>
>>>      Hi all,
>>>
>>>      We have a site with around 600,000 authority records and have
>>> enabled
>>>      elasticsearch. Reindexing and searching has worked easily for
>>> biblios
>>>      (there are just over 10,000 biblios) but fails for authorities.
>>>
>>>      $ sudo koha-elasticsearch --rebuild -d -a -v <instance>
>>>      [19024] Checking state of authorities index
>>>      [19024] Dropping and recreating authorities index
>>>      [19024] Indexing authorities
>>>
>>>      And then it just hangs there for ages until the server runs out of
>>>      memory and dies.
>>>
>>>      We're running ES 6.x on Koha 20.05.x.
>>>
>>>      I tried running the script directly and specifying an authid
>>> and that
>>>      worked as expected.
>>>
>>>      $ perl /usr/share/koha/bin/search_tools/rebuild_elasticsearch.pl
>>>      <http://rebuild_elasticsearch.pl> -a -d
>>>      -v -ai 2135575
>>>      [17684] Checking state of authorities index
>>>      [17684] Dropping and recreating authorities index
>>>      [17684] Indexing authorities
>>>      [17684] Committing final records...
>>>      [17684] Total 1 records indexed
>>>
>>>      It feels like this is happening because we have too many authority
>>>      records. Is there a way to fix this? Has anyone come across this
>>>      before?
>>>
>>>      Thanks!
>>>
>>>      --
>>>      *Aleisha Amohia*(she/her)
>>>      _______________________________________________
>>>
>>>      Koha mailing list  http://koha-community.org
>>>      <http://koha-community.org>
>>>      Koha at lists.katipo.co.nz <mailto:Koha at lists.katipo.co.nz>
>>>      Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha
>>>      <https://lists.katipo.co.nz/mailman/listinfo/koha>
>>>
>
-- 
*Aleisha Amohia*(she/her)
Koha Developer

Catalyst IT - Expert Open Source Solutions
Mob: +64 21 024 04004 | Tel: +64 4 499 2267 | www.catalyst.net.nz
<http://www.catalyst.net.nz>

Catalyst Logo

CONFIDENTIALITY NOTICE: This email is intended for the named recipients
only. It may contain privileged, confidential or copyright information.
If you are not the named recipient, any use, reliance upon, disclosure
or copying of this email or its attachments is unauthorised. If you have
received this email in error, please reply via email or call +64 4 499 2267.


More information about the Koha mailing list