[Koha] Elasticsearch reindex failing for authorities

Ere Maijala ere.maijala at helsinki.fi
Fri Apr 30 19:02:41 NZST 2021


Hi,

I think we're just doing it wrong. The problem is that with authorities 
there's a single SQL query that fetches a record set that includes the 
marcxml records. Do that for 600 000 records and it use quite a bit of 
memory. For biblios the record set only contains biblionumbers, and the 
actual metadata is fetched one-by-one. That's better but still not great.

What we should be doing is fetch n records (e.g. with n=1000) at a time, 
ordered by id. Then each round fetch the next set starting from id > 
[last fetched id]. This is basicly what I did in bug 27584 to improve 
the OAI-PMH provider performance.

I've created bug 28268 about this. I'll post a patch that you could try 
soon.

Best,
Ere

Aleisha Amohia kirjoitti 30.4.2021 klo 0.13:
> Hi Alvaro
> 
> Thank you for responding! I did try lowering the batch commit but it
> doesn't appear to even get to the part where it starts processing
> records. I think it gets stuck on fetching the records.
> 
> Aleisha
> 
> On 30/04/21 3:39 am, Alvaro Cornejo wrote:
>> Hi Aleisha
>>
>> Have you tried to lower the batch commit?
>>
>> see https://perldoc.koha-community.org/misc/search_tools/rebuild_elasticsearch.html
>> <https://perldoc.koha-community.org/misc/search_tools/rebuild_elasticsearch.html>
>>
>> Regards,
>>
>> Alvaro
>>
>>
>>
>> |----------------------------------------------------------------------------------------|
>>   Stay safe / Cuídate/  Reste sécurisé
>> */7/* Switch off as you go / Apaga lo que no usas /  Débranchez au fur
>> et à mesure.
>>   *q *Recycle always / Recicla siempre / Recyclez toujours
>>   P Print only if absolutely necessary / Imprime solo si es necesario /
>> Imprimez seulement si nécessaire
>>
>>
>> Le mer. 28 avr. 2021 à 23:52, Aleisha Amohia <aleisha at catalyst.net.nz
>> <mailto:aleisha at catalyst.net.nz>> a écrit :
>>
>>      Hi all,
>>
>>      We have a site with around 600,000 authority records and have enabled
>>      elasticsearch. Reindexing and searching has worked easily for biblios
>>      (there are just over 10,000 biblios) but fails for authorities.
>>
>>      $ sudo koha-elasticsearch --rebuild -d -a -v <instance>
>>      [19024] Checking state of authorities index
>>      [19024] Dropping and recreating authorities index
>>      [19024] Indexing authorities
>>
>>      And then it just hangs there for ages until the server runs out of
>>      memory and dies.
>>
>>      We're running ES 6.x on Koha 20.05.x.
>>
>>      I tried running the script directly and specifying an authid and that
>>      worked as expected.
>>
>>      $ perl /usr/share/koha/bin/search_tools/rebuild_elasticsearch.pl
>>      <http://rebuild_elasticsearch.pl> -a -d
>>      -v -ai 2135575
>>      [17684] Checking state of authorities index
>>      [17684] Dropping and recreating authorities index
>>      [17684] Indexing authorities
>>      [17684] Committing final records...
>>      [17684] Total 1 records indexed
>>
>>      It feels like this is happening because we have too many authority
>>      records. Is there a way to fix this? Has anyone come across this
>>      before?
>>
>>      Thanks!
>>
>>      --
>>      *Aleisha Amohia*(she/her)
>>      _______________________________________________
>>
>>      Koha mailing list  http://koha-community.org
>>      <http://koha-community.org>
>>      Koha at lists.katipo.co.nz <mailto:Koha at lists.katipo.co.nz>
>>      Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha
>>      <https://lists.katipo.co.nz/mailman/listinfo/koha>
>>

-- 
Ere Maijala
Kansalliskirjasto / The National Library of Finland


More information about the Koha mailing list