[Koha] Elasticsearch reindex failing for authorities
Ere Maijala
ere.maijala at helsinki.fi
Fri Apr 30 19:02:41 NZST 2021
Hi,
I think we're just doing it wrong. The problem is that with authorities
there's a single SQL query that fetches a record set that includes the
marcxml records. Do that for 600 000 records and it use quite a bit of
memory. For biblios the record set only contains biblionumbers, and the
actual metadata is fetched one-by-one. That's better but still not great.
What we should be doing is fetch n records (e.g. with n=1000) at a time,
ordered by id. Then each round fetch the next set starting from id >
[last fetched id]. This is basicly what I did in bug 27584 to improve
the OAI-PMH provider performance.
I've created bug 28268 about this. I'll post a patch that you could try
soon.
Best,
Ere
Aleisha Amohia kirjoitti 30.4.2021 klo 0.13:
> Hi Alvaro
>
> Thank you for responding! I did try lowering the batch commit but it
> doesn't appear to even get to the part where it starts processing
> records. I think it gets stuck on fetching the records.
>
> Aleisha
>
> On 30/04/21 3:39 am, Alvaro Cornejo wrote:
>> Hi Aleisha
>>
>> Have you tried to lower the batch commit?
>>
>> see https://perldoc.koha-community.org/misc/search_tools/rebuild_elasticsearch.html
>> <https://perldoc.koha-community.org/misc/search_tools/rebuild_elasticsearch.html>
>>
>> Regards,
>>
>> Alvaro
>>
>>
>>
>> |----------------------------------------------------------------------------------------|
>> Stay safe / Cuídate/ Reste sécurisé
>> */7/* Switch off as you go / Apaga lo que no usas / Débranchez au fur
>> et à mesure.
>> *q *Recycle always / Recicla siempre / Recyclez toujours
>> P Print only if absolutely necessary / Imprime solo si es necesario /
>> Imprimez seulement si nécessaire
>>
>>
>> Le mer. 28 avr. 2021 à 23:52, Aleisha Amohia <aleisha at catalyst.net.nz
>> <mailto:aleisha at catalyst.net.nz>> a écrit :
>>
>> Hi all,
>>
>> We have a site with around 600,000 authority records and have enabled
>> elasticsearch. Reindexing and searching has worked easily for biblios
>> (there are just over 10,000 biblios) but fails for authorities.
>>
>> $ sudo koha-elasticsearch --rebuild -d -a -v <instance>
>> [19024] Checking state of authorities index
>> [19024] Dropping and recreating authorities index
>> [19024] Indexing authorities
>>
>> And then it just hangs there for ages until the server runs out of
>> memory and dies.
>>
>> We're running ES 6.x on Koha 20.05.x.
>>
>> I tried running the script directly and specifying an authid and that
>> worked as expected.
>>
>> $ perl /usr/share/koha/bin/search_tools/rebuild_elasticsearch.pl
>> <http://rebuild_elasticsearch.pl> -a -d
>> -v -ai 2135575
>> [17684] Checking state of authorities index
>> [17684] Dropping and recreating authorities index
>> [17684] Indexing authorities
>> [17684] Committing final records...
>> [17684] Total 1 records indexed
>>
>> It feels like this is happening because we have too many authority
>> records. Is there a way to fix this? Has anyone come across this
>> before?
>>
>> Thanks!
>>
>> --
>> *Aleisha Amohia*(she/her)
>> _______________________________________________
>>
>> Koha mailing list http://koha-community.org
>> <http://koha-community.org>
>> Koha at lists.katipo.co.nz <mailto:Koha at lists.katipo.co.nz>
>> Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha
>> <https://lists.katipo.co.nz/mailman/listinfo/koha>
>>
--
Ere Maijala
Kansalliskirjasto / The National Library of Finland
More information about the Koha
mailing list