Awesome! :) I'd appreciate it if you'd be able to sign off the patch in Bugzilla. --Ere Aleisha Amohia kirjoitti 3.5.2021 klo 0.10:
Thank you for turning this around so quickly Ere, I backported the fix to our 20.05.x site and it's working well!
Aleisha
On 30/04/21 7:02 pm, Ere Maijala wrote:
Hi,
I think we're just doing it wrong. The problem is that with authorities there's a single SQL query that fetches a record set that includes the marcxml records. Do that for 600 000 records and it use quite a bit of memory. For biblios the record set only contains biblionumbers, and the actual metadata is fetched one-by-one. That's better but still not great.
What we should be doing is fetch n records (e.g. with n=1000) at a time, ordered by id. Then each round fetch the next set starting from id > [last fetched id]. This is basicly what I did in bug 27584 to improve the OAI-PMH provider performance.
I've created bug 28268 about this. I'll post a patch that you could try soon.
Best, Ere
Aleisha Amohia kirjoitti 30.4.2021 klo 0.13:
Hi Alvaro
Thank you for responding! I did try lowering the batch commit but it doesn't appear to even get to the part where it starts processing records. I think it gets stuck on fetching the records.
Aleisha
On 30/04/21 3:39 am, Alvaro Cornejo wrote:
Hi Aleisha
Have you tried to lower the batch commit?
see https://perldoc.koha-community.org/misc/search_tools/rebuild_elasticsearch.h...
<https://perldoc.koha-community.org/misc/search_tools/rebuild_elasticsearch.html>
Regards,
Alvaro
|----------------------------------------------------------------------------------------|
Stay safe / Cuídate/ Reste sécurisé */7/* Switch off as you go / Apaga lo que no usas / Débranchez au fur et à mesure. *q *Recycle always / Recicla siempre / Recyclez toujours P Print only if absolutely necessary / Imprime solo si es necesario / Imprimez seulement si nécessaire
Le mer. 28 avr. 2021 à 23:52, Aleisha Amohia <aleisha@catalyst.net.nz <mailto:aleisha@catalyst.net.nz>> a écrit :
Hi all,
We have a site with around 600,000 authority records and have enabled elasticsearch. Reindexing and searching has worked easily for biblios (there are just over 10,000 biblios) but fails for authorities.
$ sudo koha-elasticsearch --rebuild -d -a -v <instance> [19024] Checking state of authorities index [19024] Dropping and recreating authorities index [19024] Indexing authorities
And then it just hangs there for ages until the server runs out of memory and dies.
We're running ES 6.x on Koha 20.05.x.
I tried running the script directly and specifying an authid and that worked as expected.
$ perl /usr/share/koha/bin/search_tools/rebuild_elasticsearch.pl <http://rebuild_elasticsearch.pl> -a -d -v -ai 2135575 [17684] Checking state of authorities index [17684] Dropping and recreating authorities index [17684] Indexing authorities [17684] Committing final records... [17684] Total 1 records indexed
It feels like this is happening because we have too many authority records. Is there a way to fix this? Has anyone come across this before?
Thanks!
-- *Aleisha Amohia*(she/her) _______________________________________________
Koha mailing list http://koha-community.org <http://koha-community.org> Koha@lists.katipo.co.nz <mailto:Koha@lists.katipo.co.nz> Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha <https://lists.katipo.co.nz/mailman/listinfo/koha>
-- Ere Maijala Kansalliskirjasto / The National Library of Finland