Koha performance and 6 million records

3 Mar 2019

      Hi

We are currently running Koha 18.11 on Debian GNU/Linux 9 (virtual 
machine with 2 processors, 4 GB RAM) and 50'000 bibliographic records in 
a MariaDB database, using Zebra, and of course Plack and Memcached. 
There are only about 160 users. We would like to add 6 million 
bibliographic records (metadata of articles) that before were only 
indexed in the Solr index of a proprietary discovery system, but not in 
Koha itself.

Since we are lacking experience in such large masses of data we would 
like to investigate about the consequences:

* Will the overall performance and especially the retrieval experience 
suffer a lot? (note there are only 160 users) We're afraid it will...

* If yes, is there a way to improve the retrieval experience? For 
example changing from a virtual machine to a dedicated or physical host? 
Adding more resp. faster processors, more RAM? Using SSD disks instead 
of SAS? Changing from Zebra to Elasticsearch? Or will we need to 
implement another discovery system, like Vufind (which is using Solr 
Energy)?

Any tips or hints are very much appreciated!

Best wishes: Michael
-- 
Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis
Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz
T 0041 (0)61 261 55 61 · E mik@adminkuhn.ch · W www.adminkuhn.ch

Michael Kuhn

Ere Maijala

Michael Kuhn

tags

participants (2)