Joshua Ferraro wrote:
Joshua Ferraro wrote:
On our 2 GHZ Intel duo-core Linux/debian install, it imports about 250 MARC records per minute, FWIW.
You must have meant per second, right?
Nope. That figure was back of envelop from memory. Here is the correct figure. The server was doing nothing else at the time, on a Sunday arvo.
It took 101 minutes for 32605 records = 322 records per minute. Hmmm, that seems unusually slow to me, an order of magnitude or so. Can you run the following commands to try to figure out what
On Mon, Apr 14, 2008 at 8:46 PM, Rick Welykochy <rick@praxis.com.au> wrote: the bottleneck is:
$ perl -I -d:DProf /path/to/koha/modules bulkmarcimport.pl -file /path/to/file.mrc tmon.out $ dprofpp -v > dprof.txt
Then share the output of dprof.txt with us?
Too late for that server. It is now in production. I might try the same thing on our test box when time permits, with perhaps 1000 records and get a profile that way. It does seem to be taking a very long time. But consider that the import process is parsing all the records and also deconstructing them and shoveling them word by word into MySQL. While I was monitoring the processes, MySQL seemd to be the chief task running. I would imagine that the storage of words to marc_word were done one row at a time. This is can slow things down; aggregating the writes to database would be more efficient. Another indicator of a single tazsk dominating is that the second CPU on the box was basically idle. We have a script that pre-processes the MARC data, and it also uses the MARC::* classes. The preprocessing involves reading in biblio records (sans items), grabbing the items from a MySQL staging table, and adding them to the MARC records, then outputting a new set of MARC records. That task processed all 32605 records in 24 seconds (!) Conclusion: there is something seriously inefficient in the bulk MARC importer script. cheers rickw -- ________________________________________________________________ Rick Welykochy || Praxis Services || Internet Driving Instructor We like to think of ourselves as the Microsoft of the energy world. -- Kenneth Lay, former CEO of Enron