Incomplete imports
Greetings all, We are looking into using Koha in our school system to replace the aging Winnebago Spectrum. I've installed 3.0.1 on a test server to play with. I've become hung up on the imports however. Taking an export of ~15000 records from Spectrum, I swapped the 852 fields to 952 in MarcEdit according to the wiki instructions, and staged the import without problem. However when I started importing the staged records, it made it through 160 records and just stopped, no error messages or any kind of feedback from the import page. Now I have a partially complete import that I can't resume or roll back. The same thing happened when I tried breaking the import into batches of 1000, the script just stopped processing after an arbitrary number of records. Can anyone offer some insight into what's causing these frozen, incomplete imports? Since there's no errors, I'm at a loss. Also, how can I cleanly remove these frozen batches from the staging area? Many thanks, Justin. P.S. I apologize in advance if this message was accidentally dispatched twice. __________ Information from ESET NOD32 Antivirus, version of virus signature database 4136 (20090606) __________ The message was checked by ESET NOD32 Antivirus. http://www.eset.com
One issue that can result in unsuccessful imports is records coming in as MARC-8 rather than UTF8 format. MarcBreaker will let you translate the records from the former to the latter. We learned the hard way that the records coming from our state's union catalog are in the wrong format. Hope this helps, Cab Vinton, Director Sanbornton Public Library Sanbornton, NH On Sat, Jun 6, 2009 at 11:24 PM, Justin Aquadro<jaquadro@gmail.com> wrote:
Greetings all,
We are looking into using Koha in our school system to replace the aging Winnebago Spectrum. I've installed 3.0.1 on a test server to play with. I've become hung up on the imports however. Taking an export of ~15000 records from Spectrum, I swapped the 852 fields to 952 in MarcEdit according to the wiki instructions, and staged the import without problem. However when I started importing the staged records, it made it through 160 records and just stopped, no error messages or any kind of feedback from the import page. Now I have a partially complete import that I can't resume or roll back. The same thing happened when I tried breaking the import into batches of 1000, the script just stopped processing after an arbitrary number of records.
Can anyone offer some insight into what's causing these frozen, incomplete imports? Since there's no errors, I'm at a loss. Also, how can I cleanly remove these frozen batches from the staging area?
Many thanks, Justin.
P.S. I apologize in advance if this message was accidentally dispatched twice.
__________ Information from ESET NOD32 Antivirus, version of virus signature database 4136 (20090606) __________
The message was checked by ESET NOD32 Antivirus.
_______________________________________________ Koha mailing list Koha@lists.katipo.co.nz http://lists.katipo.co.nz/mailman/listinfo/koha
2009/6/7 Justin Aquadro <jaquadro@gmail.com>:
Greetings all,
We are looking into using Koha in our school system to replace the aging Winnebago Spectrum. I've installed 3.0.1 on a test server to play with. I've become hung up on the imports however. Taking an export of ~15000 records from Spectrum, I swapped the 852 fields to 952 in MarcEdit according to the wiki instructions, and staged the import without problem. However when I started importing the staged records, it made it through 160 records and just stopped, no error messages or any kind of feedback from the import page. Now I have a partially complete import that I can't resume or roll back. The same thing happened when I tried breaking the import into batches of 1000, the script just stopped processing after an arbitrary number of records.
Can anyone offer some insight into what's causing these frozen, incomplete imports? Since there's no errors, I'm at a loss. Also, how can I cleanly remove these frozen batches from the staging area?
Hi Justin This sounds a lot like bug 2926 http://bugs.koha.org/cgi-bin/bugzilla3/show_bug.cgi?id=2926 Which as it happens the awesome Galen Charlton (our fearless release manager) has just fixed. http://git.koha.org/cgi-bin/gitweb.cgi?p=Koha;a=commit;h=da51de184c1179fd725... Chris
Chris Cormack wrote:
2009/6/7 Justin Aquadro <jaquadro@gmail.com>:
Greetings all,
We are looking into using Koha in our school system to replace the aging Winnebago Spectrum. I've installed 3.0.1 on a test server to play with. I've become hung up on the imports however. Taking an export of ~15000 records from Spectrum, I swapped the 852 fields to 952 in MarcEdit according to the wiki instructions, and staged the import without problem. However when I started importing the staged records, it made it through 160 records and just stopped, no error messages or any kind of feedback from the import page. Now I have a partially complete import that I can't resume or roll back. The same thing happened when I tried breaking the import into batches of 1000, the script just stopped processing after an arbitrary number of records.
Can anyone offer some insight into what's causing these frozen, incomplete imports? Since there's no errors, I'm at a loss. Also, how can I cleanly remove these frozen batches from the staging area?
Hi Justin
This sounds a lot like bug 2926 http://bugs.koha.org/cgi-bin/bugzilla3/show_bug.cgi?id=2926
Which as it happens the awesome Galen Charlton (our fearless release manager) has just fixed.
http://git.koha.org/cgi-bin/gitweb.cgi?p=Koha;a=commit;h=da51de184c1179fd725...
Chris
Thanks for the suggestion Chris (and sorry for the rogue reply earlier), but it appears to not have helped. The problem was not with staging, but with importing the already staged records. It does sound like the same kind of problem though. It hung on the same record, #160 of my batch. I've isolated the exact record causing the problem, which hung the import in a batch all by itself: =LDR 00691nam a2200229 a 4500 =001 \\\59009955\//r84 =003 DLC =005 20050719144244.0 =008 050719s1959\\\\nyu\\\\\\b\\\\000\0\eng\\ =010 \\$a 59009955 //r84 =040 \\$aDLC$cDLC =050 00$aB4372.E5$bB7 1959 =082 0\$a198.9 =100 1\$aKierkegaard, Sr̜en,$d1813-1855. =240 10$aSelections.$lEnglish =245 12$aA Kierkegaard anthology.$cEdited by Robert Bretall. =260 \\$aNew York,$bModern Library$c[1959] =300 \\$axxv, 494 p.$c19 cm. =490 0\$aThe Modern library of the world's best books$v[no. 303] =504 \\$aBibliography: p. [483]-488. =952 \\$o198 KIE $a $c $p10197$bNHS =961 wl$t7 I didn't even notice until sending this reply, but there is a tiny unicode character buried in the author. I've checked the marc record in a hex editor and the unicode encoding looks correct, and the leader looks correct, but only by removing that character did it import. Not sure what's up, but it's a start. Thanks, Justin. __________ Information from ESET NOD32 Antivirus, version of virus signature database 4136 (20090606) __________ The message was checked by ESET NOD32 Antivirus. http://www.eset.com
Hi, On Mon, Jun 8, 2009 at 11:29 AM, Agnes Rivers-Moore<arm@hanover.ca> wrote:
It would be very helpful if you could tell us a little about what the patch for this bug does. Does it correct the MARC encoding, translating MARC-8 encoding to UTF-8? Very cool if it does. Or maybe it reports an error and passes on to the next record?
Well, the patch description in the link that Chris sent says it all: "Fixes a hang of the staging import tool when it attempts to process a MARC21 record that claims that it's UTF-8 when it is not. The staging import will now attempt to fix the character encoding of such records." Specifically, for those records it will assume that the actual character encoding is MARC-8 and try to convert it to UTF-8. Regards, Galen -- Galen Charlton VP, Research & Development, LibLime galen.charlton@liblime.com p: 1-888-564-2457 x709 skype: gmcharlt
participants (5)
-
Agnes Rivers-Moore -
Cab Vinton -
Chris Cormack -
Galen Charlton -
Justin Aquadro