[Koha] Problems with Zebra indexing

Coehoorn, Joel jcoehoorn at york.edu
Tue Jul 23 17:22:45 NZST 2013


First off: we're on Ubuntu 12.04, using Debian packages, now running koha
3.10. This is my first experience with koha, and from what I can tell I
must have done some things during the original setup as root instead of as
the koha user. It's far enough in the past now that I'm not sure exactly
what.

In spite of this, search has been fully working until this week, with the
caveat that I would have to manually run the rebuild_zebra.pl script. This
had always involved using sudo to rebuild the indexes (one part of why I
think I must have done something as root).

That was all fine when the only materials that change are bulk imports (one
import per item type) from our old system, done by me, but it's not going
to fly long term, when my librarians need to be able to add records on
their own, so this week I dug into why index updates were not happening
automatically, and I found a simple permissions error. I updated the zebra
export folders to give the koha user write access to the zebra folders, and
now nothing shows up in search.

When I watch the re-index process, I see a few additional folders that
cause permissions issues under /var/lock. I can update permissions on these
as well, but they are removed and re-created with the bad permissions every
time the server reboots. Rare enough, but still a problem.

*How can I fix this so that the lock folders get the correct permissions
when they are recreated after reboot? Is this something I'll need to script?
*

Also watching the re-index process I see we have four cases where marc
records that are too large for the spec that somehow made it into the
system. I found and removed two of those, but I'm having a hard time
finding the other two. I only know they're somewhere near the end of our
biblios (should have larger biblio numbers), because they're up past 92000
out of 99769 during the import, but I can't see where exactly to pin them
down. This brings up two questions:

*How I can find the over-sized marc records?*
and
*How can I get koha to use handle these large items? *I know it's possible,
because until this week I was able to search on those records.

With just the two bad records, I can then watch zebra index those first
92000 or so records. After the exporting biblios phase, during the reindex
phase, with extra verbosity set I do see this error:


 about bad marc records. I don't have exact message handy now, but I
remember it was explicit about skipping everything after the error event.
Everything else looks to have committed. In spite of this, all searches
still come up empty.

I tried re-indexing with the -x switch, in an attempt to use the xml format
to get around the marc 99999 byte limit. This completes with no errors.
However, I still don't see any search results. I know the zebra server is
running.


  Joel Coehoorn
Director of Information Technology
York College, Nebraska
402.363.5603
jcoehoorn at york.edu



 *The mission of York College is to transform lives through
Christ-centered education and to equip students for lifelong service to
God, family, and society*


More information about the Koha mailing list