[Koha] OPAC searches fail with ICU indexing enabled

Andreas Roussos arouss1980 at gmail.com
Wed May 18 19:52:01 NZST 2016


Hi David,

On Wed, May 18, 2016 at 4:46 AM, David Cook <dcook at prosentient.com.au>
wrote:

> Hi Andreas,
>
>
>
> Thanks for your response. It’s very helpful!
>

Thank _you_ for your time.


> As you say, the problem probably is in individual records in the results
> for "[χ.ό.]". Do you see the error in the OPAC immediately after searching,
> or is it on a certain page of results? If it’s on a certain page of
> results, we might be able to narrow down the problem further.
>

I see the problem in the OPAC immediately after searching, no results are
displayed at all.


> When using “show” in yaz-client, are you able to view every single record?
> You might try using “format xml” in yaz-client if you’re not already doing
> so, as that might help problems to surface.
>

Yes, I enabled "format xml" and was able to view all 335 records returned
for
my search by typing "show" repeatedly.


> How many records are in your Koha database overall? Depending on how
> technical you are, you might consider trying the MARC checker plugin (
> https://github.com/bywatersolutions/koha-plugin-marc-checker), or writing
> your own script to iterate through all the MARCXML records in your database
> and try to create MARC::Record objects from them. If the problem is with
> the record itself, that’s a good way of discovering which one(s) are at
> fault.
>
>
>
> If you’re having issues with results for "[χ.ό.]", it’s probably a safe
> assumption that there’s other problems with records in the database, so
> scanning all the records is probably a good idea. If you have a very large
> database, you can break it up into chunks using biblionumber.
>

We have approx. 22k records, but we're using UNIMARC (apologies for not
mentioning this earlier). For what it's worth, I enabled the MARC checker
plugin and ran a report on biblionumbers 1 to 200 (biblionumber 147
contained
the string [χ.ό.]). This resulted in a lot of output like "245: No 245 tag."
because we store the Title in field 200a. So, as I understand it, this
particular plugin is tailored towards MARC21 installations.

I'm not well versed with Perl so writing my own MARC checker script would be
difficult. However, I do know a little bit of C, so I've written a small
program that connects to our MySQL DB and fetches the 'marcxml' field of a
particular biblionumber. I then redirect the output of this program to a
file,
and (based on http://stackoverflow.com/questions/115210/utf-8-validation)
run
`iconv` on the file to see if it contains any invalid UTF-8 data. No records
with UTF-8 "oddities" have been found using this method :-(

BTW, will you be attending KohaCon'16 by any chance?

Regards,
Andreas


> David Cook
>
> Systems Librarian
>
>
>
> Prosentient Systems
>
> 72/330 Wattle St
>
> Ultimo, NSW 2007
>
>
>
> Office: 02 9212 0899
>
> Direct: 02 8005 0595
>


More information about the Koha mailing list