Re: [Koha] OPAC searches fail with ICU indexing enabled

17 May 2016

      Hi David,

Thank you for your reply. Please see my answers inline below:

On Tue, May 17, 2016 at 4:51 AM, David Cook <dcook@prosentient.com.au>
wrote:
...
Hi Andreas,
This problem looks a little familiar. I have a few questions.
You find 335 records using yaz-client. Are you able to view those records
using "show" in yaz-client?
Yes, I can view the records using "show", or "show 1", "show 42" etc.
...
Also where are you seeing the following error:
...
Error:
:8: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0xCF 0x3C 0x2F 0x74 Î¿Ï‚ Î¤Î¹Î¼ÏŒÎ¸ÎµÎ¿Î½ Î‘Î„-Ï€Ï Î¿Ï‚
Î¤Î¹Î¼ÏŒÎ¸ÎµÎ¿Î½ Î’Î„-Ï€Ï Î¿Ï‚ Î¤Î¯Ï„Î¿Î½-Ï€ ^
Is that in a file in your /var/log/koha/imp directory?
No, this is actually displayed in my web browser when searching in OPAC.
...
Also, those instructions at
https://wiki.koha-community.org/wiki/Correcting_Search_of_Arabic_records
look a bit suboptimal...
Are you using packages? Did you run the following?
sudo koha-restart-zebra {yourinstance}
sudo koha-rebuild-zebra -f {yourinstance}
Yes, I'm using packages and I've run both zebra commands.
...
That parser error doesn't look super helpful... using Windows-1251 0xCF is
Ï, 3C is <, / is 2F. With UTF-8, χ is 0xce 0xa7 and ό is 0xce 0x8c. So
there isn't a clear relation there. If I had to guess, I'd say that Zebra
thinks it's using ICU and UTF-8 but the data is still stored as Latin-1.
What I find odd is that other searches in OPAC for Greek characters work
fine and return records (for example searching for "[α.β.]", "[α.ό.]"). It
looks as if there's something contained in the results for "[χ.ό.]" that
causes the failure.

Failing that... I have some other more in-depth troubleshooting ideas.
...
I'd be more than happy to hear those :-)
...
David Cook
Systems Librarian
Prosentient Systems
72/330 Wattle St
Ultimo, NSW 2007
Office: 02 9212 0899
Direct: 02 8005 0595
Regards,
Andreas

Re: [Koha] OPAC searches fail with ICU indexing enabled

Andreas Roussos