[Koha] Title search works, but Library catalog search fails in OPAC
Bales (US), Tasha R
tasha.r.bales at boeing.com
Fri Jun 25 09:55:46 NZST 2021
Follow-up to my problem searching the OPAC for phrases containing punctuation (i.e., Electroactive polymer (EAP) actuators). This Bywater Solutions<https://bywatersolutions.com/education/elastic-searching> article suggests that the problem is a feature of Elasticsearch.
FYI, we are using Elasticsearch 6.1.1 on its own dedicated server, and I don’t believe we’ve installed the ICU Analysis plug-in (looks like it’s required for Zebra, but I can’t tell if it’s required for Elasticsearch), which could be a factor. I couldn’t replicate all aspects of my experience in a sandbox, although searches for phrases containing punctuation still failed in an OPAC “Library Catalog” sandbox search. I concluded that I needed to review our configuration.
I’ve been reviewing the Koha Wiki and elastic.co documentation, and comparing to our index_config.yaml.
By chance does anyone know how to interpret the syntax below? The documentation describes the parameters below, but I don’t see any usage of “-“ before options. Does the option “- punctuation“ mean “yes, remove punctuation”, or “no, don’t remove punctuation”, or does the phrase refer to some additional configuration file, or perhaps it’s commented out?
Thanks for your time and consideration,
Business Support Team | Information Services
Enterprise Services | Enterprise Operations, Finance and Sustainability
From: Bales (US), Tasha R
Sent: Tuesday, June 22, 2021 7:54 AM
To: 'Jonathan Druart' <jonathan.druart at bugs.koha-community.org>
Cc: Discussion Group Koha <koha at lists.katipo.co.nz>
Subject: RE: [EXTERNAL] Re: [Koha] Title search works, but Library catalog search fails in OPAC
Jonathan, thank you!
It does work without the parentheses.
I would suspect an encoding problem, but for that the problem only manifests in the OPAC, and not the intranet.
I came across this issue while testing after migrating from MariaDB to Percona MySQL. Your reply prompted me to check the encoding of the new database, and it's unfortunately Latin-1. Since these are parentheses and not diacritics, I’m not sure what my expectations should be, but changing to UTF-8 is a place to start. Httpd.conf does have UTF-8 set as the default.
FWIW, my source records were encoded in MARC-8. I used MarcEdit to convert them to UTF-8, and it appears that Koha automatically converts anyway on import. When I loaded these records into Koha, I used bulkmarcimport.pl on the command line.
I'll ask that the default character set of the database be changed, and see if that helps. Thanks again. I'm embarrassed that I didn't think to omit the parentheses, or rather was belligerently insisting to myself that they should not have been a problem,
Business Support Team | Information Services Enterprise Services | Enterprise Operations, Finance and Sustainability
From: Jonathan Druart [mailto:jonathan.druart at bugs.koha-community.org]
Sent: Tuesday, June 22, 2021 12:01 AM
To: Bales (US), Tasha R <tasha.r.bales at boeing.com>
Cc: Discussion Group Koha <koha at lists.katipo.co.nz>
Subject: [EXTERNAL] Re: [Koha] Title search works, but Library catalog search fails in OPAC
EXT email: be mindful of links/attachments.
I've created 2 records with
245$a Electroactive polymer (EAP) actuators as artificial muscles and the following query returns the 2 results.
Tried on master and 20.11.06.
Maybe a silly idea: does it work without the parenthesis?
Could you try and recreate it on a sandbox
(https://wiki.koha-community.org/wiki/Sandboxes) and provide us a step by step plan to reproduce the problem?
Le mar. 22 juin 2021 à 00:46, Bales (US), Tasha R <tasha.r.bales at boeing.com> a écrit :
> Good afternoon,
> I’m having trouble with Title vs. Library catalog keyword searching with several example titles. Searching the same phrase with either method yields different results. This problem occurs only in the OPAC. I hope to confirm whether the behavior I’m seeing is intended (i.e., the problem is me) or not. Thanks in advance.
> For example, given the ebook title, Electroactive polymer (EAP)
> actuators as artificial muscles, a Title keyword search in the OPAC is successful, but a plain, Library catalog (i.e., no index specified), keyword search fails.
> For reference, the title is recorded in the MARC record as:
> 245 00 - TITLE STATEMENT
> a Title Electroactive polymer (EAP) actuators as artificial muscles :
> Below I’ve copied in my search history as well as the tail of the search URL that shows the search parameters.
> · Library catalog keyword search with 0 results
> o 2021-06-21 02:34 PM Electroactive polymer (EAP) actuators, suppress:false 0
> o …opac-search.pl?idx=&q=Electroactive%20polymer%20%28EAP%29%20actuators&weight_search=1
> · Title keyword search with 2 results
> o 2021-06-21 02:34 PM Electroactive polymer (EAP) actuators, suppress:false 2
> o …opac-search.pl?idx=ti&q=Electroactive+polymer+%28EAP%29+actuators&weight_search=1
> As a test, I decided to enclose my Library catalog search terms in quotes, which yielded the desired results. However, I did not at all anticipate that quotes would be required to get hits:
> · Library catalog quoted keyword search with 2 results
> o 2021-06-21 02:46 PM "Electroactive polymer (EAP) actuators", suppress:false 2
> o … opac-search.pl?idx=&q=%22Electroactive+polymer+%28EAP%29+actuators%22&weight_search=1
> On comparing the above URL query strings, it appears that the unquoted terms in the Library catalog keyword search aren’t “anded” together with a “+” the way other searches are, but I’m not sure what the implications are, if any. Also, the Koha manual indicates the following, which suggests to me that I ought to get hits on the unquoted string:
> When you have more than one word in the search box, Koha will still do a keyword search, but a bit differently. Each word will be searched on its own, then the Boolean connector ‘and’ will narrow your search to those items with all words contained in matching records.
> I understand and can predict pretty well the way our old ILS (Millennium, if context helps) will perform a keyword search, but I’m a little confused here. My expectation for this particular case is that all of the above methods would yield results. If there are any pointers to be had, I thank you if might point me to them so that I may be better poised to help users.
> I’m using Elasticsearch with Koha 20.11.06. I reindexed both authorities and biblios today, but that didn’t impact my experience. The records are not newly added.
> Tasha Bales
> Business Support Team | Information Services Enterprise Services |
> Enterprise Operations, Finance and Sustainability
> Koha mailing list http://koha-community.org Koha at lists.katipo.co.nz
> Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha
More information about the Koha