[Koha] How or maybe where are relevance rankings established?

Doug Dearden dearden at sarsf.org
Thu Feb 2 07:02:42 NZDT 2012


Hi Ian,

Thanks for the info.  Your comment that “exact matches in the title field are most relevant (in keyword searches, at least), then partial title matches, then regular matches” is the behavior that I expected, but doesn’t seem to always hold true.  If you go to our public catalog at http://library.sarsf.org  and do a Title search on “coming of age” (without the quotes), there are four hits returned, all of which have the phrase in either the Title or the Remainder of Title field.  If you do the same search as a Keyword search, the returned list is topped by an item that has the phrase “coming of age” in the Summary, Etc. field (520a).  The four hits that get returned when doing a Title search appear in positions 2, 4, 5, and 9.  We would like to get the books that have the exact phrase in the title at the top of the list.

Looking at the code in Search.pm it does look like it is trying to weight for title, but I also traced the system preferences it is calling at the beginning of the subroutine and they all say “REQUIRES ZEBRA” on them.  So I assume any final tweaking would include both zebra config and perhaps refining what is going on in the subroutine in Search.pm.

None of the above am I up to right now. ;)  We will live with things as they are.

Again thanks for the response.

Best,

Doug



From: Ian Walls [mailto:koha.sekjal at gmail.com]
Sent: Wednesday, February 01, 2012 4:29 AM
To: Doug Dearden
Subject: Re: [Koha] How or maybe where are relevance rankings established?

Doug,


Relevancy weighting is determined in C4/Search.pm.  It's pretty complex, and not as rich as one might think.  The basic truth is that exact matches in the title field are most relevant (in keyword searches, at least), then partial title matches, then regular matches.  I'm simplifying quite a bit, but things like author matches or "newness" of publication year don't factor in.

The exact code  starts currently around line 757, in the internal subroutine _build_weighted_query().

Cheers,


-Ian
On Tue, Jan 31, 2012 at 11:57, Doug Dearden <dearden at sarsf.org<mailto:dearden at sarsf.org>> wrote:
Thanks Mason, you've put me on the yellow brick road.

I've looked at several config files in the ..../zebradb subdirectory, and am now surrounded by the flying monkeys. :)

Best,

Doug

-----Original Message-----
From: Mason James [mailto:mtj at kohaaloha.com<mailto:mtj at kohaaloha.com>]
Sent: Monday, January 30, 2012 6:50 PM
To: Doug Dearden
Cc: koha at lists.katipo.co.nz<mailto:koha at lists.katipo.co.nz>
Subject: Re: [Koha] How or maybe where are relevance rankings established?


On 2012-01-31, at 12:44 PM, Doug Dearden wrote:

> Hello All,
>
> I am trying to get a grip on how a search is ordered when "relevance" is specified.  Is there a place in the staff client where these results are controlled - i.e. which fields are given weight?

afaik, the relevance is determined in zebra, internally


here's some doc..

http://www.indexdata.com/zebra/doc/administration-ranking.html
http://www.indexdata.com/zebra/doc/querymodel-zebra.html

>
> Or perhaps the question is where would I find the algorithm that is being used - assuming it is hard coded somewhere?


>
> The "about" info pasted below my name in the event an update will make a difference here.
yes indeed, good thinking :)



_______________________________________________
Koha mailing list  http://koha-community.org
Koha at lists.katipo.co.nz<mailto:Koha at lists.katipo.co.nz>
http://lists.katipo.co.nz/mailman/listinfo/koha



More information about the Koha mailing list