[Koha] Elasticsearch sorting broken

Thomas Klausner domm at plix.at
Thu Feb 10 03:52:41 NZDT 2022


Hi!

(I did not want to file a bug yet, because maybe we're doing something 
stupid on our side...)

We're using Elasticsearch for searching, but when using a non-default 
sort, the results are not really sorted:

https://katalog.landesbibliothek.steiermark.at/cgi-bin/koha/opac-search.pl?idx=&q=aristophanes&sort_by=author_az

This should sort by author, and it seems to be partly sorted, but 
several results are completly out of place. Here's another example, 
sorted by publication date:

https://katalog.landesbibliothek.steiermark.at/cgi-bin/koha/opac-search.pl?idx=&q=budapest+f%C3%BChrer&sort_by=pubdate_dsc&addto=Add+to...

We get the weirdest results when sorting by author:

https://libelle.stmk.gv.at/cgi-bin/koha/opac-search.pl?idx=&q=wald&sort_by=author_az&addto=Hinzuf%C3%BCgen+zu...


I dumped the query from Koha and sent it directly to Elasticsearch, and 
found some weird data in the result:

curl http://es:9200/index/_search?pretty -X GET -H 'Content-Type: 
application/json' -d 
'{"sort":[{"issues__sort":{"order":"asc"}}],"query":{"query_string":{"default_operator":"AND","query":"(arisophanes)","lenient":true,"type":"cross_fields","fields":["title"],"fuzziness":"auto","analyze_wildcard":true}}}'

 "hits" : {
    "hits" : [
    {
      "_source" : {
        ....
        "sort" : [
          "ށA†倀\u0001"
        ]
      },


Now I'm not sure if the weird binary data in sort ("ށA†倀\u0001") is 
some ES-internal representation, or if there's something wrong with our 
index or our data?

Does anybody else has any experience with using the non-default sort 
methods and elasticsearch? I doubt that this is a bug in Koha, because 
it's a very obvious and often-used feature, so I fear we might have a 
problem with some data / ES-Mappings / etc. Any ideas / input?

Greetings,
domm



-- 
#!/usr/bin/perl                             https://domm.plix.at
for(ref bless{},just'another'perl'hacker){s-:+-$"-g&&print$_.$/}


More information about the Koha mailing list