[Koha] Elasticsearch sorting broken
Thomas Klausner
domm at plix.at
Thu Feb 10 03:52:41 NZDT 2022
Hi!
(I did not want to file a bug yet, because maybe we're doing something
stupid on our side...)
We're using Elasticsearch for searching, but when using a non-default
sort, the results are not really sorted:
https://katalog.landesbibliothek.steiermark.at/cgi-bin/koha/opac-search.pl?idx=&q=aristophanes&sort_by=author_az
This should sort by author, and it seems to be partly sorted, but
several results are completly out of place. Here's another example,
sorted by publication date:
https://katalog.landesbibliothek.steiermark.at/cgi-bin/koha/opac-search.pl?idx=&q=budapest+f%C3%BChrer&sort_by=pubdate_dsc&addto=Add+to...
We get the weirdest results when sorting by author:
https://libelle.stmk.gv.at/cgi-bin/koha/opac-search.pl?idx=&q=wald&sort_by=author_az&addto=Hinzuf%C3%BCgen+zu...
I dumped the query from Koha and sent it directly to Elasticsearch, and
found some weird data in the result:
curl http://es:9200/index/_search?pretty -X GET -H 'Content-Type:
application/json' -d
'{"sort":[{"issues__sort":{"order":"asc"}}],"query":{"query_string":{"default_operator":"AND","query":"(arisophanes)","lenient":true,"type":"cross_fields","fields":["title"],"fuzziness":"auto","analyze_wildcard":true}}}'
"hits" : {
"hits" : [
{
"_source" : {
....
"sort" : [
"ށA†倀\u0001"
]
},
Now I'm not sure if the weird binary data in sort ("ށA†倀\u0001") is
some ES-internal representation, or if there's something wrong with our
index or our data?
Does anybody else has any experience with using the non-default sort
methods and elasticsearch? I doubt that this is a bug in Koha, because
it's a very obvious and often-used feature, so I fear we might have a
problem with some data / ES-Mappings / etc. Any ideas / input?
Greetings,
domm
--
#!/usr/bin/perl https://domm.plix.at
for(ref bless{},just'another'perl'hacker){s-:+-$"-g&&print$_.$/}
More information about the Koha
mailing list