wrong sorting order of results beginning with accented letters (č, š, ř, ž,...)
Hi to all! I'm trying to achieve the correct sorting in the search results in Koha 3.12 installed on Ubuntu 12.04. I followed the instructions on http://wiki.koha-community.org/wiki/Encoding_and_Character_Sets_in_Koha (http://wiki.koha-community.org/wiki/Encoding_and_Character_Sets_in_Koha) Despite this, the sorting of the results of authorities or titles is wrong. You can see it on http://koha.doxos.eu:8080/(http://koha.doxos.eu:8080/) The variable "locale" is set to cs_CZ.UTF-8 and a simple sample sorting program written in Perl works correctly. Does anyone know which programs in Koha control the collation order of resullts on Opac and how to fix it? Thank you very much for the advice. Bohdan Smilauer Librarian of Economic Library Letenska 15 Prague 1 Czechia mail: b.smilauer@post.cz phone +420736120563
Hi there! You can adjust the mapping for sorting purposes in the sort-string-utf.chr file, here's a common line as example: map êèéëÊÈÉË e Make sure to use an encoding-savvy text editor (e.g. vi) and restart Zebra when you're done. Hope this helps, Fabio 2014/1/18 Bohdan Šmilauer <b.smilauer@post.cz>
Hi to all! I'm trying to achieve the correct sorting in the search results in Koha 3.12 installed on Ubuntu 12.04. I followed the instructions on
http://wiki.koha-community.org/wiki/Encoding_and_Character_Sets_in_Koha (http://wiki.koha-community.org/wiki/Encoding_and_Character_Sets_in_Koha)
Despite this, the sorting of the results of authorities or titles is wrong. You can see it on http://koha.doxos.eu:8080/(http://koha.doxos.eu:8080/) The variable "locale" is set to cs_CZ.UTF-8 and a simple sample sorting program written in Perl works correctly.
Does anyone know which programs in Koha control the collation order of resullts on Opac and how to fix it? Thank you very much for the advice.
Bohdan Smilauer
Librarian of Economic Library
Letenska 15
Prague 1
Czechia
mail: b.smilauer@post.cz
phone +420736120563
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz http://lists.katipo.co.nz/mailman/listinfo/koha
Hi Fabio, many thanks for your advice. It helped me. I have investigated /etc/koha/ zebradb/zebra-authorities.cfg file,the specific entry # Where are the config files located? profilePath:/etc/koha/zebradb/authorities/etc:/etc/koha/zebradb/etc:/etc/ koha/zebradb/marc_defs/marc21/authorities:/etc/koha/zebradb/lang_defs/en It points to /etc/koha/zebradb/lang_defs/en , where is the file sort-string- utf.chr, which I updated, as you wrote: "map ěêèéëÊÈÉË e", etc. I have found that other syntax can be used "map eěêèéëÊÈÉË", or "map ěêèéëÊÈÉË(e)", what is correct? Then I ran koha-rebuild -zebra -a -v -f. It caused the accented letters are assumed to be all the same "e" and the accent is ignored in collation. But grammatically correct is, that the "e" precedes "é", "ě",..... How can I control this succession? You can see the result on http://koha.doxos.eu:8080, Authority search, "Submit" (leave the other fields empty) . I want to replace in the Koha missing "Browse authors" by the Authority search. The number of authors is typically more then one thousand. I noticed the authors with No. more then 1000, are not sorted, despite I added "sortmax 11000" at the end of etc/koha/zebradb/zebra-authorities.cfg file. Selecting authors starting with e.g. letter "H" is not simple task, you have to skip many screens, there is no direct jump to the letter "H" or to page e.g. 345. Have you some experience with this problem? Many thanks Bohdan Smilauer Librarian of Economic Library Letenska 15 Prague 1 Czechia mail: b.smilauer@post.cz phone +420736120563 ---------- Původní zpráva ---------- Od: Fabio Tiana <fabio.tian@gmail.com> Datum: 20. 1. 2014 Předmět: Re: [Koha] wrong sorting order of results beginning with accented letters (č, š, ř, ž,...) "Hi there! You can adjust the mapping for sorting purposes in the sort-string-utf.chr file, here's a common line as example: map êèéëÊÈÉË e Make sure to use an encoding-savvy text editor (e.g. vi) and restart Zebra when you're done. Hope this helps, Fabio 2014/1/18 Bohdan Šmilauer <b.smilauer@post.cz>
Hi to all! I'm trying to achieve the correct sorting in the search results in Koha 3.12 installed on Ubuntu 12.04. I followed the instructions on
http://wiki.koha-community.org/wiki/Encoding_and_Character_Sets_in_Koha (http://wiki.koha-community.org/wiki/Encoding_and_Character_Sets_in_Koha)
Despite this, the sorting of the results of authorities or titles is wrong. You can see it on http://koha.doxos.eu:8080/(http://koha.doxos.eu:8080/) The variable "locale" is set to cs_CZ.UTF-8 and a simple sample sorting program written in Perl works correctly.
Does anyone know which programs in Koha control the collation order of resullts on Opac and how to fix it? Thank you very much for the advice.
Bohdan Smilauer
Librarian of Economic Library
Letenska 15
Prague 1
Czechia
mail: b.smilauer@post.cz
phone +420736120563
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz http://lists.katipo.co.nz/mailman/listinfo/koha
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz http://lists.katipo.co.nz/mailman/listinfo/koha"
Hi to all, Il 20/01/2014 11:24, Bohdan Šmilauer ha scritto:
"map ěêèéëÊÈÉË e", etc. I have found that other syntax can be used "map eěêèéëÊÈÉË", or "map ěêèéëÊÈÉË(e)", what is correct? Then I ran koha-rebuild -zebra -a -v -f.
as I know the correct is "map ěêèéëÊÈÉË e"
It caused the accented letters are assumed to be all the same "e" and the accent is ignored in collation.
And in fact , this correct. With "map ěêèéëÊÈÉË e" you optain this result. So your request is more complex. See the others dir parallel with /etc/koha/zebradb/lang_defs/en for example /etc/koha/zebradb/lang_defs/uk [Ukraine lang] /etc/koha/zebradb/lang_defs/nb [Norwegian lang] and read http://www.indexdata.com/zebra/doc/character-map-files.html , the official help on setup sort-string- utf.chr [not easy] I suggest to read also http://www.indexdata.com/zebra/doc/fields-and-charsets.html and linked pages. Cheers Zeno Tajoli -- Dr. Zeno Tajoli Dipartimento Gestione delle Informazioni e della Conoscenza z.tajoli@cineca.it fax +39 02 2135520 CINECA - Sede operativa di Segrate
On Mon, Jan 20, 2014 at 7:24 AM, Bohdan Šmilauer <b.smilauer@post.cz> wrote:
It points to /etc/koha/zebradb/lang_defs/en , where is the file sort-string- utf.chr, which I updated, as you wrote:
"map ěêèéëÊÈÉË e", etc. I have found that other syntax can be used "map eěêèéëÊÈÉË", or "map ěêèéëÊÈÉË(e)", what is correct? Then I ran koha-rebuild -zebra -a -v -f.
It caused the accented letters are assumed to be all the same "e" and the accent is ignored in collation.
That's correct. Mapping all variants of e+diacritics means they all "weight" the same (as e) in an ordering.
But grammatically correct is, that the "e" precedes "é", "ě",..... How can I control this succession?
Take the 'es' example (zebradb/lang_defs/es/sort-string-utf.chr) and look for the lines: lowercase {0-9}{a-y}zæøå uppercase {0-9}{A-Y}ZÆØÅ ^^^^^^^^^^ those are the lines you need to adjust. To accomplish your goal you should remove from the mappings those letters with diacritics you want to give a different sorting order (i.e. make them not weight the same as 'e'). The next step is putting them in the lowercase and uppercase lines.inthe (increasing) order. For example: lowercase {0-9}abcdeěêèéëfghijklmnopqrstuvwxyz Regards To+ -- Tomás Cohen Arazi Prosecretaría de Informática Universidad Nacional de Córdoba ✆ +54 351 4333190 ext 13168 GPG: B76C 6E7C 2D80 551A C765 E225 0A27 2EA1 B2F3 C15F
Hi Tomas! Estupendo! Muchas gracias! Really it does work! See http://koha.doxos.eu: 8080/ The succession (collation) of all letters is determined by the order in the list # basic character set lowercase {0-9}aábcčdďeéěfghiíjklmnňoópqrřsštťuúůvwxyýzž uppercase {0-9}AÁBCČDĎEÉĚFGHIÍJKLMNŇOÓPQRŘSŠTŤUÚŮVWXYÝZŽ stored in active sort-string-utf.chr file. The accented letters must not be included in others maps or equivalence statements. How simple, but how difficult to discover it! I would like to express my thanks to all who helped me. Yours sincerely Bohdan Smilauer Librarian of Economic Library Letenska 15 Prague 1 Czechia mail: b.smilauer@post.cz phone +420736120563 ---------- Původní zpráva ---------- Od: Tomas Cohen Arazi <tomascohen@gmail.com> Datum: 20. 1. 2014 Předmět: Re: [Koha] wrong sorting order of results beginning with accented letters (č, š, ř, ž,...) " On Mon, Jan 20, 2014 at 7:24 AM, Bohdan Šmilauer <b.smilauer@post.cz (mailto:b.smilauer@post.cz)> wrote: " It points to /etc/koha/zebradb/lang_defs/en , where is the file sort-string- utf.chr, which I updated, as you wrote: "map ěêèéëÊÈÉË e", etc. I have found that other syntax can be used "map eěêèéëÊÈÉË", or "map ěêèéëÊÈÉË(e)", what is correct? Then I ran koha-rebuild -zebra -a -v -f. It caused the accented letters are assumed to be all the same "e" and the accent is ignored in collation." That's correct. Mapping all variants of e+diacritics means they all "weight" the same (as e) in an ordering. " But grammatically correct is, that the "e" precedes "é", "ě",..... How can I control this succession?" Take the 'es' example (zebradb/lang_defs/es/sort-string-utf.chr) and look for the lines: lowercase {0-9}{a-y}zæøå uppercase {0-9}{A-Y}ZÆØÅ ^^^^^^^^^^ those are the lines you need to adjust. To accomplish your goal you should remove from the mappings those letters with diacritics you want to give a different sorting order (i.e. make them not weight the same as 'e'). The next step is putting them in the lowercase and uppercase lines.in (http://lines.in) the (increasing) order. For example: lowercase {0-9}abcdeěêèéëfghijklmnopqrstuvwxyz Regards To+ -- Tomás Cohen Arazi Prosecretaría de Informática Universidad Nacional de Córdoba ✆ +54 351 4333190 ext 13168 GPG: B76C 6E7C 2D80 551A C765 E225 0A27 2EA1 B2F3 C15F "
participants (4)
-
Bohdan Šmilauer -
Fabio Tiana -
Tomas Cohen Arazi -
Zeno Tajoli