[Koha] wrong sorting order of results beginning with accented letters (č, š, ř, ž,...)

Tomas Cohen Arazi tomascohen at gmail.com
Tue Jan 21 02:13:15 NZDT 2014


On Mon, Jan 20, 2014 at 7:24 AM, Bohdan Šmilauer <b.smilauer at post.cz> wrote:

>
> It points to /etc/koha/zebradb/lang_defs/en , where is the file
> sort-string-
> utf.chr, which I updated, as you wrote:
>
> "map ěêèéëÊÈÉË e", etc. I have found that other syntax can be used "map
> eěêèéëÊÈÉË", or "map ěêèéëÊÈÉË(e)", what is correct? Then I ran
> koha-rebuild
> -zebra -a -v -f.
>
> It caused the accented letters are assumed to be all the same "e" and the
> accent is ignored in collation.


That's correct. Mapping all variants of e+diacritics means they all
"weight" the same (as e) in an ordering.


> But grammatically correct is, that the "e"
> precedes "é", "ě",.....   How can I control this succession?


Take the 'es' example (zebradb/lang_defs/es/sort-string-utf.chr) and look
for the lines:

lowercase {0-9}{a-y}zæøå
uppercase {0-9}{A-Y}ZÆØÅ

^^^^^^^^^^ those are the lines you need to adjust. To accomplish your goal
you should remove from the mappings those letters with diacritics you want
to give a different sorting order (i.e. make them not weight the same as
'e'). The next step is putting them in the lowercase and uppercase
lines.inthe (increasing) order.

For example:

lowercase {0-9}abcdeěêèéëfghijklmnopqrstuvwxyz


Regards
To+

-- 
Tomás Cohen Arazi
Prosecretaría de Informática
Universidad Nacional de Córdoba
✆ +54 351 4333190 ext 13168
GPG: B76C 6E7C 2D80 551A C765  E225 0A27 2EA1 B2F3 C15F


More information about the Koha mailing list