local language searching problem
Dear all, I plan to install KOHA in my office. I'm trying some test of KOHA configuration in development environment. I have two language books; English and Dzongkha(Bhutanese official language). I resitered some books as test into KOHA and verified search function in both staff and OPAC page. While searching in English, I got results which match title condition I entered. However, while I search in Dzongkha, the result shows all books which I resistered in Dzongkha language,though I entered searching conditions. When I tested to verify SELECT function in MySQL, I could get correct results in Dzongkha. Maybe, I suppose that the cause why I cannot get correct results in Dzongkha case is related to KOHA's configuration. If anyone know solution for this local language searching problem, could you please tell me in detail how to modify configuration. OS ; Linux Ubuntu14.04 64bit KOHA version: KOHA 3.22.11 Zebra version: Zebra 2.0.44 record field: MARC21 Best regards, Fatone -- View this message in context: http://koha.1045719.n5.nabble.com/local-language-searching-problem-tp5906148... Sent from the Koha-general mailing list archive at Nabble.com.
Fatone, If you do a keyword search in Dzongkha, do you get results? (i.e. the "Search the catalog" masthead search on the masthead search in the staff client). How is the Dzongkha text cataloged? Are you using MARC 880? It's possible that you may be running into Bug 17407 - Fields cataloged using MARC21 880 are only searchable using keyword search https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=17407 --Barton On Sun, Oct 9, 2016 at 6:59 AM, Fatone <zfr83329@xzsok.com> wrote:
Dear all,
I plan to install KOHA in my office. I'm trying some test of KOHA configuration in development environment. I have two language books; English and Dzongkha(Bhutanese official language). I resitered some books as test into KOHA and verified search function in both staff and OPAC page. While searching in English, I got results which match title condition I entered. However, while I search in Dzongkha, the result shows all books which I resistered in Dzongkha language,though I entered searching conditions. When I tested to verify SELECT function in MySQL, I could get correct results in Dzongkha. Maybe, I suppose that the cause why I cannot get correct results in Dzongkha case is related to KOHA's configuration. If anyone know solution for this local language searching problem, could you please tell me in detail how to modify configuration.
OS ; Linux Ubuntu14.04 64bit KOHA version: KOHA 3.22.11 Zebra version: Zebra 2.0.44 record field: MARC21
Best regards,
Fatone
-- View this message in context: http://koha.1045719.n5.nabble. com/local-language-searching-problem-tp5906148.html Sent from the Koha-general mailing list archive at Nabble.com. _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
Hi Barton On Sun, Oct 9, 2016 at 10:11 PM, Barton Chittenden <barton@bywatersolutions.com> wrote:
Fatone,
If you do a keyword search in Dzongkha, do you get results? (i.e. the "Search the catalog" masthead search on the masthead search in the staff client).
How is the Dzongkha text cataloged? Are you using MARC 880?
Bhutanese users that I work with usually enter Dzongkha as any other UTF-8 text. I suspect the case is a simple one - that of setting up Zebra to use icuchain Fatone: The name is "Koha" and never "KOHA". It is a proper name, not an acronym or abbreviation :-) cheers Indranil Das Gupta L2C2 Technologies Phone : +91-98300-20971 WWW : http://www.l2c2.co.in Blog : http://blog.l2c2.co.in IRC : indradg on irc://irc.freenode.net Twitter : indradg
Excellent point, Indranil! And for completeness, here's a wiki page showing how to set up ICU chains: https://wiki.koha-community.org/wiki/Correcting_Search_of_Arabic_records I think that the changes to /etc/koha/zebradb/etc/default.idx are all that are *required* -- as long as words-icu.xml exists, making the changes to default.idx will enable searching. On Sun, Oct 9, 2016 at 2:04 PM, Indranil Das Gupta <indradg@gmail.com> wrote:
Hi Barton
On Sun, Oct 9, 2016 at 10:11 PM, Barton Chittenden <barton@bywatersolutions.com> wrote:
Fatone,
If you do a keyword search in Dzongkha, do you get results? (i.e. the "Search the catalog" masthead search on the masthead search in the staff client).
How is the Dzongkha text cataloged? Are you using MARC 880?
Bhutanese users that I work with usually enter Dzongkha as any other UTF-8 text.
I suspect the case is a simple one - that of setting up Zebra to use icuchain
Fatone: The name is "Koha" and never "KOHA". It is a proper name, not an acronym or abbreviation :-)
cheers
Indranil Das Gupta L2C2 Technologies
Phone : +91-98300-20971 WWW : http://www.l2c2.co.in Blog : http://blog.l2c2.co.in IRC : indradg on irc://irc.freenode.net Twitter : indradg
Dear Barton, Thank you for replying, Barton, I can get results when I input a keyword in Dzongkha in the "Search the catalog", but the results are all books which resisterd into Koha. I use Dzongkha text which I installed from Ubuntu Language Support, and use MARC21 942 fields of default framework. I have no idea to set up ICU chains. The link which you attached is for Arabic, not Dzongkha. If I correct "words-icu.xml" from <icu_chain locale="ar"> to <icu_chain locale="dz">, will searching enable? I'm sorry that I don't understand what you say, because I don't have enough knowleage about Koha yet. Would you mind giving me more advice about what I should do in detail. -- View this message in context: http://koha.1045719.n5.nabble.com/local-language-searching-problem-tp5906148... Sent from the Koha-general mailing list archive at Nabble.com.
Hi Fatone and all, Il 10/10/2016 14:45, Fatone ha scritto:
The link which you attached is for Arabic, not Dzongkha. If I correct "words-icu.xml" from <icu_chain locale="ar"> to <icu_chain locale="dz">, will searching enable?
in fact this change is the first thing to do. I suggest you to read also https://wiki.koha-community.org/wiki/Correcting_Search_of_Polish_records. It is more clear if know latin alphabet better than arabic. In fact the icu_chain could be: <icu_chain locale="dz"> <transliterate rule="\'>\ "/> <transliterate rule="[:Number:] { '-' > "/> <transform rule="[:Control:] Any-Remove"/> <tokenize rule="l"/> <transform rule="[[:WhiteSpace:][:Punctuation:]] Remove"/> <transform rule="NFD"/> <transform rule="[:Nonspacing Mark:] Remove"/> <transform rule="NFC"/> <display/> <casemap rule="l"/> </icu_chain> This a basic for all languges, read ICU site to understand better the options availabe. You can start for here: http://userguide.icu-project.org/transforms/general/rules In fact Zebra (the indexing system of Koha) uses ICU quite directly. You can do same test with yaz-icu: http://www.indexdata.com/yaz/doc/yaz-icu.html Bye Zeno Tajoli -- Zeno Tajoli /SVILUPPO PRODOTTI CINECA/ - Automazione Biblioteche Email: z.tajoli@cineca.it Fax: 051/6132198 *CINECA* Consorzio Interuniversitario - Sede operativa di Segrate (MI)
Hi Fatone, On Mon, Oct 10, 2016 at 6:15 PM, Fatone <zfr83329@xzsok.com> wrote: <snipped>
I'm sorry that I don't understand what you say, because I don't have enough knowleage about Koha yet. Would you mind giving me more advice about what I should do in detail.
For now do this: 1. using sudo permissions edit this file /etc/koha/zebradb/etc/default.idx 2. Find the line "charmap word-phrase-utf.chr" in it (there should be two instances) 3. Replace that line with "icuchain words-icu.xml" in both the cases 4. rebuild zebra with the command: sudo koha-rebuild-zebra -v -f your_koha_instance_name where your_koha_instance_name is the name you gave while running the koha-create command 5. Set the following system preferences in your Koha administration. (a) enable UseICU (b) QueryFuzzy and QueryStemming set to do not try. Long explanation: The reason why Barton and Karam are pointing to the Arabic reference is because that example highlights how to generate language specific transliteration and collation rules to be used (in that case Arabic). In case of Dzongkha you *may* (or may not) need to define your own for fine tuning the searches. I do not understand Dzongkha grammar well enough to say if you do require to do it or not. cheers Indranil Das Gupta L2C2 Technologies Phone : +91-98300-20971 WWW : http://www.l2c2.co.in Blog : http://blog.l2c2.co.in IRC : indradg on irc://irc.freenode.net Twitter : indradg
Dear Zeno, Indranil Thank you for giving me advice to set up. Anyway, I'm trying that way while studying icu_chain and Koha configuration. If It doesn't work the setting, please let me ask you again. Thank you for everything. Fatone -- View this message in context: http://koha.1045719.n5.nabble.com/local-language-searching-problem-tp5906148... Sent from the Koha-general mailing list archive at Nabble.com.
Thank you for advice, Indranil, I take care the abbreviation from now on. It was good that I can remind before humiliated. Fatone -- View this message in context: http://koha.1045719.n5.nabble.com/local-language-searching-problem-tp5906148... Sent from the Koha-general mailing list archive at Nabble.com.
participants (4)
-
Barton Chittenden -
Fatone -
Indranil Das Gupta -
Tajoli Zeno