A good day, אהלן, こんにちは, Le jeu. 10 sept. 2020 à 03:16, Charles Kelley <cmkelleymls@gmail.com> a écrit :
Hi, all!
My library has an extensive catalog in CJK, Russian, and a few other languages that write in non-Roman writing systems.
Ho, mine also :).
We can import such records into Koha and export the records from Koha. Provided we make sure the applications (BBEdit, EndNote, Excel, Word, to name a few), browsers (Chrome, Firefox, IE, Safari, etc.), OSs (Linux, Mac OS, and Windows) can handle UTF-8, all is well -- for importing and exporting. But getting Koha to search CJK has been fruitless, and we are terribly frustrated.
Been there.
How does one get Koha to search CJK and Arabic, Cyrillic, Hebrew, and other non-Roman writing systems for that matter)?
Koha 20.05 running on Debian 9.4 "Stretch".
Tuning Zebra to search CKJ or languages written with arabic script was a real pain with Zebra, it is a no brainer with Elastic Search. With the ICU module enabled, it works very well for CKJ and handles glyphe similarities. For example, the Library of Congress catalogues Farsi with alef maksura U+0649 instead of yeh U+06CC. We imported the farsi records from the Library of Congress and we were unable to find the documents searching with a farsi keyboard yielding the letter yeh. You can parameter Zebra to handle this and say U+0649 = U+06CC. With Elastic Search and ICU, you don't have to, it just works. We lost some day the possibilities to search CKJ with Zebra and didn't understand how to get it back. Zebra is certainly a very good search engine. But it's weird and hard to tune. We don't even have to ask ourselves how to tweak Elastic Search to do it. It just works. Note that for Chinese, enabling QueryAutoTruncate with Elastic Search may lead to weird results when you type a full chinese name or title. As of 18.05 this is the case, I didn't check yet if this improved since then. We enabled it only when “*” is added at the end of a word. Best regards, יאללה ביי, それでは、また, -- *Nicolas Legrand* Administration technique et développements du système de gestion de la bibliothèque [image: Logo BULAC] Bibliothèque universitaire des langues et civilisations 65 rue des Grands Moulins F-75013 PARIS T +33 1 81 69 *18 22* www.bulac.fr