[Koha] Elastic search for Arabic

Mohamad F Barham mbarham at birzeit.edu
Tue Sep 3 17:42:11 NZST 2024


Hi,

Our catalog still in development, we have both Arabic and English.

You can visit it on https://koha.birzeit.edu/





Mohamad Barham

System Engineer | Information Technology Department

Birzeit University

P.O.Box. 14, Birzeit, Palestine

Tel: + 970 22982012 | Mob: +970 597 861929 | Ext: 5616

mbarham at birzeit.edu | www.birzeit.edu<http://www.birzeit.edu/>




________________________________
From: David Cook <dcook at prosentient.com.au>
Sent: Friday, August 30, 2024 3:40 AM
To: Mohamad F Barham <mbarham at birzeit.edu>
Cc: koha at lists.katipo.co.nz <koha at lists.katipo.co.nz>
Subject: Re: [Koha] Elastic search for Arabic

Caution: This email originated from outside the Organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. When in doubt, please contact support.



Hi Mohamad,

Does your collection only contain Arabic or does it contain multiple languages?

I've ben considering moving an Arabic/French/English collection to Elasticsearch, so I'd love to hear more about your experience.

David Cook
Senior Software Engineer
Prosentient Systems
Suite 7.03
6a Glen St
Milsons Point NSW 2061
Australia

Office: 02 9212 0899
Online: 02 8005 0595

-----Original Message-----
Date: Thu, 29 Aug 2024 06:27:57 +0000
From: Mohamad F Barham <mbarham at birzeit.edu>
To: Fridolin SOMERS <fridolin.somers at biblibre.com>,
        "koha at lists.katipo.co.nz" <koha at lists.katipo.co.nz>
Subject: Re: [Koha] Elastic search for Arabic
Message-ID:
        <TYZPR01MB38233C5A86A7B129CF7D4ECFA3962 at TYZPR01MB3823.apcprd01.prod.exchangelabs.com>

Content-Type: text/plain; charset="utf-8"

Dears,


I just need to update you regarding elastic search in Arabic,

SOLVED

Solution was so simple, using elastic search built-in arabic analyzer (REF https://www.elastic.co/guide/en/elasticsearch/reference/7.17/analysis-lang-analyzer.html#arabic-analyzer )

Using kibana opened biblio index settings, added

# this will remove the specified words from the stemmer

  "index.analysis.filter.arabic_keywords.keywords": [
    "الله"
  ],
  "index.analysis.filter.arabic_keywords.type": "keyword_marker",

# this for Arabic stemmer filter
  "index.analysis.filter.arabic_stemmer.type": "stemmer",
  "index.analysis.filter.arabic_stemmer.language": "arabic",


-------------

Then add the filters to the current analyzer (order is important)


"index.analysis.analyzer.analyzer_standard.filter": [

    "icu_folding",
    "arabic_keywords",
    "arabic_stemmer"
  ],
----------------
Then reindex from terminal

koha-elasticsearch --rebuild  -b  -c 2000 -p 8 koha




Mohamad Barham

System Engineer | Information Technology Department

Birzeit University

P.O.Box. 14, Birzeit, Palestine

Tel: + 970 22982012 | Mob: +970 597 861929 | Ext: 5616

mbarham at birzeit.edu | www.birzeit.edu<http://www.birzeit.edu/>





~~~~~~~~~~~~~~~~~~~~~~~~~~
The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The University is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
~~~~~~~~~~~~~~~~~~~~~~~~~~


More information about the Koha mailing list