[Koha] Slowness & outages

Coehoorn, Joel jcoehoorn at york.edu
Tue Jun 24 01:42:52 NZST 2025


We have similar issue with misbehaving AI crawlers, especially trying to
crawl the opac_search.pl page, which should be restricted by robots.txt.

fail2ban was only a partial success or stopgap, and didn't really fix
things.

What is working MUCH better is setting the apache MaxRequestWorkers option
(or possibly MaxClients on some systems) in the apache.conf file. I haven't
needed to manually intervene with our server for this issue since setting
the option. We still have some outages, but they are significantly less
frequent (maybe one or two every other week) and the server typically
recovers on it's own by the time uptimerobot detects it and sends the
notification to me. Further tuning of the value could probably eliminate
the problem for us entirely. As is often the case, the best value here will
depend on your specific server, database size, configuration, and load, but
15 can be a nice starting place.

*Joel Coehoorn*
Director of Information Technology
*York University*
Office: 402-363-5603 | jcoehoorn at york.edu | york.edu



On Mon, Jun 23, 2025 at 1:49 AM Wagner, Alexander <alexander.wagner at desy.de>
wrote:

> Hi!
>
> > My library consortium is hosted by ByWater Solutions and during the last
> > seven days, several times Koha has been slow and then there's an outage
> > for thirty to sixty minutes then things are fine for a while, then
> > there's more slowness and outage. ByWater is addressing these issues,
> > but I want to know if anyone else's sites or clients are having issues
> > like these?
>
> I am no exprt on the issue but we see this a lot on our open access
> repositories. There we were able to get it under control for the time being
> using `fail2ban`. As robots.txt tends to get ignored especially by the AI
> (I am sure of the meaning of `A` but I doubt the common translation of
> `I`...) the setup on our end is quite customized to the URLs exposed by the
> repos. For the jail config we currently use
>
> ```
> [apache-proxy]
> enabled = true
> filter = apache-proxy
> action = hostsdeny-proxy[name = apache-proxy]
> logpath = /opt/invenio/var/log/apache-ssl.log
> maxretry = 12
> findtime = 5
> bantime = 6000
> ```
>
> However, for this to work it's imperative that your proxy pass on the
> origin of the request (`X-Forwarded-For`).
>
> > I am also trying to get educated on modern internet threats and causes
> > for issues like this.
>
> It might be worthwhile to check up also with the repository community.
> With the exposed full texts they provide much more valuable content for the
> so called AI than just the bibliographic description. Additionally, in the
> past we saw quite aggressive bots harvesting the full texts for
> non-AI-related uses, so they nag us for quite some time. But unfortunately,
> there is no simple solution.
>
> --
> Kind regards,
>
> Alexander Wagner
>
> Deutsches Elektronen-Synchrotron DESY
> Library and Documentation
>
> Building 01d Room OG1.444
> Notkestr. 85
> 22607 Hamburg
>
> phone:  +49-40-8998-1758
> e-mail: alexander.wagner at desy.de
> _______________________________________________
>
> Koha mailing list  http://koha-community.org
> Koha at lists.katipo.co.nz
> Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha
>


More information about the Koha mailing list