Hi!
My library consortium is hosted by ByWater Solutions and during the last seven days, several times Koha has been slow and then there's an outage for thirty to sixty minutes then things are fine for a while, then there's more slowness and outage. ByWater is addressing these issues, but I want to know if anyone else's sites or clients are having issues like these?
I am no exprt on the issue but we see this a lot on our open access repositories. There we were able to get it under control for the time being using `fail2ban`. As robots.txt tends to get ignored especially by the AI (I am sure of the meaning of `A` but I doubt the common translation of `I`...) the setup on our end is quite customized to the URLs exposed by the repos. For the jail config we currently use ``` [apache-proxy] enabled = true filter = apache-proxy action = hostsdeny-proxy[name = apache-proxy] logpath = /opt/invenio/var/log/apache-ssl.log maxretry = 12 findtime = 5 bantime = 6000 ``` However, for this to work it's imperative that your proxy pass on the origin of the request (`X-Forwarded-For`).
I am also trying to get educated on modern internet threats and causes for issues like this.
It might be worthwhile to check up also with the repository community. With the exposed full texts they provide much more valuable content for the so called AI than just the bibliographic description. Additionally, in the past we saw quite aggressive bots harvesting the full texts for non-AI-related uses, so they nag us for quite some time. But unfortunately, there is no simple solution. -- Kind regards, Alexander Wagner Deutsches Elektronen-Synchrotron DESY Library and Documentation Building 01d Room OG1.444 Notkestr. 85 22607 Hamburg phone: +49-40-8998-1758 e-mail: alexander.wagner@desy.de