[Koha] Block web crawlers Bots
Michael Kuhn
mik at adminkuhn.ch
Fri Jan 28 23:40:52 NZDT 2022
Hi
Instead of blocking webcrawlers and bots which want to index your
records (which is generally a good thing) I recommend you implement a
sitemap.
* https://koha-community.org/manual/21.11/en/html/cron_jobs.html#sitemap
* https://lists.katipo.co.nz/public/koha/2020-November/055401.html
*
https://github.com/Koha-Community/Koha/blob/master/misc/cronjobs/sitemap.pl
Like this the webcrawlers will just access your sitemap instead of every
single record in your database. This will decrease memory usage immensely.
Blocking webcrawler by firewall can get very hairy because many bots do
not have just one IP address where they come from.
Best wishes: Michael
--
Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis
Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz
T 0041 (0)61 261 55 61 · E mik at adminkuhn.ch · W www.adminkuhn.ch
Am 28.01.22 um 06:04 schrieb JITHIN N:
> My server memory usage is increasing by some web search engine bots by
> access of OPAC. I tried put
>
> *robots.txt in *
>
> on /usr/share/koha/opac/htdocs
>
> As
>
> *User-agent: **
>
> *Disallow: /*
>
> But still, my server is getting loaded by bots. The apache log file shows
> bots names like petalbot, Googlebot, AhrefsBot etc. How can I block all
> these bots' access to my OPAC pages?
>
> With Regards
>
> Jithin N
>
> *System Information*
>
> Koha 21.05
>
> Deian 10
> _______________________________________________
>
> Koha mailing list http://koha-community.org
> Koha at lists.katipo.co.nz
> Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha
More information about the Koha
mailing list