[Koha] Problems with the facebook web crawler
Nigel Titley
nigel at titley.com
Fri Jul 26 01:04:26 NZST 2024
On 25/07/2024 13:55, Jason Boyer wrote:
> While they do ignore robots.txt they do at least supply a recognizable
> user agent that you can just block:
>
> RewriteEngine on
> RewriteCond %{HTTP_USER_AGENT} "facebookexternalhit|other|bots|here"
> RewriteCond %{REQUEST_URI} "!403\.pl" [NC]
> RewriteRule "^.*" "-" [F]
>
> Note that second RewriteCond is required or you'll end up with a
> redirect loop. They will still be sending you requests but at least they
> won't tie up a plack backend doing useless work. I haven't tried
> returning 5xx errors to see if that causes them to back off but I doubt
> they would take much notice.
Brilliant... that should help a lot. I'll also try Michael's approach
for comparison
Nigel
More information about the Koha
mailing list