2009/11/2 Sébastien Hinderer <Sebastien.Hinderer@snv.jussieu.fr>:
That's what I don't know -- how I want them to behave. My guess is that everything shold be disallowed because no page has a meaning without arguments... What I would like to know is whether this guess is correct or not.
I think it's not, actually. I have set up Koha for my customer #1 at sksk.bibkat.no and without doing anything I now get 7.000+ hits in Google when I search for site:sksk.bibkat.no: http://www.google.com/search?q=site%3Asksk.bibkat.no The very first hit is for a page of search results for the norwegian word for birds. How they figured that out, I have no idea! The strange thing is that this catalogue is hardly linked to from anywhere, so they must have some way to index the catalogue other than just following links. I notice that on the second page of search results there are several MARC-views - hardly what you want patrons to find first. So perhaps there should be some way to tell bots to just index the "ordinary" views, not things like MARC? Also, having a robots.txt just to say "index everything" sounds like a good idea, to avoid the "robots.txt not found" messages in the error log. Regards, Magnus libriotech.no