add some additional URL paths to robots.txt
My apache logs show that some redmine URLs are being heavily indexed by robots, and it seems like it would be best to have them blocked by robots.txt:
/projects/N/wiki/* (where N is the numeric project id)
#2 Updated by mark burdett over 8 years ago
Or I can easily block these via my apache config. But I do think they should be added to robots.txt by default. I also wonder, how are Googlebot and others even finding some of these non-canonical paths? It could point to a bug elsewhere which is generating links to these paths?
#5 Updated by mark burdett about 8 years ago
- File robots.txt.patch added
The wiki pages that this patch blocks are not the canonical path, they use the numeric project id rather than project name.
I now realize that the initial version of this patch blocked the individual issue pages; I intended to only block /issues? -- i.e. the global issue search page.