Skip to content

Commit

Permalink
Doc: robots.txt: lame attempt at preventing AI robots to scrap us
Browse files Browse the repository at this point in the history
  • Loading branch information
rouault committed Jan 27, 2025
1 parent fcafee9 commit fd34c6c
Showing 1 changed file with 81 additions and 0 deletions.
81 changes: 81 additions & 0 deletions doc/source/extra_path/robots.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,84 @@ User-agent: *
Allow: /en/stable/
Disallow: /en/
Sitemap: https://gdal.org/sitemap.xml

# Prevent AI scrapping
# Source: https://neil-clarke.com/block-the-bots-that-feed-ai-models-by-scraping-your-website/

User-agent: CCBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: GPTBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Google-CloudVertexBot
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Omgilibot
Disallow: /

User-agent: Omgili
Disallow: /

User-agent: FacebookBot
Disallow: /

User-agent: Diffbot
Disallow: /

User-agent: DuckAssistBot
Disallow: /

User-agent: AI2Bot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: Kangaroo Bot
Disallow: /

User-agent: PanguBot
Disallow: /

User-agent: ImagesiftBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: cohere-ai
Disallow: /

User-agent: cohere-training-data-crawler
Disallow: /

User-agent: Meta-ExternalAgent
Disallow: /

User-agent: Meta-ExternalFetcher
Disallow: /

User-agent: Timpibot
Disallow: /

User-agent: Webzio-Extended
Disallow: /

User-agent: YouBot
Disallow: /

0 comments on commit fd34c6c

Please sign in to comment.