Legitbot

Ruby gem to make sure that an IP really belongs to a bot, typically a search engine.

Usage

Suppose you have a Web request and you would like to check it is not diguised:

bot = Legitbot.bot(userAgent, ip)

bot will be nil if no bot signature was found in the User-Agent. Otherwise, it will be an object with methods

bot.detected_as # => :google
bot.valid? # => true
bot.fake? # => false

Sometimes you already know which search engine to expect. For example, you might be using rack-attack:

Rack::Attack.blocklist("fake Googlebot") do |req|
  req.user_agent =~ %r(Googlebot) && Legitbot::Google.fake?(req.ip)
end

Or if you do not like all those ghoulish crawlers stealing your content, evaluating it and getting ready to invade your site with spammers, then block them all:

Rack::Attack.blocklist 'fake search engines' do |request|
  Legitbot.bot(request.user_agent, request.ip)&.fake?
end

Versioning

Semantic versioning with the following clarifications:

MINOR version is incremented when support for new bots is added.
PATCH version is incremented when validation logic for a bot changes (IP list updated, for example).

Supported

Ahrefs
Amazonbot
Amazon AdBot
Applebot
Baidu spider
Bingbot
BLEXBot (WebMeUp)
DataForSEO
DuckDuckGo bot
Google crawlers
IAS
OpenAI GPTBot
Oracle Data Cloud Crawler
Marginalia
Meta / Facebook Web crawlers
Petal search engine
Pinterest
Twitterbot, the list of IPs is in the Troubleshooting page
Yandex robots
You.com

License

Apache 2.0

Other projects

Play Framework variant in Scala: play-legitbot
Article When (Fake) Googlebots Attack Your Rails App
Voight-Kampff is a Ruby gem that detects bots by User-Agent
crawler_detect is a Ruby gem and Rack middleware to detect crawlers by few different request headers, including User-Agent
Project Honeypot's http:BL can not only classify IP as a search engine, but also label them as suspicious and reports the number of days since the last activity. My implementation of the protocol in Scala is here.
CIDRAM is a PHP routing manager with built-in support to validate bots.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Legitbot

Usage

Versioning

Supported

License

Other projects

Files

README.md

Latest commit

History

README.md

File metadata and controls

Legitbot

Usage

Versioning

Supported

License

Other projects