Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

domain reputation DB #696

Closed
msimerson opened this issue Oct 6, 2014 · 11 comments
Closed

domain reputation DB #696

msimerson opened this issue Oct 6, 2014 · 11 comments

Comments

@msimerson
Copy link
Member

Along with URIBL, WoT would make another excellent check against incoming email.

@smfreegard
Copy link
Collaborator

It would - except I can't see that the data is publicly available?

@msimerson
Copy link
Member Author

Scroll down to the bottom of that page and click the Developers link to find the API info page.

@smfreegard
Copy link
Collaborator

Forgot to mention that I looked at this when after you posted the above. Personally - I think the T&Cs are a bit of a showstopper:

In addition to the WOT Terms of Service, the following restrictions apply to using the API:

The API is free for individuals and non-commercial use.
You must not make more than 25000 API requests during any 24 hour period.
You must limit your request rate to at most 10 requests per second.
If you are unable to comply with these restrictions, please contact us regarding partnership or commercial offering.

Furthermore, if you use the API in your application, we recommend the following:
You should credit WOT in your application. You can use our badges, for example.
You should not request the same information more than once during any 30 minute period, but should use a local cache for repeated requests instead.

I manually sampled some data and I don't really think the work involved for this would be worth it e.g. it didn't find anything for the URIs that I tried.

@msimerson
Copy link
Member Author

Maybe WoT isn't quite the right tool. In much of the current spam, the domains are disposable, populated with "all the right DNS" (SPF, FCrDNS, helo hostname, etc), and then some amount of time elapses (so the domains fall off the "newly observed domains" lists (nod, sem-fresh, etc.), and then the campaigns begin. Another way to detect these disposable domains is to perform a Google search. In nearly every case, these disposable domains have zero matches. I can't think of many real-world cases where anyone would want to receive email from a domain with no Google visibility. Thoughts?

@msimerson msimerson changed the title WoT domain reputation DB domain reputation DB Nov 19, 2014
@msimerson
Copy link
Member Author

PhishTank is focused exclusively on phishing but it has a downloading database, making it a viable source of data to check URLs against.

@msimerson
Copy link
Member Author

Another one is Spam404. I don't see a data URL, but it appears it'd be pretty easy to scrape the web pages and maintain a local copy of the domain list.

@msimerson
Copy link
Member Author

Artists against 419 provides their DB via SOAP, so it could be sucked into a local DB and queried against as well.

@smfreegard
Copy link
Collaborator

PhishTank data is included in SURBL IIRC.

@msimerson
Copy link
Member Author

PhishTank data is included in SURBL IIRC

And ClamAV UNOFFICIAL.

@msimerson
Copy link
Member Author

for future reference: https://github.com/jpf/domain-profiler

@baudehlo
Copy link
Collaborator

This ticket seems redundant. I'm going to close - re-open if you think it's important.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants