A Bot that searches for and posts links to archived versions of articles after scanning all of HackerNews' top articles for those that contain a link to a site that requires a subscription.
Enjoy a cute robot I made with dalle-2
The Program follows 4 steps :
1. It collects all of HackerNews' top, new, and best posts.
2. Identifies which posts include links to articles on websites with subscription blocks.
(i.e. After a certain number of articles, you must pay to read them.)
3. Locates the articles' archived snapshots.
- WaybackMachine is mostly used because archive.today (archive.ph) uses Cloudflare, which has a very robust bot detector.
4. Leaves a comment with a link to the Archived snapshot.
Unfortunately, the application was blocked by the moderator of HackerNews, since hackernews doesn't allow bots. The software, however, performs flawlessly (p.s. - i'm joking, I'm sure their are 100+ bugs).
I enjoyed learning about building a bot and using APIs while also discovering a whole new subfield of computer science.
- I hope someone finds this project entertaining or useful
- Feel free to expand on this and build something better
Contents of requirements.txt
beautifulsoup4==4.11.1
regex==2022.7.9
requests==2.28.1
urllib3==1.26.11