Like my work?
Scrapes the given website for internal links and saves the found ones into web.archive.org
I assume you have already installed go. (Go installation manual)
Download the dependecies via go get
Execute the following two commands:
go get -u github.com/simonfrey/proxyfy
go get -u github.com/PuerkitoBio/goquery
Just clone the git repo
git clone https://github.com/simonfrey/save_to_web.archive.org.git
Navigate into the directory of the git repo.
Execute with:
Please Replace http[s]://[yourwebsite.com]
with the url of the website you want to scrape and save.
go run main.go http[s]://[yourwebsite.com]
****Additional commandline arguments:
-p
for proxyfing the requests
-i
for also crawling internal urls (e.g. /test/foo)
So if you want to use the tool with also crawling interal links and use a proxy for that it would be the following command
go run main.go -p -i http[s]://[yourwebsite.com]