Intro

this is a script, which crawls through a website and saves everything in either the data or the downloads folder, depending on where the data comes from.

Html pages and their resources go to /data. Everything else goes to /download.

Usage

install the script:

npm install -g nightmare-spider

run the script with config json

nightmare-spider /path/to/config.json

This config.json should look like this:

{
  "ssl": true, // use http oder https? - default http
  "domain": "ethereum.org", // only crawl links on this domain
  "start": "ethereum.org", // start to crawl here
  "path": "/home/foo/test", // files are saved here - must be absolute
  "maxConnections": 10 // how many connections - default: 10
}

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.idea		.idea
.gitignore		.gitignore
README.md		README.md
download-util.js		download-util.js
error-util.js		error-util.js
global-util.js		global-util.js
index.js		index.js
log-util.js		log-util.js
messages.json		messages.json
package.json		package.json
path-util.js		path-util.js
promise-pool-wrapper.js		promise-pool-wrapper.js
url-util.js		url-util.js
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intro

Usage

About

Releases

Packages

Languages

beac0n/nightmare-spider

Folders and files

Latest commit

History

Repository files navigation

Intro

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages