Backlog

- Add the ability to switch the crawling engine: headless Chrome or Node.js requests

- Migrate to ES6 or Typescript, modern tooling
  -- Should be possible to run crawler end-to-end tests - OK
  -- Should be possible to run crawler unit tests, both on the console and in the browser - OK
  -- Should be possible to run crawler examples - OK
  -- Should be possible to publish an NPM package and install it - OK
  -- Should be possible to lint Typescript code - OK
  -- Re-factor crawler, extract more classes, add types
    --- Extract separate tests for Executor and Response - OK

  -- Use Mocha and Chai instead of Jasmine for unit tests? Use headless Chrome or Firefox when running tests.
     -- Test request failing, that concurrent request number is decreased as expected
  Replace PhantomJS with headless Chrome when running unit tests

  -- Fix linting errors
  -- Source map support
  -- Line coverage

- Add more e2e tests, change from jasmine-node to some standard test runner for Node.js?
- Add support for other request engines: headless Chrome, make it configurable
- Provide additional reactive-like API for crawler (rx.js), naturally urls being crawled form a stream of data?

- Enable debugging, when crawler is passed debug: true option log information about crawled urls, content etc.?

- It should be possible to limit the number of requests not only by the number of requests per second but also
by the number of concurrent active requests, i.e. maxConcurrentRequests: 5
This should provide yet another sensible approach to limiting the bandwidth being used up.

- Add the ability to avoid crawling non-text urls such as audio and video files, can take a lot of bandwidth

- In case there is an error when crawling some url, this url can be queued again, and only fully abandoned from crawling after another 2 repeated failures

- Limit the section of the page that should be crawled https://github.com/antivanov/js-crawler/issues/15

- Normalize the urls, so that urls https://github.com https://github.com/ are considered to be the same url

- Re-factor the crawler, add unit tests and the build infrastructure

- Add end-to-end tests for the following cases:
- Redirects
- Binary content
- Several pages referencing each other (tree like structure)
- Handle cycles: page 1 references page 2 and page 2 references page 1
- Look at and test more the API for forgetting crawled urls
- Better intuitive API for crawling several urls
- Move sources under the src directory, then bundle during the build
- Code coverage