submits get requests to Hacker News' API endpoints, and then appends the data to a text file, which we will later parse and enter into our data base
-
Modify scraper.js
-
Change the filename on line 37 * The 'items' numbers should be the same as your beginning and end values from index.html. * For example: 'items-1-n.txt' for start of 1.
-
Start a node server
-
Open your terminal
-
Navigate to your root folder for Scraper.
-
Enter 'node scraper.js' in the command line to start a local node server that will write data to your file.
-
Disable power conservation settings on your mac.
-
Plug your computer into a charger.
-
Click the in the upper left bar on your home screen.
-
Select system preferences
-
Select energy saver.
-
Check * Prevent computer from sleeping automatically when the display is off * Wake for Wi-Fi network access * Kindly close pop-ups that warn you you're going to waste power
-
Uncheck * Put hard disks to sleep when possible * Enable Power Nap while plugged into a power adapter
-
Open index.html from the scraper folder in your browser.
- Enter a start and end value in the boxes that are 2000 apart.
- For example, 1-2001.
- When you're done downloading:
- Close the browser.
- Navigate to the scraper_data folder.
- Open the text file to find the last item downloaded.
- Rename the file by replacing n with the last item downloaded. * If item 5555 was the last item you downloaded the file would be renamed items-1-5555.txt
- 1 - 2,832,730 items completed (Justin)
- 4,879,476 - 8,474,817 items completed (Adam)
- Total on Oct. 23: 6,428,071