Skip to content

A utility program that compiles to Google Apps Script to scrape job postings and record them into a spreadsheet file.

License

Notifications You must be signed in to change notification settings

justkash/job_scraper

Repository files navigation

Goals

The goal for this project is to aid in selecting the best job postings to apply for and to also identify the topic areas that are highest in demand for employers. Some of the questions that this project attemps to answer are the following:

  • Which jobs offer the most competitive compensation?
  • Which qualifications have the highest demand?
  • Where are companies looking for remote employees located?

The resulting data from this project can be aggregated into a dashboard as shown in the image below to answer the outlined questions.

Dashboard built using the scraped data

Changing Default Settings

For now, the changable settings for the scripts are hard coded as constants at the begining of the main.ts file. The FOLDER_ID is of particular importance since this value is used to identify which directory within Google Drive the created spreadsheet file will be put into. Change this value to the id of the folder of your choice before deploying the script. Note also that the FILE_NAME and SHEET_NAME values might also be of interest should you wish to change the default file and sheet names.

Starting the Docker Image

Create the Docker image using the following command.

docker image build -t clasp:1.0 .

Run the container using the following command. This will put you into the shell prompt within the container.

docker run -it --rm --name job_scraper --mount type=bind,source="$(pwd)",target=/usr/src/app clasp:1.0

Using Clasp

Clasp is a utility from Google to allow for Google Apps Script development locally. Once inside the container, run the following command to login and follow the prompts.

clasp login --no-localhost

You can check the login status of clasp using the following command.

clasp login --status

Upload to Google G Suite Developer Hub

Before uploading to the developer hub, run the following command to install all the required npm modules.

npm install

Run the following command to compile the typscript files and upload to Google's developer hub. Note that this step will require authentication and that Google Apps Script API is turned on. This can be done by going to the settings tab in the hub and toggling the switch on.

npm run deploy

Importing Libraries into

Finally, before running the script, you will need to add the Cheerio library by adding it under Resources > Libraries with the uploaded script file open.

Running Scripts

The script can be run by simply selecting the function you'd like to run from the dropdown menu and clicking on the triangle play button. Note however, that a script like this is far more powerful with the use of triggers.

About

A utility program that compiles to Google Apps Script to scrape job postings and record them into a spreadsheet file.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published