Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blurbs are still being retrieved for filtered out jobs #83

Closed
bunsenmurder opened this issue Jun 30, 2020 · 4 comments · Fixed by #90
Closed

Blurbs are still being retrieved for filtered out jobs #83

bunsenmurder opened this issue Jun 30, 2020 · 4 comments · Fixed by #90
Assignees
Labels

Comments

@bunsenmurder
Copy link
Collaborator

bunsenmurder commented Jun 30, 2020

Description

Currently the scraper is still retrieving blurbs for jobs that have been filtered out by the pre_filter method.

Please include a summary of the issue.
Please include the steps to reproduce.
List any additional libraries that are affected.

Steps to Reproduce

  1. Run JobFunnel under any query and make sure the results are saved to a directory without a master_list.csv or duplicate_list.csv file.
  2. Run the scraper again and take the note of the amount unique jobs found by the pre_filter, then count the amount of individual jobs that are being scraped. You should notice that they don't match.

Expected behavior

The scraper should remove jobs identified by the by the pre_filter, and only obtain blurbs for the remaining jobs.

Actual behavior

The scraper retrieves blurbs for all jobs whether they were filtered out or not.

To fix the issue, the order of the creation of the scrape_list and call to the pre_filter method would have to be switched. The screenshot below highlights the issue within the code and the debugger output :
image

Although this could've of been fixed in a pull request, making this fix would break date_filter called by the pre_filter method in the main JobFunnel class.

Environment

  • Build: Master 0a246cb
  • Operating system and version: Arch Linux
  • [Linux] Desktop Environment and/or Window Manager: Gnome
@PaulMcInnis
Copy link
Owner

PaulMcInnis commented Jul 6, 2020

thank-you for the detailed write-up!

(looks like it's time to do some more thorough code review in the codebase)

@PaulMcInnis
Copy link
Owner

ah oops should have done this before I drafted a release just now. Need to fix this and some other behaviour issues and up the sub-rev.

@bunsenmurder
Copy link
Collaborator Author

Perfect timing actually, I was gonna make a pull request with some fixes I made.

@PaulMcInnis
Copy link
Owner

ah nice! glad to hear it!

Feel free to up the rev to 2.1.9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants