Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

periodically scrape the website for new data and store to database #94

Open
duttashi opened this issue May 13, 2020 · 1 comment
Open
Assignees
Labels
learn_stuff everything related to learning Python-scraper all Python scripts or notes related to web-data scraping are grouped in this tag TODO
Milestone

Comments

@duttashi
Copy link
Owner

  1. Scrape the website (https://news.ycombinator.com/jobs) for new data
  2. Save the scraped data to MySql database
  3. Periodically scrape the website for new information

Reference

@duttashi duttashi added TODO learn_stuff everything related to learning Python-scraper all Python scripts or notes related to web-data scraping are grouped in this tag labels May 13, 2020
@duttashi duttashi added this to the data pipeline milestone May 13, 2020
@duttashi duttashi self-assigned this May 13, 2020
@duttashi
Copy link
Owner Author

Note: To execute a script periodically, you'll have to set up a CRON job.

  1. How to set it up on Windows 7?

Answer: See these SO posts, 1, 2

  1. How to schedule python script using Windows Scheduler?

Answer: See this tutorial and this SO posts, 1,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
learn_stuff everything related to learning Python-scraper all Python scripts or notes related to web-data scraping are grouped in this tag TODO
Projects
None yet
Development

No branches or pull requests

1 participant