Skip to content

mrsahabu/python_scrapyy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

python_scrapyy

My working env is Ubunut 18.0.4

url.txt has URL which Scrapper is going to scrap. It can only scrapp Company Name, Job Title, and Location.

To get job title there is a file (titles_combined.txt) and it contains almost 77k jobs. Install package find_title_job from here https://pypi.org/project/find-job-titles//

To get location from text, used python GeoText package. Install package from here https://geotext.readthedocs.io/en/latest/installation.html.

Install requirements

To run the script add all files in your root directory "python htmlparser.py"

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages