Skip to content

Python Scrapy project parse people profiles of Linkedin Search and arrange result content in Excel and Json file

License

Notifications You must be signed in to change notification settings

khaleddallah/LinkedinScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Linkedin Scraper using Scrapy

  • Scrape number of profiles that exist in result of Linkedin searchUrl.
  • Export the content of profiles to Excel and Json files.

Installation

  • Use the package manager pip to install Scrapy.
    (Anaconda Recomended)
cd LinkedinScraperProject     
pip install -r requirements.txt    
  • clone the project
git clone https://github.com/khaleddallah/GoogleImageScrapyDownloader.git

Usage

  • get into the directory of the project:
cd LinkedinScraperProject   
  • to get help :
python LinkedinScraper -h
usage: 
python LinkedinScraper [-h] [-n NUM] [-o OUTPUT] [-p] [-f format] [-m excelMode] (searchUrl or profilesUrl)

positional arguments:
  searchUrl     URL of Linkedin search URL or Profiles URL

optional arguments:
  -h, --help    show this help message and exit
  -n NUM        num of profiles
                ** the number must be lower or equal of result number
                'page' will parse profiles of url page (10 profiles) (Default)
  -o OUTPUT     Output file
  -p            Enable Parse Profiles
  -f FORMAT     json    Json output file
                excel    Excel file output
                all    Json and Excel output files
  -m EXCELMODE  1    to make each profile in Excel file appear in one row
                m    to make each profile in Excel file appear in multi row


Examples

python LinkedinScraper -p -o 'ABC' 'https://www.linkedin.com/in/khaled-dallah/' 'https://www.linkedin.com/in/linustorvalds/'
python LinkedinScraper -n 23 'https://www.linkedin.com/search/results/all/?keywords=Robotic&origin=GLOBAL_SEARCH_HEADER'
python LinkedinScraper -n 17 -f excel -m 1 'https://www.linkedin.com/search/results/all/?keywords=Robotic&origin=GLOBAL_SEARCH_HEADER'

Built with

  • Python 3.7
  • Scrapy
  • openpyxl

Author

Issues:

Report bugs and feature requests here.

Contribute

Contributions are always welcome!

License

This project is licensed under the LGPL-V3.0 License - see the LICENSE.md file for details

About

Python Scrapy project parse people profiles of Linkedin Search and arrange result content in Excel and Json file

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages