Skip to content

crawl stylish images from design community Behance.net, field of textile design as example

Notifications You must be signed in to change notification settings

jingliao132/Behance-spider

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Behance-spider

Crawl images from Behance.net, field of textile design as example
Retrieve project URLs and save as xls

Pre-requirements

  1. Install ButterSoup4 and selenium
    pip install BeautifulSoup4
    pip install selenium
  2. Install support packages of regular expression, excel and socket connection
    pip install re
    pip install xlwt
    pip install socket
  3. Install browser webdriver
    Download and install from browser support page

Steps

  1. Run RetrieveProject.py
    This script will grasp project urls from Behance.net, and save in file ProjectURL.xls
    A pre-generated ProjectURL.xls is provided.

  2. Run RetrieveImages.py
    This script will download images of each project in ProjectURL.xls, and save in fold 'pic1' under the root
    Downloading process and infomation will be printed.
    If fail to download a image from the url, 0 will be writen at the corresponding row in ProjectURL.xls. Else, 1 will be written.

  3. Run TransformImages.py
    This script will convert different images to JPEG file with RGB colorspace.

About

crawl stylish images from design community Behance.net, field of textile design as example

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages