GitHub - luo4neck/WebSpider: Python Web Spider // Python网络爬虫

This is a simple Python webspider, could collect and store all watched movies of a specified douban.com user into a csv file. The input should be the link of douban movie first page of a user.

这是一个简单的Python网络爬虫，可以采集指定豆瓣用户所有看过的电影并处存进一个csv文件。爬虫的输入是豆瓣用户电影首页地址。

Test input: $ Make test

In order to avoid IP banning, it takes about 40 minutes to finish the test.

测试输入: $ Make test

为了防止IP封禁，完成测试大概需要40分钟。

Totally Python code, used library: urllib2, bs4, time, re, csv, sys.

Python代码，涉及库：urllib2，bs4，time，re，csv，sys。

Code was wrotten in March/2015, Dublin Ireland.

代码于2015年3月，爱尔兰都柏林。

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Makefile		Makefile
README.md		README.md
store_csv.py		store_csv.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

luo4neck/WebSpider

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages