ScrapyWeiboRelate

Crawl tweet data via Scrapy and map out a network of people via AntV
zh Chinese

Project description

Crawler functions

crawl twitter user information
recent user tweets
crawl users' social connections (followers/fans)

How to use

modify user_id = int('user_id') in spiders/weibo.py to determine which user is the core of the relationship
modify relate_deep = 2 and deepth_fans = 2 to determine the depth of dispersion (deepth of 2 will include my followers/fans of followers/fans, the number of exponential growth)
please rewrite proxy_handle and get_cookies to make sure middlewares can get the correct cookie and IP proxy
run run.py
open Draw/index.html, important parameters are: linkDistance: 50 (control edge length), endArrow: true (whether the edge has arrows), lineWidth: 0.65 (the thickness of the edge), can be changed according to your needs

Special note

because the size of my cookie pool is too small, so in spiders/weibo.py 73 lines, 107 lines, 132 lines, 166 lines, 258 lines added time.sleep, running slow for no other reason
the crawl filtered the users, filtered out the big V and the users with more than 10000 followers
the accuracy rate of NLP is 89% due to the lack of training corpus, so the training corpus is attached to this project (the source of the corpus is unknown, download it from CDSN)

Final results

1000 nodes or so
5000 nodes or so
10000+ nodes
Fans and followers are distinguished by different colored lines

Completion progress

The main features are now complete, we are optimizing the readability of the images and other widgets, and correlating friendliness with social connections.

Thanks

Thanks for your help and guidance

Description

Because of the enhanced anti-crawl capability of Weibo, the mock login function in the project is no longer available and the followers/fans can only crawl to the first 20 pages, but this part of the project is no longer updated to focus on showing the relationships between users, so here is a note.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_EN.md

README_EN.md

ScrapyWeiboRelate

Project description

Crawler functions

How to use

Special note

Final results

Completion progress

Thanks

Description

Files

README_EN.md

Latest commit

History

README_EN.md

File metadata and controls

ScrapyWeiboRelate

Project description

Crawler functions

How to use

Special note

Final results

Completion progress

Thanks

Description