It is a python package to help get data from Twitter, Foursquare.
This package was created to facilitate the data mining from Twitter and Foursquare. (Only Linux)
$ python3 -m pip install SocialCrawler
- Python >= 3
- setuptools
- Foursquare developer credentials ( if you wanna work with)
- Twitter developer credentials ( if you wanna work with )
- geckodriver installed and in $PATH (we got this problem with when try run in Linux Mint and Kali)
- Download from https://github.com/mozilla/geckodriver/releases
$ export PATH=$PATH:<geckodriver-path>
- As the package use tweepy as framework to connect with Twitter we can use Twitter Stream API. Therefore you can search based in :
- delimited
- stall_warnings
- filter_level
- language
- follow
- track
- locations
- count
- with
- replies
- stringift_friend_id
As shown in Stream Overview
- Getting check-ins shared in Twitter or the check-ins of the last week.
- If you have a Foursquare credential you will be able to track data from specific locations and others.
See Wiki!
-
v 0.1.0
- fixed module class declaration
-
v 0.0.9
- fixed syntax erro and hacking method dir output
-
v 0.0.8
- added selenium as requirements to use foursquare browser request (to avoid rate limit), can not work
- updated ExtractorData to a full version to allow get (almost) full VENUE info (NewExtractorData)
- removed urlib2 as requirements
- updated run flow, now always we will have return just check if the field is NULL, when this happen it is because the data is missing
-
v 0.0.7
- when VENUE or FOURSQUARE get requests error the program thread will wait 15 minutes to request again
- Added new except treatments
- separeted foursquare request and venue request in two try-except blocks
- fixed write categorie_id bug, missing int to str convert
- yet in ExtractorData possibility of use other file (non a created by Collector or CollectorV2 ) to consult Foursquare. (not available yet)
-
v 0.0.6
- Formatted to PEP257 and PEP8 (almost)
- Implementaded ExtractorData: a simple way to get data from Foursquare using the swarm url code
- Add HistoricalCollector.CollectorV2 that get all data from json tweet and save as tsv file
- Add in ExtractorData possibility of use other file (non a created by Collector or CollectorV2 ) to consult Foursquare. (not available yet)
- added urllib2 as Requirements
-
v 0.0.5
- Fixed bug in getStoredData function that allow some parameter be None
- Updated format file name generated
- Increased time wait request from 15 minutos to 16. ( Sometimes when was tried request again -after 15 minutes - the server responded that don't finished the 15 minutes.
- Updated the fields saved. Now all field is saved in a file using \tab format as is shown in Wiki.