Scrapy for Craigslists Apartments

A project by: Tsung Hung @masterfung

Scrapy is the open-source web scrapper (in Python) and it is used to power this library. Please visit this link for more information if you are not familiar with it.

Purpose:

The purpose of Scrapy Craigslist (CL) is to obtain useful CL apartment data for application- specific data analysis and representation.

A model example that utilizes the data from this project is located here. This project is a Django-powered project.

The code inside utilizes San Francisco as a model city for CL searches but you can easily change the link to one of your interests.

Requirements:

You will need:

Python 2.x or 3.x
pip
Scrapy (pip install scrapy)

Project:

Clone the project with git clone [email protected]:masterfung/scrapy-craigslist.git

Craigslist City Codes:

If you need to change the city on the project to harvest a different city, please reference this link for more information. Once you change that component, you would be able to obtain the data from the city of interest.

Running the Project

Here are the steps to take to modify the the returning city data of interests:

Look into the spiders folder and click on the craigslist_scrapy.py
Click on the city codes and replace link with the city of your interest. There are three areas this appears.
Run the code and use this command: scrapy crawl craigslist -o FILENAME.json

NOTE: This may take awhile, depends on how many listings are on Craigslist and how many parameters you selected.

Outputted JSON files should be saved on the applicational level of this project.

If there are any issues or request, please let me know. I am happy to help. All the feedback are welcomed. Let us learn and build things together!

Happy Scraping!

User Agent

Visit [this link] (http://api.useragent.io/) to obtain newly generated user agent to run your this Scrapy. You can change the User Agent in the settings.py file.

The link was a project by the wonderful [Randall Degges] (https://github.com/rdegges). He is awesome!!

Others

Beware of IP ban from Craigslist.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
scrapy_craigslist		scrapy_craigslist
.gitignore		.gitignore
README.md		README.md
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scrapy for Craigslists Apartments

Purpose:

Requirements:

Project:

Craigslist City Codes:

Running the Project

User Agent

Others

About

Releases

Packages

Contributors 2

Languages

masterfung/scrapy-craigslist

Folders and files

Latest commit

History

Repository files navigation

Scrapy for Craigslists Apartments

Purpose:

Requirements:

Project:

Craigslist City Codes:

Running the Project

User Agent

Others

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages