Skip to content

eliseobao/carggregator

Repository files navigation

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

CC BY-SA 4.0

Your second-hand car easier than ever

Carggregator is a second-hand car ads aggregator.

TOC

Tech

Carggregator uses a number of open source projects to work properly:

  • Black - An uncompromising Python code formatter.
  • Scrappy - A fast high-level web crawling and web scraping framework.
  • user_agent - A module for generating random, valid web user agents.
  • Elasticsearch - Distributed, RESTful search and analytics engine at the heart of the Elastic Stack.
  • dejavu - The Missing Web UI for Elasticsearch: Import, browse and edit data with rich filters and query views, create search UIs visually.
  • ScrapyElasticSearch - An Scrapy pipeline which allows you to store scrapy items in Elasticsearch.

And of course Carggregator itself is open source with a public repository on GitHub.

Development

Want to contribute? Great!

Git-Flow

Carggregator uses git-flow to structure its repository! Open your favorite Terminal and run these commands.

Initialize git-flow:

bash .bin/git_gitflow.sh init

Start a new feature:

git flow feature start Issue-X

Finish a feature:

git flow feature finish Issue-X

Docker

Build the development image:

make build

Connect to the development image:

make shell

Format source code:

make black

Usage

Docker

Deploy Elasticsearch and dejavu services:

make up

Deploy only Elasticsearch service:

make up/minimal

Stop deployed services preserving volumes:

make down

Stop deployed services and remove volumes:

make down/remove

Crawl and index ~n items from motor.es. If not specified, all the webspace:

make crawl-motor.es [items=n]

Crawl and index ~n items from autoscout24. If items not specified, all the webspace:

make crawl-autoscout24 [items=n]

Crawl and index ~n items from autocasion. If items not specified, all the webspace:

make crawl-autocasion [items=n]

Automatically executes an example of the complete project functionality:

make demo

Update submodules recursively.

make update

Crawl and index ~n items from motor.es, autoscout24 and autocasion. If items not specified, all the webspace:

make crawl-all [items=n]

Demo

carggregator-demo.mp4

License

GNU General Public License v3.0