Skip to content

This repository contains a Python script that can be used to scrape data of builders and their projects from the websites 99acres and Magicbricks. The script uses the BeautifulSoup library to parse the HTML code of the websites and extract the desired data.

Notifications You must be signed in to change notification settings

sagarparmar881/web-scrapping-builders-data

Repository files navigation

Contributors Forks Stargazers Issues MIT License LinkedIn

Web Scrapping Builders Data

This repository contains a Python script that can be used to scrape data of real-estate builders and their projects from the websites '99acres.com' and 'magicbricks.com' respectively. The script uses the beautifulsoup library and selenium to parse the HTML code of the websites and extract the desired data.
Explore the docs »

View Demo · Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Results
  5. Contributing

About The Project

This repository contains a Python script that can be used to scrape data of real-estate builders and their projects from the websites '99acres.com' and 'magicbricks.com' respectively. The script uses the beautifulsoup library and selenium to parse the HTML code of the websites and extract the desired data.

Disclaimer

This project is for educational purposes only. The data that is scraped from the websites 99acres.com and magicbricks.com is not intended to be used for commercial purposes. The websites 99acres.com and magicbricks.com are not affiliated with this project in any way.

The developer(s) of this project make no guarantees about the accuracy or completeness of the data that is scraped. The data may be changed or removed at any time by the websites 99acres.com and magicbricks.com.

Results

The results of the data scrapped from the websites 99acres.com and magicbricks.com are saved in the CSV file format. The sample screenshots are attached below.

  • CSV Output - [99acers.com]
FIELD1 builder_name projects_total projects_completed
0 Darshanam Group 50 34
1 Nyalkaran Group 30 13
2 Kanha Group 23 14
3 Nilamber Group 21 15
4 Narayan Realty 20 16
5 Ananta Group 18 14
6 Earth Group Gujarat 18 17
7 Pawan Group 19 16
8 Raama Group 18 12
9 Samanvay Realty 20 14
10 Taksh Group 15 13

99acers

  • CSV Output - [magicbricks.com]
FIELD1 name projects_total projects_completed
0 Pacifica Companies 8 Projects in Vadodara 6 Completed
1 Akshar Group 14 Projects in Vadodara 9 Completed
2 Sangani Infrastructure India Pvt. Ltd. 3 Projects in Vadodara 1 Completed
3 Shreenath Group 12 Projects in Vadodara 10 Completed
4 Nilamber Group 23 Projects in Vadodara 18 Completed
5 Pratham Enterprises 14 Projects in Vadodara 12 Completed
6 J P Iscon 2 Projects in Vadodara 2 Completed
7 Narayan Realty Ltd. 11 Projects in Vadodara 8 Completed
8 Alembic Group Alchemy Real Estate 2 Projects in Vadodara 2 Completed
9 Shreeji Infrastucture 2 Projects in Vadodara 1 Completed
10 Kanha Group 22 Projects in Vadodara 11 Completed

magicbricks

(back to top)

Built With

Python

(back to top)

Getting Started

Prerequisites

These are list things you need to use the software and how to install them.

  • Python3

The libraries used for the this project are:

  • BeautifulSoup
  • Selenium
  • Requests
  • Pandas

Installation

  1. Clone the repository.

    git clone https://github.com/sagarparmar881/web-scrapping-builders-data.git
    
  2. Install requirements.

    pip install -r requirements.txt
    
  3. Set URLs in '.env' file.

    • For magicbricks.com
    • Search on google and find the URL of the particular city to scrap the data of builders. Check out the below URL for reference. Make sure to get the exact similar URL or else program might produce unexpected results.

    BASE_URL_MAGICBRICKS="https://www.magicbricks.com/mbutility/builders-in-Vadodara"
    
    • More examples of URLs for magicbricks.com can be:

      https://www.magicbricks.com/mbutility/builders-in--ahmedabad
      
      https://www.magicbricks.com/mbutility/builders-in--Surat
      
    • For www.99acres.com
    • Search on google and find the URL of the particular city to scrap the data of builders. Check out the below URL for reference. Make sure to get the exact similar URL or else program might produce unexpected results.

    BASE_URL_99_ACERS="https://www.99acres.com/builders-in-vadodara-bffid"
    
    • More examples of URLs for 99acres.com can be:
      https://www.99acres.com/builders-in-ahmedabad-bffid
      
      https://www.99acres.com/builders-in-surat-bffid
      

(back to top)

Usage

For magicbricks.com
  1. Run the file named data_scrap_magicbricks.py

    python3 data_scrap_magicbricks.py
For www.99acres.com
  1. Run the file named data_scrap_99acers.py

    python3 data_scrap_99acers.py

(back to top)

Results

For magicbricks.com &www.99acres.com
  • The final CSV file will be saved in the data folder of the root directory of the project.
  • The respective file names will be:
    • 99acers_builders_in_vadodara_26082023_154912.csv
    • magicbricks_builders_in_vadodara_26082023_152856.csv

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

About

This repository contains a Python script that can be used to scrape data of builders and their projects from the websites 99acres and Magicbricks. The script uses the BeautifulSoup library to parse the HTML code of the websites and extract the desired data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages