This repository contains multiple web scraping tools for gathering player statistics and game logs from various professional sports reference websites, including:
- 🏀 Basketball (NBA) - Basketball Reference
- 🏈 Football (NFL) - Pro Football Reference
- ⚾ Baseball (MLB) - Baseball Reference
- 🏒 Hockey (NHL) - Hockey Reference
- ⚽ Soccer (MLS) - FBRef (Major League Soccer and other leagues)
These scrapers are designed to extract detailed statistics, player data, and game logs for each respective sport. They can either persist the data in a MongoDB database or directly scrape the information as needed.
- Scrapes player data and game logs from sports reference websites.
- Supports multiple sports including NBA, NFL, MLB, NHL, and MLS.
- Allows users to persist the scraped data into MongoDB for future queries.
- Each scraper is tailored to its respective sport's data structure and website.
For caching and persistent storage, the scrapers support MongoDB. MongoDB acts as a backend where scraped data is stored and queried. If MongoDB is not used, the scrapers will query data directly from the sports reference sites.
- Use MongoDB for faster access to already-scraped data.
- Perform on-demand scraping from sports reference websites when MongoDB is not available.
- Basketball (NBA): Scrape player and game statistics from Pro Basketball Reference.
Each sport-specific scraper contains its own README file, detailing how to:
- Install the required dependencies.
- Set up MongoDB (if used).
- Run the scraper with appropriate flags and inputs for fetching and storing data.
You can check the individual README files in each scraper folder for specific instructions related to the sport you're interested in.
Stay tuned as we continue to expand support for more sports and enhance the scraping features! 😎⚽🏀⚾🏒🏈