The hydro data scraper is a tool to collect operational hydrological data from the Linked Data Services platform (LINDAS). This repository contains the code that can be deployed on a server to periodically read the latest observations from LINDAS.
We assume you have an Ubuntu server available with docker engine (installation instructions) and git (sudo apt-get install git
) installed.
- Clone the git repository:
git clone https://github.com/hydrosolutions/hydro_data_scraper.git
(in the this example we clone it to path /data) - Edit the gauge station IDs in the .env file of the repository (a geospatial layer with station IDs is available on map.geo.admin.ch.
- Pull the docker image that is created from the main branch of this repository:
docker pull mabesa/hydro-scarper
- Edit your crontabs to periodically run the hydro data scraper every 9 minutes:
- Open the crontab file for editing:
crontab -e
- Add the following line:
*/9 * * * * docker run --rm -v /data/hydro_data_scraper:/app mabesa/hydro-scraper:latest >> /data/hydro_data_scraper/logfile.log 2>&1
- Save your edits and exit the editor
- Open the crontab file for editing:
Note that this will write a file called lindas_hydro_data.csv to path /data/hydro_data_scraper/data which will grow quickly over time.