-
List only required/directly used dependencies in
dev
branch without versions, unless they are required in THIS version. Splitrequirements.txt
and list only directly used ones.- EnvironmentalData
- Harvester
- EnvDataServer
-
Solve context path problem (via configuration or similar)
-
Code-Cleaning:
- Remove unused imports
- [] functions, variables
-
Move every config variable to config.py
-
Establish central location for variable definition: keep in weather.py; EnvDataServer/app.py accesses them via python features; page.js accesses theme via light weight endpoint implemented in app.py Convert to dictionary and provide helper functions to retrieve the required representation
-
Establish
dev
branch -
Add Package/Component Overview to root
README.md
Document, who is using what part of the utilities package. -
Comment wind variable in webui (and API) page.js:95 comment EnvDataServer/app.py:159 remove Wind from list of variables if not configurable
-
WebUI: set time bbox to current day (limit input to hours only, no minutes, if easily possible)
-
Store last request in cookie or Browser-Storage
-
History Back does not work
-
Replace MOTU bei OPeNDAP
-
Dockerfile
of harvester:- reuqirements.txt -> new name
- copy package and the dependency packages (utilities, environmentData)
- adjust
CMD
to match new python package structure- add env pythonpath
- change CMD to variant with "[" s
-
Identify useful k8s/proxy timeout or switch to asynchronous communication pattern with managed request queue
MariDataHarvest is a tool for scrapping and harvesting Automatic Identification System (AIS) data provided by marinecadastre then appending it with the weather and environment conditions provided by CMEMS and RDA at each geographical and UTC timestamp point. In the following is a description of the datasets used:
This tool is developed with in the MariData project.
MariDataHarvest requires python 3 and pip to run. You can install all python requirements with the following command:
pip install -r requirements.txt
For a detailed list, see section License below.
The script requires accounts for the following webservices:
The credentials of these services MUST be entered into a file called .env.secret
as outlined here:
UN_CMEMS=
PW_CMEMS=
UN_RDA=
PW_RDA=
Start harvesting with the following command:
python main.py --year=2019 --minutes=30 --dir=C:\..
-
year
: the year(s) to download AIS-data. Expected input a year 'YYYY' , a range of years 'YYYY-YYYY' or multiple years 'YYYY,YYYY,YYYY'. -
minutes
: is the sub-sampling interval in minutes. -
dir
: the absolute path where to keep data. If empty, the directory is same as the project directory. -
optional arguments:
-
step
: starts the script at a specific step:- Download,
- Subsample,
- Appending weather data.
If
step
equals0
(default value), the script runs all steps starting from step 1. -
clear
: clears files ofyear
ONLY after step 2 is done. -
depth_first
: runs all steps for each file, which automatically deactivatesstep
argument.
-
Start the EnvDataAPI services locally for testing using the following command in the EndDataServer
directory:
export PYTHONPATH="$PYTHONPATH:.:EnvironmentalData:utilities"
python ./EnvDataServer/app.py
You can use the Dockerfile to build a docker image and run the script in its own isolated environment. It is recommended to provide a volume to persist the data between each run. You can specify all arguments including the optional ones as environment variables when creating/starting the container as outlined in the following. The labels used are following the Image And Container Label Specification of 52°North.
-
Build:
docker build \ --build-arg GIT_COMMIT=$(git rev-parse -q --verify HEAD) \ --build-arg BUILD_DATE=$(date -u +"%Y-%m-%dT%H:%M:%SZ") \ --file Dockerfile.harvester \ -t 52north/mari-data_harvester:1.0.0 .
Ensure, that the version of the image tag is matching the version in the Dockerfile, here:
1.0.0
. -
Create named volume:
docker volume create \ --label [email protected] \ --label org.52north.context="MariData Project: Data Harvesting Script" \ --label org.52north.end-of-life=$(date -d "+365 days" -u +"%Y-%m-%dT%H:%M:%SZ") \ mari-data-harvester_data
-
Run:
docker run \ --label [email protected] \ --label org.52north.context="MariData Project: Data Harvesting Script" \ --label org.52north.end-of-life=$(date -d "+365 days" -u +"%Y-%m-%dT%H:%M:%SZ") \ --label org.52north.created='$(date -u +"%Y-%m-%dT%H:%M:%SZ")' \ --volume mari-data-harvester_data:/mari-data/data \ --volume .env.secret:/mari-data/EnvironmentalData/.env.secret:ro \ --env-file docker.env \ --name=mari-data_harvester \ --detach \ 52north/mari-data_harvester:1.0.0 \ && docker logs --follow mari-data_harvester
with
docker.env
containing the following information:YEAR=2015-2021 MINUTES=5 DATA_DIR=/mari-data/data STEP=0 DEPTH_FIRST=--depth-first CLEAR=--clear
Use the following command to send the code to any server for building the image (or clone this repository using git clone...
) and run it:
rsync --recursive --verbose --times --rsh ssh \
--exclude='AIS-DATA' --exclude='*.tmp' \
--exclude='*.swp' --exclude='.vscode' \
--exclude='__pycache__' --delete . \
mari-data-harvester.example.org:/home/user/mari-data-harvester
We are using an nginx container to provide web access to the generated data. It requires an external service to maintain the ssl certificates. The according data volume is mounted read-only and externally provided, hence docker-compose does not create it with a project prefix.
Just execute the following command in root folder of the repository to start the service:
docker-compose up -d --build && docker-compose logs --follow
The data is available directly at the server root via https. All requests to http are redirected to https by default.
This application is licensed under the GPLv2 (see LICENSE).
Name | Version | License |
---|---|---|
Deprecated | 1.2.13 | MIT License |
Flask | 1.1.4 | BSD License |
Flask-Limiter | 1.5 | MIT License |
Jinja2 | 2.11.3 | BSD License |
MarkupSafe | 1.1.1 | BSD License |
Paste | 3.5.2 | MIT License |
PyYAML | 5.4.1 | MIT License |
Werkzeug | 1.0.1 | BSD License |
beautifulsoup4 | 4.11.1 | MIT License |
certifi | 2022.12.7 | Mozilla Public License 2.0 (MPL 2.0) |
cftime | 1.6.2 | MIT License |
charset-normalizer | 3.0.1 | MIT License |
click | 7.1.2 | BSD License |
idna | 3.4 | BSD License |
itsdangerous | 1.1.0 | BSD License |
joblib | 1.2.0 | BSD License |
limits | 3.2.0 | MIT License |
motuclient | 1.8.8 | GNU Lesser General Public License v3 (LGPLv3) |
netCDF4 | 1.6.2 | MIT License |
numpy | 1.24.1 | BSD License |
packaging | 23.0 | Apache Software License; BSD License |
pandas | 1.5.3 | BSD License |
protobuf | 4.21.12 | 3-Clause BSD License |
python-dateutil | 2.8.2 | Apache Software License; BSD License |
python-dotenv | 0.17.0 | BSD License |
pytz | 2021.1 | MIT License |
requests | 2.28.2 | Apache Software License |
scikit-learn | 0.24.0 | new BSD |
scipy | 1.10.0 | BSD License |
siphon | 0.9 | BSD License |
six | 1.16.0 | MIT License |
soupsieve | 2.3.2.post1 | MIT License |
threadpoolctl | 3.1.0 | BSD License |
typing_extensions | 4.4.0 | Python Software Foundation License |
urllib3 | 1.26.14 | MIT License |
waitress | 2.1.2 | Zope Public License |
wrapt | 1.14.1 | BSD License |
xarray | 0.17.0 | Apache Software License |
generate license list
docker compose run --interactive --rm api /bin/bash \
-c "pip install --no-warn-script-location --no-cache-dir pip-licenses > /dev/null && .local/bin/pip-licenses -f markdown"
Project/Logo | Description |
---|---|
MariGeoRoute is funded by the German Federal Ministry of Economic Affairs and Energy (BMWi) |