From 14dd491efa55d975b04961baf6ac2688b10afbc9 Mon Sep 17 00:00:00 2001 From: ghiggi Date: Wed, 1 Nov 2023 15:03:45 +0100 Subject: [PATCH] Improve documentation and README --- .github/PULL_REQUEST_TEMPLATE.md | 2 +- AUTHORS.md | 2 +- CONTRIBUTING.rst | 2 +- README.md | 136 +++++++++++++----- docs/source/{data.rst => contribute_data.rst} | 104 ++++++-------- docs/source/data_download.rst | 84 +++++++++++ docs/source/index.rst | 11 +- docs/source/maintainers_guidelines.rst | 2 +- docs/source/metadata_archive.rst | 73 ++++++++++ .../{overview.rst => software_structure.rst} | 0 10 files changed, 314 insertions(+), 102 deletions(-) rename docs/source/{data.rst => contribute_data.rst} (57%) create mode 100644 docs/source/data_download.rst create mode 100644 docs/source/metadata_archive.rst rename docs/source/{overview.rst => software_structure.rst} (100%) diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index 4502d502..22a885ed 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -49,7 +49,7 @@ If adding a **new feature**, the PR's description includes: **Other information:** -# Related GitHub issues and pull requests +# Related GitHub issues and Pull Requests - Ref: # diff --git a/AUTHORS.md b/AUTHORS.md index e54121a0..326af451 100644 --- a/AUTHORS.md +++ b/AUTHORS.md @@ -1,4 +1,4 @@ -# Project contributors +# Project Contributors The following people have made contributions to this project: diff --git a/CONTRIBUTING.rst b/CONTRIBUTING.rst index ed5a8d37..1bd1f108 100644 --- a/CONTRIBUTING.rst +++ b/CONTRIBUTING.rst @@ -1,4 +1,4 @@ -Contributing guide +Contributors Guidelines =========================== Hi! Thanks for taking the time to contribute to DISDRODB. diff --git a/README.md b/README.md index 8a22e9d2..df015ce2 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# DISDRODB - A package to standardize, process and analyze global disdrometer data. +# ๐Ÿ“ฆ DISDRODB - A package to standardize, process and analyze global disdrometer data. .. |pypi| image:: https://badge.fury.io/py/disdrodb.svg @@ -7,22 +7,19 @@ .. |conda| image:: https://img.shields.io/conda/vn/conda-forge/disdrodb.svg?logo=conda-forge&logoColor=white :target: https://anaconda.org/conda-forge/disdrodb - .. |pypi_downloads| image:: https://img.shields.io/pypi/dm/disdrodb.svg?label=PyPI%20downloads :target: https://pypi.org/project/disdrodb/ .. |conda_downloads| image:: https://img.shields.io/conda/dn/conda-forge/disdrodb.svg?label=Conda%20downloads :target: https://anaconda.org/conda-forge/disdrodb -# TODO PYTHON VERSIONS -[![image](https://img.shields.io/pypi/pyversions/ruff.svg)](https://pypi.python.org/pypi/ruff) -.. |python| image:: https://img.shields.io/badge/python-3.8+-blue.svg +.. |versions| image:: https://img.shields.io/badge/Python-3.8%20|%203.9%20|%203.10%20|%203.11|%203.12-blue :target: https://www.python.org/downloads/ + :alt: Supported Python Versions .. |status| image:: https://www.repostatus.org/badges/latest/active.svg :target: https://www.repostatus.org/#active - .. |tests| image:: https://github.com/ltelab/disdrodb/actions/workflows/tests.yml/badge.svg :target: https://github.com/ltelab/disdrodb/actions/workflows/tests.yml @@ -53,26 +50,32 @@ .. |licence| image:: https://img.shields.io/github/license/ltelab/disdrodb :target: https://github.com/ltelab/disdrodb/blob/main/LICENSE -# TODO slack link and badge .. |slack| image:: https://img.shields.io/badge/Slack-disdrodb-green.svg?logo=slack - :target: http://slack.disdrodb.org + :target: https://disdrodbworkspace.slack.com/ .. |discussion| image:: https://img.shields.io/badge/GitHub-Discussions-green?logo=github :target: https://github.com/ltelab/disdrodb/discussions -.. |ruff| image:: https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json) - :target: https://github.com/astral-sh/ruff - :alt: ruff +.. |ruff| image:: https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json + :target: https://github.com/astral-sh/ruff + :alt: ruff .. |black| image:: https://img.shields.io/badge/code%20style-black-000000.svg?style=flat - :target: https://github.com/psf/black - :alt: black + :target: https://github.com/psf/black + :alt: black -# TODO badge -.. |codespell| image:: https://img.shields.io/badge/code%20style-black-000000.svg?style=flat +.. |codespell| image:: https://img.shields.io/badge/Codespell-enabled-brightgreen :target: https://github.com/codespell-project/codespell :alt: codespell +.. |openssf| image:: https://www.bestpractices.dev/projects/XXXX/badge + :target: https://www.bestpractices.dev/projects/XXXX + :alt: OpenSSF Best Practices + +.. |pyopensci| image:: https://tinyurl.com/XXXX + :target: https://github.com/pyOpenSci/software-review/issues/XXX + :alt: pyOpenSci + .. |joss| image:: http://joss.theoj.org/papers//joss./status.svg :target: https://doi.org/ @@ -91,13 +94,15 @@ +----------------------+---------------------------------------------+ | Build Status | |tests| |lint| |docs| | +----------------------+---------------------------------------------+ +| Linting | |black| |ruff| |codespell| | ++----------------------+---------------------------------------------+ | Code Coverage | |coverall| |codecov| | +----------------------+---------------------------------------------+ -| Code Quality |codefactor| |codebeat| | +| Code Quality | |codefactor| |codebeat| | | +---------------------------------------------+ | | |codacy| |codescene| | +----------------------+---------------------------------------------+ -| Linting | |black| |ruff| |codespell| | +| Code Review | |pyopensci| |openssf| | +----------------------+---------------------------------------------+ | License | |licence| | +----------------------+------------------------+--------------------+ @@ -108,45 +113,104 @@ [**Slack**](http://slack.disdrodb.org) | [**Docs**](https://disdrodb.readthedocs.io/en/latest/) - DISDRODB is part of an initial effort to index, collect and homogenize drop size distribution (DSD) data sets across the globe, as well as to establish a global standard for disdrometers observations data sharing. -DISDRODB standards are being established following FAIR data best practices and Climate & Forecast (CF) conventions, and will facilitate the preprocessing, analysis and visualization of disdrometer data. +DISDRODB standards are being established following FAIR data best practices and Climate & Forecast (CF) conventions, and will facilitate +the preprocessing, analysis and visualization of disdrometer data. + +## โ„น๏ธ Software Overview + +The software currently enable to: +- download the raw disdrometer data from all stations included in the DISDRODB Decentralized Data Archive +- upload raw disdrometer data from the user to the DISDRODB Decentralized Data Archive +- process more than 400 disdrometer stations into a standard NetCDF format (DISDRODB L0 product) + +Currently, the DISDRODB Working Group is discussing the development of various scientific products. : +If you have ideas, algorithms, data or expertise to share, or you want to contribute to the future DISDRODB products, do not hesitate to **GET IN TOUCH** !!! + +Join the [**DISDRODB Slack Workspace**](http://slack.disdrodb.org) to meet the DISDRODB Community ! + + +## ๐Ÿš€ Quick Start + +You're about to create your very own DISDRODB Data Archive. All it takes is a simple command-line journey to your chosen directory. + +#### ๐Ÿ“š Set up the DISDRODB Metadata And Local Data Archive + +Let's start by travel to the directory where you want to store the DISDRODB Data Archive with :code:`cd `. -The DISDRODB archive is composed of 3 product levels: -- L0 provides the raw sensors measurements converted into a standardized netCDF4 format. -- L1 provides L0 homogenized and quality-checked data. -- L2 provides scientific products derived from the L1 data. +Then clone the DISDRODB Metadata Archive repository with: -The code required to the generate the DISDRODB archive is enclosed in the `production` directory of the repository. +.. code:: bash -The code facilitating the analysis and visualization of the DISDRODB archive is available in the `api` directory. + git clone https://github.com/ltelab/disdrodb-data.git +This will create a directory called ``disdrodb-data``, which is ready to be filled with data from the DISDRODB Decentralized Data Archive. -The software documentation is available at [https://disdrodb.readthedocs.io/en/latest/](https://disdrodb.readthedocs.io/en/latest/). +But before starting to download some data, we need to specify the location of the DISDRODB Local Archive. -Currently: -- only the DISDRODB L0 product generation has been implemented; -- the pipeline for DISDRODB L1 and L2 product generation is in development; -- the DISDRODB API is in development; -- more than 300 sensors have been already processed to DISDRODB L0; -- tens of institutions have manifested their interest in adopting the DISDRODB tools and standards. +You can specify once for ever the default DISDRODB Local Archive directory by running in python: -Consequently **IT IS TIME TO GET INVOLVED**. If you have ideas, algorithms, data or expertise to share, do not hesitate to **GET IN TOUCH** !!! +```python + import disdrodb + disdrodb_dir = "/disdrodb-data/DISDRODB>" + disdrodb.define_configs(disdrodb_dir=disdrodb_dir) +``` +or set up a (temporary) environment variable `DISDRODB_DIR` in your terminal with: +```bash + export DISDRODB_DIR="/disdrodb-data/DISDRODB>" +``` +#### ๐Ÿ“ฅ Download the raw data of the DISDRODB stations -## Installation +To download all data stored into the DISDRODB Decentralized Data Archive, you just have to run the following command: + +```bash + download_disdrodb_archive +``` + +#### ๐Ÿ’ซ Transform the raw data to standardized netCDF files (DISDRODB L0 product). + +Then, if you want to convert all stations raw data into standardized netCDF4 files, run the following command in the terminal: + +```bash + + run_disdrodb_l0 + +``` + +#### ๐Ÿ“– Explore the DISDRODB documentation + +To discover all download and processing options, or how to contribute your own data to DISDRODB, +please read the software documentation available at [https://disdrodb.readthedocs.io/en/latest/](https://disdrodb.readthedocs.io/en/latest/). + +If you want to improve to the DISDRODB Metadata Archive repository, you can explore the repository +at [https://github.com/ltelab/disdrodb-data](https://github.com/ltelab/disdrodb-data) + + +## ๐Ÿ› ๏ธ Installation DISDRODB can be installed from PyPI with pip: - ```sh + ```bash + pip install disdrodb + ``` -## Contributors +## ๐Ÿ’ญ Feedback and Contributing Guidelines + +If you aim to contribute your data or discuss the future development of DISDRODB, +we highly suggest to join the [**DISDRODB Slack Workspace**](http://slack.disdrodb.org) + +Feel free to also open a [GitHub Issue](https://github.com/ltelab/disdrodb/issues) or a +[GitHub Discussion](https://github.com/ltelab/disdrodb/discussions) specific to your questions or ideas. + + +## โœ๏ธ Contributors * [Gionata Ghiggi](https://people.epfl.ch/gionata.ghiggi) * [Kim Candolfi](https://github.com/KimCandolfi) diff --git a/docs/source/data.rst b/docs/source/contribute_data.rst similarity index 57% rename from docs/source/data.rst rename to docs/source/contribute_data.rst index 02f40807..0d741792 100644 --- a/docs/source/data.rst +++ b/docs/source/contribute_data.rst @@ -1,7 +1,6 @@ -===== -Data -===== - +============================== +Contribute Data to DISDRODB +============================== Users can make their own data accessible to the community. DISDRODB provides a central storage for code (readers), issues and metadata. @@ -29,20 +28,6 @@ Data transfer upload and download schema: .. image:: /static/transfer.png -Download the DISDRODB metadata archive ------------------------------------------ - -First travel to the directory where you want to store the data. -Then clone the disdrodb-data repository with: - -.. code:: bash - - git clone https://github.com/ltelab/disdrodb-data.git - -However, if you plan to add new data or metadata to the archive, first -fork the repository on your GitHub account and then clone the forked -repository. - Update the DISDRODB metadata archive ---------------------------------------- @@ -72,47 +57,18 @@ follow these steps: 7. Test that the integration of your new dataset functions by deleting your data locally and re-fetching it through the process detailed above. -8. `Create a pull - request `__, - and wait for a maintainer to accept it! +8. Go to the `Github DISDRODB Metadata Repository `__, open the Pull Request and wait for a maintainer to accept it! + For more information on GitHub Pull Requests, read the + `"Create a pull request documentation" `__. -9. If you struggle with this process, donโ€™t hesitate to raise an `issue `__ so we can help! +9. If you struggle with this process, do not hesitate to raise an `issue `__ so we can help! -Download the DISDRODB raw data archive ---------------------------------------- -Prerequisite: First clone the disdrodb-data repository as described above to get the DISDRODB directory structure. -Objective: You would like to download the raw data referenced in some metadata ``.yml`` file. -In order to download the data, you should be in a virtual environment with the disdrodb package installed! +Upload your stations data on Zenodo and link it to the DISDRODB Decentralized Data Archive +---------------------------------------------------------------------------------------------- -To download all data, just run: - -.. code:: bash - - download_disdrodb_archive --data_sources --campaign_names --station_names --force true - -The ``disdrodb_dir`` parameter is compulsory and must include the path -of the root folder, ending with ``DISDRODB``. The other parameters are -optional and are meant to restrict the download processing to a specific -data source, campaign, or station. - -Parameters: - -- ``data_sources`` (optional): Station data source. -- ``campaign_names`` (optional): Station campaign name. -- ``station_names`` (optional): Name of the stations. -- ``force`` (optional, default = ``False``): a boolean value indicating - whether existing files should be overwritten. - -To download data from multiple data sources or campaigns, please provide a space-separated string of -the data sources or campaigns you require. For example, ``"EPFL NASA"``. - - -Add new stations raw data to the DISDRODB archive (using Zenodo) ------------------------------------------------------------------ - -We provide users with a code to upload their stationโ€™s raw data to `Zenodo `_. +We provide users with a code to easily upload their stations raw data to `Zenodo `_. .. code:: bash @@ -129,6 +85,7 @@ Parameters: - ``campaign_names`` (optional): the name of the campaign. - ``station_names`` (optional): the name of the station. - ``platform`` (optional, default is Zenodo). + Currently, only Zenodo is supported. - ``force`` (optional, default = ``False``): a boolean value indicating whether files already uploaded somewhere else should still be included. @@ -136,9 +93,6 @@ Parameters: To upload data from multiple data sources or campaigns, please provide a space-separated string of the data sources or campaigns you require. For example, ``"EPFL NASA"``. - -Currently, only Zenodo is supported. - After running this command, the user will be prompted to insert a Zenodo token. Once the data is uploaded, a link will be displayed that the user must use to go to the Zenodo web interface and manually publish the @@ -151,3 +105,39 @@ To get a Zenodo token, go to .. image:: /static/zenodo.png + + + +Test the download and DISDRODB L0 processing of the stations you contributed +------------------------------------------------------------------------------ + +To test that the data upload has been successfuland you specified correctly all the required metadata, let's first try to download +the data you just uploaded from the DISDRODB Decentralized Data Archive. + +To do so, first make a copy of the DISDRODB metadata archive you just edited, in order to inadvertently delete the data you just uploaded. + +Then, run the following command to download the data you just uploaded: + +.. code:: bash + export DISDRODB_DIR=" --campaign_names --force true + +::note + Be sure to specify a ``DISDRODB_DIR`` environment variable that points to a copy of the metadata archive you edited + otherwise you risk to overwrite the data you just uploaded! + +If the download is successful, and you also already implemented the DISDRODB reader for your data, you can now try to process the data you just downloaded. + +To do so, run the following command: + +.. code:: bash + export DISDRODB_DIR=" --campaign_names + + ::note + If the correctness of the reader has already been tested, you can add the ``--debugging_mode True`` parameter to just run the processing + on a small subset of the data. This will speed up the processing and will allow you to check that the processing is working correctly. + + +If the processing is successful, you can now open a Pull Request to merge your changes to the DISDRODB metadata archive. +Congratulations !!! Your data are now available to the community !!! diff --git a/docs/source/data_download.rst b/docs/source/data_download.rst new file mode 100644 index 00000000..7ec2aadf --- /dev/null +++ b/docs/source/data_download.rst @@ -0,0 +1,84 @@ +========================= +DISDRODB Data Download +========================= + +In this section, we describe how to download disdrometer data from the DISDRODB Decentralized Data Archive to your local machine. +First, is however necessary to download on your local machine the DISDRODB Metadata Archive, which contains the pointers +to the remote data repositiores where the DISDRODB stations are stored. + +.. note:: The DISDRODB Metadata Archive is often updated with new stations or metadata. + Therefore, we recommend to update your local DISDRODB Metadata Archive regularly (see below). + +Download the official DISDRODB Metadata Archive +----------------------------------------------- + +First travel to the directory where you want to store the DISDRODB Data Archive with :code:`cd ` + +Then clone the DISDRODB Metadata Archive repository with: + +.. code:: bash + + git clone https://github.com/ltelab/disdrodb-data.git + +This will create a directory called ``disdrodb-data``. + +.. note:: Remember that the DISDRODB Metadata Archive is often updated with new stations or metadata. + To update your local DISDRODB Metada Archive (and therefore download recently added new stations), run: + + .. code:: bash + git pull + + +Define the DISDRODB root directory +------------------------------------------ + +The DISDRODB root directory is the directory ``DISDRODB`` inside ``disdrodb-data`` (i.e. ``/disdrodb-data/DISDRODB>``) . + +You can set the default DISDRODB root directory by running in python: + +.. code:: python + + import disdrodb + + disdrodb_dir = "/disdrodb-data/DISDRODB>" + disdrodb.define_configs(disdrodb_dir=disdrodb_dir) + +By running this command, the disdrodb software will write a ``.config_disdrodb.yml`` file into your home directory (i.e. ``~/.config_disdrodb.yml``) +that will be used as default configuration file when running the disdrodb software. + + +Alternatively, you can also define the DISDRODB root directory as an environment variable ``DISDRODB_DIR``. +In the terminal, you must type the following command: +.. code:: bash + export DISDRODB_DIR="/disdrodb-data/DISDRODB" + +.. note:: It's important to remember that the environment variable ``DISDRODB_DIR`` (if defined) will take priority over the default path + defined in the ``.config_disdrodb.yml`` file. + + +Download the DISDRODB Data Archive +--------------------------------------- + +In order to download the data, you should be in a virtual environment with the disdrodb package installed! +Refers to the installation section for more details on how to set-up and activate the virtual environment. + +To download all data stored into the DISDRODB Decentralized Data Archive, you just have to run the following command: + +.. code:: bash + + download_disdrodb_archive --data_sources --campaign_names --station_names --force true + +The ``data_sources``, ``campaign_names`` and ``station_names`` parameters are optional and are meant to restrict the download processing to a specific +data source, campaign, or station. + +Parameters: + +- ``data_sources`` (optional): Station data source. +- ``campaign_names`` (optional): Station campaign name. +- ``station_names`` (optional): Name of the stations. +- ``force`` (optional, default = ``False``): a boolean value indicating + whether existing files should be overwritten. + +To download data from multiple data sources, campaigns, or stations, please provide a space-separated string of +the data sources, campaigns or stations you require. For example, to download all EPFL and NASA data use ``--data_sources "EPFL NASA"``, +while if you want to download only stations named in a specific way, use ``--station_names "station1 station2"``. diff --git a/docs/source/index.rst b/docs/source/index.rst index 8136b0cc..48c865a8 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -42,22 +42,23 @@ Documentation :maxdepth: 2 installation - overview - data + data_download + l0_processing + metadata_archive + contribute_data metadata readers sensor_configs - l0_processing contributors_guidelines maintainers_guidelines authors - + software_structure .. toctree:: :maxdepth: 1 - disdrodb API + DISDRODB API diff --git a/docs/source/maintainers_guidelines.rst b/docs/source/maintainers_guidelines.rst index 91372136..c6c66b76 100644 --- a/docs/source/maintainers_guidelines.rst +++ b/docs/source/maintainers_guidelines.rst @@ -1,5 +1,5 @@ ======================== -Maintainers guidelines +Maintainers Guidelines ======================== diff --git a/docs/source/metadata_archive.rst b/docs/source/metadata_archive.rst new file mode 100644 index 00000000..13bfafe7 --- /dev/null +++ b/docs/source/metadata_archive.rst @@ -0,0 +1,73 @@ +========================== +DISDRODB Metadata Archive +========================== + +The DISDRODB metadata repository is hosted on GitHub and serves as a central hub for tracking available stations, +the potential malfunctioning of the sensors, and to list the URLs of the remote data repositories where the raw disdrometer data are stored. +The GitHub platform facilitates community collaboration to continuously enhance station metadata using best open-source practices. +This approach also enables recursive data quality improvements, while keeping the DISDRODB product chain transparent and fully reproducible. + +To ensure quality and consistency of metadata, a comprehensive standard set of metadata keys has been established. +The DISDRODB community is empowered to pinpoint specific timestamps or periods when sensors might have malfunctioned or + generated erroneous data logs through specific issues YAML files. + +The DISDRODB Metadata Repository is therefore updated on a regular basis to reflect the latest status of the stations and the data availability. + +Here below we detail the necessary step to add/update the information of the DISDRODB Metadata Archive. + + +Fork and download the DISDRODB Metadata Archive +--------------------------------------------------- + +If you plan to add new data to the DISDRODB Decentralized Data Archive or you want to just update +some station metadata/issues information, go to the +`DISDRODB metadata repository `__, +fork the repository on your GitHub account and then clone the forked repository: + +.. code:: bash + + git clone https://github.com//disdrodb-data.git + + +Update the DISDRODB Metadata Archive +---------------------------------------- + +To update the DISDRODB Metadata Archive follow these steps: + +1. Go inside the ``disdrodb-data`` directory where you have cloned the repository: + +2. Create a new branch. + + .. code:: bash + + git checkout -b "reader--" + + .. note:: + If you are adding information regarding a new station, please name the branch as follows: ``reader--``. + If you are just improving some specific information of an existing station, please name the branch as follows: ``update---``. + +3. Edit or add the metadata files that you are interested in. + +4. When you are done, please run the following command to check that the metadata files are valid: + + .. code:: bash + + export DISDRODB_DIR="/disdrodb-data/DISDRODB" + disdrodb_check_metadata_compliance + + .. note:: + The ``DISDRODB_DIR`` environment variable has to be specified only if the DISDRODB root directory had not been specified before. + +5. Commit your changes and push your branch to GitHub: + + .. code:: bash + + git add * + git commit -m "Add/update metadata for " + git push origin + +6. Go to the `Github DISDRODB Metadata Repository `__, open the Pull Request and wait for a maintainer to accept it! + For more information on GitHub Pull Requests, read the + `"Create a pull request documentation" `__. + +7. If you struggle with this process, do not hesitate to raise an `issue `__ so we can help! diff --git a/docs/source/overview.rst b/docs/source/software_structure.rst similarity index 100% rename from docs/source/overview.rst rename to docs/source/software_structure.rst