Skip to content

Commit

Permalink
Improve doc
Browse files Browse the repository at this point in the history
  • Loading branch information
ghiggi committed Nov 7, 2023
1 parent 7687259 commit a18c580
Show file tree
Hide file tree
Showing 9 changed files with 211 additions and 190 deletions.
66 changes: 31 additions & 35 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,20 @@ Hi! Thanks for taking the time to contribute to DISDRODB.

You can contribute in many ways :

- Join the
`discussion <https://github.com/ltelab/disdrodb/discussions>`__
- Report `issues <#issue-reporting-guidelines>`__
- Join the
`discussions <https://github.com/ltelab/disdrodb/discussions>`__
- Report software `issues <#issue-reporting-guidelines>`__
- Help us developing new readers
- Any others code improvements are welcome !
- Add new data to the DISDRODB Decentralized Data Archive
- Implement new products (e.g. L1, L2, L3)
- ...
- Any code improvements are welcome !

**We Develop with GitHub !**

We use GitHub to host code, to track issues and feature requests, as well as accept Pull Requests.
We use `GitHub flow <https://docs.github.com/en/get-started/quickstart/github-flow>`__.
So all code changes happen through Pull Requests (PRs).


Before adding your contribution, please make sure to take a moment and read through the following documnents :
Expand All @@ -20,38 +29,25 @@ Before adding your contribution, please make sure to take a moment and read thro
- `Code review checklist <#code-review-checklist>`__


Issue Reporting
-----------------

Issue Reporting Guidelines
--------------------------
- Always use one of the available `GitHub Issue
Templates <https://github.com/ltelab/disdrodb/issues/new/choose>`__
- If you do not find the required GitHub Issue Template, please ask for a new template.

- Always use one available `issue
templates <https://github.com/ltelab/disdrodb/issues/new/choose>`__
- If you do not find the required GitHub issue template, please ask for a new template.


GitHub
-----------------------

**We Develop with GitHub !**

We use GitHub to host code, to track issues and feature requests, as well as accept Pull Requests.
We use `GitHub flow <https://docs.github.com/en/get-started/quickstart/github-flow>`__.
So all code changes happen through Pull Requests (PRs).




Contributing environment setup
Setup the contributor environment
-----------------------------------

**First Time Contributors ?**

Please follow the following steps to install your developing environment :

- Setting up the development environment
- Set up the development environment
- Install pre-commit hooks

Setting up the development environment
Set up the development environment
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You will need python to set up the development environment.
Expand Down Expand Up @@ -111,7 +107,7 @@ Here is a brief overview of the steps that each DISDRODB developer must follow t



Fork the repository
1. Fork the repository
~~~~~~~~~~~~~~~~~~~

Once you have set the development environment (see `Setting up the development environment`_), the next step is creating
Expand All @@ -132,8 +128,8 @@ modifications. The steps to follow are:

Done! Now you have a local copy of the disdrodb repository.

Create a new branch
~~~~~~~~~~~~~~~~~~~
2. Create a new branch
~~~~~~~~~~~~~~~~~~~~~~~

Each contribution should be made in a separate new branch of your forked repository.
For example, if you plan to contribute with new readers, please create a branch for every single reader.
Expand Down Expand Up @@ -162,8 +158,8 @@ Please define the name of your branch based on the scope of the contribution. Tr



Work on your changes
~~~~~~~~~~~~~~~~~~~~
3. Work on your changes
~~~~~~~~~~~~~~~~~~~~~~~~~~


We follow the `PEP 8 <https://pep8.org/>`__ style guide for python code.
Expand Down Expand Up @@ -332,7 +328,7 @@ They are automatically run when you push your changes to the main repository via



Code testing
4. Code testing
~~~~~~~~~~~~~~~~


Expand Down Expand Up @@ -402,8 +398,8 @@ The Continuous Integration (CI) on GitHub runs tests and analyzes code coverage.



Push your changes to your fork repository
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5. Push your changes to your fork repository
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

During this process, pre-commit hooks will be run. Your commit will be
allowed only if quality requirements are fulfilled.
Expand All @@ -419,8 +415,8 @@ The goal is to increase readability and ease of contribution.



Create a new Pull Request in GitHub.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
6. Create a new Pull Request in GitHub.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Once your code has been uploaded into your DISDRODB fork, you can create
a Pull Request (PR) to the DISDRODB main branch.
Expand Down
6 changes: 3 additions & 3 deletions disdrodb/tests/test_metadata/test_metadata_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,9 +156,9 @@ def test__get_list_metadata_with_data(tmp_path):


def test_get_list_metadata_file(tmp_path):
from pathlib import Path

tmp_path = Path("/tmp/test_test")
# from pathlib import Path
# tmp_path = Path("/tmp/test_test")

base_dir = tmp_path / "DISDRODB"

data_source = "data_source"
Expand Down
4 changes: 2 additions & 2 deletions disdrodb/utils/netcdf.py
Original file line number Diff line number Diff line change
Expand Up @@ -212,10 +212,10 @@ def ensure_unique_dimension_values(list_ds: list, fpaths: str, dim: str = "time"
List of netCDFs file paths.
"""
# Reorder the files and filepaths by the starting dimension value (time)
sorted_list_ds, sorted_fpaths = _sort_datasets_by_dim(list_ds=list_ds, fpaths=fpaths, dim=dim)
list_ds, fpaths = _sort_datasets_by_dim(list_ds=list_ds, fpaths=fpaths, dim=dim)

# Get the datasets dimension values array (and associated list_ds/xr.Dataset indices)
dim_values, list_index, ds_index = _get_dim_values_index(sorted_list_ds, dim=dim)
dim_values, list_index, ds_index = _get_dim_values_index(list_ds, dim=dim)

# Get duplicated indices
idx_duplicated = _get_duplicated_indices(dim_values, keep="first")
Expand Down
30 changes: 17 additions & 13 deletions docs/source/contribute_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,29 @@
How to Contribute New Data
==============================

Users can make their own data accessible to the community. DISDRODB
provides a central storage for code (readers), issues and metadata.
However, the raw data itself must be stored by the data provider due to
size limitations.
Users can make their own data accessible to the community.
DISDRODB provides a central storage for code (readers), issues and metadata.
However, the raw data itself must be stored by the data provider on a remote data
repository (e.g., Zenodo, Figshare, etc.).


Two types of data must be distinguished:

- Station Raw Data:
- DISDRODB Raw Data:

- Stores disdrometer measurements for days, weeks, and years.
- This dataset can be very heavy.
- No central storage is provided.
- Contain disdrometer measurements for days, weeks, and years.
- This data can be very large. No central storage is provided.
- DISDRODB provides utility functions to easily upload the raw data on remote data
repositories (i.e. Zenodo)
- DISDRODB provides utility functions to download the raw data from the remote data repositories.

- Station Metadata and Issues YAML files:
- DISDRODB Metadata and Issues YAML files:

- Stores a standard set of metadata and measurement issues of each disdrometer.
- Central storage is provided in the ``disdro-data`` Git repository.
- The ``/metadata`` folder contains a YAML metadata file called
``<station_name>.yml``. It has a ``disdrodb_data_url`` key that references to the remote/online repository where station's raw data are stored. At this URL, a single zip file provides all data available for a given station.
- Each disdrometer station has a standardized metadata and issue YAML file.
- The ``disdrodb_data_url`` metadata key references to the remote/online repository where
- station's raw data are stored. At this URL, a single zip file provides all data available for a given station.
- The DISDRODB Metadata Archive, hosted on the ``disdro-data`` GitHub repository, acts as a centralized storage
for the metadata and issue YAML files of all DISDRODB stations.


Data transfer upload and download schema:
Expand Down
33 changes: 19 additions & 14 deletions docs/source/data_download.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,18 +23,16 @@ Then clone the DISDRODB Metadata Archive repository with:
This will create a directory called ``disdrodb-data``.

.. note:: Remember that the DISDRODB Metadata Archive is often updated with new stations or metadata.
To update your local DISDRODB Metada Archive (and therefore download recently added new stations), run:
To update your local DISDRODB Metada Archive (and therefore download recently added new stations),
run :code:`git pull` inside the ``disdrodb-data`` directory.

.. code:: bash
git pull

Define the DISDRODB root directory
Define the DISDRODB Base Directory
------------------------------------------

The DISDRODB root directory is the directory ``DISDRODB`` inside ``disdrodb-data`` (i.e. ``<the_root_folder>/disdrodb-data/DISDRODB>``) .
The DISDRODB base directory is the directory ``DISDRODB`` inside ``disdrodb-data`` (i.e. ``<the_root_folder>/disdrodb-data/DISDRODB>``) .

You can set the default DISDRODB root directory by running in python:
You can set the default DISDRODB base directory by running in python:

.. code:: python
Expand All @@ -47,9 +45,11 @@ By running this command, the disdrodb software will write a ``.config_disdrodb.y
that will be used as default configuration file when running the disdrodb software.


Alternatively, you can also define the DISDRODB root directory as an environment variable ``DISDRODB_BASE_DIR``.
Alternatively, you can also define the DISDRODB base directory as an environment variable ``DISDRODB_BASE_DIR``.
In the terminal, you must type the following command:

.. code:: bash
export DISDRODB_BASE_DIR="<the_root_folder>/disdrodb-data/DISDRODB"
.. note:: It's important to remember that the environment variable ``DISDRODB_BASE_DIR`` (if defined) will take priority over the default path
Expand All @@ -68,17 +68,22 @@ To download all data stored into the DISDRODB Decentralized Data Archive, you ju
disdrodb_download_archive --data_sources <data_source> --campaign_names <campaign_name> --station_names <station_name> --force true
The ``data_sources``, ``campaign_names`` and ``station_names`` parameters are optional and are meant to restrict the download processing to a specific
data source, campaign, or station.
The ``data_sources``, ``campaign_names`` and ``station_names`` parameters are optional and are meant to restrict the download to a specific set of
data sources, campaigns, and/or stations.

Parameters:

- ``data_sources`` (optional): Station data source.
- ``campaign_names`` (optional): Station campaign name.
- ``data_sources`` (optional): Station data sources.
- ``campaign_names`` (optional): Station campaign names.
- ``station_names`` (optional): Name of the stations.
- ``force`` (optional, default = ``False``): a boolean value indicating
whether existing files should be overwritten.

To download data from multiple data sources, campaigns, or stations, please provide a space-separated string of
the data sources, campaigns or stations you require. For example, to download all EPFL and NASA data use ``--data_sources "EPFL NASA"``,
while if you want to download only stations named in a specific way, use ``--station_names "station1 station2"``.
the data sources, campaigns or stations you require.

For example:

- if you want to download all EPFL and NASA data use ``--data_sources "EPFL NASA"``,
- if you want to download only stations of specific campaigns, use ``--campaign_names "HYMEX_LTE_SOP3 HYMEX_LTE_SOP4"``.
- if you want to download only stations named in a specific way, use ``--station_names "station1 station2"``.
Loading

0 comments on commit a18c580

Please sign in to comment.