diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/0_Upload_datafiles.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/0_Upload_datafiles.png new file mode 100644 index 00000000000000..ba73f86a953a2c Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/0_Upload_datafiles.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/10_MetaShARK_personal.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/10_MetaShARK_personal.png new file mode 100644 index 00000000000000..0514c295ac1f08 Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/10_MetaShARK_personal.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/11_MetaShARK_abstract.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/11_MetaShARK_abstract.png new file mode 100644 index 00000000000000..359bf725ca64a8 Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/11_MetaShARK_abstract.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/12_MetaShARK_methods.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/12_MetaShARK_methods.png new file mode 100644 index 00000000000000..f98b70c29372ad Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/12_MetaShARK_methods.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/13_MetaShARK_keywords.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/13_MetaShARK_keywords.png new file mode 100644 index 00000000000000..9338dbf844060c Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/13_MetaShARK_keywords.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/14_MetaShARK_makeeml.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/14_MetaShARK_makeeml.png new file mode 100644 index 00000000000000..0e4ac535952e0a Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/14_MetaShARK_makeeml.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/15_MetaShRIMPS.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/15_MetaShRIMPS.png new file mode 100644 index 00000000000000..4d1c4c6b29111b Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/15_MetaShRIMPS.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/17_MetaShARK_annotations_predicate.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/17_MetaShARK_annotations_predicate.png new file mode 100644 index 00000000000000..9a9fbcb33a91e2 Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/17_MetaShARK_annotations_predicate.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/18_MetaShARK_annotations_object.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/18_MetaShARK_annotations_object.png new file mode 100644 index 00000000000000..c6009111364303 Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/18_MetaShARK_annotations_object.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/1_upload_shapefile.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/1_upload_shapefile.png new file mode 100644 index 00000000000000..2e5fae5ec49f9d Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/1_upload_shapefile.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/2_MetaShARK.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/2_MetaShARK.png new file mode 100644 index 00000000000000..f8716ddbab304f Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/2_MetaShARK.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/3_MetaShARK_create.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/3_MetaShARK_create.png new file mode 100644 index 00000000000000..45a9958e55a318 Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/3_MetaShARK_create.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/4_MetaShARK_upload.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/4_MetaShARK_upload.png new file mode 100644 index 00000000000000..4ac020bd689821 Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/4_MetaShARK_upload.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/5_MetaShARK_geotiffattributes.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/5_MetaShARK_geotiffattributes.png new file mode 100644 index 00000000000000..56d0d57f6a1bec Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/5_MetaShARK_geotiffattributes.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/6_MetaShARK_catvars.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/6_MetaShARK_catvars.png new file mode 100644 index 00000000000000..055e1733a45222 Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/6_MetaShARK_catvars.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/7_MetaShARK_spatialinfo.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/7_MetaShARK_spatialinfo.png new file mode 100644 index 00000000000000..6e953e4d93a71c Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/7_MetaShARK_spatialinfo.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/8_MetaShARK_geocov.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/8_MetaShARK_geocov.png new file mode 100644 index 00000000000000..a8cbb7052c653d Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/8_MetaShARK_geocov.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/9_MetaShARK_taxcov.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/9_MetaShARK_taxcov.png new file mode 100644 index 00000000000000..303baed06d1389 Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/9_MetaShARK_taxcov.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/Download_HTML.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/Download_HTML.png new file mode 100644 index 00000000000000..f203fe65d382d8 Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/Download_HTML.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/Download_docx.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/Download_docx.png new file mode 100644 index 00000000000000..62ccb4201eb8d4 Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/Download_docx.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/FAIR_data_principles.jpg b/topics/ecology/tutorials/MetaShARK_tutorial/Images/FAIR_data_principles.jpg new file mode 100644 index 00000000000000..bce54a64c4df47 Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/FAIR_data_principles.jpg differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/Fairscore_tab.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/Fairscore_tab.png new file mode 100644 index 00000000000000..63558f25fc04d7 Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/Fairscore_tab.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/Fairscore_tab2.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/Fairscore_tab2.png new file mode 100644 index 00000000000000..66f5726920b430 Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/Fairscore_tab2.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/WARNING.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/WARNING.png new file mode 100644 index 00000000000000..6ffc747e483d5a Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/WARNING.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/readme.md b/topics/ecology/tutorials/MetaShARK_tutorial/Images/readme.md new file mode 100644 index 00000000000000..0ad7138c196132 --- /dev/null +++ b/topics/ecology/tutorials/MetaShARK_tutorial/Images/readme.md @@ -0,0 +1 @@ +cecc diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/upload_1.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/upload_1.png new file mode 100644 index 00000000000000..b146db4cfdf651 Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/upload_1.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/Images/upload_2.png b/topics/ecology/tutorials/MetaShARK_tutorial/Images/upload_2.png new file mode 100644 index 00000000000000..9b47b335948c0e Binary files /dev/null and b/topics/ecology/tutorials/MetaShARK_tutorial/Images/upload_2.png differ diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/data-library.yaml b/topics/ecology/tutorials/MetaShARK_tutorial/data-library.yaml new file mode 100644 index 00000000000000..55e8b25cb7ae71 --- /dev/null +++ b/topics/ecology/tutorials/MetaShARK_tutorial/data-library.yaml @@ -0,0 +1,13 @@ +--- +destination: + type: library + name: GTN - Material + description: Galaxy Training Network Material + synopsis: Galaxy Training Network Material. See https://training.galaxyproject.org +items: +- name: New topic + description: Topic summary + items: + - name: Creating metadata using Ecological Metadata Language (EML) standard with EML Assembly Line functionalities + items: [] + diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/tutorial.bib b/topics/ecology/tutorials/MetaShARK_tutorial/tutorial.bib new file mode 100644 index 00000000000000..9206b0b6e4cae4 --- /dev/null +++ b/topics/ecology/tutorials/MetaShARK_tutorial/tutorial.bib @@ -0,0 +1,42 @@ + +# This is the bibliography file for your tutorial. +# +# To add bibliography (bibtex) entries here, follow these steps: +# 1) Find the DOI for the article you want to cite +# 2) Go to https://doi2bib.org and fill in the DOI +# 3) Copy the resulting bibtex entry into this file +# +# To cite the example below, in your tutorial.md file +# use {% cite Batut2018 %} +# +# If you want to cite an online resourse (website etc) +# you can use the 'online' format (see below) +# +# You can remove the examples below + +@article{Batut2018, + doi = {10.1016/j.cels.2018.05.012}, + url = {https://doi.org/10.1016/j.cels.2018.05.012}, + year = {2018}, + month = jun, + publisher = {Elsevier {BV}}, + volume = {6}, + number = {6}, + pages = {752--758.e1}, + author = {B{\'{e}}r{\'{e}}nice Batut and Saskia Hiltemann and Andrea Bagnacani and Dannon Baker and Vivek Bhardwaj and + Clemens Blank and Anthony Bretaudeau and Loraine Brillet-Gu{\'{e}}guen and Martin {\v{C}}ech and John Chilton + and Dave Clements and Olivia Doppelt-Azeroual and Anika Erxleben and Mallory Ann Freeberg and Simon Gladman and + Youri Hoogstrate and Hans-Rudolf Hotz and Torsten Houwaart and Pratik Jagtap and Delphine Larivi{\`{e}}re and + Gildas Le Corguill{\'{e}} and Thomas Manke and Fabien Mareuil and Fidel Ram{\'{i}}rez and Devon Ryan and + Florian Christoph Sigloch and Nicola Soranzo and Joachim Wolff and Pavankumar Videm and Markus Wolfien and + Aisanjiang Wubuli and Dilmurat Yusuf and James Taylor and Rolf Backofen and Anton Nekrutenko and Bj\"{o}rn Gr\"{u}ning}, + title = {Community-Driven Data Analysis Training for Biology}, + journal = {Cell Systems} +} + +@online{gtn-website, + author = {GTN community}, + title = {GTN Training Materials: Collection of tutorials developed and maintained by the worldwide Galaxy community}, + url = {https://training.galaxyproject.org}, + urldate = {2021-03-24} +} diff --git a/topics/ecology/tutorials/MetaShARK_tutorial/tutorial.md b/topics/ecology/tutorials/MetaShARK_tutorial/tutorial.md new file mode 100644 index 00000000000000..c7bf348854303e --- /dev/null +++ b/topics/ecology/tutorials/MetaShARK_tutorial/tutorial.md @@ -0,0 +1,307 @@ +--- +layout: tutorial_hands_on + +title: Creating metadata using Ecological Metadata Language (EML) standard + with EML Assembly Line functionalities +zenodo_link: https://zenodo.org/records/10663465 +questions: +- How to generate detailled metadata easily from biodiversity datasets ? +- How to use international metadata standard? +- How to update metadata informations ? +objectives: +- Explain the necessity of using such tools when producing ecological metadata +- Learn how to use the interactive tool MetaShARK +- Understand the challenges MetaShARK is trying to respond to +- Learn how to create rich metadata using Ecological Metadata Language (EML) standard +- Learn how to update EML metadata +time_estimation: 30M +key_points: +- This tool aims to improve FAIR quality of metadata focusing on user exeprience and automatic inferences +- Creating metadata as FAIR as possible is a must +- Be carefull of the format and standard of metadata used only EML metadata will work +tags: + - Metadata + - EML + - Ecology + - Biodiversity + - FAIR + - Data Paper +draft: true +contributions: + authorship: + - yvanlebras + - ThibaudGlinez + editing: + - yvanlebras + - ThibaudGlinez + - hexylena + funding: + - fnso2019 + - pndb + +--- + + +
This tutorial aims to teach how to use functionalities of the EML Assembly Line R package to produce rich metadata using the Ecological Metadata Language (EML) international metadata standard. Here, we will notably propose a concrete example on how to use Galaxy Ecology tools to create, evaluate and modify EML metadata content using both EML Assemby Line metadata template tabular files, easily readable and editable by humans, and XML file, devoted to machine.
+ +> A major gap when a researcher is writing metadata documents is the fact that metadata international standards often use formats not really human readable and/or editable as XML or JSON. To answer this issue, [Environmental Data Initiative](https://edirepository.org/) (EDI) through the EML Assembly Line R package propose to generate intermediate metadata template files using classical tabular text format.
+Another major issue regarding metadata fill in, is the fact that one need to take a lot of time to write, and often rewrite, metadata elements who can be already filled using automatic inferences or use of webservices. Here again, Environmental Data Initiaitve (EDI) through the EML Assembly Line R package propose to generate automatically information related to data attributes, geographic coverage, taxonomic coverage, using the content of provided datafiles.
+
+Finally, through the MetaShARK R Shiny app created by the french biodiversity data hub research infrastructure (Pôle national de données de Biodiversité (PNDB)), user can use MetaShark, a graphical user interface to apply the EML Assembly Line workflow and benefit from some additionnal functionnalities as:
According to the [GBIF](https://www.gbif.org/data-papers) (Global Biodiversity Information Facility), +> A data paper is a peer reviewed document describing a dataset, published in a peer reviewed journal. It takes effort to prepare, curate and describe data. +> Data papers provide recognition for this effort by means of a scholarly article.
+{: .comment} + + + +# 2] Get data to describe 💾📂 + +>MetaShARK will normally guess that the three `02_Ref` files are representing a uniq shapefile. MetaShARK will normally guess each data type and infer list of attributes for each file but the geotiff `Present.Surface.pH.tif` one. So now you need to select this datafile and upload the `attributes_Present.Surface.pH.txt` metadata template file so MetaShARK can fill attributes of this file (here the attribute is named "Present.Surface.pH").
+ + + +Then you can provide a description for this attribute, for example "Present surface pH", then look at each attribute information of each data file so you can click on the "Next" button and go to the next step, to give informations on categorical variables!
+ + + +Clicking "Next" button will then allows you to fill spatial informations about all GIS recognized datafiles, here the `Present.Surface.pH.tif` geotiff raster file and the `02_Ref` shapefile vector file. Geotiff is in pixel, accuracy unknown and shapefile is in Point, both are in `GCS_WGS_1984`spatial reference.
+ + + +Next step is devoted to specifying geographic coverage.
You can use a method between "columns" or "custom". "Custom" allows you to create one to several geographical sites using a map widget where you can draw limits of each site or enter directly latitude and longitude coordinates. "Columns" method, used here, allows you to specify an attribute containing site names then associated latitude and longitudes attributes.
Now geographic coverage is set, one can specific taxonomic coverage.
To do so, you can select a data attribute containing taxonomic information then select kind of notation you want to have and finally on which taxonomic authority (or authorities) information will be compared. Note that this can take a while if you have a lot of taxons and time is duplicated for each selected additional authority.
Now we can fill personal informations.
To do so, the easiest way is to provide ORCID identifiers for each individual person involved as creator, contact and/or PI. Depending on the information filled in ORCID by each individual and on the level of accessibility of each, all field can be automatically filled. If "PI" is selected, you can specify a project name, funder name and related funding number.
To do so, you need to reach the MetaShARK parameters (upper right icon) then enter your CEDAR token. To create a CEDAR account, you can 1/ register here http://cedar.metadatacenter.org/ then 2/ go on the "profile" on http://cedar.metadatacenter.org/ and there 3/ you can find the API key.
+> +> API key format to enter is something like: +> ``` +> api 205b1e521f2eaf0ad4a361c438b63205b1e521f2eaf0ad4a361c438b63c438b63 +> ``` +> +>You then can use the `+` button on the keyword space to **Add keyword with dataset annotation**. You will have to choose a "predicate", from IAO ontology, then an "object" from ontologies coming from Bioportal to add information concerning a "subject", the ‘thing’ being annotated, here, regarding keyword, "dataset", but you can also apply the same to datafiles "attributes".
+> +> +> +> +> +{: .comment} + +Finally, you can specify a temporal coverage and go to the last step of this MetaShARK workflow: Generat an EML metadata file! If everything is ok, you will have creation of an EML metadata file.
+ + + +Once EML written, you can download the data package through the button "Download Data Package". This will allow you to download a zip archive you can unzip on your local computer. Resulting files are organized through 2 main folders :
+ +- **A main folder with data_objects** + - `all datafiles ` you uploaded into MetaShARK + - `eml ` which is the EML metadata file written in XML format + - `metadata_templates ` with all metadata files written in text format, column separated by tabulations + +- A **second folder called "emldown"** where a **draft of data paper** written in html format can be accessed + +👏 Congratulations! You've just produced your first EML yourself!👏
+By clicking on the "Draft of Data Paper" tab, you will have access to the draft of Data Paper presented in an HTML format. You can either navigate through the Data Paper with the tabs or with the scrollbar on the right and access different elements.
+ +- You can at the top of the page **download the draft** in either an _HTML format_ .... : + + + +- .... or an editable _docx format_ : + + + +These principles were to improve the access and usabiliy of data by the machine and to help making data reusable and shareable for users. It covers the whole concept of why it is necessary to produce a rich and described metadata in order to permit external users to understand and reuse data for their own studies. + +There are several ways of computing the FAIR index, for each letter of the word is associated with a degree of FAIRitude of the data.
+ +>