Skip to content

Commit

Permalink
Juranic/music dir schema (#1324)
Browse files Browse the repository at this point in the history
* Create music-v2.0.yaml

Create next-gen MUSIC directory schema (non-specified files only accepted in extras directory, per update at 4/15/24 DCWG Meeting).

* Update CHANGELOG.md

Add next-gen MUSIC directory schema

* Docs: Update MUSIC docs

* Docs: Update YAML/XLSX

---------

Co-authored-by: Juan Puerto <=>
  • Loading branch information
j-uranic authored Apr 19, 2024
1 parent bf9e76d commit ac2651a
Show file tree
Hide file tree
Showing 9 changed files with 81 additions and 0 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
- Allow multiple comma-separated parent_sample_id values
- Accommodate dir schema minor versions
- Fix ORCID URL checking
- Add MUSIC next-gen directory schema

## v0.0.18

Expand Down
3 changes: 3 additions & 0 deletions docs/field-assays.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,7 @@ assay_category:
- MS (shotgun lipidomics)
- MS Bottom-Up
- MS Top-Down
- MUSIC
- Micro CT
- Molecular Cartography
- Multiplex Ion Beam Imaging
Expand Down Expand Up @@ -306,6 +307,7 @@ assay_type:
- MS (shotgun lipidomics)
- MS Bottom-Up
- MS Top-Down
- MUSIC
- Micro CT
- Molecular Cartography
- Multiplex Ion Beam Imaging
Expand Down Expand Up @@ -825,6 +827,7 @@ is_cedar:
- MALDI
- MERFISH
- MRI
- MUSIC
- Micro CT
- Molecular Cartography
- Multiplex Ion Beam Imaging
Expand Down
Binary file modified docs/field-schemas.xlsx
Binary file not shown.
3 changes: 3 additions & 0 deletions docs/field-schemas.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ assay_category:
- mibi
- microct
- mri
- music
- mxif
- nano
- nano-splits
Expand Down Expand Up @@ -216,6 +217,7 @@ assay_type:
- mibi
- microct
- mri
- music
- mxif
- nano
- nano-splits
Expand Down Expand Up @@ -586,6 +588,7 @@ is_cedar:
- mibi
- microct
- mri
- music
- nano-splits
- oct
- phenocycler
Expand Down
1 change: 1 addition & 0 deletions docs/music/current/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Moved to [github pages](https://hubmapconsortium.github.io/ingest-validation-tools/music/).
39 changes: 39 additions & 0 deletions docs/music/current/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
title: MUSIC
schema_name: music
category: Sequence Assays
all_versions_deprecated: False
exclude_from_index: False
layout: default

---
Prepare your metadata based on the latest metadata schema using one of the template files below. See the instructions in the [Metadata Validation Workflow](https://docs.google.com/document/d/1lfgiDGbyO4K4Hz1FMsJjmJd9RdwjShtJqFYNwKpbcZY) document for more information on preparing and validating your metadata.tsv file prior to submission.

Related files:


- [📝 Excel template](https://raw.githubusercontent.com/hubmapconsortium/dataset-metadata-spreadsheet/main/music/latest/music.xlsx): For metadata entry.
- [📝 TSV template](https://raw.githubusercontent.com/hubmapconsortium/dataset-metadata-spreadsheet/main/music/latest/music.tsv): Alternative for metadata entry.




## Metadata schema


<summary><a href="https://openview.metadatacenter.org/templates/https:%2F%2Frepo.metadatacenter.org%2Ftemplates%2F5efe0d51-828c-457a-9b94-2ac8090fe14f"><b>Version 2 (use this one)</b></a></summary>



<br>

## Directory schemas
<summary><b>Version 2.0 (use this one)</b></summary>

| pattern | required? | description |
| --- | --- | --- |
| <code>extras\/.*</code> || Folder for general lab-specific files related to the dataset. |
| <code>raw\/fastq\/[^\/]+_R[^\/]+\.fastq\.gz</code> || The raw un-multiplexed fastq files. |
| <code>lab_processed\/fastq\/DNA\/[^\/]+_R[^\/]+\.fastq\.gz</code> || This is a GZip'd version of the fastq files from whole genome sequencing. |
| <code>lab_processed\/fastq\/RNA\/[^\/]+_R[^\/]+\.fastq\.gz</code> || This is a GZip'd version of the forward and reverse fastq files from RNAseq sequencing (R1 and R2). |

21 changes: 21 additions & 0 deletions src/ingest_validation_tools/directory-schemas/music-v2.0.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
files:
-
pattern: extras\/.*
required: True
description: Folder for general lab-specific files related to the dataset.
-
pattern: raw\/fastq\/[^\/]+_R[^\/]+\.fastq\.gz
required: True
description: The raw un-multiplexed fastq files.
is_qa_qc: False
-
pattern: lab_processed\/fastq\/DNA\/[^\/]+_R[^\/]+\.fastq\.gz
required: True
description: This is a GZip'd version of the fastq files from whole genome sequencing.
is_qa_qc: False
-
pattern: lab_processed\/fastq\/RNA\/[^\/]+_R[^\/]+\.fastq\.gz
required: True
description: This is a GZip'd version of the forward and reverse fastq files from RNAseq sequencing (R1 and R2).
is_qa_qc: False

1 change: 1 addition & 0 deletions src/ingest_validation_tools/enums.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@
"MERFISH",
"MS (shotgun lipidomics)",
"MIBI",
"MUSIC",
"Multiplex Ion Beam Imaging",
"Molecular Cartography",
"NanoDESI",
Expand Down
12 changes: 12 additions & 0 deletions src/ingest_validation_tools/table-schemas/assays/music-v2.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
fields:
- name: is_cedar
description: 'Identifies whether the version is hosted by CEDAR'
example: 'https://openview.metadatacenter.org/templates/https:%2F%2Frepo.metadatacenter.org%2Ftemplates%2F5efe0d51-828c-457a-9b94-2ac8090fe14f'
- name: assay_category
constraints:
enum:
- sequence
- name: assay_type
constraints:
enum:
- MUSIC

0 comments on commit ac2651a

Please sign in to comment.