Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update dev #94

Merged
merged 40 commits into from
Aug 11, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
41dc530
Minor typo correction
kee007ney Aug 23, 2019
575c4fa
Update provenance-domain.md
jpat1546 Dec 2, 2019
f06ebdd
test
HadleyKing Jan 16, 2020
bbd7715
tsting
HadleyKing Jan 16, 2020
732272d
Merge branch 'dev' into 1.4.0
HadleyKing Jan 16, 2020
3787b47
Add CHANGELOG.md and log.sh
HadleyKing Jan 19, 2020
0aae6a8
Remove IEEE_Docs
HadleyKing Jan 19, 2020
a147cc6
Deleted IEEE files
HadleyKing Jan 21, 2020
d9ac543
[WIP] release_protocol.md
HadleyKing Jan 21, 2020
cd653fd
Fix small typo
ktaletsk Jan 21, 2020
803f274
Merge pull request #87 from ktaletsk/patch-1
jpat1546 Jan 22, 2020
6c6ce8c
stash
HadleyKing Feb 20, 2020
1ee53d0
Add ieee-2791-schema
HadleyKing Mar 5, 2020
0f6f863
Add 2791 language
jpat1546 Mar 26, 2020
e9bb143
Add 2791 language
jpat1546 Mar 26, 2020
5c7f83d
Create Best_Practices
jpat1546 Apr 28, 2020
ee84a10
Create best_practices
jpat1546 Apr 29, 2020
82c0093
Rename best_practices to best_practices.md
jpat1546 Apr 29, 2020
c2e056f
Update best_practices.md
jpat1546 Apr 30, 2020
ec4070c
Update best_practices.md
kee007ney May 1, 2020
be65ecb
Updates to schema files based on IEEE 2791-202 publication.
HadleyKing May 19, 2020
88b9426
Update README.md
HadleyKing Jul 16, 2020
6e9e18d
Merge branch 'main' into 1.4.0
HadleyKing Jul 16, 2020
1007e5a
Merge pull request #89 from biocompute-objects/1.4.0
HadleyKing Jul 16, 2020
4e19388
Update parametric-domain.md
jpat1546 Oct 1, 2020
a81c2c9
Update io-domain.md
jpat1546 Oct 1, 2020
dff716e
Update error-domain.md
jpat1546 Oct 1, 2020
f2cd663
Add Hugo site files
HadleyKing Nov 14, 2020
550443f
Rename files for Hugo build
HadleyKing Nov 14, 2020
93758d3
Add theme
HadleyKing Nov 14, 2020
ec1048a
Add build commands
HadleyKing Nov 14, 2020
4fa6e03
Update build configs
HadleyKing Nov 14, 2020
8ea7e4c
Major overhaul. Fix #77
HadleyKing Nov 14, 2020
2ad94ac
Add favicon
HadleyKing Nov 14, 2020
7b57500
try to fix the code highlight issue
nanxstats Nov 19, 2020
75d2c56
Update example BCO files
HadleyKing Feb 18, 2021
bed2662
Merge branch 'main' into 2.0.0
HadleyKing Aug 10, 2021
16d239f
Merge branch '2.0.0' into main
HadleyKing Aug 10, 2021
392af21
Refactoring for Portal linking
HadleyKing Aug 11, 2021
0d061c5
Merge pull request #93 from biocompute-objects/local
HadleyKing Aug 11, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file not shown.
Binary file removed IEEE_Docs/EMB MWG P&P_P2791.docx
Binary file not shown.
Binary file not shown.
Binary file removed IEEE_Docs/IEEE Entity CLA_BSD-3_081717.pdf
Binary file not shown.
Binary file not shown.
Binary file removed IEEE_Docs/IEEE Individual CLA_BSD-3_081717.pdf
Binary file not shown.
Binary file removed IEEE_Docs/LC_Sponsor_Ballot_Overview_1July2016.ppt
Binary file not shown.
Binary file removed IEEE_Docs/Meetings/2018August29/08292018_Agenda.doc
Binary file not shown.
Binary file not shown.
Binary file removed IEEE_Docs/Meetings/2018August29/08292018_Minutes.doc
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file removed IEEE_Docs/Meetings/2019May09/05092019_Agenda.doc
Binary file not shown.
Binary file removed IEEE_Docs/Meetings/2019May09/05092019_Minutes.doc
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file removed IEEE_Docs/Meetings/Individual_Roster_Public.xlsx
Binary file not shown.
Binary file removed IEEE_Docs/P2971_D3_Dec2018.doc
Binary file not shown.
Binary file removed IEEE_Docs/P2971_D3_Dec2018_JGK.doc
Binary file not shown.
Binary file removed IEEE_Docs/P2971_D3_Dec2018_JGK_Revised.doc
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
11 changes: 0 additions & 11 deletions IEEE_Docs/README.md

This file was deleted.

337 changes: 0 additions & 337 deletions IEEE_Docs/standard.md

This file was deleted.

32 changes: 16 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
BioCompute
==========

This version: [draft-1.4.0](https://github.com/biocompute-objects/BCO_Specification/tree/dev)
This version: [1.4.0](https://github.com/biocompute-objects/BCO_Specification/tree/1.4.0)

Previous version: [v1.3.1](https://github.com/biocompute-objects/BCO_Specification/releases/tag/1.3.1)

Expand Down Expand Up @@ -33,27 +33,27 @@ A functional example of a BCO with associated input and output files, and includ

## User Guide

The [BioCompute Objects user guide](/docs/user_guide.md) provides an introduction to implementing/writing a BCO for a pipeline and/or a workflow, and is taken from the [BioCompute Objects Specification Document](/IEEE_Docs/standard.md).
The [BioCompute Objects user guide](/content/user_guide.md) provides an introduction to implementing/writing a BCO for a pipeline and/or a workflow, and is taken from the [BioCompute Objects Specification Document](/IEEE_Docs/standard.md).

### Repository

Note that unless you are viewing a [release](https://github.com/biocompute-objects/BCO_Specification/releases) this is a draft subject to change.

Table of content:

* [BioCompute Object (BCO) User Guide](/docs/user_guide.md)
* [Introduction to BioCompute Objects](/docs/introduction.md)
* [BCO domains](/docs/bco-domains.md)
* [Top level fields](/docs/top-level.md)
* [Provenance domain](/docs/provenance-domain.md)
* [Usability domain](/docs/usability-domain.md)
* [FHIR extension](/docs/extension-fhir.md)
* [SCM extension](/docs/extension-scm.md)
* [Description domain](/docs/description-domain.md)
* [Execution domain](/docs/execution-domain.md)
* [Parametric domain](/docs/parametric-domain.md)
* [Input and Output domain](/docs/io-domain.md)
* [Error domain](/docs/error-domain.md)
* [BioCompute Object (BCO) User Guide](/content/user_guide.md)
* [Introduction to BioCompute Objects](/content/introduction.md)
* [BCO domains](/content/bco-domains.md)
* [Top level fields](/content/top-level.md)
* [Provenance domain](/content/provenance-domain.md)
* [Usability domain](/content/usability-domain.md)
* [FHIR extension](/content/extension-fhir.md)
* [SCM extension](/content/extension-scm.md)
* [Description domain](/content/description-domain.md)
* [Execution domain](/content/execution-domain.md)
* [Parametric domain](/content/parametric-domain.md)
* [Input and Output domain](/content/io-domain.md)
* [Error domain](/content/error-domain.md)
* [BCO expanded view example HCV1a.json](HCV1a.json)

## Specification
Expand Down Expand Up @@ -90,4 +90,4 @@ As a subscriber to the BCO mailing list, you can post to it by sending a message

To subscribe or unsubscribe, please visit https://hermes.gwu.edu/cgi-bin/wa?A0=BIOCOMPUTELS and click `Subscribe` or `Unsubscribe` on the lower right. You can also unsubscribe from the list at any time by sending an email to [email protected], in which the body says: `unsubscribe biocomputels`

Please also see our [OSF page](https://osf.io/h59uh/) or our [main page](https://biocomputeobject.org/)
This repository is in support of [2791-2020](https://standards.ieee.org/standard/2791-2020.html) - IEEE Approved Draft Standard for Bioinformatics Computations and Analyses Generated by High-Throughput Sequencing (HTS) to Facilitate Communication. Please also see our [OSF page](https://osf.io/h59uh/) or our [main page](https://biocomputeobject.org/)
3 changes: 3 additions & 0 deletions VERSION.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
VERSION_MAJOR = "1.4"
VERSION_MINOR = ".0"
VERSION = VERSION_MAJOR + ('.' + VERSION_MINOR if VERSION_MINOR else '')
71 changes: 71 additions & 0 deletions config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
baseURL = "/"
languageCode = "en-us"
title = "BioCompute Object Documentation"
theme = "hugo-biocompute"

enableEmoji = true
hasCJKLanguage = true

pygmentsstyle = "github"

[markup]
[markup.goldmark]
[markup.goldmark.renderer]
unsafe = true

[params]
copyright = "© 2018 - 2020 BioCompute. All rights reserved."
faviconfile = "images/favicon.png"
posts_navigation = true
uselatex = false
highlightjs = false
highlightjslanguages = ["r"]
progressively = false
#google_tag_manager = "UA-79911051-3"

contact = "[email protected]"
github = "biocompute-objects"
twitter = "BioComputeObj"
# linkedin = "example"

[[menu.primary]]
identifier = "home"
name = "Documentation Home"
url = "/"
weight = 1

[[menu.primary]]
identifier = "about"
name = "About "
url = "/about"
weight = 2

[[menu.primary]]
identifier = "user_guide"
name = "User Guide"
url = "/user_guide"
weight = 3

[[menu.primary]]
identifier = "best_practices"
name = "Best Practices"
url = "/best_practices"
weight = 4

[[menu.primary]]
identifier = "sop"
name = "Curation SOP"
url = "/sop"
weight = 5

[[menu.primary]]
identifier = "events"
name = "News & Events"
url = "/events"
weight = 6

[[menu.primary]]
identifier = "biocomputeobject.org"
name = "BioCompute Portal"
url = "http://portal.biochemistry.gwu.edu/"
weight = 7
52 changes: 52 additions & 0 deletions content/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
---
title: "Home"
menu: "main"
---

<script>
((window.gitter = {}).chat = {}).options = {
room: 'biocompute-objects/BCO_Specification'
};
</script>
<script src="https://sidecar.gitter.im/dist/sidecar.v1.js" async defer></script>

<div class="col-lg-6 offset-lg-3 text-center">
<img src="/images/logo.about.png" class="img-fluid mx-auto d-block" width="75%" alt="BioCompute Logo">
</div>

<br>

### The BioCompute Standard

Because of the many different ways to organize data, a major goal of the BioCompute project is to build and maintain a formal standard through recognized, accredited standards setting organizations like the Institute for Electrical and Electronics Engineers (IEEE) and the International Standards Organization (ISO). A formal, consensus-based standard builds predictability and even more stability into the way in which bioinformatic methods are communicated.

The standard, officially known as 2791-2020, has two parts: the standards document and the schema, which is maintained in an open source repository:

- **The current version of the standard can be found [here](https://standards.ieee.org/standard/2791-2020.html)**.
- **The schema can be found [here](https://opensource.ieee.org/2791-object/ieee-2791-schema)**.

Since the base BioCompute schema is maintained as an open source repository, it can be cloned and integrated into an organization in unique ways, which allows organizations to build off of this schema to create dependent standards for specific applications. This is similar to the different versions of WiFi based on usage, such as the 802.11a standard for fast speed, but high cost and shorter range, or the 802.11b for slower top speed, but lower cost, etc. --- all of which are built on the 802.11 base standard. It can also be used to further extend the schema, such as for handling proprietary, internal content, while still being compatible with the base standard. The open source schema also enables individuals or organizations to suggest changes to be incorporated into future versions the standard.

### Citation
This standard was originaly prepared by [The BioCompute Object working group](/BCO_Spec_V1.2.md#biocompute-object-consortium-members-bcoc) during preparation for the [2017 HTS Computational Standards for Regulatory Sciences Workshop](https://hive.biochemistry.gwu.edu/htscsrs/workshop_2017).

To reference the BCO standards, please use the following
citation inclusive of the DOI:

Simonyan, V., Goecks, J., & Mazumder, R. (2017). ***Biocompute Objects — A Step towards Evaluation and Validation of Biomedical Scientific Computations.*** PDA Journal of Pharmaceutical Science and Technology, 71(2), 136–146. doi: [10.5731/pdajpst.2016.006734](http://doi.org/10.5731/pdajpst.2016.006734)

## Support, Community and Contributing

To suggest changes to [this repository](#Repository) we welcome contributions as a [pull request](https://github.com/biocompute-objects/BCO_Specification/pulls) or [issue](https://github.com/biocompute-objects/BCO_Specification/issues) submission.

BCO_Specification is licensed under the [BSD 3-Clause "New" or "Revised" License](./LICENSE)

>A permissive license similar to the BSD 2-Clause License, but with a 3rd clause that prohibits others from using the name of the project or its contributors to promote derived products without written consent.

## Mailing List

As a subscriber to the BCO mailing list, you can post to it by sending a message [email protected] (using the email address that is subscribed). This list is semi-automated and will send your message for review.

To subscribe or unsubscribe, please visit https://hermes.gwu.edu/cgi-bin/wa?A0=BIOCOMPUTELS and click `Subscribe` or `Unsubscribe` on the lower right. You can also unsubscribe from the list at any time by sending an email to [email protected], in which the body says: `unsubscribe biocomputels`

This repository is in support of [2791-2020](https://standards.ieee.org/standard/2791-2020.html) - IEEE Approved Draft Standard for Bioinformatics Computations and Analyses Generated by High-Throughput Sequencing (HTS) to Facilitate Communication. Please also see our [OSF page](https://osf.io/h59uh/) or our [main page](https://biocomputeobject.org/)
73 changes: 73 additions & 0 deletions content/about.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
---
title: "About"
menu: "main"
---

<div class="col-lg-6 offset-lg-3 text-center">
<img src="/images/logo.about.png" class="img-fluid mx-auto d-block" width="75%" alt="BioCompute Logo">
</div>

<br>

### What is BioCompute?

Tremendous insights can be found in genome data, and many of these insights are being used to drive personalized medicine. But the hundreds of millions of reads that come from a gene sequencer represent small, nearly random fragments of the genome that's being sequenced, and there are countless ways in which that data can be transformed to yield insights into cancer, ancestry, microbiome dynamics, metagenomics, and many other areas of interest.

Because there are so many different platforms and so many different scripts and tools to analyze genome data, there is a great need to standardize the way in which these steps are communicated. The more analysis steps and the more complicated a pipeline, the greater the need for a standardized mechanism of communication. The BioCompute standard brings clarity to an analysis, making it clear and reproducible.

<div class="col-lg-10 offset-lg-1 text-center">
<img src="/images/about.3.png" class="img-fluid mx-auto d-block" alt="">
</div>

<br>

A BioCompute Object (BCO) is an instance of the BioCompute standard, and is a computational record of a bioinformatics pipeline. A BCO is not an analysis, but is a record of which analyses were executed and in exactly which ways. In this way, a BCO acts as an interface for existing standards. A BCO contains all of the necessary information to repeat an entire pipeline from FASTQ to result, and includes additional metadata to identify provenance and usage.

### WiFi Analogy

The [802.11 standard](https://en.wikipedia.org/wiki/IEEE_802.11) (more commonly called "WiFi") is a way of standardizing communication between vastly different products on a wireless network. If a product manufacturer wants a product to be able to communicate on a wireless internet network, they can configure the device to use the WiFi standard and it will be able to communicate with most commercial routers, regardless of whether the product is a Mac, a PC, a cell phone, or a smart toaster.

<div class="col-lg-8 offset-lg-2 text-center">
<img src="/images/about.4.png" class="img-fluid mx-auto d-block" alt="">
</div>

<br>

BioCompute fills a similar need. BioCompute is not an automation or a new programming language, it is a way of collecting and communicating information between two entities. Rather than a latop and a router, it may be between a pharmaceutical company and the FDA, or between two clinicians, or between a clinician and a researcher. In much the same way that WiFi does not standardize the data that's being transmitted -- allowing you to use Apple's Facetime, Microsoft's Internet Explorer, or your favorite cell phone app -- BioCompute does not standardize the platforms or tools that are used for genome analysis. You continue to use your favorite platforms and tools, whether it's [HIVE](https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main), [Galaxy](https://galaxyproject.org/), [Seven Bridges](https://www.sevenbridges.com/), [DNAnexus](https://www.dnanexus.com/), or others. Also like WiFi, BioCompute can be layered with other privacy or security protocols depending on usage. So clinical trial data can be secured and HIPAA-compliant, while government-funded data sets shared between researchers can be completely open access.

Because BioCompute acts like an envelope for an entire analysis pipeline, it is compatible with other existing standards, including [FHIR Genomics](https://www.hl7.org/fhir/genomics.html) and [GA4GH](https://www.ga4gh.org/).

### BioCompute Description

BioCompute is written in [Javascript Object Notation (JSON)](https://json.org/example.html), which is simply a set of key:value pairs (meaning that raw files can be read without any knowledge of programming). Information within the BCO is organized into "domains." The domains within a BCO record are Provenance, Usability, Extension, Description, Execution, Input/Output, and Parametric Domains. For more information on the domains, please see the [BioCompute Schema](https://gitlab.com/IEEE-SA/2791/ieee-2791-schema).

BioCompute was built through a collaboration between The George Washington University and the FDA to improve communication of bioinformatics pipelines, and has since been expanded and refined through the participation or collaboration of hundreds of participants from throughout the public and private sectors. While we welcome interest and membership from anyone, most users will fall into one of three categories:

- [Research Community](/research) <br>
The Biocompute standard can help substantially improve replicability, making it possible to repeat a pipeline on a different sample with high fidelity and high confidence.

- [Clinical Community](/clinical) <br>
As BioCompute Objects become tested and validated, they can be applied in the clinic to identify risk factors, flag pharmakogenetic information, and much more.

- [Pharma, Biotech and Regulatory Pipeline](/regulatory) <br>
Protracted communications with the FDA can extend the review process by months. A standardized method of communicating HTS data may help repeat results more quickly and without the need for additional communication.

Research, clinical, and regulatory groups are key drivers of personalized medicine that is based on next generation sequencing, but there are barriers between these groups. BioCompute reduces these hurdles and brings transparency to the workflow, making it more clear what was done, and clearly delineating expectations for data sharing. The BioCompute specification can be layered with other privacy and security protocols to guard sensitive data, or be made open source depending on the needs of the user.

The BioCompute project has generated two publications, three workshops, FDA funding, contributions from over 300 participants, and FDA submissions. The project has worked with individuals from NIH, Harvard, several biotech and pharma companies, EMBL-EBI, Galaxy Project, and many more, and can be integrated with any existing standard for HTS data. The project is expected to be both an IEEE and ISO recognized standard within 8-10 months.

More information about The current BioCompute standard can be found on the [Open Science Foundation website](https://osf.io/h59uh/) (where the standard is developed and maintained), the [HIVE](https://hive.biochemistry.gwu.edu/htscsrs/biocompute) website, and the [Research Objects discussion of BioCompute](http://www.researchobject.org/2017-11-27-biocompute-objects/).

<div class="col-lg-12 text-center">
<img src="/images/about.2.png" class="img-fluid mx-auto d-block" alt="">
</div>

<br>

<div class="alert alert-primary" role="alert">

**Milestones in the BioCompute Program**

The major milestones of the BioCompute Partnership and future goals are paving the way for a consensus-driven, widely adopted standard. The FDA's Genomics Working Group (GWG) originally articulated the challenges of communicating genomic analysis pipelines in a regulatory context in 2013. Since then, the project has accumulated tremendous momentum, a testament to the GWG's efforts in describing communication challenges. More recently, the second BioCompute publication has recently been published, the 4th Workshop is scheduled, and the next major goal is the formal launch of the BioCompute Public Private Partnership. The [Executive Committee](https://www.biocomputeobject.org/leadership.html) will formalize the future roadmap beyond these goals.

</div>
Loading