Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monte carlo tensor type #261

Closed
wants to merge 18 commits into from
Closed

Conversation

mortonjt
Copy link
Contributor

@mortonjt mortonjt commented May 24, 2021

Brief summary of the Pull Request, including any issues it may fix using the GitHub closing syntax:

This pull request is a follow up on this qiime2 forum post
https://forum.qiime2.org/t/thoughts-on-differential-and-tensor-types/18210

Furthermore, there are follow up discussions at
https://github.com/gibsramen/BIRDMAn/issues/26
qiime2/q2-gneiss#75
https://github.com/mortonjt/q2-matchmaker/pull/19

Briefly, this a type that would help unify all q2 plugins utilizing Bayesian inference, since it would enable easily-readable and manipulable output of monte carlo samples. I know of at least a half a dozen existing plugins that could make use of this.

What this PR does : Opens the possibility of exchanging objects specific to Bayesian computation
What this PR doesn't do yet: Provide standards for Bayesian workflows. This was discussed in the q2 forums - I believe its out of the scope of this PR since it isn't clear what standards are enforceable. We could start thinking about doing this for simple examples such as computing differentials between 2 groups as done in Aldex2 / Songbird, where we could force the exact same format. But this quickly breakdowns for more customizable stan models - for instance if we wanted to store information about over-dispersion, batch, matching, confounders, etc. I believe there are too many statistical edge-cases to be accounted for at the moment.

Finally, note the dependency to arviz.

Also, include any co-authors or contributors using the GitHub coauthor tag:

CC @gibsramen @thermokarst @ebolyen

https://help.github.com/articles/creating-a-commit-with-multiple-authors/


Copy link
Contributor Author

@mortonjt mortonjt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a possible typo

@@ -40,7 +42,10 @@ def test_aligned_sequence_semantic_type_registration(self):
self.assertRegisteredSemanticType(AlignedSequence)

def test_differential_semantic_type_registration(self):
self.assertRegisteredSemanticType(AlignedSequence)
self.assertRegisteredSemanticType(Differential)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thermokarst @ebolyen I think this is a typo from a previous commit.

@mortonjt mortonjt changed the title WIP : Monte carlo tensor type Monte carlo tensor type May 24, 2021
@nbokulich
Copy link
Member

Thanks @mortonjt ! IMO these specialist types should be moved to a new types package, maybe q2-types-bayesian-inference. The addition of another dependency with this PR is part of the motivation for this, also to avoid making q2-types too bloated with specialist types.

As we branch out more into specialized plugins and multi-omics functionality, new types packages will need to be created to enable more flexible and coherent organization of these specialist types, also so that users of specific functionality will not need to download a single central q2-types that defines everything. For example, we recently started a new q2-types-genomics package for genomics-focused types. q2-types will remain more focused on generalist types (e.g., feature table types).

In this regard, these new tensor types strike me as the foundation of a new types package, so that the relevant functionality and dependencies can be focused in one place — all relevant plugins can then depend on this new types package.

Does that sound reasonable to you?

@mortonjt
Copy link
Contributor Author

Hi @nbokulich I think that is very reasonable, and honestly preferable.

I guess this brings up additional questions about the higher level org. For instance, I would like to have a q2-gneiss command that takes in as input a MonteCarloTensor. Furthermore, I could see possible extensions of q2-sample-classifier and q2-longitudinal that could benefit from this as well.

Would this q2-bayes-types package belong in the core package, or should we discuss a broader org restructuring?

@nbokulich
Copy link
Member

Hi @mortonjt , I agree, there could totally be good applications of these types in q2-sample-classifier and q2-longitudinal, among others.

Re: core, that's a good question — in general we are moving away from the "core" concept, and eventually there will be no "core", but maybe different "flavors" of Q2 (e.g., for different omics applications). See the last section here for some idea. So I think in the near term it is more likely that plugins that depend on these new types will be dropped from the current core distribution... but the "core" distribution will be history soon in any case.

@mortonjt
Copy link
Contributor Author

mortonjt commented Jun 1, 2021

Hi @nbokulich , yes I agree that this would be the ideal path forward. I'm going to move ahead and close this issue and start building up a custom repo for these types.

@mortonjt mortonjt closed this Jun 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants