Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

norm_functional_contrib- Error or misunderstaning #206

Closed
ragavishn opened this issue Sep 15, 2021 · 6 comments
Closed

norm_functional_contrib- Error or misunderstaning #206

ragavishn opened this issue Sep 15, 2021 · 6 comments

Comments

@ragavishn
Copy link

I have run picrust2 pipeline for my smaples and I notice that the sum of 'norm_taxon_func_contrib' for a given function is not equal to 1. am I missing something here? Can you please clarify how this is being calculated? Our understanding is

'norm_taxon_func_contrib' = taxon_fn_abun / (sum of taxon_fn abun for the given fn)

Thank you.

@gavinmdouglas
Copy link
Member

Hey @ragavishn,

Yes that should be what that column corresponds to for each sample and function. Would you mind sharing the table where you are noticing this problem?

If you don't mind sharing it, you can email it to me here: gmail_screenshot

Thanks,

Gavin

@ragavishn
Copy link
Author

Hi Gavin,

Thank you so much for taking your time to look into this. Please find below the values I get when i sum the columns norm_func_contrib by function. I have emailed you my EC_metagenome_out/pred_metagenome_contrib.tsv file as well. Please let me know if I have missed something. Thank you.

function norm_taxon_function_contrib
EC:1.1.1.1 1.0007563418716223
EC:1.1.1.100 5.874426127552522
EC:1.1.1.103 0.7118477678873769
EC:1.1.1.108 0.0001659706257272136
EC:1.1.1.11 0.8416089934812513
EC:1.1.1.122 0.00030970298766573297
EC:1.1.1.125 1.1444731716377892
EC:1.1.1.130 0.0184457626485505

@gavinmdouglas
Copy link
Member

Thanks for sending these files. That column is supposed to sum to one per sample / function, so that’s definitely concerning. Would you mind also sending the original input files and the exact command you used to generate the stratified table (e.g. through a google drive or dropbox link)? I’d just like to confirm that I can reproduce the same problem from the full output file and double-check that the output file was written correctly.

Thanks,

Gavin

@ragavishn
Copy link
Author

I will try to upload the original input file. Last column (norm_func_contrib) alone was not calculated past 100 rows. I am afraid its more of an pandas.groupby fn issue as it works for small datasets and your test data.
Thank you again for your time and great work!

@gavinmdouglas
Copy link
Member

My apologies @ragavishn, I just realized that I had totally forgotten about this... I thought it had been resolved, but it seems not be the case. I'm looking into it now and I believe this is a bug unfortunately, at least for larger tables.

@gavinmdouglas
Copy link
Member

Thanks again - this is now fixed in the latest version (v2.5.0): https://github.com/picrust/picrust2/releases, which should soon be available through conda.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants