Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

negative entries in filtered count matrix? #306

Open
bobermayer opened this issue Nov 9, 2023 · 7 comments
Open

negative entries in filtered count matrix? #306

bobermayer opened this issue Nov 9, 2023 · 7 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@bobermayer
Copy link

Hi,

I've been using this amazing tool successfully for more than three years now. two days ago I updated (git pull && pip install) to the most recent version (0.3.1; commit 3a4dc8), and I'm getting weird results: there are negative entries in the filtered h5 file, and the HTML report looks weird too. This happened in all the datasets I've run, so I tried again with Brain single-nuc and PBMC single-cell datasets from 10X, like so

cellbender remove-background --input Brain_10X/5k_mouse_brain_CNIK_3pv3_raw_feature_bc_matrix.h5 --output Brain_10X/Brain_10X.h5 --fpr 0.01 --epochs 150 --cuda --expected-cells 5000 --total-droplets-included 50000

but I'm getting the same result: there are about 0.1 and 0.3% negative entries in the filtered count matrix (loaded into R using this helper function).

Here's a plot comparing total counts pre- and post cellbender for Brain:
mouse_brain_raw_vs_filt
(adding up absolute counts results in post-cellbender results being much larger than pre-cellbender)

the HTML report also contains negative entries for n_cellbender or fraction_remaining:
Brain_10X_report.zip

(worryingly, Seurat doesn't even complain about negative counts and fails with cryptic errors only if you use vars.to.regress or TransferData)

Am I missing something? this happens with the default mckp estimator but also with map and mean, and without or with PRmu posterior regularization.

thanks for your help!

@jordan841220
Copy link

jordan841220 commented Nov 10, 2023

Same issue here.
I guess that these negative counts results to the error I had encountered using SCTransform in seurat.

FYI, I'm also using the most recent version (0.3.1)

@sjfleming
Copy link
Member

Hi @bobermayer and @JordanCTLin thanks for reporting. Something went wrong in the update to v0.3.1! Please use v0.3.0 for now until I figure out what happened. I am deleting the v0.3.1. release now.

@sjfleming sjfleming self-assigned this Nov 20, 2023
@sjfleming sjfleming added the bug Something isn't working label Nov 20, 2023
@sjfleming sjfleming added this to the v0.3.1 milestone Nov 20, 2023
@RvdKwast
Copy link

RvdKwast commented Dec 29, 2023

Hi,
when using cellbender v0.3.0 output using a windows system (no html report due to bug) I was getting an error from SCTransform that there were negative counts and then I found this post.

As above, I used the scCustomize package to load the h5 file and create a dual-assay-object and used the CellBender_Feature_Diff() function to compare raw and cellbender counts. The results suggest that in my case, only 60 genes had the overall counts changed, of which 50 cellbender turned negative. Ly96, the most extreme case, had 590 counts, which got turned into -15082 somehow.

Could it be some issue with the reading of the data or due to the new Seurat v5 object structure, or is it most likely a cell bender issue?

I have put a lot of time into optimizing cellbender settings for more than 20 samples, 8 of which with v0.2.0. So if it is a cellbender issue, I guess I have to second guess all of them, or can I at least be sure that the v0.2 version should be safe?

230927_yng_Heart_230810_CBfinal.log
230927_yng_Heart_230810_CBfinal.pdf

@onurcanbektas
Copy link

onurcanbektas commented Jan 5, 2024

Same issue in my case with version 0.3.1 using the conda environment.

@nimne
Copy link

nimne commented Jan 5, 2024

We also see this issue in both 0.3.0 and 0.3.1 where the outputs have negative counts. The same samples don't have apparent issues in v0.2.

@IrinaVKuznetsova
Copy link

IrinaVKuznetsova commented Feb 21, 2024

I am having the same issue.
negative counts in raw matrix
NAs in Seurat and the same as mentioned by @bobermayer

worryingly, Seurat doesn't even complain about negative counts...
in my case in normalisation function

@sjfleming
Copy link
Member

Apologies that it has taken me a long time to come back to this very important issue. I am working on it now.

I plan to release a v0.3.2 where this is fixed. v0.3.1 has been redacted and that version number will not be used.

sjfleming added a commit that referenced this issue Apr 11, 2024
* issue #348 lxml_html_clean dependency

* Fix #306: noise count integer overflow

* Introduce --force-use-checkpoint for redoing v0.3.1 runs

* Add a warning to the README about v0.3.1
@github-project-automation github-project-automation bot moved this to In Progress in remove-background Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: In Progress
Development

No branches or pull requests

7 participants