You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
tldr: If you want to quantify a 10x Multiome experiment (just the Gene / GEX part) you MUST provide the proper onlist to kb count.
This is not a bug per-se, but rather an oversight for an edge case. The 10X Multiome GEX+ATAC chemistry (at least for the GEX part) uses the same barcode setup (16 + 12) as 10xv3.
BUT
If you run the kb count workflow with -x 10xv3 you may (at least in my case) return far too few cells. I was very miffed until I realized that 10x uses a different barcode whitelist for the multiome compared to the 10xv3.
I noticed this error on 0.28.2 and confirmed it on 0.29.1 (though I did use a 0.28.2 idx...which I don't think matters)
Retrieve data
# get a gene 10x multiome fastq pair
fasterq-dump SRR29226057; pigz -p 8 SRR29226057*
wget https://teichlab.github.io/scg_lib_structs/data/10X-Genomics/gex_737K-arc-v1.txt.gz
gunzip gex_737K-arc-v1.txt.gz
Run kb count the "default" way
This gives you about 679 cells after sc.pp.filter_cells(adata, min_genes = 300)
# not the exact command, but no one wants to see the full paths
kb count --workflow nac --sum total -g t2g.txt -t 12 -x 10xv3 -i index.idx -c1 t2c.cdna.txt -c2 t2c.unprocessed.txt -o SRR29226057_kb_provided_whitelist --h5ad SRR29226057_1.fastq.gz SRR29226057_2.fastq.gz
# now in python
import scanpy as sc
adata = sc.read_h5ad('SRR29226057_kb_provided_whitelist/counts_unfiltered/adata.h5ad')
sc.pp.calculate_qc_metrics(adata, percent_top=None, log1p=False, inplace=True)
sc.pp.filter_cells(adata, min_genes=300)
adata
Give kb count the correct onlist
Same, but adding the -w flag -> now you get 12582 cells
The text was updated successfully, but these errors were encountered:
davemcg
changed the title
WARNING: 10X Multiome (GEX + ARC) should not use bustools 10xv3 pre-built barcode whitelist
WARNING: 10X Multiome (GEX + ARC) should not use bustools 10xv3 internal barcode whitelist
Jan 30, 2025
tldr: If you want to quantify a 10x Multiome experiment (just the Gene / GEX part) you MUST provide the proper onlist to
kb count
.This is not a bug per-se, but rather an oversight for an edge case. The 10X Multiome GEX+ATAC chemistry (at least for the GEX part) uses the same barcode setup (16 + 12) as 10xv3.
BUT
If you run the
kb count
workflow with-x 10xv3
you may (at least in my case) return far too few cells. I was very miffed until I realized that 10x uses a different barcode whitelist for the multiome compared to the 10xv3.If you give
kb count
the 10x multiome barcode whitelist as-w gex_737K-arc-v1.txt
you get the proper result.Versions
I noticed this error on 0.28.2 and confirmed it on 0.29.1 (though I did use a 0.28.2 idx...which I don't think matters)
Retrieve data
Run kb count the "default" way
This gives you about 679 cells after sc.pp.filter_cells(adata, min_genes = 300)
Give kb count the correct onlist
Same, but adding the
-w
flag -> now you get 12582 cellsThe text was updated successfully, but these errors were encountered: