Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Bin edges must be unique: #18

Closed
wangpenhok opened this issue May 8, 2024 · 8 comments
Closed

ValueError: Bin edges must be unique: #18

wangpenhok opened this issue May 8, 2024 · 8 comments

Comments

@wangpenhok
Copy link

run :
SEVtras.ESAI_calculator(adata_ev_path="./output/sEV_SEVtras_sample10.h5ad",
adata_cell_path='./scanpy_output/adata_gex_10sample.h5ad', out_path='./output/',
Xraw=False, OBSsample='sampleName', OBScelltype='celltype')

error:
/home/data/wangp_sc/.conda/envs/SEVtras_env/lib/python3.8/site-packages/anndata/_core/merge.py:942: UserWarning: Only some AnnData objects have .raw attribute, not concatenating .raw attributes.
warn(
/home/data/wangp_sc/.conda/envs/SEVtras_env/lib/python3.8/site-packages/anndata/_core/anndata.py:1785: FutureWarning: X.dtype being converted to np.float32 from float64. In the next version of anndata (0.9) conversion will not be automatic. Pass dtype explicitly to avoid this warning. Pass AnnData(X, dtype=X.dtype, ...) to get the future behavour.
[AnnData(sparse.csr_matrix(a.shape), obs=a.obs) for a in all_adatas],
/home/data/wangp_sc/.conda/envs/SEVtras_env/lib/python3.8/site-packages/anndata/_core/anndata.py:1785: FutureWarning: X.dtype being converted to np.float32 from float64. In the next version of anndata (0.9) conversion will not be automatic. Pass dtype explicitly to avoid this warning. Pass AnnData(X, dtype=X.dtype, ...) to get the future behavour.
[AnnData(sparse.csr_matrix(a.shape), obs=a.obs) for a in all_adatas],
/home/data/wangp_sc/.conda/envs/SEVtras_env/lib/python3.8/site-packages/scanpy/preprocessing/_normalization.py:197: UserWarning: Some cells have zero counts
warn(UserWarning('Some cells have zero counts'))
/home/data/wangp_sc/.conda/envs/SEVtras_env/lib/python3.8/site-packages/scanpy/preprocessing/_simple.py:352: RuntimeWarning: invalid value encountered in log1p
np.log1p(X, out=X)

ValueError Traceback (most recent call last)
Cell In[24], line 1
----> 1 SEVtras.ESAI_calculator(adata_ev_path="./output/sEV_SEVtras_sample10.h5ad",
2 adata_cell_path='./scanpy_output/adata_gex_10sample.h5ad', out_path='./output/',
3 Xraw=False, OBSsample='sampleName', OBScelltype='celltype')

File ~/.conda/envs/SEVtras_env/lib/python3.8/site-packages/SEVtras/main.py:188, in ESAI_calculator(adata_ev_path, adata_cell_path, out_path, OBSsample, OBScelltype, OBSev, OBSMpca, cellN, Xraw, normalW, plot_cmp, save_plot_prefix, OBSMumap, size)
186 adata_cell = read_adata(adata_cell_path, get_only=False)
187 from .functional import deconvolver, ESAI_celltype, plot_SEVumap, plot_ESAIumap
--> 188 celltype_e_number, adata_evS, adata_com = deconvolver(adata_ev, adata_cell, OBSsample, OBScelltype, OBSev, OBSMpca, cellN, Xraw, normalW)
189 ##ESAI for sample
190 sample_ESAI = (adata_com[adata_com.obs[OBScelltype]==OBSev,].obs[OBSsample].value_counts() / adata_com[adata_com.obs[OBScelltype]!=OBSev,].obs[OBSsample].value_counts()).fillna(0)

File ~/.conda/envs/SEVtras_env/lib/python3.8/site-packages/SEVtras/functional.py:114, in deconvolver(adata_ev, adata_cell, OBSsample, OBScelltype, OBSev, OBSMpca, cellN, Xraw, normalW)
112 def deconvolver(adata_ev, adata_cell, OBSsample='batch', OBScelltype='celltype', OBSev='sEV', OBSMpca='X_pca', cellN=10, Xraw = True, normalW=True):
--> 114 adata_combined = preprocess_source(adata_ev, adata_cell, OBScelltype=OBScelltype, OBSev=OBSev, Xraw = Xraw)
115 gsea_pval_dat = source_biogenesis(adata_cell, OBScelltype=OBScelltype, Xraw = Xraw, normalW=normalW)
116 near_neighbor_dat = near_neighbor(adata_combined, OBSsample=OBSsample, OBSev=OBSev, OBScelltype=OBScelltype, OBSMpca=OBSMpca, cellN=cellN)

File ~/.conda/envs/SEVtras_env/lib/python3.8/site-packages/SEVtras/functional.py:88, in preprocess_source(adata_ev, adata_cell, OBScelltype, OBSev, Xraw)
86 sc.pp.normalize_total(adata_combined, target_sum=1e4)
87 sc.pp.log1p(adata_combined)
---> 88 sc.pp.highly_variable_genes(adata_combined, min_mean=0.0125, max_mean=3, min_disp=0.5)
89 # sc.pl.highly_variable_genes(Normal_combined)
90 adata_combined = adata_combined[:, adata_combined.var.highly_variable]#highly_variable

File ~/.conda/envs/SEVtras_env/lib/python3.8/site-packages/scanpy/preprocessing/_highly_variable_genes.py:440, in highly_variable_genes(adata, layer, n_top_genes, min_disp, max_disp, min_mean, max_mean, span, n_bins, flavor, subset, inplace, batch_key, check_values)
428 return _highly_variable_genes_seurat_v3(
429 adata,
430 layer=layer,
(...)
436 inplace=inplace,
437 )
439 if batch_key is None:
--> 440 df = _highly_variable_genes_single_batch(
441 adata,
442 layer=layer,
443 min_disp=min_disp,
444 max_disp=max_disp,
445 min_mean=min_mean,
446 max_mean=max_mean,
447 n_top_genes=n_top_genes,
448 n_bins=n_bins,
449 flavor=flavor,
450 )
451 else:
452 sanitize_anndata(adata)

File ~/.conda/envs/SEVtras_env/lib/python3.8/site-packages/scanpy/preprocessing/_highly_variable_genes.py:215, in _highly_variable_genes_single_batch(adata, layer, min_disp, max_disp, min_mean, max_mean, n_top_genes, n_bins, flavor)
213 df['dispersions'] = dispersion
214 if flavor == 'seurat':
--> 215 df['mean_bin'] = pd.cut(df['means'], bins=n_bins)
216 disp_grouped = df.groupby('mean_bin')['dispersions']
217 disp_mean_bin = disp_grouped.mean()

File ~/.conda/envs/SEVtras_env/lib/python3.8/site-packages/pandas/core/reshape/tile.py:293, in cut(x, bins, right, labels, retbins, precision, include_lowest, duplicates, ordered)
290 if (np.diff(bins.astype("float64")) < 0).any():
291 raise ValueError("bins must increase monotonically.")
--> 293 fac, bins = _bins_to_cuts(
294 x,
295 bins,
296 right=right,
297 labels=labels,
298 precision=precision,
299 include_lowest=include_lowest,
300 dtype=dtype,
301 duplicates=duplicates,
302 ordered=ordered,
303 )
305 return _postprocess_for_cut(fac, bins, retbins, dtype, original)

File ~/.conda/envs/SEVtras_env/lib/python3.8/site-packages/pandas/core/reshape/tile.py:420, in _bins_to_cuts(x, bins, right, labels, precision, include_lowest, dtype, duplicates, ordered)
418 if len(unique_bins) < len(bins) and len(bins) != 2:
419 if duplicates == "raise":
--> 420 raise ValueError(
421 f"Bin edges must be unique: {repr(bins)}.\n"
422 f"You can drop duplicate edges by setting the 'duplicates' kwarg"
423 )
424 bins = unique_bins
426 side: Literal["left", "right"] = "left" if right else "right"

ValueError: Bin edges must be unique: array([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan]).
You can drop duplicate edges by setting the 'duplicates' kwarg

@wangpenhok
Copy link
Author

I have totally no idea why this happened, could you please help to look at this error and give any hint on that? thank you vary much

@RuiqiaoHe
Copy link
Member

Please refer to scverse/scanpy#391.
Try: Updating adata.X to a scipy csr sparse matrix using adata.X = scipy.sparse.csr_matrix(adata.X) fixed this error.

@wangpenhok
Copy link
Author

wangpenhok commented May 9, 2024

I tried this method both to the adata_sev and adata_cell as following pics:

截屏2024-05-09 15 49 40

截屏2024-05-09 15 50 38

but the same error still occurred:
截屏2024-05-09 15 52 38

截屏2024-05-09 15 52 59

@RuiqiaoHe
Copy link
Member

Could you please enter adata_sEV and adate_cell in SEVtras without log normalization? Here, values in adata.x > 0. I want to check if it has any influence?

@wangpenhok
Copy link
Author

sorry, I am not quite familiar with scanpy so far. To make sure, theoretically, so I have to go through the standard scanpy pipeline which necessitate log normalization for the adate_cell to get celltype first, and then replace the adate_cell.X with pre-saved unnormalized compressed sparse matrix in the last strep. after that I pass the adate_cell as input to SEVtras.ESAI_calculator ?

@wangpenhok
Copy link
Author

I just did as I described above, and it did worked out , thanks for your help~

截屏2024-05-09 17 09 03

@kingwzun
Copy link

kingwzun commented Aug 2, 2024

Hello, wangpenhok, can I get your email? I want to get the code of adata_cell in ESAI_calculator.([email protected]

@wangpenhok
Copy link
Author

wangpenhok commented Aug 8, 2024

Hello, wangpenhok, can I get your email? I want to get the code of adata_cell in ESAI_calculator.([email protected]

I suggest that sing cell annotation be process by Seurat and exported as h5ad files, following the instruction by the 13rd answer on SEVtras Troubleshooting page(https://sevtras.readthedocs.io/en/latest/Troubleshooting.html) .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants