-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
number of genes in tl.rank_velocity_genes result #858
Comments
@natallah would you please provide the full pipeline you are running including the traceback and package versions are asked for in the issue template? Thanks! |
scvelo version 0.2.4 was used. I do not have the traceback anymore, but there was no error when I ran it (sorry!). I recognize that you will not be able to look into my specific question/use case. I assume that the other entries are NaN, which I'll check. So then is the way that rank_velocity_genes works to output all genes with velocity measurements regardless of statistical significance, ranked (up to the max number of genes)? Is there any way to see which were statistically significant? |
Sorry, did not necessarily mean error stack trace but also the logging statements (along with the corresponding code) provided by scvelo/scanpy which can be helpful to pinpoint why a certain behaviour is being observed (see e.g. #775 where the number of selected highly variable genes differed from the specified number).
No, there is a bit of gene selection going on (see here) which could also explain why might also contribute to the number of genes being smaller than 2000.
No, unfortunately not yet (see #154). |
Sorry for the delay! Attached are the logging statements and code. I separated the commands into small chunks so that I could troubleshoot easier. I also commented out irrelevant lines (for the purposes of this issue). 2_velocity_analysis_loom_macs_all.txt Also I appreciate your comments! I think many of the genes simply did not have enough cells to compute the dynamics and then possibly for others the gene selection that you pointed out may be causing the discrepancy between the number I requested and the number I am getting in the output. This is totally fine, but I wanted to make sure that I understood the output that I was getting prior to drawing any conclusions. |
@natallah, I'm closing this for now. Please provide reproducible code snippets I can simply copy and paste, here on GitHub. Additionally, please provide the accompanying traceback. Sorry, I do not have the time to go through different, random text files that do not contain merely the essential information. |
...
I want scVelo to output the top 2000 genes for each cluster, however only 706 genes were output for each cluster. There is no error message and it seems to run fine and work well, so I don't know that this is a bug and is likely moreso that I do not understand how scVelo is selecting the genes to output (I understand the ranking and statistics, but not why the list is cut at 706 rather than 2000 genes). Of note, I also computed velocities on every gene using scv.tl.recover_dynamics(adata, var_names='all') so the issue shouldnt be that some genes do not have velocities associated with them.
The text was updated successfully, but these errors were encountered: