Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Step 4 synteny error #162

Open
Hannah1746 opened this issue Jul 18, 2024 · 8 comments
Open

Step 4 synteny error #162

Hannah1746 opened this issue Jul 18, 2024 · 8 comments

Comments

@Hannah1746
Copy link

I am running GENESPACE and keep running into this error and I can't seem to pin down what is causing the error.

############################
4. Flagging synteny for each pair of genomes ...
# Chunk 1 / 2 (02:10:40 PM) ...
Error in rbindlist(mclapply(1:nrow(chnk), mc.cores = nCores, function(i) { :
Item 1 of input is not a data.frame, data.table or list
Calls: run_genespace -> synteny -> rbindlist -> lapply -> FUN -> rbindlist
In addition: Warning message:
In mclapply(1:nrow(chnk), mc.cores = nCores, function(i) { :
scheduled core 1 encountered error in user code, all values of the job will be affected
Execution halted

I know what individual is causing the problem but the bed and protein fasta inputs don't seem to have anything wrong with them.

I would love some help trying to debug this if you have time.

@jtlovell
Copy link
Owner

Please try running with nCores = 1 and reporting the error. Usually this happens when there is no synteny, but there are other possible causes.

@Hannah1746
Copy link
Author

Here is the new error:
############################
4. Flagging synteny for each pair of genomes ...
Error in FUN(X[[i]], ...) : object 'outHits' not found
Calls: run_genespace ... lapply -> FUN -> rbindlist -> mclapply -> lapply -> FUN
Execution halted
I know for a fact there is synteny between all my individuals. The one that is causing issues (DR) has one to every two genes ratio with my others.
input code:

wd = "/mnt/krab3/catostomid_GENESPACE"
setwd(wd)

path2mcscanx <- "/home/krablab/Documents/apps/MCScanX"

gpar <- init_genespace(
wd = wd,
path2mcscanx = path2mcscanx,
genomeIDs = c("DR","M.asiaticus", "X.texanus", "H.nigricans","C.commersonii", "M.valenciennesi"),
ploidy = c(0,1,1,1,1,1),
nCores = 1
)

out <- run_genespace(gpar, overwrite = T)

@jtlovell
Copy link
Owner

I never thought to check for ploidy > 0 ... can you try that, ploidy = c(0,1,1,1,1,1) + 1

@Hannah1746
Copy link
Author

So this moved me forward but I am still getting an error:
6. Integrating syntenic positions across genomes ...
##############
Generating syntenic dotplots ... Done!
##############
Interpolating syntenic positions of genes ...
Drer: (0 / 1 / 2 / >2 syntenic positions)
Error in vecseq(f__, len__, if (allow.cartesian || notjoin || !anyDuplicated(f__, :
Join results in 342554 rows; more than 338510 = nrow(x)+nrow(i). Check for duplicate key values in i each of which join to the same group in x over and over again. If that's ok, try by=.EACHI to run j for each group to avoid the large allocation. If you are sure you wish to proceed, rerun with allow.cartesian=TRUE. Otherwise, please search for this error message in the FAQ, Wiki, Stack Overflow and data.table issue tracker for advice.
Calls: run_genespace ... merge -> merge.data.table -> [ -> [.data.table -> vecseq
In addition: There were 50 or more warnings (use warnings() to see the first 50)
Execution halted

The thing is I have used the bed and fasta before to plot and it work but now it is not working. When I take it out I can also get it to run.

I am sorry for taking up so much of your time!!!

@jtlovell
Copy link
Owner

Its alright ... how do the dotplots look? Is it possible there is no synteny?

@Hannah1746
Copy link
Author

No there is synteny. All the dotplots show that. Here are a couple of them:
Drer_vs_Gyro.syntenicHits.pdf
Drer_vs_H.nigricans.syntenicHits.pdf
X.texanus_vs_Drer.syntenicHits.pdf
X.texanus_vs_Gyro.syntenicHits.pdf

@jtlovell
Copy link
Owner

pls send me an email so I can troubleshoot your run. jlovell [at] hudsonalpha [dot] org

@jtlovell
Copy link
Owner

OK - there is something funky with your run that was causing there to be duplicated block coordinates ... I couldn't figure out what was causing that, but I did just commit a change to master that now runs through your genomes without erroring out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants