Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault and no recovery possible #222

Open
RNieuwenhuis opened this issue Nov 14, 2023 · 0 comments
Open

Segmentation fault and no recovery possible #222

RNieuwenhuis opened this issue Nov 14, 2023 · 0 comments
Labels

Comments

@RNieuwenhuis
Copy link

Describe the issue
I tried to run RepeatModeler on a very large genome in the TETools singularity container on a machine with 64 cores and over 750 Gb of RAM.
I changed the sample size to 1 Gbp to ensure at least a decent amount of my genome is being sampled. The process runs fine for a week or so and reached the eledef stage for round-5 when it exited with code 139 which seems to be a segmentation fault.

99% completed,  00:0:00 (hh:mm:ss) est. time remaining.
      100% completed,  00:0:00 (hh:mm:ss) est. time remaining.
Comparison Time: 62:27:00 (hh:mm:ss) Elapsed Time, 757981572 HSPs Collected
  - RECON: Running imagespread..
RECON Elapsed: 01:34:19 (hh:mm:ss) Elapsed Time
  - RECON: Running initial definition of elements ( eledef )..
eledef failed. Exit code 139

Restarting the pipeline using -recoverDir results in the following message:

Oops...the RM_3967425.WedNov81814392023 run did not get passed round-1.
It makes more sense to restart this run from the beginning.
Remove the -recoverDir option and rerun the program.

I see that for each step the consensi.fa and families.stk files are empty, also in the directories for each round.

ls -l ./
total 2004
-rw-r--r-- 1 nieuw133 domain users       0 Nov  8 18:49 consensi.fa
-rw-r--r-- 1 nieuw133 domain users       0 Nov  8 18:49 families.stk
-rw-r--r-- 1 nieuw133 domain users    7801 Nov 14 12:28 rmod.log
drwxr-xr-x 2 nieuw133 domain users   57344 Nov  8 18:47 round-1
drwxr-xr-x 4 nieuw133 domain users   53248 Nov  8 20:05 round-2
drwxr-xr-x 4 nieuw133 domain users  159744 Nov  9 05:24 round-3
drwxr-xr-x 4 nieuw133 domain users  434176 Nov 11 16:46 round-4
drwxr-xr-x 5 nieuw133 domain users 1318912 Nov 14 09:19 round-5

Reproduction steps

RepeatModeler -database My_genome -threads 64 -LTRStruct -genomeSampleSizeMax 1000000000

Log output

See above

Environment (please include as much of the following information as you can find out):

I used the TETools latest singularity image as is, mounted my directory and used BuildDatabase. No other databases were installed.

What is going on? Why are there no results stored that I can use for recovery?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant