Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error methyltrain #15

Open
quanc1989 opened this issue Aug 10, 2018 · 4 comments
Open

Error methyltrain #15

quanc1989 opened this issue Aug 10, 2018 · 4 comments

Comments

@quanc1989
Copy link

quanc1989 commented Aug 10, 2018

I downloaded data from https://www.ebi.ac.uk/ena/data/view/PRJEB13021 and want to compile the pipeline. Since the whole dataset is too large, I extract 20-30 files from ecoli R7 data for training (ecoli_er2925.MSssI.timp.100215.fast5, ecoli_er2925.native.timp.110915.fast5, ecoli_er2925.pcr_MSssI.timp.021216.fast5, ecoli_er2925.pcr.timp.021216.fast5).

Then I compiled the pipeline and the following error occurs. I found that raw R7 fast5 files have no Signal object, I wonder whether this pipeline could be accomplished by these data without Raw Signal.

poretools fasta --type 2D /Users/quanc/Documents/Data/Nanopore/data/ecoli_er2925.MSssI.timp.100215.fast5/pass > ecoli_er2925.MSssI.timp.100215.pass.fasta

poretools fasta --type 2D /Users/quanc/Documents/Data/Nanopore/data/ecoli_er2925.MSssI.timp.100215.fast5/fail > ecoli_er2925.MSssI.timp.100215.fail.fasta

cat ecoli_er2925.MSssI.timp.100215.pass.fasta ecoli_er2925.MSssI.timp.100215.fail.fasta > ecoli_er2925.MSssI.timp.100215.fasta

nanopolish index -d ~/Documents/Data/Nanopore/data/ ecoli_er2925.MSssI.timp.100215.fasta
[readdb] num reads: 17, num reads with path to fast5: 17

bwa mem -t 4 -x ont2d ecoli_k12.fasta ecoli_er2925.MSssI.timp.100215.fasta |\
        samtools view -q 20 -Sb - |\
        samtools sort -o ecoli_er2925.MSssI.timp.100215.sorted.bam -T %.tmp
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 17 sequences (125274 bp)...
[M::mem_process_seqs] Processed 17 reads in 1.828 CPU sec, 0.558 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 4 -x ont2d ecoli_k12.fasta ecoli_er2925.MSssI.timp.100215.fasta
[main] Real time: 0.582 sec; CPU: 1.845 sec

samtools index ecoli_er2925.MSssI.timp.100215.sorted.bam

/Users/quanc/Documents/Workspace/Github/methylation-analysis/initialize_model.sh /Users/quanc/Documents/Workspace/Github/methylation-analysis/models/r7.3_e6_70bps_6mer_template_median68pA.model template t.006 SQK006 > t.006.ont.model
ln -s t.006.ont.model t.006.ont.alphabet_nucleotide.model

/Users/quanc/Documents/Workspace/Github/methylation-analysis/initialize_model.sh /Users/quanc/Documents/Workspace/Github/methylation-analysis/models/r7.3_e6_70bps_6mer_complement_median68pA_pop1.model complement.pop1 c.p1.006 SQK006 > c.p1.006.ont.model
ln -s c.p1.006.ont.model c.p1.006.ont.alphabet_nucleotide.model

/Users/quanc/Documents/Workspace/Github/methylation-analysis/initialize_model.sh /Users/quanc/Documents/Workspace/Github/methylation-analysis/models/r7.3_e6_70bps_6mer_complement_median68pA_pop2.model complement.pop2 c.p2.006 SQK006 > c.p2.006.ont.model
ln -s c.p2.006.ont.model c.p2.006.ont.alphabet_nucleotide.model

echo t.006.ont.alphabet_nucleotide.model c.p1.006.ont.alphabet_nucleotide.model c.p2.006.ont.alphabet_nucleotide.model | tr " " "\n" > ont.alphabet_nucleotide.R7.fofn

nanopolish methyltrain -t 4  --train-kmers all --out-fofn ecoli_er2925.MSssI.timp.100215.alphabet_nucleotide.fofn --out-suffix .ecoli_er2925.MSssI.timp.100215.alphabet_nucleotide.model -m ont.alphabet_nucleotide.R7.fofn -b ecoli_er2925.MSssI.timp.100215.sorted.bam -r ecoli_er2925.MSssI.timp.100215.fasta -g ecoli_k12.fasta.alphabet_nucleotide --filter-policy R7

Training SQK006 for alphabet nucleotide for 6-mers
Starting round 0
HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 123145559535616:
  #000: H5L.c line 1117 in H5Lget_name_by_idx(): name doesn't exist
    major: Symbol table
    minor: Object already exists
  #001: H5Gtraverse.c line 861 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #002: H5Gtraverse.c line 755 in H5G_traverse_real(): component not found
    major: Symbol table
    minor: Object not found
HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 123145560072192:
  #000: H5L.c line 1117 in H5Lget_name_by_idx(): name doesn't exist
    major: Symbol table
    minor: Object already exists
  #001: H5Gtraverse.c line 861 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #002: H5Gtraverse.c line 755 in H5G_traverse_real(): component not found
    major: Symbol table
    minor: Object not found
HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 123145560072192:
  #000: H5D.c line 358 in H5Dopen2(): not found
    major: Dataset
    minor: Object not found
  #001: H5Gloc.c line 430 in H5G_loc_find(): can't find object
    major: Symbol table
    minor: Object not found
  #002: H5Gtraverse.c line 861 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #003: H5Gtraverse.c line 641 in H5G_traverse_real(): traversal operator failed
    major: Symbol table
    minor: Callback failed
  #004: H5Gloc.c line 385 in H5G_loc_find_cb(): object 'Signal' doesn't exist
    major: Symbol table
    minor: Object not found
Assertion failed: (rt.n > 0), function load_from_raw, file src/nanopolish_squiggle_read.cpp, line 321.
HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 123145559535616:
  #000: H5D.c line 358 in H5Dopen2(): not found
    major: Dataset
    minor: Object not found
  #001: H5Gloc.c line 430 in H5G_loc_find(): can't find object
    major: Symbol table
    minor: Object not found
  #002: H5Gtraverse.c line 861 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
make: *** [ecoli_er2925.MSssI.timp.100215.alphabet_nucleotide.fofn] Abort trap: 6
@jts
Copy link
Owner

jts commented Aug 10, 2018

Hi,

The fast5 file structure has changed a lot since 2015 and R7 data is no longer well supported. If you want to exactly replicate the analysis for our paper you'll have to use the specific version of nanopolish that we have in pipeline.make.

Jared

@quanc1989
Copy link
Author

@jts Thanks a lot! This error confused me a few days.
Based on your suggestion, counld I draw a conclusion that current version of nanopolish have to work with fast5 Files which contain /Raw/Signal ? And is it true for both methyltrain, methyltest and call-methylation?

@jts
Copy link
Owner

jts commented Aug 10, 2018

Yes, all modern ONT data will contain /Raw/Signal. We've tried to maintain support for older data in nanopolish but since no one really uses R7 data anymore some features may be neglected.

@quanc1989
Copy link
Author

Got it. Then I have to find some R9 data to train the model.
Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants