hifiasm using about gametophyte of fern #718

guoshanf · 2024-11-02T09:15:21Z

Hello!
Thank you for providing such an excellent software!
I am currently working on a study involving a fern. Due to a minor oversight, I sent its haploid gametophyte for sequencing, meaning I only obtained sequencing data for one set of chromosomes. At the time, I did not notice this issue, so I proceeded to assemble the genome using hifiasm with default settings. Previously, I estimated the genome size to be approximately 1.8G using flow cytometry. However, the resulting genome assembly is 3.6G in size. I would like to ask if this is because hifiasm, by default, assembles two sets of chromosomes under the assumption of diploidy? If so, is there a way to assemble just one haploid set using hifiasm? If not, what could be the reason for the discrepancy between my assembled genome size and the predicted genome size?
Thanks in advance for your help!

chhylp123 · 2024-11-04T17:27:09Z

You could consider to use: https://github.com/dfguan/purge_dups

guoshanf · 2024-11-05T02:39:10Z

Thank you very much for your advice!

Previously, I thought the issue was with the sequencing depth, so I supplemented with additional HiFi data, which means I ended up with a total of 222G of fastq data. I then assembled the data separately using the first sequencing data, the second sequencing data, and the combined data from both sequencing runs. The assembly results were 3.6G, 4.0G, and 4.6G, respectively. I first evaluated the combined assembly result using BUSCO, achieving an impressive 96.2% with the viridiplantae database, but only 83.1% with the embryophyta database. After that, I processed it with purge_haplotigs and eventually obtained 4.2G of clean data. Both sequencing runs were performed on the haploid gametophyte. Since the genome sizes of the three assemblies are different, I am now unsure if my approach is correct. I also am uncertain if the original data from haploid sequencing can be directly assembled using hifiasm. Currently, it seems the results from flow cytometry do not offer much reference value either. Do you think it is necessary to resequence its diploid sporophyte? Or should I increase the sequencing depth further? Alternatively, should I switch to a different assembly tool?

Thank you in advance for your generous assistance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hifiasm using about gametophyte of fern #718

hifiasm using about gametophyte of fern #718

guoshanf commented Nov 2, 2024

chhylp123 commented Nov 4, 2024

guoshanf commented Nov 5, 2024

hifiasm using about gametophyte of fern #718

hifiasm using about gametophyte of fern #718

Comments

guoshanf commented Nov 2, 2024

chhylp123 commented Nov 4, 2024

guoshanf commented Nov 5, 2024