Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase of size of exons in new annotation created by liftoff #150

Open
sven0schuierer opened this issue Jun 8, 2023 · 0 comments
Open

Comments

@sven0schuierer
Copy link

sven0schuierer commented Jun 8, 2023

Hi,

We have used liftoff to transfer the Macaca Fascicularis genome annotation of Ensembl 109 (Macaca_fascicularis.Macaca_fascicularis_6.0.109.gtf) to a new genome assembly that was created in-house.

In the genome annotation of the new assembly generated by liftoff we see that occasionally exons become much larger than in the original annotation. Here is an example for the transcript ENSMFAT00000077059 (the numbers are the length difference of the liftoff annotation vs the Ensembl 109 annotation):

ENSMFAEE00000395752/1: 0
ENSMFAEE00000390940/2: 0
ENSMFAEE00000306469/3: 0
ENSMFAEE00000394765/4: 0
ENSMFAEE00000346828/5: 0
ENSMFAEE00000394763/6: 0
ENSMFAEE00000388088/7: 0
ENSMFAEE00000335427/8: 0
ENSMFAEE00000386907/9: 0
ENSMFAEE00000327363/10: 21
ENSMFAEE00000382736/11: 0
ENSMFAEE00000351310/12: 0
ENSMFAEE00000359950/13: 1
ENSMFAEE00000401398/14: 2
ENSMFAEE00000345644/15: 0
ENSMFAEE00000331052/16: 1
ENSMFAEE00000402627/17: 0
ENSMFAEE00000360812/18: 0
ENSMFAEE00000351868/19: 0
ENSMFAEE00000396914/20: 0
ENSMFAEE00000329600/21: 0
ENSMFAEE00000354579/22: 0
ENSMFAEE00000377326/23: 0
ENSMFAEE00000377326/23: 0
ENSMFAEE00000344810/24: 0
ENSMFAEE00000368280/25: 0
ENSMFAEE00000369733/26: 0
ENSMFAEE00000332228/27: 0
ENSMFAEE00000384438/28: 2
ENSMFAEE00000309224/29: 1
ENSMFAEE00000309974/30: 1
ENSMFAEE00000312020/31: 0
ENSMFAEE00000372379/32: 0
ENSMFAEE00000346943/33: 0
ENSMFAEE00000368619/34: 0
ENSMFAEE00000317729/35: -1
ENSMFAEE00000402464/36: 0
ENSMFAEE00000397616/37: 0
ENSMFAEE00000342437/38: 4
ENSMFAEE00000408104/39: 2
ENSMFAEE00000333135/40: 0
ENSMFAEE00000351950/41: 2
ENSMFAEE00000375812/42: 0
ENSMFAEE00000409156/43: 1
ENSMFAEE00000366366/44: 0
ENSMFAEE00000401471/45: 4
ENSMFAEE00000373865/46: 1
ENSMFAEE00000353440/47: 0
ENSMFAEE00000409481/48: 1
ENSMFAEE00000305093/49: 1
ENSMFAEE00000361133/50: 0
ENSMFAEE00000361387/51: 0
ENSMFAEE00000343244/52: 0
ENSMFAEE00000375959/53: 0
ENSMFAEE00000329204/54: 0
ENSMFAEE00000325732/55: 0
ENSMFAEE00000314368/56: 0
ENSMFAEE00000343446/57: 1463
ENSMFAEE00000315765/58: 0
ENSMFAEE00000407038/59: 0
ENSMFAEE00000312377/60: 0
ENSMFAEE00000371461/61: 0
ENSMFAEE00000348082/62: 0
ENSMFAEE00000328504/63: 495
ENSMFAEE00000316044/64: 0
ENSMFAEE00000338362/65: 1959
ENSMFAEE00000380540/66: 0
ENSMFAEE00000389418/67: 0
ENSMFAEE00000358043/68: 0
ENSMFAEE00000409953/69: 0
ENSMFAEE00000360312/70: 0
ENSMFAEE00000355622/71: 0
ENSMFAEE00000340181/72: 0
ENSMFAEE00000362474/73: 0
ENSMFAEE00000398786/74: 495
ENSMFAEE00000365651/75: 0
ENSMFAE00000110662/76: 0
ENSMFAEE00000349697/77: 0

This is a bit counter-intuitive to the description of the algorithm in Bioinformatics from which I would have concluded that exons in the version created by liftoff are at most as long as the original exons. Btw, the above results were created without the -polish option. With -polish we get the following results - with exon 16 increasing considerably in length (in the original annotation it is only 256bp long:

ENSMFAEE00000395752/1: 0
ENSMFAEE00000390940/2: 0
ENSMFAEE00000306469/3: 0
ENSMFAEE00000394765/4: 0
ENSMFAEE00000346828/5: 0
ENSMFAEE00000394763/6: 0
ENSMFAEE00000388088/7: 0
ENSMFAEE00000335427/8: 0
ENSMFAEE00000386907/9: 0
ENSMFAEE00000327363/10: 21
ENSMFAEE00000382736/11: 0
ENSMFAEE00000351310/12: 0
ENSMFAEE00000359950/13: 1
ENSMFAEE00000401398/14: 2
ENSMFAEE00000345644/15: 0
ENSMFAEE00000331052/16: 3412
ENSMFAEE00000402627/17: 0
ENSMFAEE00000360812/18: 0
ENSMFAEE00000351868/19: 0
ENSMFAEE00000396914/20: 0
ENSMFAEE00000329600/21: 0
ENSMFAEE00000354579/22: 0
ENSMFAEE00000377326/23: 507
ENSMFAEE00000344810/24: 0
ENSMFAEE00000368280/25: 0
ENSMFAEE00000369733/26: 0
ENSMFAEE00000332228/27: 0
ENSMFAEE00000384438/28: 2
ENSMFAEE00000309224/29: 1
ENSMFAEE00000309974/30: 1
ENSMFAEE00000312020/31: 0
ENSMFAEE00000372379/32: 0
ENSMFAEE00000346943/33: 0
ENSMFAEE00000368619/34: 0
ENSMFAEE00000317729/35: -1
ENSMFAEE00000402464/36: 0
ENSMFAEE00000397616/37: 0
ENSMFAEE00000342437/38: 4
ENSMFAEE00000408104/39: 2
ENSMFAEE00000333135/40: 0
ENSMFAEE00000351950/41: 2
ENSMFAEE00000375812/42: 0
ENSMFAEE00000409156/43: 1
ENSMFAEE00000366366/44: 0
ENSMFAEE00000401471/45: 4
ENSMFAEE00000373865/46: 1
ENSMFAEE00000353440/47: 0
ENSMFAEE00000409481/48: 1
ENSMFAEE00000305093/49: 1
ENSMFAEE00000361133/50: 0
ENSMFAEE00000361387/51: 0
ENSMFAEE00000343244/52: 0
ENSMFAEE00000375959/53: 0
ENSMFAEE00000329204/54: 0
ENSMFAEE00000325732/55: 0
ENSMFAEE00000314368/56: 0
ENSMFAEE00000343446/57: 0
ENSMFAEE00000315765/58: 0
ENSMFAEE00000407038/59: -21
ENSMFAEE00000312377/60: 0
ENSMFAEE00000371461/61: 0
ENSMFAEE00000348082/62: 0
ENSMFAEE00000328504/63: 4
ENSMFAEE00000316044/64: 0
ENSMFAEE00000338362/65: 0
ENSMFAEE00000380540/66: 0
ENSMFAEE00000389418/67: 0
ENSMFAEE00000358043/68: 0
ENSMFAEE00000409953/69: 0
ENSMFAEE00000360312/70: 0
ENSMFAEE00000355622/71: 0
ENSMFAEE00000340181/72: 0
ENSMFAEE00000362474/73: 0
ENSMFAEE00000398786/74: 0
ENSMFAEE00000365651/75: 0
ENSMFAE00000110662/76: 0
ENSMFAEE00000349697/77: 0

What is the explanation for these large increases in the size of the exons?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant