Skip to content

Commit

Permalink
v1.04 updated D genotype references
Browse files Browse the repository at this point in the history
  • Loading branch information
Cmnorris7 committed Oct 28, 2024
1 parent 94149ff commit c95415c
Show file tree
Hide file tree
Showing 6 changed files with 11 additions and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Using GenoFLU, fully Eurasian and distinct introductions of H5 2.3.4.4b virus ar

The GenoFLU tool is intended to identify the genotype of North American H5 2.3.4.4b viruses as well as providing information on individual segments when a sequence does not belong to a defined genotype. Input for the tool should be high quality, high coverage sequences with all eight segments present. The number of mixed bases should be low as mixed sequences may result in aberrant genotype calls. Both FASTQ and FASTA sequence data can be input into the tool; however, FASTA file input will not generate statistics on the average depth of coverage and care should be taken to utilize high quality consensus sequences.

Within the [genotyping scheme](./docs/Genotyping_reference_for_US_H5_2.3.4.4b_10-21-2024.pdf), each segment has a set of reference “type” sequences for the segment, each assigned a unique number. Sequences that fall within 2% identity of the reference are called as that segment number. Each genotype is defined by the constellation of segment numbers.
Within the [genotyping scheme](./docs/Genotyping_reference_for_US_H5_2.3.4.4b.pdf), each segment has a set of reference “type” sequences for the segment, each assigned a unique number. Sequences that fall within 2% identity of the reference are called as that segment number. Each genotype is defined by the constellation of segment numbers.

## Installation

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
>am24 24-030039-001 PB2
ATGGAGAGAATAAAAGAACTGAGAGATCTAATGTCACAGTCTCGCACTCGCGAGATACTAACCAAAACCACTGTTGACCACATGGCCATAATCAAGAAGTACACATCAGGAAGGCAAGAAAAGAACCCTGCACTCAGGATGAAATGGATGATGGCAATGAAATACCCAATCACAGCAGACAAGCGAATAATGGAAATGATCCCTGAAAGGAATGAACAAGGACAAACCCTCTGGAGCAAGACAAATGATGCTGGATCAGATAGAGTGATGGTGTCACCCCTGGCTGTGACATGGTGGAATAGGAATGGACCAACAACAAGTACAGTTCACTATCCAAAGGTATACAAAACTTATTTTGAAAAAGTTGAAAGGTTGAAACACGGGACCTTTGGCCCTGTACACTTCAGAAACCAAGTTAAGATAAGACGGAGGGTCGACATAAACCCGGGCCATGCTGACCTCAGTGCCAAAGAGGCGCAGGACGTAATCATGGAAGTTGTCTTTCCAAATGAAGTAGGAGCGAGAATATTGACGTCGGAGTCACAATTGACGATAACAAAGGAAAAGAAGGAAGAACTCCAGGACTGCAAAATCGCCCCTCTGATGGTTGCATACATGCTAGAAAGAGAGCTGGTCCGCAAGACAAGGTTTCTCCCAGTGGCTGGTGGAACAAGCAGTGTCTACATTGAGGTGCTGCATTTGACCCAGGGAACATGCTGGGAGCAAATGTATACTCCAGGAGGAGAAGTGAGAAACGATGATGTAGACCAAAGCTTAATTATCGCTGCTAGGAACATAGTAAGAAGAGCAACAGTGTCAGCAGACCCATTAGCATCTCTATTGGAGATGTGCCACAGCACACAAATTGGAGGAATAAGAATGGTAGACATTCTTCGGCAAAATCCAACAGAGGAACAAGCCGTGGACATATGCAAGGCGGCAATGGGCTTGAGGATTAGCTCATCTTTCAGCTTTGGTGGATTCACTTTTAAAAGAACAAGTGGGTCATCAGTCAAAAGGGAAGAAGAAGTGCTTACGGGCAATCTTCAAACATTGAAAATAAGAGTGCATGAGGGGTATGAAGAATTCACTATGGTTGGAAGAAGAGCAACGGCCATTCTAAGGAAAGCAACCAGAAGACTGATCCAGTTAATAGTAAGTGGAAGGGACGAACAGTCAATTGCTGAAGCAATAATTGTGGCCATGGTATTCTCACAAGAGGATTGCATGATAAAGGCAGTTCGAGGTGACCTGAATTTTGTCAATAGGGCAAATCAGCGGCTGAATCCAATGCATCAGCTCTTGAGACACTTCCAAAAGGATGCAAAAGTGCTTTTCCAAAATTGGGGAATTGAGCCCATTGACAATGTGATGGGAATGATCGGGATATTGCCTGACATGACTCCAAGTACCGAGATGTCTCTGAGGGGAATAAGAGTCAGTAAGATGGGAGTAGATGAATACTCCAGTACAGAGCGGGTAGTAGTAAGCATCGACCGATTTTTAAGAGTCCGAGACCAACGGGGGAATGTACTACTGTCACCCGAAGAGGTCAGCGAGACACAAGGAACAGAGAAACTGACAATCACTTACTCGTCATCAATGATGTGGGAGATCAATGGTCCTGAGTCGGTGTTGGTCAATACCTATCAGTGGATAATCAGAAACTGGGAAACTGTAAAAATTCAATGGTCACAAGATCCCACGATGTTGTATAATAAGATGGAGTTCGAGCCATTCCAGTCTCTGGTCCCTAAGGCAGCCAGGGGTCAATACAGTGGGTTCGTGAGGACACTATTTCAGCAAATGCGAGATGTACTTGGAACATTTGACACTGTTCAGATAATAAAACTTCTCCCCTTTGCCGCTGCTCCACCGGAACAAAGTAGAATGCAATTCTCCTCCCTGACCGTGAATGTGAGAGGATCAGGAATGAGAATACTGGTAAGAGGCAATTCTCCAGTGTTCAACTACAACAAGGCCACCAAGAGGCTCACAGTTCTCGGGAAAGACGCAGGTGCATTGACCGAAGACCCAGATGAAGGCACAGCTGGAGTGGAGTCTGCTGTTTTAAGAGGATTTCTCATTTTGGGCAAAGAAGACAAGAGATATGGCCCAGCACTGAGCATCAATGAGTTGAGCAATCTTGCAAAGGGAGAGAAGGCTAATGTGCTAATTGGGCAAGGAGACGTAGTGTTGGTGATGAAACGGAAACGGGACTCTAGCATACTTACTGACAGCCAGACAGCGACCAAAAGAATTCGGATGGCCATCAATTAG
>am4 24-030039-001 PA
ATGGAAGACTTTGTGCGACAATGCTTCAATCCAATGATCGTCGAGCTTGCGGAAAAGGCAATGAAAGAATATGGGGAAGATCCGAAAATCGAAACTAACAAGTTCGCTGCAATATGCACTCATTTGGAAGTCTGTTTCATGTATTCGGATTTCCATTTCATTGATGAACGGGGCGAATCAATAATTGTGGAATCTGGCGATCCAAATGCACTACTGAAGCATCGATTTGAGATAATTGAAGGAAGAGACAGAACAATGGCCTGGACAGTGGTAAATAGCATCTGCAACACCACGGGAGTCGAGAAGCCCAAGTTCCTTCCTGATTTGTATGATTACAAGGAGAACCGATTCATTGAGATTGGAGTGACACGGAGAGAGGTCCATATATATTACCTAGAGAAAGCCAACAAGATAAAATCCGAGAAGACACACATCCACATCTTCTCATTTACTGGAGAAGAAATGGCCACTAAAGCAGACTACACCCTTGACGAAGAAAGCAGAGCAAGGATTAAAACCAGGCTATTCACTATAAGACAAGAAATGGCCAGCAGGGGTCTATGGGATTCCTTTCGTCAGTCCGAAAGAGGCGAAGAGACAATTGAAGAAAGATTTGAAATCACAGGAACCATGCGCAGGCTTGCCGACCAAAGTCTCCCACCGAACTTCTCCAGCCTTGAAAACTTTAGAGCCTATGTGGATGGATTCGAACCGAACGGCTGCATTGAGGGCAAGCTTTCTCAAATGTCCAAAGAAGTGAACGCCAGAATTGAACCATTTTTGAAGACAACACCACGCCCTCTCAAATTGCCTGATGGGCCCCCCTGCTCTCAGCGGTCAAAATTTCTGCTGATGGATGCTTTGAAATTAAGCATTGAAGACCCAAGTCATGAGGGAGAGGGGATACCACTGTACGATGCAATCAAATGCATGAAGACATTTTTCGGCTGGAAAGAGCCCAATGTAATCAAACCACATGAAAAGGGCATAAACCCTAACTACCTCCTGGCTTGGAAGCAAGTGCTAGCAGAACTCCAGGACCTTGAAAATGGGGAGAAAATCCCAAAGACGAAGAACATGAAGAAAACAAGTCAATTAAAGTGGGCACTTGGTGAGAACATGGCACCGGAAAAAGTGGACTTTGAGGACTGCAAGGATGTTGGCGATCTAAAACAGTATGATAGCGATGAGCCAGAGCCTAGATCGCTAGCGAGTTGGATCCAGAGTGAATTCAATAAGGCATGTGAATTGACTGACTCAAGCTGGATAGAACTGGACGAAATAGGGGAAGATGTTGCTCCGATTGAACACATTGCAAGCATGAGGAGGAATTATTTCACAGCAGAAGTGTCCCATTGCAGGGCCACTGAATACATAATGAAAGGAGTCTACATAAATACAGCTCTGCTCAATGCATCTTGCGCGGCCATGGATGACTTCCAGCTGATTCCAATGATAAGCAAATGCAGGACCAAAGAAGGAAGACGGAAAACAAACCTATATGGGTTCATCATAAAAGGAAGGTCTCATTTGAGGAATGATACCGATGTAGTGAATTTTGTAAGTATGGAGTTTTCTCTCACCGACCCAAGGCTGGAACCACACAAATGGGAAAAGTACTGCGTTCTTGAAGTGGGAGATATGCTCCTGAGGACTGCAATAGGCCAAGTATCAAGACCCATGTTCCTGTATGTTAGGACCAACGGGACCTCCAAAATCAAGATGAAATGGGGTATGGAGATGAGGCGTTGCCTTCTTCAGTCTCTTCAACAGATTGAGAGCATGATTGAGGCCGAGTCTTCTGTCAAAGAAAAAGACATGACTAAAGAATTTTTTGAGAACAAGTCGGAAACGTGGCCAATTGGAGAATCCCCCAGAGGGGTAGAGGAAGGATCCATTGGGAAGGTATGCAGAACCCTGCTGGCAAAATCTGTGTTCAACAGTCTATACGCATCCCCACAACTTGAAGGATTTTCAGCAGAATCGAGGAAACTGCTTCTCATTGTTCAGGCACTTAGGGACAACCTGGAACCTGGAACCTTCGATCTTGGAGGGCTATATGAAGCAATTGAGGAGTGCCTGATTAATGATCCCTGGGTTTTGCTTAATGCATCTTGGTTCAACTCCTTCCTCACACATGCACTGAAATAG
>am13 24-030039-001 NP
ATGGCGTCTCAAGGCACCAAACGATCCTATGAACAAATGGAAACTGGTGGGGAACGCCAGAATGCCACTGAAATCAGAGCATCTGTTGGAAGAATGGTTGGCGGAATCGGGAGATTCTACATACAGATGTGCACTGAGCTCAAACTCAGTGATTACGAAGGGAGGCTGATCCAAAACAGCATAACCATAGAAAGGATGGTTCTCTCAGCATTTGATGAGAGGAGGAACAAGTATCTGGAAGAACATCCCAGTGCTGGGAAGGATCCCAAGAAGACTGGAGGTCCAATCTACAGGAGAAGAGATGGCAAATGGATGAGAGAGTTGATCCTCTACGACAAAGAAGAGATCAGAAGAATTTGGCGTCAAGCTAATAATGGAGAGGATGCAACTGCTGGTCTCACTCATTTGATGATTTGGCATTCCAATCTGAATGATGCCACATACCAGAGAACAAGGGCACTTGTGCGTACTGGAATGGACCCTAGGATGTGCTCTCTGATGCAAGGCTCAACCCTCCCTAGGAGATCCGGGGCTGCTGGAGCAGCAGTGAAAGGAGTTGGAACAATGGTAATGGAATTGATTCGGATGATCAAACGAGGGATCAATGATCGGAATTTCTGGAGAGGCGAAAATGGACGGAGAACCAGGATTGCCTACGAGAGAATGTGCAACATTCTCAAGGGAAAGTTCCAAACAGCAGCACAACGAGCAATGATGGACCAAGTGAGGGAAAGCCGGAATCCTGGGAATGCTGAGATTGAAGATCTCATCTTTCTCGCACGATCTGCTCTCATCTTGAGGGGATCAGTGGCTCATAAGTCCTGTCTGCCTGCTTGCGTGTATGGACTTGCTGTAGCCAGTGGATATGACTTTGAAAGAGAAGGATACTCTCTAGTCGGAATTGATCCTTTCCGTCTGCTCCAAAACAGTCAAGTCTTCAGTCTCATCAGACCGAACGAAAATCCAGCTCATAAAAGTCAGCTGGTATGGATGGCATGCCACTCTGCGGCATTTGAGGATCTAAGAGTGTCAAGCTTCATCAGAGGGACAAGAGTAGTCCCAAGAGGACAACTGTCCACCAGAGGAGTTCAGATTGCTTCAAATGAAAACATGGAGACAATGGACTCCAGTACTCTCGAACTGAGGAGCAGATACTGGGCTATAAGAACAAGAAGTGGAGGAAACACTAACCAACAGAGAGCATCTGCAGGGCAAATCAGCGTACAGCCCACATTCTCTGTGCAGAGAAACCTCCCATTCGAAAGAGCAACCATCATGGCAGCATTTACGGGAAACACTGAAGGCAGAACTTCAGACATGAGAACTGAGATCATAAGGATGATGGAAAATGCCAGACCTGAAGATGTGTCTTTCCAGGGGCGGGGAGTCTTCGAGCTCTCGGACGAAAAGGCAACGAACCCGATCGTGCCTTCCTTTGACATGAGCAATGAAGGATCTTATTTCTTCGGAGACAATGCAGAGGAGTATGACAATTAA
>am4N1 24-030039-001 NA
ATGAATCCAAATCAAAAGATAATAACTATCGGGTCAATCTGCATGGTAATTGGAATAATAAGTCTGGTGCTACAAATTGGAAACATAATCTCAATATGGGTTAGTCATTCAATTCAAACTGGAAACCAGAACCACCCAGAAACATGCAATCAAAGTGTCATTACCTACGAAAACAATACTTGGGTGAATCAGACATACATCAACATAAGTAATACCAATTTAATTGCAGAACAGGCTGTAGATCCAGTAGCACTAGCAGGTAATTCCTCTCTCTGTCCAATCAGTGGATGGGCCATATACAGCAAGGACAATGGTATAAGGATAGGTTCCAAAGGAGATGTATTTGTCATCAGAGAGCCTTTTATTTCATGCTCTCACTTGGAATGCAGGACCTTTTTTCTAACTCAAGGGGCCCTGTTGAATGATAAGCATTCTAATGGAACCGTTAAAGACAGAAGCCCTTATAGAACCCTGATGAGCTGCCCTGTTGGTGAAGCTCCTTCACCATACAATTCAAGGTTTGAGTCTGTTGCTTGGTCAGCAAGTGCTTGTCATGATGGCATTAGTTGGTTGACAATTGGTATTTCTGGCCCAGACAATGGGGCGGTGGCTGTATTGAAATACAATGGCATAATAACAGATACTATTAAGAGTTGGAGAAGCAATATATTGAGAACACAAGAGTCTGAATGTGCCTGCATTAATGGTTCTTGCTTTACCATAATGACTGATGGACCGAGTAATGGCCAGGCCTCATACAAAATTTTCAGAATAGAAAAGGGAAAGGTAGTCAAATCAGTTGAGTTGAATGCCCCTAATTATCACTATGAGGAGTGCTCCTGTTATCCTGATGCTAGCGAGGTAATGTGTGTGTGCAGAGACAACTGGCATGGTTCAAACCGACCATGGGTGTCCTTCAATCAAAATCTGGAATATCAAATAGGGTACATATGCAGCGGAGTTTTTGGAGACAACCCGCGCCCCAGTGATGGAACAGGCAGTTGTGGTCCAGTGTCCTCTAATGGGGCATATGGAGTGAAGGGATTTTCATTTAAATACGGTAATGGTGTTTGGATAGGAAGAACTAAAAGTACTAGCTCAAGGAGTGGGTTTGAGATGATCTGGGATCCTAATGGGTGGACGGAGACAGACAGCAGTTTTTCTGTAAAGCAAGATATTGTAGCAATAACGGACTGGTCAGGATATAGCGGAAGTTTTGTTCAGCATCCAGAACTGACGGGGTTGGATTGCATGAGGCCTTGCTTCTGGGTTGAGCTGATCAGAGGAAGACCCAGAGAGAACACGATTTGGACCAGTGGAAGCAGCATTTCTTTTTGTGGAGTAAATAGCGACACTGTGGGTTGGTCTTGGCCAGACGGTGCTGAGTTGCCATTCACCATTGACAAGTAG
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
>am5 24-031478-001 PA
ATGGAAGATTTTGTGCGACAATGCTTCAATCCAATGATCGTCGAGCTTGCGGAAAAGGCAATGAAGGAATATGGGGAAGATCCAAAAATTGAGACAAACAAATTTGCTGCAATATGCACACACTTAGAAGTATGTTTCATGTATTCAGATTTCCATTTCATTGATGAACGAGGTGAATCAATAATCGTGGAATCTGGCGATCCAAATGCACTCCTGAAACACCGATTTGAAATAATTGAAGGGAGAGACCGCACCATGGCCTGGACAGTAGTGAACAGTATCTGCAACACTACAGGAGTCGAAAAACCCAAGTTTCTCCCGGATTTATACGATTACAAAGAGAACCGTTTCATTGAAATTGGAGTAACCAGGAGGGAAGTTCATATATACTATTTAGAAAAGGCCAATAAGATAAAGTCTGAGAAAACACACATTCATATCTTTTCATTCACTGGAGAAGAAATGGCCACTAAAGCAGACTACACCCTTGATGAAGAAAGCAGAGCGAGGATCAAAACCAGACTATTCACCATAAGGCAAGAGATGGCCAGTAGAGGCCTCTGGGATTCCTTTCGTCAGTCCGAGAGAGGCGAAGAGACAATTGAAGAAAGATTTGAAATTACAGGAACCATGCGCAGGCTCGCCGACCAAAGTCTCCCACCGAACTTCTCCAGCCTTGAAAACTTTAGAGCCTATGTGGATGGATTCGAACCGAACGGTTGCATTGAGGGCAAGCTTTCTCAAATGTCCAAAGAAGTAAATGCAAGAATTGAACCGTTTCTGAAGACAACACCACGCCCTCTGAGATTGCCAGAGGGGCCTCCTTGCTCTCAACGGTCGAAATTTCTGCTAATGGATGCTCTGAAGCTTAGTATTGAAGACCCGAGTCATGAAGGTGAGGGGATACCGCTGTATGATGCGATCAAATGCATGAAGACCTTTTTTGGCTGGAAAGAGCCTAACATTGTCAAGCCACATGAGAAGGGAATAAACCCCAATTATCTCCTGGCTTGGAAGCAAGTGTTAGCAGAACTTCAGGATATTGAAAACGAGGAGGAGATTCCAAAAACGAAAAACATGAAGAAAACAAGCCAATTGAAGTGGGCACTTGGTGAAAATATGGCACCAGAGAAAGTGGACTTTGAAGACTGCAAGGATGTCAGTGATTTGAGACAGTATGACAGTGACGAGCCTGAACAAAGATCACTAGCAAGTTGGATTCAAAGTGAATTCAACAAAGCTTGTGAATTGACTGACTCAAGTTGGATAGAGCTCGATGAAATAGGAGAGGATGTTGCCCCGATTGAACACATTGCAAGCATGAGAAGGAATTACTTTACCGCTGAAGTGTCTCATTGCAGGGCAACAGAATACATAATGAAGGGAGTATACATAAACACAGCTTTGCTCAATGCTTCTTGTGCGGCAATGGATGATTTTCAACTGATCCCAATGATAAGCAAATGCAGGACCAAAGAAGGGCGGCGGAAGACAAATCTGTATGGGTTCATAATAAAGGGAAGGTCTCATTTGAGGAATGATACTGATGTGGTGAATTTTGTGAGCATGGAGTTTTCTCTTACTGATCCTAGACTAGAACCACATAAATGGGAGAAGTACTGTGTCCTTGAGATAGGGGACATGCTCCTGCGTACTGCAATAGGCCAAGTATCAAGACCCATGTTCTTGTATGTGAGAACTAATGGAACCTCCAAAATCAAAATGAAATGGGGTATGGAGATGAGGCGTTGTCTTCTTCAATCTCTTCAACAAATTGAAAGTATGGTTGAAGCCGAATCCTCTGTCAAAGAGAAGGACATGACCAGAGAATTCTTCGAAAACAAATCAGAGACATGGCCCATTGGGGAATCACCCAAAGGAGTAGAAGAAGGTTCCATTGGGAAGGTGTGCAGGACTCTGCTGGCAAAATCTGTGTTCAACAGCTTGTATGCATCTCCACAACTTGAGGGGTTTTCAGCTGAGTCGAGAAAGCTGCTCCTCATTGTTCAGGCACTTAGGGACAACCTGGAACCTGGTACCTTCGATCTTGGAGGGCTATATGAAGCAATTGAGGAGTGCCTGATTAATGATCCCTGGGTTTTGCTTAATGCATCTTGGTTCAACTCCTTCCTCACACATGCACTGAAATAG
Binary file modified dependencies/genotype_key.xlsx
Binary file not shown.
Binary file added docs/Genotyping_reference_for_US_H5_2.3.4.4b.pdf
Binary file not shown.
Binary file not shown.

0 comments on commit c95415c

Please sign in to comment.