Self-sequence alignment represents a novel mathematical model for measuring the information content in biological sequences, such as DNA, RNA, or proteins. Nevertheless, the use of this model is not limited to biological sequences. Thus, any symbol inside a sequence can be considered. The information content is calculated using a dedicated function 𝜎(s), which provides values expressed in percentages, where 100% represents the highest information content and zero the lowest. Two principles are explored: A global principle whereby a single value is obtained for a sequence of any length and a local approach in which a series of values are used to construct signals along the sequence. For more detailed information, check the primary source of this model entitled Algorithms in Bioinformatics: Theory and Implementation. The relationship between the computer implementation and the mathematical model can be observed below:
https://gagniuc.github.io/Self-sequence-alignment/
- Paul A. Gagniuc. Algorithms in Bioinformatics: Theory and Implementation. John Wiley & Sons, Hoboken, NJ, USA, 2021, ISBN: 9781119697961.