Christian J. Michel and Ahmed Ahmed
A trinucleotide circular code is a set of trinucleotides allowing the reading frame in genes to be retrieved locally, i.e. anywhere in genes and in particular without start codon, and automatically with a window of a few nucleotides. In 1996, a common circular code X has been identified simultaneously in two large populations of eukaryotic and prokaryotic genes. The method proposed here identifies periodic signals of this code X in the two frameshift types (+1 and -1) of both eukaryotic and prokaryotic frameshift genes. As expected by the code theory, the circular code modulo 3 signals move in the same direction of translational frameshifting. Finally, in 68% of frameshift genes in the RECODE 2 database, the frameshift type (+1 and -1) is automatically identified using only this circular code periodic signal. This circular code information constitutes a new structural property of frameshift genes. It may be used directly or in association with existing methods to identify frameshift genes in genomes and their encoded proteins.
PDFShare this article
Journal of Computer Science & Systems Biology received 2279 citations as per Google Scholar report