Achraf El Allali
Accepted Abstracts: J Comput Sci Syst Biol
Computational gene finding algorithms have proven their robustness in identifying genes in complete genomes. However, Next Generation Sequencing (NGS) has presented new challenges due to the incomplete and fragmented nature of the data. During the last few years, attempts have been made to extract complete and incomplete open reading frames (ORFs) directly from short reads and identify the coding ORFs, bypassing other challenging tasks such as the assembly of the metagenome. The results are new generation of gene finders that are yet to be included in the metagenomic pipeline. Currently, most metagenomics analysis tools rely on Blastx for annotation and phylogenetic profiling. Researchers use Blastx against the NCBI-nr database even though it is at least six times slower since blasting metagenomic reads against the nucleotide database results in less than 1% hit rate. Proteins allow us to go further back in time and give us homology, though at a higher cost. If we can minimize the computational cost by finding the correct reading frame and identifying only the sequences that code for proteins, we can cut computation by at least six times. Additional computation time is also saved when fragments that do not contain any genes are ignored. These immediate benefits are believed we need to advocate for the use of the new generation of gene finder in the metagenomic pipeline.
Achraf El Allali has recently obtained his PhD from the University of South Carolina. He ranked fourth place nationally in both his Baccalaureate and Bachelor?s degrees. His research focus has been in the areas of gene sequencing and next generation sequencing. He is currently an Assistant Professor at King Saud University.
Journal of Computer Science & Systems Biology received 2279 citations as per Google Scholar report