A fast algorithm to "de novo" genome wide tandem repeats discovery.
A fast algorithm to "de novo" genome wide tandem repeats discovery.
Author(s): NARCISO, M.; YAMAGISHI, M.
Summary: Tandem Repeats (TR) are sequences where the same pattern repeats consecutively. They have been used as genomic markers (microsatellite and minisatéllite) since the begining of the genomic era. Recently, new studies have associated TR to important regulatory processes which substantialy increased the interest in TR. The exponential reduction cost of sequencing caused by the new technologies, resulted in the proliferation of genome projects, and particularly of novel model organisms. Very often, the first sequence analysis is the identification of genetic markers such as SNPs and TRs. As the former is a by product of the assembly phase, the real chalenge resides in the latter since the TRs identification must be done de novo. This scenario requires a faster and more efficient algorithms to perform de novo TR discovery. In this paper, we propose a new strategy to address this problem. Our algorithm is able to deal with large genomes in a reduced computational time (on average 30% to 50% faster than other the approaches). Furthermore, our algorithm finds all TR in a genome while some popular algorithms do not as will be shown. Consequently, as our algorithm is faster and find all TR, it may be used in new genomes and old genomes as well to discover eventually missed TR.
Publication year: 2011
Types of publication: Abstract in annals or event proceedings
Unit: Embrapa Rice & Beans
Observation
Some of Embrapa's publications are published as ePub files. To read them, use or download one of the following free software options to your computer or mobile device. Android: Google Play Books; IOS: iBooks; Windows and Linux: Calibre.
Access other publications
Access the Agricultural Research Database (BDPA) to consult Embrapa's full library collection and records.
Visit Embrapa Bookstore to purchase books and other publications sold by Embrapa.