首页 A manually curated database of tetrapod mitochondrially encoded tRNA sequences and secondary structures

A manually curated database of tetrapod mitochondrially encoded tRNA sequences and secondary structures

举报
开通vip

A manually curated database of tetrapod mitochondrially encoded tRNA sequences and secondary structuresA manually curated database of tetrapod mitochondrially encoded tRNA sequences and secondary structures BioMedCentral BMC Bioinformatics Open Access Database A manually curated database of tetrapod mitochondrially encoded tRNA sequences and secondary stru...

A manually curated database of tetrapod mitochondrially encoded tRNA sequences and secondary structures
A manually curated database of tetrapod mitochondrially encoded tRNA sequences and secondary structures BioMedCentral BMC Bioinformatics Open Access Database A manually curated database of tetrapod mitochondrially encoded tRNA sequences and secondary structures Konstantin Yu Popadin1, Leila A Mamirova1 and Fyodor A Kondrashov*2 Address: 1Institute for Information Transmission Problems RAS, Bolshoi Karetny pereulok 19, Moscow 127994, Russia and 2Section on Ecology, Behavior and Evolution, Division of Biological Sciences, University of California at San Diego, 2218 Muir Biology Building, La Jolla, CA 92093, USA Email: Konstantin Yu Popadin - konstantinpopadin@gmail.com; Leila A Mamirova - leilamamirova@gmail.com; Fyodor A Kondrashov* - fkondrashov@ucsd.edu * Corresponding author Published: 14 November 2007 Received: 14 March 2007 Accepted: 14 November 2007 BMC Bioinformatics 2007, 8:441 doi:10.1186/1471-2105-8-441 This article is available from: ? 2007 Popadin et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Background: Mitochondrial tRNAs have been the subject of study for structural biologists interested in their secondary structure characteristics, evolutionary biologists have researched patterns of compensatory and structural evolution and medical studies have been directed towards understanding the basis of human disease. However, an up to date, manually curated database of mitochondrially encoded tRNAs from higher animals is currently not available. Description: We obtained the complete mitochondrial sequence for 277 tetrapod species from GenBank and re-annotated all of the tRNAs based on a multiple alignment of each tRNA gene and secondary structure prediction made independently for each tRNA. The mitochondrial (mt) tRNA sequences and the secondary structure based multiple alignments are freely available as Supplemental Information online. Conclusion: We compiled a manually curated database of mitochondrially encoded tRNAs from tetrapods with completely sequenced genomes. In the course of our work, we reannotated more than 10% of all tetrapod mt-tRNAs and subsequently predicted the secondary structures of 6060 mitochondrial tRNAs. This carefully constructed database can be utilized to enhance our knowledge in several different fields including the evolution of mt-tRNA secondary structure and prediction of pathogenic mt-tRNA mutations. In addition, researchers reporting novel mitochondrial genome sequences should check their tRNA gene annotations against our database to ensure a higher level of fidelity of their annotation. availability factor, mt-tRNAs show several unusual prop- Background Mitochondrially encoded tRNAs (mt-tRNAs) are an excel- erties. mt-tRNAs are of particular interest to structural lent object of study for researchers in several fields for a biologists, since the secondary structure of the mt-tRNAs variety of reasons. The primary reason is the wide variety is not as conserved as that of their nuclear encoded coun- of available completely sequenced mitochondrial terparts [1], and some mt-tRNAs in several lineages show genomes, which provides a large data sample from a accelerated rates of secondary structure evolution [2]. broad phylogenetic background. Besides the obvious Although some changes of the secondary structure may be Page 1 of 6 (page number not for citation purposes) BMC Bioinformatics 2007, 8:441 [1-3] may lead to our mation from Helm et al. [1] and two more extensive data- understanding of the broader issues associated with paral- sets based on secondary structure alignments [4,14]. Due lel evolution of secondary structure. to extensive divergence between tetrapods we made no attempt to align sequences in D- and T- loops. The evolution of secondary structure in mt-tRNAs is also Obvious errors in the sequence and the alignment were coupled with rapid compensatory evolution that is aimed at the conservation of the secondary structure stability [4]. changed manually. Among large-scale errors annotation Indeed, as much as 50% of the substitutions in mt-tRNAs of the tRNA on the wrong strand was the most common may be compensatory [4], and further study of these mol- (71 cases out of 6060 tRNAs) followed by labeling of the ecules may shed light on the selective pressure governing tRNA as transporting the wrong amino acid or complete compensatory evolution. Compensatory changes have omissions (20 cases out of 6060 tRNAs). However, most also been observed on a larger scale, with the import of errors were more subtle, such as an omission or addition nuclear coded Lys-tRNA was shown to compensate for of several nucleotides in the flanks of the gene (690 cases complete loss of mt-Lys-tRNA in metatherians [5]. Also, out of 6060 tRNAs), that were corrected only through the mt-tRNAs appear to be targets for post-transcriptional analysis of secondary structure information coupled with RNA modification mechanisms [6] including instances of the availability of a multiple alignment for the entire gene. RNA editing where modification of nucleotides that Since mt-tRNAs are not as conserved in sequence and would otherwise be damaging to function or structure takes place [7]. structure as nuclear tRNAs [1-3] a compilation of a multi- ple alignment alone is insufficient for an accurate second- Although mt-tRNAs span only 10% of the entire mito- ary structure annotation. Thus, we have chosen to add a chondrial genome, they appear to be "hotspots" of dis- secondary structure prediction using mfold [19] in addi- ease-causing mutations, such that 50% of all pathogenic tion to that produced by the multiple alignment. We pre- mutations that have described for the mitochondria have dicted secondary structure using mfold web server and a been localized to one of the mt-tRNAs [8]. In addition, Perl script to automate the Web routines. We ran mfold some mt-tRNAs appear to harbor more disease-causing specifying pairing constraints for stem structures pre- variants than others [8,9]. While no explanation for these dicted from the earlier step of the multiple alignment. We observations has been adequately tested, they underline made no attempt to restrict pairing of sequence sections the medical importance of mt-tRNAs [10,11]. The availa- predicted to be loops. If the above constraints (binding of bility of secondary structure information [12,13] and evo- some nucleotide pairs) returned implausible tRNA struc- lutionary information including compensatory changes tures or no structures at all, the alignment and secondary [13,14] have made progress in the identification and pos- structure were modified and mfold was ran again to test sible treatment of deleterious variants in mt-tRNAs the plausibility of the new structure. Finally we ranked [14,15]. Thus, a compilation of a manually curated data- each type of tRNA molecules according to their free ener- base of mt-tRNAs incorporating a multiple alignment of gies and checked by hand 25% of tRNAs with the highest genes from many closely related species and an independ- free energy. Since we have not constrained loop sizes dur- ent secondary structure prediction, would serve to ing these iterations some loop sizes decreased and stem advance structural, evolutionary and medically relevant sizes increased, leading to increased stability of tRNA mol- studies of mt-tRNAs and aid in the annotation of mt- ecules. tRNAs in newly sequenced mitochondrial genomes. Empirically we observed the following constraints of tRNA secondary structure folding as performed by mfold: Construction and content We obtained complete tetrapod mitochondrial genomes 1) minimum loop size was never less than 3 nucleotides from GenBank [16] using the Entrez search system [17] and 2) WC and GU pairs were not formed if both of the with "tetrapoda AND complete AND genome" in as the neighbor nucleotides did not participate in pairing. A few key entry and setting the Limits option of the Entrez mt-tRNA mfold secondary structures predictions that did search to mitochondrial sequences. A total of 277 differ- not conform to the expected cloverleaf were manually ent tetrapod genomes were obtained, including 148 mam- rechecked and the alignment altered until the mfold pre- malian, 53 amphibian and 76 saurosopoda genomes (all diction yielded a cloverleaf-like structure. In addition, some mt-tRNAs, including the mt-tRNASerAGY in all species GenBank files used are available at the database website). We then obtained the sequence of all annotated tRNAs [1] and several reptilian and nine banded armadillo mt- and their flanking regions using a Perl script. All of the mt- tRNAsCys [1-3] showed secondary structures that differed tRNA sequences were aligned with the muscle program from the expected cloverleaf structure due to the loss of [18] and then manually corrected, using the previously Page 2 of 6 (page number not for citation purposes) BMC Bioinformatics 2007, 8:441 , which is a structural evolutionary lian species, it does not report any secondary structure that change particularly common in mt-tRNAs [2]. is independent of a multiple alignment and excludes com- plicated cases, such as the loss of D-stems [1]. Another, more current database that includes nuclear and mito- Utility and Discussion Most tetrapod mitochondrial genomes code for 22 differ- chondrial tRNAs from the entire diversity of life forms has ent tRNAs with the exception of Metatherians that have been, unfortunately, derived automatically [20] and is lost the mt-tRNALys [5]. In addition, some tetrapod mito- unlikely to be useful to researchers requiring a high level of sequence and structure annotation fidelity. In addition, chondrial genomes that were labeled as complete were both of these databases are difficult to use in batch mode, only partially finished, such that seven mammalian genomes did not have sequences for tRNAPhe (Dromiciops as they do not represent their results in a parsing-friendly format. Thus, our database is likely to be more useful for gliroides, Metachirus nudicaudatus, Macrotis lagotis, Noto- researchers that require a low level of annotation error, a ryctes typhlops, Perameles gunnii, Pseudocheirus peregrinus phylogenetically diverse sample or prefer to work with and Thylamys elegans) and five mammals did not have the many tRNA genes in simple text files. However, our data- sequence for tRNAPro (Arctocephalus forsteri, Dromiciops gli- roides, Macrotis lagotis, Perameles gunnii and Thylamys ele- base is not tailored to the needs of researchers that require a graphical interface for their work. gans). Thus, our database contains complete manually curated sequence and secondary structure information for 6060 mitochondrially encoded tRNA molecules. In the course of re-annotation and the compilation of a secondary-structure based multiple alignment, we have Our database is available in 22 text files, one for each modified the annotation of the mt-tRNA gene location for tRNA, with sequences of the 277 different species pre- 13% of all mt-tRNAs presented in our database. Such a sented in the same order in each file. The order of the spe- high error rate in the annotation of such seemingly simple cies in the alignment is the same for each mt-tRNA gene molecules as mt-tRNAs underscores the importance of and roughly recapitulates the tetrapod phylogeny. Each availability of manually annotated databases such as the file in the database includes the species common and sci- one reported here. In particular, we suggest for researchers entific names, basic phylogenetic information and a mul- reporting novel mitochondrial genome sequences to tiple alignment of the tRNA with unaligned flanking check their tRNA gene annotations against our database to sequence and annotated secondary structure (Figure 1 and ensure a higher level of fidelity of their annotation. Man- 2). The "|" characters in the alignment delineate the con- ually curated databases have an inherent advantage of a served secondary structure prediction that was made using lower error rate than automatically created ones. How- the alignment of all tRNA genes. The capital and lower- ever, a manual assembly of such an extensive database as case letters in the files represent paired nucleotides accord- the one reported here is a resource-intensive enterprise, ing to the secondary structure prediction that was made and it is unlikely that the current database will be consid- with mfold. The two methods of secondary structure pre- erably expanded using the same manual approach. Rather diction generally showed similar results but small differ- the aim for the further development of this resource is to ences were common. For example, according to the mfold use the alignments reported here as a basis for further prediction many species in the tRNAAsn gene form 3 WC automatic enlargement. pairs in the D-stem, while the classical tRNA structure sup- ported by the alignment predicts 4 interacting nucleotides Conclusion We report a secondary structure based multiple alignment in this stem (Figure 1b). The value of showing separate predictions made by the alignment and the secondary of 6060 mt-tRNAs from 277 tetrapod species. In the structure is more evident in complicated cases, such as the course of our work, we have re-annotated a large fraction case of the anticodon stem in the tRNAAsn of the common of mt-tRNA genes, and manually checked all secondary structure predictions. We expect that our database will iguana. In this case the alignment delineates the overall facilitate further research of mitochondrially encoded area where the anticodon stem should be formed, while tRNAs from a structural, evolutionary and medical per- mfold predicts which nucleotides form WC pairs in the spectives. Currently, mammalian mitochondrial tRNAs structure (Figure 1b). Our database has a simple tab- are thought to have a high level of similarity to the canon- delimited format with a set number of species in exactly ical tRNA secondary structure [1]. However, an analysis of the same order in each file making it especially useful for exceptions to the canonical tRNA structures among the those researchers that wish to use our database in batch by vertebrate mt-tRNAs, which is made possible with the parsing information on the secondary structure from our database reported here, has not been undertaken. The files. evolutionary implications of compensations on a molec- The first database of mammalian mt-tRNAs which we ular level have been investigated previously [4], however, the study of CPDs in mt-tRNAs has been performed only used as a kernel in our alignment reports only mamma- Page 3 of 6 (page number not for citation purposes) BMC Bioinformatics 2007, 8:441 (humanb, c) mt-tRNAAsn (a) and the multiple alignment with annotated secondary structure for The secondary structure of the human mt-tRNAAsn (a) and the multiple alignment with annotated secondary structure for selected species of mt-tRNAAsn (b, c). The "|" characters separate the loops and stems based on the accepted basic secondary structure of mt-tRNAs form Helm et al. (2000) while capital letters denote those nucleotides that are predicted by mfold to participate in WC or GU pairing in stem structures. Page 4 of 6 (page number not for citation purposes) BMC Bioinformatics 2007, 8:441 (b, c) mt-tRNACys (a), and the multiple alignment with annotated second- The secondary structure of the the nine-banded armadillo mt-tRNACys (a), and the multiple alignment with annotated second- ary structure for selected species of mt-tRNACys (b, c). The "|" characters separate the loops and stems based on the accepted basic secondary structure of mt-tRNAs form Helm et al. (2000) while capital letters denote those nucleotides that are pre- dicted by mfold to participate in WC or GU pairing in stem structures. The secondary structure of mt-tRNASerAGY in our data- base resembles the one of the nine-banded armadillo mt-tRNACys (c). Page 5 of 6 (page number not for citation purposes) BMC Bioinformatics 2007, 8:441 , prediction of patho- 8. Brandon MC, Lott MT, Nguyen KC, Spolim S, Navathe SB, Baldi P, Wallace DC: MITOMAP: a human mitochondrial genome genic mutations in mt-tRNAs relies heavily on evolution- database – 2004 update. Nucleic Acids Res 2005, 33:D611-D613. ary conservation [13,14] and the availability of a 9. Wittenhagen LM, Kelley SO: Impact of disease-related mito- secondary structure-based alignment of an expanded set chondrial mutations on tRNA structure and function. Trends Biochem Sci 2003, 28:605-611. of species may contribute to a more accurate prediction of 10. Mahata B, Mukherjee S, Mishra S, Bandyopadhyay A, Adhya S: Func- the phenotypic consequences of mt-tRNA mutations. tional delivery of a cytosolic tRNA into mutant mitochondria 11. of human cells. Science 2006, 314:471-474. Moreno-Loshuertos R, Acin-Perez R, Fernandez-Silva P, Movilla N, Availability and requirements Perez-Martos A, Rodriguez de Cordoba S, Gallardo ME, Enriquez JA: Project name: mt tRNA tetrapod database; Differences in reactive oxygen species production explain the phenotypes associated with common mouse mitochon- Project home page: ~kondrash/ 12. drial DNA variants. Nat Genet 2006, 38:1261-1268. Florentz C, Sissler M: Disease-related versus polymorphic Database/; mutations in human mitochondrial tRNAs. Where is the dif- 13. ference? EMBO Rep 2001, 2:481-486. Operating system(s): Platform independent; McFarland R, Elson JL, Taylor RW, Howell N, Turnbull DM: Assign- ing pathogenicity to mitochondrial tRNA mutations: when Programming language: none "definitely maybe" is not good enough. Trends Genet 2004, 14. 20:591-596. Kondrashov FA: Prediction of pathogenic mutations in mito- License: no restriction; 15. chondrially encoded human tRNAs. Hum Mol Genet 2005, 14:2415-2419. Smith PM, Ross GF, Taylor RW, Turnbull DM, Lightowlers RN: Any restrictions to use by non-academics: no restriction. 16. Strategies for treating disorders of the mitochondrial genome. Biochim Biophys Acta 2004, 1659:232-239. 17. Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Authors' contributions Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL: Chetvernin V, Church DM, DiCuccio M, Edgar R, Federhen S, Geer GenBank. Nucleic Acids Res 2006, 34:D16-D20. KYuP, LAM and FAK conceived the construction of the LY, Kapustin Y, Khovayko O, Landsman D, Lipman DJ, Madden TL, database, and participated in the construction of the ini- Maglott DR, Ostell J, Miller V, Pruitt KD, Schuler GD, Sequeira E, tial and final alignments, corrected erroneously annotated Sherry ST, Sirotkin K, Souvorov A, Starchenko G, Tatusov RL, Tatus- ova TA, Wagner L, Yaschenko E: Database resources of the tRNAs and were involved in secondary structure predic- National Center for Biotechnology Information. Nucleic Acids tion. FAK drafted the paper, and all authors read and Res 2007, 35:D5-D12. 18. Edgar RC: MUSCLE: multiple sequence alignment with high approved the final manuscript. accuracy and high throughput. Nucleic Acids Res 2004, 32:1792-1797. 19. Zuker M: Mfold web server for nucleic acid folding and hybrid- Acknowledgements ization prediction. Nucleic Acids Res 2003, 31:3406-15. KYuP and LAM were supported by the Molecular and Cellular Biology Pro- 20. Sprinzl M, Vassilenko KS: Compilation of tRNA sequences and gram of the Russian Academy of Science. KYuP was supported by the Rus- sequences of tRNA genes. Nucleic Acids Res 2005, 33:D139-D140. sian Fund of Basic Research (grant 04-04-49623). LAM was partially supported by grants from the Howard Hughes Medical Institute (55005610), INTAS (05-1000008-8028). FAK is a National Science Founda- tion Graduate Research Fellow. References 1. Helm M, Brule H, Friede D, Giege R, Putz D, Florentz C: Search for characteristic structural features of mammalian mitochon- drial tRNAs. RNA 2000, 6:1356-1379. 2. Macey JR, Larson A, Ananjeva NB, Papenfuss TJ: Replication slip- page may cause parallel evolution in the secondary struc- tures of mitochondrial transfer RNAs. Mol Biol Evol 1997, 14:30-39. 3. Kondrashov FA: Convergent evolution of secondary structure of mitochondrial cysteine tRNA in the nine-banded arma- Publish with BioMed Central and every dillo Dasypus novemcinctus. Biofizika 2005, 50:396-403. 4. scientist can read your work free of charge Kern AD, Kondrashov FA: Mechanisms and convergence of compensatory evolution in mammalian mitochondrial "BioMed Central will be the most significant development for tRNAs. Nat Genet 2004, 36:1207-1212. 5. disseminating the results of biomedical research in our Janke A, Feldmaier-Fuchs G, Thomas WK, von Haeseler A, Paabo S: lifetime." Sir Paul Nurse, Cancer Research UK The marsupial mitochondrial genome and the evolution of 6. placental mammals. Genetics 1994, 137:243-256. Your research papers will be: Borner GV, Yokobori S, Morl M, Dorner M, Paabo S: RNA editing available free of charge to the entire biomedical community in metazoan mitochondria: staying fit without sex. FEBS Lett 7. peer reviewed and publishedimmediately upon acceptance 1997, 409:320-324. Helm M, Brule H, Degoul F, Cepanec C, Leroux JP, Giege R, Florentz cited in PubMed and archived on PubMed Central C: The presence of modified nucleotides is required for clo- yours — you keep the copyright verleaf folding of a human mitochondrial tRNA. Nucleic Acids Res 1998, 26:1636-1643. BioMedcentral Submit your manuscript here: Page 6 of 6 (page number not for citation purposes)
本文档为【A manually curated database of tetrapod mitochondrially encoded tRNA sequences and secondary structures】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。
下载需要: 免费 已有0 人下载
最新资料
资料动态
专题动态
is_589748
暂无简介~
格式:doc
大小:139KB
软件:Word
页数:0
分类:管理学
上传时间:2018-05-05
浏览量:10