TY - JOUR
T1 - Large-scale analysis of full-length cDNAs from the tomato (Solanum lycopersicum) cultivar Micro-Tom, a reference system for the Solanaceae genomics
AU - Aoki, Koh
AU - Yano, Kentaro
AU - Suzuki, Ayako
AU - Kawamura, Shingo
AU - Sakurai, Nozomu
AU - Suda, Kunihiro
AU - Kurabayashi, Atsushi
AU - Suzuki, Tatsuya
AU - Tsugane, Taneaki
AU - Watanabe, Manabu
AU - Ooga, Kazuhide
AU - Torii, Maiko
AU - Narita, Takanori
AU - Shin-i, Tadasu
AU - Kohara, Yuji
AU - Yamamoto, Naoki
AU - Takahashi, Hideki
AU - Watanabe, Yuichiro
AU - Egusa, Mayumi
AU - Kodama, Motoichiro
AU - Ichinose, Yuki
AU - Kikuchi, Mari
AU - Fukushima, Sumire
AU - Okabe, Akiko
AU - Arie, Tsutomu
AU - Sato, Yuko
AU - Yazawa, Katsumi
AU - Satoh, Shinobu
AU - Omura, Toshikazu
AU - Ezura, Hiroshi
AU - Shibata, Daisuke
N1 - Funding Information:
We are grateful for and acknowledge use of the draft tomato genome sequence, which was generated by the International Tomato Genome Sequencing Consortium http://solgenomics.net/tomato/. We are also grateful for and acknowledge use of the tomato SBM dataset, which was generated by Kazusa DNA Research Institute. We thank Hideki Hirakawa and Shinobu Nakayama (Kazusa DNA Res. Inst.) for assistance in GeneMark.hmm analysis. We thank Shusei Sato (Kazusa DNA Res. Inst.) for critical reading of the manuscript, and Kenta Shirasawa (Kazusa DNA Res. Inst.) for helpful discussion. We also thank Tsugumi Isozaki, Miyuki Inde, and Tsurue Aoyama (Kazusa DNA Res. Inst.) for plant care and lab assistance. We thank Hiroshi Otani (Tottori Univ.), Wataru Hasama (Oita Pref. Agric. Res. Center) and Hiroshi Shiomi (Takii and Co. Ltd.) for providing the Corynespora cassiicola isolate. This work was supported by National Bioresource Project, Genome program, “Enhancing tomato resources by sequencing Micro-Tom full-length cDNA” (2008, MEXT, Japan) to KA, by the Japan Solanaceae Consortium (JSOL), a grant from Meiji Univ. to KY, and a grant from the Kazusa DNA Res. Inst. to DS and KA.
PY - 2010/3/30
Y1 - 2010/3/30
N2 - Background: The Solanaceae family includes several economically important vegetable crops. The tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. Recently, a number of tomato resources have been developed in parallel with the ongoing tomato genome sequencing project. In particular, a miniature cultivar, Micro-Tom, is regarded as a model system in tomato genomics, and a number of genomics resources in the Micro-Tom-background, such as ESTs and mutagenized lines, have been established by an international alliance.Results: To accelerate the progress in tomato genomics, we developed a collection of fully-sequenced 13,227 Micro-Tom full-length cDNAs. By checking redundant sequences, coding sequences, and chimeric sequences, a set of 11,502 non-redundant full-length cDNAs (nrFLcDNAs) was generated. Analysis of untranslated regions demonstrated that tomato has longer 5'- and 3'-untranslated regions than most other plants but rice. Classification of functions of proteins predicted from the coding sequences demonstrated that nrFLcDNAs covered a broad range of functions. A comparison of nrFLcDNAs with genes of sixteen plants facilitated the identification of tomato genes that are not found in other plants, most of which did not have known protein domains. Mapping of the nrFLcDNAs onto currently available tomato genome sequences facilitated prediction of exon-intron structure. Introns of tomato genes were longer than those of Arabidopsis and rice. According to a comparison of exon sequences between the nrFLcDNAs and the tomato genome sequences, the frequency of nucleotide mismatch in exons between Micro-Tom and the genome-sequencing cultivar (Heinz 1706) was estimated to be 0.061%.Conclusion: The collection of Micro-Tom nrFLcDNAs generated in this study will serve as a valuable genomic tool for plant biologists to bridge the gap between basic and applied studies. The nrFLcDNA sequences will help annotation of the tomato whole-genome sequence and aid in tomato functional genomics and molecular breeding. Full-length cDNA sequences and their annotations are provided in the database KaFTom http://www.pgb.kazusa.or.jp/kaftom/ via the website of the National Bioresource Project Tomato http://tomato.nbrp.jp.
AB - Background: The Solanaceae family includes several economically important vegetable crops. The tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. Recently, a number of tomato resources have been developed in parallel with the ongoing tomato genome sequencing project. In particular, a miniature cultivar, Micro-Tom, is regarded as a model system in tomato genomics, and a number of genomics resources in the Micro-Tom-background, such as ESTs and mutagenized lines, have been established by an international alliance.Results: To accelerate the progress in tomato genomics, we developed a collection of fully-sequenced 13,227 Micro-Tom full-length cDNAs. By checking redundant sequences, coding sequences, and chimeric sequences, a set of 11,502 non-redundant full-length cDNAs (nrFLcDNAs) was generated. Analysis of untranslated regions demonstrated that tomato has longer 5'- and 3'-untranslated regions than most other plants but rice. Classification of functions of proteins predicted from the coding sequences demonstrated that nrFLcDNAs covered a broad range of functions. A comparison of nrFLcDNAs with genes of sixteen plants facilitated the identification of tomato genes that are not found in other plants, most of which did not have known protein domains. Mapping of the nrFLcDNAs onto currently available tomato genome sequences facilitated prediction of exon-intron structure. Introns of tomato genes were longer than those of Arabidopsis and rice. According to a comparison of exon sequences between the nrFLcDNAs and the tomato genome sequences, the frequency of nucleotide mismatch in exons between Micro-Tom and the genome-sequencing cultivar (Heinz 1706) was estimated to be 0.061%.Conclusion: The collection of Micro-Tom nrFLcDNAs generated in this study will serve as a valuable genomic tool for plant biologists to bridge the gap between basic and applied studies. The nrFLcDNA sequences will help annotation of the tomato whole-genome sequence and aid in tomato functional genomics and molecular breeding. Full-length cDNA sequences and their annotations are provided in the database KaFTom http://www.pgb.kazusa.or.jp/kaftom/ via the website of the National Bioresource Project Tomato http://tomato.nbrp.jp.
UR - http://www.scopus.com/inward/record.url?scp=77950400977&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77950400977&partnerID=8YFLogxK
U2 - 10.1186/1471-2164-11-210
DO - 10.1186/1471-2164-11-210
M3 - Article
C2 - 20350329
AN - SCOPUS:77950400977
SN - 1471-2164
VL - 11
JO - BMC Genomics
JF - BMC Genomics
IS - 1
M1 - 210
ER -