Plant Biol (Stuttg) 2005; 7(3): 238-250
DOI: 10.1055/s-2005-837578
Research Paper

Georg Thieme Verlag Stuttgart KG · New York

Representation and High-Quality Annotation of the Physcomitrella patens Transcriptome Demonstrates a High Proportion of Proteins Involved in Metabolism in Mosses

D. Lang1 , J. Eisinger2 , R. Reski1 , S. A. Rensing1
  • 1Plant Biotechnology, Faculty of Biology, University of Freiburg, Schänzlestraße 1, 79104 Freiburg, Germany
  • 2Faculty of Applied Science, Chair of Computer Architecture, University of Freiburg, Georges-Koehler-Allee, Building 051, 79110 Freiburg, Germany
Further Information

Publication History

Received: December 12, 2004

Accepted: January 26, 2005

Publication Date:
15 April 2005 (online)

Abstract

To gain insight into the transcriptome of the well-used plant model system Physcomitrella patens, several EST sequencing projects have been undertaken. We have clustered, assembled, and annotated all publicly available EST and CDS sequences in order to represent the transcriptome of this non-seed plant. Here, we present our fully annotated knowledge resource for the Physcomitrella patens transcriptome, integrating annotation from the production process of the clustered sequences and from a high-quality annotation pipeline developed during this study. Each transcript is represented as an entity containing full annotations and GO term associations. The whole production, filtering, clustering, and annotation process is being modelled and results in seven datasets, representing the annotated Physcomitrella transcriptome from different perspectives. We were able to annotate 63.4 % of the 26 123 virtual transcripts. The transcript archetype, as covered by our clustered data, is compared to a compilation based on all available Physcomitrella full length CDS. The distribution of the gene ontology annotations (GOA) for the virtual transcriptome of Physcomitrella patens demonstrates consistency in the ratios of the core molecular functions among the plant GOA. However, the metabolism subcategory is over-represented in bryophytes as compared to seed plants. This observation can be taken as an indicator for the wealth of alternative metabolic pathways in moss in comparison to spermatophytes. All resources presented in this study have been made available to the scientific community through a suite of user-friendly web interfaces via www.cosmoss.org and form the basis for assembly and annotation of the moss genome, which will be sequenced in 2005.

References

  • 1 Allagulova C. R., Gimalov F. R., Shakirova F. M., Vakhitov V. A.. The plant dehydrins: structure and putative functions.  Biochemistry (Mosc). (2003);  68 945-951
  • 2 Altschul S. F., Madden T. L., Schaffer A. A., Zhang J., Zhang Z., Miller W., Lipman D. J.. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.  Nucleic Acids Research. (1997);  25 3389-3402
  • 3 Apel K., Hirt H.. Reactive oxygen species: metabolism, oxidative stress, and signal transduction.  Annual Review of Plant Biology. (2004);  55 373-399
  • 4 Apweiler R., Bairoch A., Wu C. H., Barker W. C., Boeckmann B., Ferro S., Gasteiger E., Huang H., Lopez R., Magrane M., Martin M. J., Natale D. A., O'Donovan C., Redaschi N., Yeh L. S.. UniProt: the Universal Protein knowledgebase.  Nucleic Acids Research (Database issue). (2004);  32 D115-D119
  • 5 Ashburner M., Ball C. A., Blake J. A., Botstein D., Butler H., Cherry J. M., Davis A. P., Dolinski K., Dwight S. S., Eppig J. T., Harris M. A., Hill D. P., Issel-Tarver L., Kasarskis A., Lewis S., Matese J. C., Richardson J. E., Ringwald M., Rubin G. M., Sherlock G.. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.  Nature Genetics. (2000);  25 25-29
  • 6 Bateman A., Coin L., Durbin R., Finn R. D., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E. L., Studholme D. J., Yeats C., Eddy S. R.. The Pfam protein families database.  Nucleic Acids Research (Database issue). (2004);  32 D138-D141
  • 7 Benton D.. Recent changes in the GenBank on-line service.  Nucleic Acids Research. (1990);  18 1517-1520
  • 8 Brun F., Gonneau M., Doutriaux M. P., Laloue M., Nogue F.. Cloning of the PpMSH-2 cDNA of Physcomitrella patens, a moss in which gene targeting by homologous recombination occurs at high frequency.  Biochimie. (2001);  83 1003-1008
  • 9 Camon E., Magrane M., Barrell D., Binns D., Fleischmann W., Kersey P., Mulder N., Oinn T., Maslen J., Cox A., Apweiler R.. The Gene Ontology Annotation (GOA) project: implementation of GO in SWISS-PROT, TrEMBL, and InterPro.  Genome Research. (2003);  13 662-672
  • 10 Camon E., Magrane M., Barrell D., Lee V., Dimmer E., Maslen J., Binns D., Harte N., Lopez R., Apweiler R.. The Gene Ontology Annotation (GOA) database: sharing knowledge in Uniprot with gene ontology.  Nucleic Acids Research (Database issue). (2004);  32 D262-D266
  • 11 Cove D.. The moss, Physcomitrella patens. .  Journal of Plant Growth Regulation. (2000);  19 275-283
  • 12 Du X. M., Yin W. X., Zhao Y. X., Zhang H.. [The production and scavenging of reactive oxygen species in plants].  Sheng Wu Gong Cheng Xue Bao. (2001);  17 121-125
  • 13 Ewing B., Green P.. Base-calling of automated sequencer traces using phred. II. Error probabilities.  Genome Research. (1998);  8 186-194
  • 14 Ewing B., Hillier L., Wendl M. C., Green P.. Base-calling of automated sequencer traces using phred. I. Accuracy assessment.  Genome Research. (1998);  8 175-185
  • 15 Fagegaltier D., Lescure A., Walczak R., Carbon P., Krol A.. Structural analysis of new local features in SECIS RNA hairpins.  Nucleic Acids Research. (2000);  28 2679-2689
  • 16 Frahm J.-P.. Moose - lebende Fossilien.  Biologie in unserer Zeit. (1994);  24 120-124
  • 17 Fujita T., Shin-i T., Seki M., Kamiya A., Uchiyama I., Nishiyama T., Carninci P., Hayashizaki Y., Shinozaki K., Kohara Y., Hasebe M.. 82317 Genbank accessions. (2004)
  • 18 Girke T., Schmidt H., Zahringer U., Reski R., Heinz E.. Identification of a novel delta 6-acyl-group desaturase by targeted gene disruption in Physcomitrella patens. .  The Plant Journal. (1998);  15 39-48
  • 19 Grillo G., Licciulli F., Liuni S., Sbisa E., Pesole G.. PatSearch: A program for the detection of patterns and structural motifs in nucleotide sequences.  Nucleic Acids Research. (2003);  31 3608-3612
  • 20 Harris M. A., Clark J., Ireland A., Lomax J., Ashburner M., Foulger R., Eilbeck K., Lewis S., Marshall B., Mungall C., Richter J., Rubin G. M., Blake J. A., Bult C., Dolan M., Drabkin H., Eppig J. T., Hill D. P., Ni L., Ringwald M., Balakrishnan R., Cherry J. M., Christie K. R., Costanzo M. C., Dwight S. S., Engel S., Fisk D. G., Hirschman J. E., Hong E. L., Nash R. S., Sethuraman A., Theesfeld C. L., Botstein D., Dolinski K., Feierbach B., Berardini T., Mundodi S., Rhee S. Y., Apweiler R., Barrell D., Camon E., Dimmer E., Lee V., Chisholm R., Gaudet P., Kibbe W., Kishore R., Schwarz E. M., Sternberg P., Gwinn M., Hannick L., Wortman J., Berriman M., Wood V., de la Cruz N., Tonellato P., Jaiswal P., Seigfried T., White R.. The Gene Ontology (GO) database and informatics resource.  Nucleic Acids Research (Database issue). (2004);  32 D258-D261
  • 21 Heintz D., Wurtz V., High A. A., Van Dorsselaer A., Reski R., Sarnighausen E.. An efficient protocol for the identification of protein phosphorylation in a seedless plant, sensitive enough to detect members of signalling cascades.  Electrophoresis. (2004);  25 1149-1159
  • 22 Hentze M. W., Kuhn L. C.. Molecular control of vertebrate iron metabolism: mRNA-based regulatory circuits operated by iron, nitric oxide, and oxidative stress.  Proceedings of the National Academy of Sciences of the USA. (1996);  93 8175-8182
  • 23 Holtorf H., Guitton M. C., Reski R.. Plant functional genomics.  Naturwissenschaften. (2002);  89 235-249
  • 24 Iseli C., Jongeneel C. V., Bucher P.. ESTScan: a program for detecting, evaluating, and reconstructing potential coding regions in EST sequences. International Conference on Intelligent Systems for Molecular Biology. (1999): 138-148
  • 25 Jurka J.. Repbase update: a database and an electronic journal of repetitive elements.  Trends in Genetics. (2000);  16 418-420
  • 26 Kasukawa T., Furuno M., Nikaido I., Bono H., Hume D. A., Bult C., Hill D. P., Baldarelli R., Gough J., Kanapin A., Matsuda H., Schriml L. M., Hayashizaki Y., Okazaki Y., Quackenbush J.. Development and evaluation of an automated annotation pipeline and cDNA annotation system.  Genome Research. (2003);  13 1542-1551
  • 27 Kiessling J., Martin A., Gremillon L., Rensing S. A., Nick P., Sarnighausen E., Decker E. L., Reski R.. Dual targeting of plastid division protein FtsZ to chloroplasts and the cytoplasm.  Embo Reports. (2004);  5 889-894
  • 28 Kitts P. A., Madden T. L. H. S., Ostell J. A.. UniVec. www.ncbi.nlm.nih.gov/VecScreen/UniVec.html
  • 29 Koprivova A., Altmann F., Gorr G., Kopriva S., Reski R., Decker E. L.. N-glycosylation in the moss Physcomitrella patens is organized similarly to that in higher plants.  Plant Biology. (2003);  5 582-591
  • 30 Koprivova A., Meyer A. J., Schween G., Herschbach C., Reski R., Kopriva S.. Functional knockout of the adenosine 5′-phosphosulfate reductase gene in Physcomitrella patens revives an old route of sulfate assimilation.  Journal of Biological Chemistry. (2002);  277 32195-32201
  • 31 Kroemer K., Reski R., Frank W.. Abiotic stress response in the moss Physcomitrella patens: evidence for an evolutionary alteration in signaling pathways in land plants.  Plant Cell Reports. (2004);  22 864-870
  • 32 Le S. Y., Maizel Jr. J. V.. A common RNA structural motif involved in the internal initiation of translation of cellular mRNAs.  Nucleic Acids Research. (1997);  25 362-369
  • 33 Mangalam H.. The Bio* toolkits - a brief overview.  Briefings in Bioinformatics. (2002);  3 296-302
  • 34 Mikami K., Repp A., Graebe-Abts E., Hartmann E.. Isolation of cDNAs encoding typical and novel types of phosphoinositide-specific phospholipase C from the moss Physcomitrella patens. .  Journal of Experimental Botany. (2004);  55 1437-1439
  • 35 Miller N. D.. Tertiary and quarternary fossils. Schuster, R. M., ed. New Manual of Bryology, Vol. 2. Miyazaki; Hattori Bot. Lab. (1984): 1194-1232
  • 36 Mulder N. J., Apweiler R., Attwood T. K., Bairoch A., Barrell D., Bateman A., Binns D., Biswas M., Bradley P., Bork P., Bucher P., Copley R. R., Courcelle E., Das U., Durbin R., Falquet L., Fleischmann W., Griffiths-Jones S., Haft D., Harte N., Hulo N., Kahn D., Kanapin A., Krestyaninova M., Lopez R., Letunic I., Lonsdale D., Silventoinen V., Orchard S. E., Pagni M., Peyruc D., Ponting C. P., Selengut J. D., Servant F., Sigrist C. J., Vaughan R., Zdobnov E. M.. The InterPro Database, 2003 brings increased coverage and new features.  Nucleic Acids Research. (2003);  31 315-318
  • 37 Nishiyama T., Fujita T., Shin I. T., Seki M., Nishide H., Uchiyama I., Kamiya A., Carninci P., Hayashizaki Y., Shinozaki K., Kohara Y., Hasebe M.. Comparative genomics of Physcomitrella patens gametophytic transcriptome and Arabidopsis thaliana: implication for land plant evolution.  Proceedings of the National Academy of Sciences of the USA. (2003);  100 8007-8012
  • 38 Oliver M. J., Dowd S. E., Zaragoza J., Mauget S. A., Payton P. R.. The rehydration transcriptome of the desiccation-tolerant bryophyte Tortula ruralis: Transcript classification and analysis.  BMC Genomics. (2004);  5 89
  • 39 Patnaik D., Khurana P.. Germins and germin like proteins: an overview.  The Journal of Experimental Biology. (2001);  39 191-200
  • 40 Pesole G., Grillo G., Liuni S.. Databases of mRNA untranslated regions for metazoa.  Computers and Chemistry. (1996);  20 141-144
  • 41 Pesole G., Liuni S., Grillo G., Licciulli F., Mignone F., Gissi C., Saccone C.. UTRdb and UTRsite: specialized databases of sequences and functional elements of 5′ and 3′ untranslated regions of eukaryotic mRNAs. Update 2002.  Nucleic Acids Research. (2002);  30 335-340
  • 42 Quatrano R., Bashiardes S., Cove D., Cuming A., Knight C., Clifton S., Marra M., Hillier L., Pape D., Martin J., Wylie T., Underwood K., Theising B., Allen M., Bowers Y., Person B., Swaller T., Steptoe M., Gibbons M., Harvey N., Ritter E., Jackson Y., McCann R., Waterston R., Wilson R.. Leeds/Wash U Moss EST Project, 19538 Genbank accessions. (1999)
  • 43 Reiser L., Mueller L. A., Rhee S. Y.. Surviving in a sea of data: a survey of plant genome data resources and issues in building data management systems.  Plant Molecular Biology. (2002);  48 59-74
  • 44 Rensing S. A., Fritzowsky D., Lang D., Reski R.. Protein encoding genes in an ancient plant: analysis of codon usage, retained genes and splice sites in a moss, Physcomitrella patens. .  BMC Genomics. (2005);  in press
  • 45 Rensing S. A., Lang D., Reski R.. In silico prediction of UTR repeats using clustered EST data. Proceedings of the German Conference on Bioinformatics. Munich, Germany; Belleville Verlag Michael Farin (2003): 117-122
  • 46 Rensing S. A., Rombauts S., Hohe A., Lang D., Duwenig E., Rouze P., Van de Peer Y., Reski R.. The transcriptome of the moss Physcomitrella patens: Comparative analysis reveals a rich source of new genes. http://www.plantbiotech.net/Rensing_et_al_transcriptome2002.pdf. (2002 a)
  • 47 Rensing S. A., Rombauts S., Van de Peer Y., Reski R.. Moss transcriptome and beyond.  Trends in Plant Science. (2002 b);  7 535-538
  • 48 Reski R.. Development, genetics and molecular biology of mosses.  Botanica Acta. (1998);  111 1-15
  • 49 Reski R.. Molecular genetics of Physcomitrella. .  Planta. (1999);  208 301-309
  • 50 Rhee S. Y., Beavis W., Berardini T. Z., Chen G., Dixon D., Doyle A., Garcia-Hernandez M., Huala E., Lander G., Montoya M., Miller N., Mueller L. A., Mundodi S., Reiser L., Tacklind J., Weems D. C., Wu Y., Xu I., Yoo D., Yoon J., Zhang P.. The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community.  Nucleic Acids Research. (2003);  31 224-228
  • 51 Richter U., Kiessling J., Hedtke B., Decker E., Reski R., Borner T., Weihe A.. Two RpoT genes of Physcomitrella patens encode phage-type RNA polymerases with dual targeting to mitochondria and plastids.  Gene. (2002);  290 95-105
  • 52 Sarnighausen E., Wurtz V., Heintz D., Van Dorsselaer A., Reski R.. Mapping of the Physcomitrella patens proteome.  Phytochemistry. (2004);  65 1589-1607
  • 53 Schuler G. D., Epstein J. A., Ohkawa H., Kans J. A.. Entrez: molecular biology database and retrieval system.  Methods in Enzymology. (1996);  266 141-162
  • 54 Schween G., Egener T., Fritzkowsky D., Granado J., Guitton M.-C., Hartmann N., Hohe A., Holtorf H., Lang D., Lucht J. M., Reinhard C., Rensing S. A., Schlink K., Schulte J., Reski R.. Large-scale analysis of 73 329 Physcomitrella plants transformed with different gene disruption libraries: production parameters and mutant phenotypes.  Plant Biology. (2005);  in press
  • 55 Stajich J. E., Block D., Boulez K., Brenner S. E., Chervitz S. A., Dagdigian C., Fuellen G., Gilbert J. G., Korf I., Lapp H., Lehvaslaiho H., Matsalla C., Mungall C. J., Osborne B. I., Pocock M. R., Schattner P., Senger M., Stein L. D., Stupka E., Wilkinson M. D., Birney E.. The Bioperl toolkit: Perl modules for the life sciences.  Genome Research. (2002);  12 1611-1618
  • 56 Takezawa D., Minami A.. Calmodulin-binding proteins in bryophytes: identification of abscisic acid-, cold-, and osmotic stress-induced genes encoding novel membrane-bound transporter-like proteins.  Biochemical and Biophysical Research Communications. (2004);  317 428-436
  • 57 Theissen G., Münster T., Henschel K.. Why don't mosses flower?.  New Phytologist. (2001);  150 1-8
  • 59 von Schwartzenberg K., Schultze W., Kassner H.. The moss Physcomitrella patens releases a tetracyclic diterpene.  Plant Cell Reports. (2004);  22 780-786
  • 60 Walczak R., Westhof E., Carbon P., Krol A.. A novel RNA structural motif in the selenocysteine insertion element of eukaryotic selenoprotein mRNAs.  RNA. (1996);  2 367-379
  • 61 Ware D., Jaiswal P., Ni J., Pan X., Chang K., Clark K., Teytelman L., Schmidt S., Zhao W., Cartinhour S., McCouch S., Stein L.. Gramene: a resource for comparative grass genomics.  Nucleic Acids Research. (2002);  30 103-105
  • 62 Wheeler D. L., Church D. M., Edgar R., Federhen S., Helmberg W., Madden T. L., Pontius J. U., Schuler G. D., Schriml L. M., Sequeira E., Suzek T. O., Tatusova T. A., Wagner L.. Database resources of the National Center for Biotechnology Information: update.  Nucleic Acids Research (Database issue). (2004);  32 D35-D40
  • 63 Wise M. J., Tunnacliffe A.. POPP the question: what do LEA proteins do?.  Trends in Plant Science. (2004);  9 13-17
  • 64 Wojtaszek P.. Oxidative burst: an early plant response to pathogen infection.  Biochemical Journal. (1997);  322 681-692
  • 65 Zank T. K., Zahringer U., Beckmann C., Pohnert G., Boland W., Holtorf H., Reski R., Lerchl J., Heinz E.. Cloning and functional characterisation of an enzyme involved in the elongation of Delta6-polyunsaturated fatty acids from the moss Physcomitrella patens. .  The Plant Journal. (2002);  31 255-268

S. A. Rensing

Plant Biotechnology
Faculty of Biology
University of Freiburg

Schänzlestraße 1

79104 Freiburg

Germany

Email: stefan.rensing@biologie.uni-freiburg.de

Editor: H. Rennenberg

    >