Jul 10
29
These three tiers are not proposed to replace the primary data archives such as the INSDC (for nucleotide sequence), GEO [23] and ArrayExpress [24] (for expression data), but rather to exist in parallel, providing biological context to the archived data, which remains a record of experiments that have been carried out. In contrast, this stream of information represents the scientific community's best current understanding of information on these species. The specialization in terms of biology decreases from Tier 1 to Tier 3, whereas the sophistication in engineering and computation increases from Tier 1 to Tier 3. This structure both provides for a diversity of datasets and approaches (in particular Tier 1 and to some extent Tier 2) while ensuring consistency and the preservation of high-value datasets within Tier 3. Importantly, it captures the enthusiasm and expertise of specialized scientific groups around Tier 2 databases to keep information on specific genomes up to date, and provides a direct route for this information into the Tier 3 databases that are used by the wider scientific community. As in all scientific endeavors, openness and discussions between all participants need to be encouraged, but this structure places particular emphasis on the communication between adjacent Tiers.
For this structure to work, the different components need to be funded efficiently, with a minimum of unproductive overlap and maximizing the overall utility of the information. As the inter-tier communication is critical for this, we believe that creating funding schemes that deliberately span two tiers (that is, Tier 1 to Tier 2 or Tier 2 to Tier 3) is optimal. Such funding schemes guarantee the communication lines and promote the transfer of information into the higher, longer-lived tiers.
There are well developed funding streams from a variety of agencies for Tier 1 groups, primarily from 'responsive-mode schemes' that encourage the submission of proposals within a broad area of scientific research. It is important to realize that the Tier 1 groups require an increasing intensity of bioinformatics to perform the primary analysis of their own data, and that the presence of the other tiers, and the investment of informatics in these tiers, does not fundamentally change the need for bioinformatics at this level. In addition, funding agencies should support grants that deliberately couple the transfer of information to Tier 2, in some cases by having joint funding episodes with the appropriate Tier 2 group. This sort of 'spanning' funding is particularly appropriate when the generation of a specific dataset is the major focus of a grant: for example, a program to expand a specific phylogenetic domain in terms of genomes sequenced or to generate population genomics resources for a particular species.
There are a variety of existing mechanisms for Tier 2 resources, such as the Biological and Bioinformatics Resources (BBR) of the Biotechnology and Biological Sciences Research Council (BBSRC) in the United Kingdom and, in the United States, the model organism database funds of the National Human Genome Research Institute (NHGRI) and the BRCs of NIAID. The focus of a Tier 2 resource is ideally a specific area of biology, led by scientists practicing in this area. However, it is best sited in, or allied to, an institutional context with existing commitment to suitable infrastructure. This tier is currently the least well defined, and there are areas of biology with no obvious Tier 2 'aggregator' capable of providing a good feed of information into Tier 3. As with the Tier1/Tier2 interface, we see funding that spans Tier2 and Tier3 being a successful way to ensure transfer of information up into the next tier. Such 'spanning' funds exist now in a number of areas (for example, the grants supporting VectorBase [20] and PomBase [25] , both Tier 2 resources, each of which defines a relationship with a Tier 3 resource).
Schemes such as the BRCs and BBRs are welcome because they offer the possibility of continuity of funding, and partnership with Tier 3 resources provides the possibility of data persistence even beyond funding episodes. Indeed, the BBSRC is now addressing the needs of plant pathogens within this framework. The model-organism funding stream from NHGRI is also clearly targeted at this area. There are also initiatives under way to coordinate global funding for important Tier 2 resources, such as recent workshops held in the United Kingdom and the United States to develop a framework to secure funding for the ongoing needs of the Arabidopsis community. However, given the large number of species with sequenced genomes expected over the next decade, overall we believe that Tier 2 is the least well understood by funding agencies and research communities, and that this is the area that most needs clarifying and developing by funding agencies.
A Tier 3 resource is fundamentally an information infrastructure, and must be provided by institutions with a core commitment to infrastructure provision. For much biomolecular data, two obvious centers are the NCBI and EBI, although it is vital that these develop clear interfaces, not just with Tier 2 resources, but also with other infrastructure providers in adjacent domains (such as medical informatics, crop informatics and bioengineering). This area of funding is becoming better defined, with increasingly sophisticated links between institutes of the National Institutes of Health (NIH) and NCBI in the United States; the ELIXIR process led by the EBI to coordinate bioinformatics infrastructure funding in Europe; and increasing collaboration between EBI and NCBI on a number of Tier 2 and Tier 3 projects (for example, the Common Coding Sequence Initiative in human and mouse to establish a universal set of reference transcripts for these species). Set against this is the fact that a number of heavily used 'aggregator' resources, such as the UCSC genome browser, are so widely used that despite the different institutional contexts of these resources, it is likely that they will be very long lasting and thus have characteristics of Tier 3 resources. Despite this progress, however, it is still unclear how these new funding streams will mature as the volume and diversity of underlying data continue to grow. This discussion needs to be considered in the context of the broader infrastructure challenges in bioinformatics and medical informatics.
To sum up, the structure proposed here is in many ways a formalization of current best practice, particularly in the model organism databases. However, by expanding and codifying the structure, and emphasizing the importance of information transfer between the tiers, it should go some way towards closing the loop between the public archival databases and the scientific literature, and ensuring that the latest functional information is propagated to relevant genome databases, where it can form an effective foundation for subsequent research from high-throughput analysis to individual hypothesis-based approaches.
We are grateful to Pat Goodwin and the Wellcome Trust for their encouragement, and for supporting a workshop in November 2008 in which aspects of this model were discussed.
Drysdale R: FlyBase: a database for the Drosophila research community.
Methods Mol Biol 2008 , 420:45-59. PubMed Abstract | Publisher Full Text
Harris TW, Antoshechkin I, Bieri T, Blasiar D, Chan J, Chen WJ, De La Cruz N, Davis P, Duesbury M, Fang R, Fernandes J, Han M, Kishore R, Lee R, Müller H, Nakamura C, Ozersky P, Petcherski A, Rangarajan A, Rogers A, Schindelman G, Schwarz EM, Tuli MA, Van Auken K, Wang D, Wang X, Williams G, Yook K, Durbin R, Stein LD, et al.: WormBase: a comprehensive resource for nematode research.
Nucleic Acids Res 2010 , 38:D463-D467. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Engel SR, Balakrishnan R, Binkley G, Christie KR, Costanzo MC, Dwight SS, Fisk DG, Hirschman JE, Hitz BC, Hong EL, Krieger CJ, Livstone MS, Miyasato SR, Nash R, Oughtred R, Park J, Skrzypek MS, Weng S, Wong ED, Dolinski K, Botstein D, Cherry JM: Saccharomyces Genome Database provides mutant phenotype data.
Nucleic Acids Res 2010 , 38:D433-D436. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Ed L, Nomi H, Mark G, Raymond C, Suzanna L: Apollo: a community resource for genome annotation editing.
Bioinformatics 2009 , 25:1836-1837. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Zhou P, Emmert D, Zhang P: Using Chado to store genome annotation data.
Curr Protoc Bioinformatics 2006. , Chapter 9(Unit 9.6):PubMed Abstract | Publisher Full Text
Carver T, Berriman M, Tivey A, Patel C, Böhme U, Barrell BG, Parkhill J, Rajandream M: Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database.
Bioinformatics 2008 , 24:2672-2676. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW: GenBank.
Nucleic Acids Res 2010 , 38:D46-D51. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Leinonen R, Akhtar R, Birney E, Bonfield J, Bower L, Corbett M, Cheng Y, Demiralp F, Faruque N, Goodgame N, Gibson R, Hoad G, Hunter C, Jang M, Leonard S, Lin Q, Lopez R, Maguire M, McWilliam H, Plaister S, Radhakrishnan R, Sobhany S, Slater G, Ten Hoopen P, Valentin F, Vaughan R, Zalunin V, Zerbino D, Cochrane G: Improvements to services at the European Nucleotide Archive.
Nucleic Acids Res 2010 , 38:D39-D45. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Kaminuma E, Mashima J, Kodama Y, Gojobori T, Ogasawara O, Okubo K, Takagi T, Nakamura Y: DDBJ launches a new archive database with analytical tools for next-generation sequence data.
Nucleic Acids Res 2010 , 38:D33-D38. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Flicek P, Aken BL, Ballester B, Beal K, Bragin E, Brent S, Chen Y, Clapham P, Coates G, Fairley S, Fitzgerald S, Fernandez-Banet J, Gordon L, Graf S, Haider S, Hammond M, Howe K, Jenkinson A, Johnson N, Kahari A, Keefe D, Keenan S, Kinsella R, Kokocinski F, Koscielny G, Kulesha E, Lawson D, Longden I, Massingham T, McLaren W, et al.: Ensembl's 10th year.
Nucleic Acids Res 2010 , 38:D557-D562. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Pruitt KD, Tatusova T, Klimke W, Maglott DR: NCBI Reference Sequences: current status, policy and new initiatives.
Nucleic Acids Res 2009 , 37:D32-D36. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Rhead B, Karolchik D, Kuhn RM, Hinrichs AS, Zweig AS, Fujita PA, Diekhans M, Smith KE, Rosenbloom KR, Raney BJ, Pohl A, Pheasant M, Meyer LR, Learned K, Hsu F, Hillman-Jackson J, Harte RA, Giardine B, Dreszer TR, Clawson H, Barber GP, Haussler D, Kent WJ: The UCSC Genome Browser database: update 2010.
Nucleic Acids Res 2010 , 38:D613-D619. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Sayers EW, Barrett T, Benson DA, Bolton E, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Federhen S, Feolo M, Geer LY, Helmberg W, Kapustin Y, Landsman D, Lipman DJ, Lu Z, Madden TL, Madej T, Maglott DR, Marchler-Bauer A, Miller V, Mizrachi I, Ostell J, Panchenko A, Pruitt KD, Schuler GD, Sequeira E, Sherry ST, Shumway M, et al.: Database resources of the National Center for Biotechnology Information.
Nucleic Acids Res 2010 , 38:D5-D16. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Brooksbank C, Cameron G, Thornton J: The European Bioinformatics Institute's data resources.
Nucleic Acids Res 2010 , 38:D17-D25. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Poole RL: The TAIR database.
Methods Mol Biol 2007 , 406:179-212. PubMed Abstract | Publisher Full Text
Giles PF, Soanes DM, Talbot NJ: A relational database for the discovery of genes encoding amino acid biosynthetic enzymes in pathogenic fungi.
Comp Funct Genomics 2003 , 4:4-15. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Stein LD, Mungall C, Shu S, Caudy M, Mangone M, Day A, Nickerson E, Stajich JE, Harris TW, Arva A, Lewis S: The generic genome browser: a building block for a model organism system database.
Genome Res 2002 , 12:1599-1610. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Aslett M, Mooney P, Adlem E, Berriman M, Berry A, Hertz-Fowler C, Ivens AC, Kerhornou A, Parkhill J, Peacock CS, Wood V, Rajandream M, Barrell B, Tivey A: Integration of tools and resources for display and analysis of genomic data for protozoan parasites.
Int J Parasitol 2005 , 35:481-493. PubMed Abstract | Publisher Full Text
Aurrecoechea C, Brestelli J, Brunk BP, Fischer S, Gajria B, Gao X, Gingle A, Grant G, Harb OS, Heiges M, Innamorato F, Iodice J, Kissinger JC, Kraemer ET, Li W, Miller JA, Nayak V, Pennington C, Pinney DF, Roos DS, Ross C, Srinivasamoorthy G, Stoeckert CJ, Thibodeau R, Treatman C, Wang H: EuPathDB: a portal to eukaryotic pathogen databases.
Nucleic Acids Res 2010 , 38:D415-D419. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Lawson D, Arensburger P, Atkinson P, Besansky NJ, Bruggner RV, Butler R, Campbell KS, Christophides GK, Christley S, Dialynas E, Hammond M, Hill CA, Konopinski N, Lobo NF, MacCallum RM, Madey G, Megy K, Meyer J, Redmond S, Severson DW, Stinson EO, Topalis P, Birney E, Gelbart WM, Kafatos FC, Louis C, Collins FH: VectorBase: a data resource for invertebrate vector genomics.
Nucleic Acids Res 2009 , 37:D583-D587. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Snyder EE, Kampanya N, Lu J, Nordberg EK, Karur HR, Shukla M, Soneja J, Tian Y, Xue T, Yoo H, Zhang F, Dharmanolla C, Dongre NV, Gillespie JJ, Hamelius J, Hance M, Huntington KI, Jukneliene D, Koziski J, Mackasmiel L, Mane SP, Nguyen V, Purkayastha A, Shallom J, Yu G, Guo Y, Gabbard J, Hix D, Azad AF, Baker SC, et al.: PATRIC: the VBI PathoSystems Resource Integration Center.
Nucleic Acids Res 2007 , 35:D401-406. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Giardine B, Riemer C, Hardison RC, Burhans R, Elnitski L, Shah P, Zhang Y, Blankenberg D, Albert I, Taylor J, Miller W, Kent WJ, Nekrutenko A: Galaxy: a platform for interactive large-scale genome analysis.
Genome Res 2005 , 15:1451-1455. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Muertter RN, Edgar R: NCBI GEO: archive for high-throughput functional genomic data.
Nucleic Acids Res 2009 , 37:D885-D890. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Parkinson H, Kapushesky M, Kolesnikov N, Rustici G, Shojatalab M, Abeygunawardena N, Berube H, Dylag M, Emam I, Farne A, Holloway E, Lukk M, Malone J, Mani R, Pilicheva E, Rayner TF, Rezwan F, Sharma A, Williams E, Bradley XZ, Adamusiak T, Brandizi M, Burdett T, Coulson R, Krestyaninova M, Kurnosov P, Maguire E, Neogi SG, Rocca-Serra P, Sansone S, et al.: ArrayExpress update - from an archive of functional genomics experiments to the atlas of gene expression.
Nucleic Acids Res 2009 , 37:D868-D872. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Wixon J, Wood V: Tools and resources for Sz. pombe: a report from the 2006 European Fission Yeast Meeting.
Yeast 2006 , 23:901-903. PubMed Abstract | Publisher Full Text
These three tiers are not proposed to replace the primary data archives such as the INSDC (for nucleotide sequence), GEO [23] and ArrayExpress [24] (for expression data), but rather to exist in parallel, providing biological context to the archived data, which remains a record of experiments that have been carried out. In contrast, this stream of information represents the scientific community's best current understanding of information on these species. The specialization in terms of biology decreases from Tier 1 to Tier 3, whereas the sophistication in engineering and computation increases from Tier 1 to Tier 3. This structure both provides for a diversity of datasets and approaches (in particular Tier 1 and to some extent Tier 2) while ensuring consistency and the preservation of high-value datasets within Tier 3. Importantly, it captures the enthusiasm and expertise of specialized scientific groups around Tier 2 databases to keep information on specific genomes up to date, and provides a direct route for this information into the Tier 3 databases that are used by the wider scientific community. As in all scientific endeavors, openness and discussions between all participants need to be encouraged, but this structure places particular emphasis on the communication between adjacent Tiers.
For this structure to work, the different components need to be funded efficiently, with a minimum of unproductive overlap and maximizing the overall utility of the information. As the inter-tier communication is critical for this, we believe that creating funding schemes that deliberately span two tiers (that is, Tier 1 to Tier 2 or Tier 2 to Tier 3) is optimal. Such funding schemes guarantee the communication lines and promote the transfer of information into the higher, longer-lived tiers.
There are well developed funding streams from a variety of agencies for Tier 1 groups, primarily from 'responsive-mode schemes' that encourage the submission of proposals within a broad area of scientific research. It is important to realize that the Tier 1 groups require an increasing intensity of bioinformatics to perform the primary analysis of their own data, and that the presence of the other tiers, and the investment of informatics in these tiers, does not fundamentally change the need for bioinformatics at this level. In addition, funding agencies should support grants that deliberately couple the transfer of information to Tier 2, in some cases by having joint funding episodes with the appropriate Tier 2 group. This sort of 'spanning' funding is particularly appropriate when the generation of a specific dataset is the major focus of a grant: for example, a program to expand a specific phylogenetic domain in terms of genomes sequenced or to generate population genomics resources for a particular species.
There are a variety of existing mechanisms for Tier 2 resources, such as the Biological and Bioinformatics Resources (BBR) of the Biotechnology and Biological Sciences Research Council (BBSRC) in the United Kingdom and, in the United States, the model organism database funds of the National Human Genome Research Institute (NHGRI) and the BRCs of NIAID. The focus of a Tier 2 resource is ideally a specific area of biology, led by scientists practicing in this area. However, it is best sited in, or allied to, an institutional context with existing commitment to suitable infrastructure. This tier is currently the least well defined, and there are areas of biology with no obvious Tier 2 'aggregator' capable of providing a good feed of information into Tier 3. As with the Tier1/Tier2 interface, we see funding that spans Tier2 and Tier3 being a successful way to ensure transfer of information up into the next tier. Such 'spanning' funds exist now in a number of areas (for example, the grants supporting VectorBase [20] and PomBase [25] , both Tier 2 resources, each of which defines a relationship with a Tier 3 resource).
Schemes such as the BRCs and BBRs are welcome because they offer the possibility of continuity of funding, and partnership with Tier 3 resources provides the possibility of data persistence even beyond funding episodes. Indeed, the BBSRC is now addressing the needs of plant pathogens within this framework. The model-organism funding stream from NHGRI is also clearly targeted at this area. There are also initiatives under way to coordinate global funding for important Tier 2 resources, such as recent workshops held in the United Kingdom and the United States to develop a framework to secure funding for the ongoing needs of the Arabidopsis community. However, given the large number of species with sequenced genomes expected over the next decade, overall we believe that Tier 2 is the least well understood by funding agencies and research communities, and that this is the area that most needs clarifying and developing by funding agencies.
A Tier 3 resource is fundamentally an information infrastructure, and must be provided by institutions with a core commitment to infrastructure provision. For much biomolecular data, two obvious centers are the NCBI and EBI, although it is vital that these develop clear interfaces, not just with Tier 2 resources, but also with other infrastructure providers in adjacent domains (such as medical informatics, crop informatics and bioengineering). This area of funding is becoming better defined, with increasingly sophisticated links between institutes of the National Institutes of Health (NIH) and NCBI in the United States; the ELIXIR process led by the EBI to coordinate bioinformatics infrastructure funding in Europe; and increasing collaboration between EBI and NCBI on a number of Tier 2 and Tier 3 projects (for example, the Common Coding Sequence Initiative in human and mouse to establish a universal set of reference transcripts for these species). Set against this is the fact that a number of heavily used 'aggregator' resources, such as the UCSC genome browser, are so widely used that despite the different institutional contexts of these resources, it is likely that they will be very long lasting and thus have characteristics of Tier 3 resources. Despite this progress, however, it is still unclear how these new funding streams will mature as the volume and diversity of underlying data continue to grow. This discussion needs to be considered in the context of the broader infrastructure challenges in bioinformatics and medical informatics.
To sum up, the structure proposed here is in many ways a formalization of current best practice, particularly in the model organism databases. However, by expanding and codifying the structure, and emphasizing the importance of information transfer between the tiers, it should go some way towards closing the loop between the public archival databases and the scientific literature, and ensuring that the latest functional information is propagated to relevant genome databases, where it can form an effective foundation for subsequent research from high-throughput analysis to individual hypothesis-based approaches.
We are grateful to Pat Goodwin and the Wellcome Trust for their encouragement, and for supporting a workshop in November 2008 in which aspects of this model were discussed.
Drysdale R: FlyBase: a database for the Drosophila research community.
Methods Mol Biol 2008 , 420:45-59. PubMed Abstract | Publisher Full Text
Harris TW, Antoshechkin I, Bieri T, Blasiar D, Chan J, Chen WJ, De La Cruz N, Davis P, Duesbury M, Fang R, Fernandes J, Han M, Kishore R, Lee R, Müller H, Nakamura C, Ozersky P, Petcherski A, Rangarajan A, Rogers A, Schindelman G, Schwarz EM, Tuli MA, Van Auken K, Wang D, Wang X, Williams G, Yook K, Durbin R, Stein LD, et al.: WormBase: a comprehensive resource for nematode research.
Nucleic Acids Res 2010 , 38:D463-D467. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Engel SR, Balakrishnan R, Binkley G, Christie KR, Costanzo MC, Dwight SS, Fisk DG, Hirschman JE, Hitz BC, Hong EL, Krieger CJ, Livstone MS, Miyasato SR, Nash R, Oughtred R, Park J, Skrzypek MS, Weng S, Wong ED, Dolinski K, Botstein D, Cherry JM: Saccharomyces Genome Database provides mutant phenotype data.
Nucleic Acids Res 2010 , 38:D433-D436. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Ed L, Nomi H, Mark G, Raymond C, Suzanna L: Apollo: a community resource for genome annotation editing.
Bioinformatics 2009 , 25:1836-1837. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Zhou P, Emmert D, Zhang P: Using Chado to store genome annotation data.
Curr Protoc Bioinformatics 2006. , Chapter 9(Unit 9.6):PubMed Abstract | Publisher Full Text
Carver T, Berriman M, Tivey A, Patel C, Böhme U, Barrell BG, Parkhill J, Rajandream M: Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database.
Bioinformatics 2008 , 24:2672-2676. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW: GenBank.
Nucleic Acids Res 2010 , 38:D46-D51. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Leinonen R, Akhtar R, Birney E, Bonfield J, Bower L, Corbett M, Cheng Y, Demiralp F, Faruque N, Goodgame N, Gibson R, Hoad G, Hunter C, Jang M, Leonard S, Lin Q, Lopez R, Maguire M, McWilliam H, Plaister S, Radhakrishnan R, Sobhany S, Slater G, Ten Hoopen P, Valentin F, Vaughan R, Zalunin V, Zerbino D, Cochrane G: Improvements to services at the European Nucleotide Archive.
Nucleic Acids Res 2010 , 38:D39-D45. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Kaminuma E, Mashima J, Kodama Y, Gojobori T, Ogasawara O, Okubo K, Takagi T, Nakamura Y: DDBJ launches a new archive database with analytical tools for next-generation sequence data.
Nucleic Acids Res 2010 , 38:D33-D38. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Flicek P, Aken BL, Ballester B, Beal K, Bragin E, Brent S, Chen Y, Clapham P, Coates G, Fairley S, Fitzgerald S, Fernandez-Banet J, Gordon L, Graf S, Haider S, Hammond M, Howe K, Jenkinson A, Johnson N, Kahari A, Keefe D, Keenan S, Kinsella R, Kokocinski F, Koscielny G, Kulesha E, Lawson D, Longden I, Massingham T, McLaren W, et al.: Ensembl's 10th year.
Nucleic Acids Res 2010 , 38:D557-D562. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Pruitt KD, Tatusova T, Klimke W, Maglott DR: NCBI Reference Sequences: current status, policy and new initiatives.
Nucleic Acids Res 2009 , 37:D32-D36. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Rhead B, Karolchik D, Kuhn RM, Hinrichs AS, Zweig AS, Fujita PA, Diekhans M, Smith KE, Rosenbloom KR, Raney BJ, Pohl A, Pheasant M, Meyer LR, Learned K, Hsu F, Hillman-Jackson J, Harte RA, Giardine B, Dreszer TR, Clawson H, Barber GP, Haussler D, Kent WJ: The UCSC Genome Browser database: update 2010.
Nucleic Acids Res 2010 , 38:D613-D619. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Sayers EW, Barrett T, Benson DA, Bolton E, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Federhen S, Feolo M, Geer LY, Helmberg W, Kapustin Y, Landsman D, Lipman DJ, Lu Z, Madden TL, Madej T, Maglott DR, Marchler-Bauer A, Miller V, Mizrachi I, Ostell J, Panchenko A, Pruitt KD, Schuler GD, Sequeira E, Sherry ST, Shumway M, et al.: Database resources of the National Center for Biotechnology Information.
Nucleic Acids Res 2010 , 38:D5-D16. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Brooksbank C, Cameron G, Thornton J: The European Bioinformatics Institute's data resources.
Nucleic Acids Res 2010 , 38:D17-D25. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Poole RL: The TAIR database.
Methods Mol Biol 2007 , 406:179-212. PubMed Abstract | Publisher Full Text
Giles PF, Soanes DM, Talbot NJ: A relational database for the discovery of genes encoding amino acid biosynthetic enzymes in pathogenic fungi.
Comp Funct Genomics 2003 , 4:4-15. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Stein LD, Mungall C, Shu S, Caudy M, Mangone M, Day A, Nickerson E, Stajich JE, Harris TW, Arva A, Lewis S: The generic genome browser: a building block for a model organism system database.
Genome Res 2002 , 12:1599-1610. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Aslett M, Mooney P, Adlem E, Berriman M, Berry A, Hertz-Fowler C, Ivens AC, Kerhornou A, Parkhill J, Peacock CS, Wood V, Rajandream M, Barrell B, Tivey A: Integration of tools and resources for display and analysis of genomic data for protozoan parasites.
Int J Parasitol 2005 , 35:481-493. PubMed Abstract | Publisher Full Text
Aurrecoechea C, Brestelli J, Brunk BP, Fischer S, Gajria B, Gao X, Gingle A, Grant G, Harb OS, Heiges M, Innamorato F, Iodice J, Kissinger JC, Kraemer ET, Li W, Miller JA, Nayak V, Pennington C, Pinney DF, Roos DS, Ross C, Srinivasamoorthy G, Stoeckert CJ, Thibodeau R, Treatman C, Wang H: EuPathDB: a portal to eukaryotic pathogen databases.
Nucleic Acids Res 2010 , 38:D415-D419. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Lawson D, Arensburger P, Atkinson P, Besansky NJ, Bruggner RV, Butler R, Campbell KS, Christophides GK, Christley S, Dialynas E, Hammond M, Hill CA, Konopinski N, Lobo NF, MacCallum RM, Madey G, Megy K, Meyer J, Redmond S, Severson DW, Stinson EO, Topalis P, Birney E, Gelbart WM, Kafatos FC, Louis C, Collins FH: VectorBase: a data resource for invertebrate vector genomics.
Nucleic Acids Res 2009 , 37:D583-D587. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Snyder EE, Kampanya N, Lu J, Nordberg EK, Karur HR, Shukla M, Soneja J, Tian Y, Xue T, Yoo H, Zhang F, Dharmanolla C, Dongre NV, Gillespie JJ, Hamelius J, Hance M, Huntington KI, Jukneliene D, Koziski J, Mackasmiel L, Mane SP, Nguyen V, Purkayastha A, Shallom J, Yu G, Guo Y, Gabbard J, Hix D, Azad AF, Baker SC, et al.: PATRIC: the VBI PathoSystems Resource Integration Center.
Nucleic Acids Res 2007 , 35:D401-406. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Giardine B, Riemer C, Hardison RC, Burhans R, Elnitski L, Shah P, Zhang Y, Blankenberg D, Albert I, Taylor J, Miller W, Kent WJ, Nekrutenko A: Galaxy: a platform for interactive large-scale genome analysis.
Genome Res 2005 , 15:1451-1455. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Muertter RN, Edgar R: NCBI GEO: archive for high-throughput functional genomic data.
Nucleic Acids Res 2009 , 37:D885-D890. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Parkinson H, Kapushesky M, Kolesnikov N, Rustici G, Shojatalab M, Abeygunawardena N, Berube H, Dylag M, Emam I, Farne A, Holloway E, Lukk M, Malone J, Mani R, Pilicheva E, Rayner TF, Rezwan F, Sharma A, Williams E, Bradley XZ, Adamusiak T, Brandizi M, Burdett T, Coulson R, Krestyaninova M, Kurnosov P, Maguire E, Neogi SG, Rocca-Serra P, Sansone S, et al.: ArrayExpress update - from an archive of functional genomics experiments to the atlas of gene expression.
Nucleic Acids Res 2009 , 37:D868-D872. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Wixon J, Wood V: Tools and resources for Sz. pombe: a report from the 2006 European Fission Yeast Meeting.
Yeast 2006 , 23:901-903. PubMed Abstract | Publisher Full Text