Viewport Size Code:
Login | Create New Account
picture

  MENU

About | Classical Genetics | Timelines | What's New | What's Hot

About | Classical Genetics | Timelines | What's New | What's Hot

icon

Bibliography Options Menu

icon
QUERY RUN:
HITS:
PAGE OPTIONS:
Hide Abstracts   |   Hide Additional Links
NOTE:
Long bibliographies are displayed in blocks of 100 citations at a time. At the end of each block there is an option to load the next block.

Bibliography on: Genomic Standards Consortium

The Electronic Scholarly Publishing Project: Providing world-wide, free access to classic scientific papers and other scholarly materials, since 1993.

More About:  ESP | OUR CONTENT | THIS WEBSITE | WHAT'S NEW | WHAT'S HOT

ESP: PubMed Auto Bibliography 25 Sep 2018 at 01:36 Created: 

Genomic Standards Consortium

The Genomic Standards Consortium (GSC) is an open-membership working body formed in September 2005. The aim of the GSC is making genomic data discoverable. The GSC enables genomic data integration, discovery and comparison through international community-driven standards.

Created with PubMed® Query: ("genomic standards consortium" AND GSC OR RCN4GSC) NOT pmcbook NOT ispreviousversion

Citations The Papers (from PubMed®)

RevDate: 2018-06-19
CmpDate: 2017-08-21

Bowers RM, Kyrpides NC, Stepanauskas R, et al (2017)

Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea.

Nature biotechnology, 35(8):725-731.

We present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a Metagenome-Assembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used in combination with other GSC checklists, including the Minimum Information about a Genome Sequence (MIGS), Minimum Information about a Metagenomic Sequence (MIMS), and Minimum Information about a Marker Gene Sequence (MIMARKS). Community-wide adoption of MISAG and MIMAG will facilitate more robust comparative genomic analyses of bacterial and archaeal diversity.

RevDate: 2017-06-15
CmpDate: 2017-06-15

Mukherjee S, Stamatis D, Bertsch J, et al (2017)

Genomes OnLine Database (GOLD) v.6: data updates and feature enhancements.

Nucleic acids research, 45(D1):D446-D456.

The Genomes Online Database (GOLD) (https://gold.jgi.doe.gov) is a manually curated data management system that catalogs sequencing projects with associated metadata from around the world. In the current version of GOLD (v.6), all projects are organized based on a four level classification system in the form of a Study, Organism (for isolates) or Biosample (for environmental samples), Sequencing Project and Analysis Project. Currently, GOLD provides information for 26 117 Studies, 239 100 Organisms, 15 887 Biosamples, 97 212 Sequencing Projects and 78 579 Analysis Projects. These are integrated with over 312 metadata fields from which 58 are controlled vocabularies with 2067 terms. The web interface facilitates submission of a diverse range of Sequencing Projects (such as isolate genome, single-cell genome, metagenome, metatranscriptome) and complex Analysis Projects (such as genome from metagenome, or combined assembly from multiple Sequencing Projects). GOLD provides a seamless interface with the Integrated Microbial Genomes (IMG) system and supports and promotes the Genomic Standards Consortium (GSC) Minimum Information standards. This paper describes the data updates and additional features added during the last two years.

RevDate: 2017-02-20
CmpDate: 2016-09-26

Endrullat C, Glökler J, Franke P, et al (2016)

Standardization and quality management in next-generation sequencing.

Applied & translational genomics, 10:2-9 pii:S2212-0661(16)30023-0.

DNA sequencing continues to evolve quickly even after > 30 years. Many new platforms suddenly appeared and former established systems have vanished in almost the same manner. Since establishment of next-generation sequencing devices, this progress gains momentum due to the continually growing demand for higher throughput, lower costs and better quality of data. In consequence of this rapid development, standardized procedures and data formats as well as comprehensive quality management considerations are still scarce. Here, we listed and summarized current standardization efforts and quality management initiatives from companies, organizations and societies in form of published studies and ongoing projects. These comprise on the one hand quality documentation issues like technical notes, accreditation checklists and guidelines for validation of sequencing workflows. On the other hand, general standard proposals and quality metrics are developed and applied to the sequencing workflow steps with the main focus on upstream processes. Finally, certain standard developments for downstream pipeline data handling, processing and storage are discussed in brief. These standardization approaches represent a first basis for continuing work in order to prospectively implement next-generation sequencing in important areas such as clinical diagnostics, where reliable results and fast processing is crucial. Additionally, these efforts will exert a decisive influence on traceability and reproducibility of sequence data.

RevDate: 2017-02-20
CmpDate: 2015-06-29

Reddy TB, Thomas AD, Stamatis D, et al (2015)

The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification.

Nucleic acids research, 43(Database issue):D1099-106.

The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Here we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19,200 studies, 56,000 Biosamples, 56,000 sequencing projects and 39,400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards.

RevDate: 2017-02-20
CmpDate: 2014-09-08

Field D, Sterk P, Kottmann R, et al (2014)

Genomic standards consortium projects.

Standards in genomic sciences, 9(3):599-601 pii:sigs.5559680.

The Genomic Standards Consortium (GSC) is an open-membership community that was founded in 2005 to work towards the development, implementation and harmonization of standards in the field of genomics. Starting with the defined task of establishing a minimal set of descriptions the GSC has evolved into an active standards-setting body that currently has 18 ongoing projects, with additional projects regularly proposed from within and outside the GSC. Here we describe our recently enacted policy for proposing new activities that are intended to be taken on by the GSC, along with the template for proposing such new activities.

RevDate: 2017-02-20
CmpDate: 2013-05-01

Tuama EÓ, Deck J, Dröge G, et al (2012)

Meeting Report: Hackathon-Workshop on Darwin Core and MIxS Standards Alignment (February 2012).

Standards in genomic sciences, 7(1):166-170.

The Global Biodiversity Information Facility and the Genomic Standards Consortium convened a joint workshop at the University of Oxford, 27-29 February 2012, with a small group of experts from Europe, USA, China and Japan, to continue the alignment of the Darwin Core with the MIxS and related genomics standards. Several reference mappings were produced as well as test expressions of MIxS in RDF. The use and management of controlled vocabulary terms was considered in relation to both GBIF and the GSC, and tools for working with terms were reviewed. Extensions for publishing genomic biodiversity data to the GBIF network via a Darwin Core Archive were prototyped and work begun on preparing translations of the Darwin Core to Japanese and Chinese. Five genomic repositories were identified for engagement to begin the process of testing the publishing of genomic data to the GBIF network commencing with the SILVA rRNA database.

RevDate: 2017-02-20
CmpDate: 2013-05-01

Robbins RJ, Amaral-Zettler L, Bik H, et al (2012)

RCN4GSC Workshop Report: Managing Data at the Interface of Biodiversity and (Meta)Genomics, March 2011.

Standards in genomic sciences, 7(1):159-165.

Building on the planning efforts of the RCN4GSC project, a workshop was convened in San Diego to bring together experts from genomics and metagenomics, biodiversity, ecology, and bioinformatics with the charge to identify potential for positive interactions and progress, especially building on successes at establishing data standards by the GSC and by the biodiversity and ecological communities. Until recently, the contribution of microbial life to the biomass and biodiversity of the biosphere was largely overlooked (because it was resistant to systematic study). Now, emerging genomic and metagenomic tools are making investigation possible. Initial research findings suggest that major advances are in the offing. Although different research communities share some overlapping concepts and traditions, they differ significantly in sampling approaches, vocabularies and workflows. Likewise, their definitions of 'fitness for use' for data differ significantly, as this concept stems from the specific research questions of most importance in the different fields. Nevertheless, there is little doubt that there is much to be gained from greater coordination and integration. As a first step toward interoperability of the information systems used by the different communities, participants agreed to conduct a case study on two of the leading data standards from the two formerly disparate fields: (a) GSC's standard checklists for genomics and metagenomics and (b) TDWG's Darwin Core standard, used primarily in taxonomy and systematic biology.

RevDate: 2017-02-20
CmpDate: 2013-05-01

Robbins RJ, Cochrane G, Davies N, et al (2012)

RCN4GSC Workshop Report: Modeling a Testbed for Managing Data at the Interface of Biodiversity and (Meta)Genomics, April 2011.

Standards in genomic sciences, 7(1):153-158.

At the GSC11 meeting (4-6 April 2011, Hinxton, England, the GSC's genomic biodiversity working group (GBWG) developed an initial model for a data management testbed at the interface of biodiversity with genomics and metagenomics. With representatives of the Global Biodiversity Information Facility (GBIF) participating, it was agreed that the most useful course of action would be for GBIF to collaborate with the GSC in its ongoing GBWG workshops to achieve common goals around interoperability/data integration across (meta)-genomic and species level data. It was determined that a quick comparison should be made of the contents of the Darwin Core (DwC) and the GSC data checklists, with a goal of determining their degree of overlap and compatibility. An ad-hoc task group lead by Renzo Kottman and Peter Dawyndt undertook an initial comparison between the Darwin Core (DwC) standard used by the Global Biodiversity Information Facility (GBIF) and the MIxS checklists put forward by the Genomic Standards Consortium (GSC). A term-by-term comparison showed that DwC and GSC concepts complement each other far more than they compete with each other. Because the preliminary analysis done at this meeting was based on expertise with GSC standards, but not with DwC standards, the group recommended that a joint meeting of DwC and GSC experts be convened as soon as possible to continue this joint assessment and to propose additional work going forward.

RevDate: 2017-02-20
CmpDate: 2013-05-01

Robbins RJ, Beach J, Blum S, et al (2012)

RCN4GSC Meeting Report: Initiating a Testbed for Managing Data at the Interface of Biodiversity and Genomics/Metagenomics, May 2011.

Standards in genomic sciences, 7(1):171-174.

Following up on efforts from two earlier workshops, a meeting was convened in San Diego to (a) establish working connections between experts in the use of the Darwin Core and the GSC MIxS standards, (b) conduct mutual briefings to promote knowledge exchange and to increase the understanding of the two communities' approaches, constraints, community goals, subtleties, etc., (c) perform an element-by-element comparison of the two standards, assessing the compatibility and complementarity of the two approaches, (d) propose and consider possible use cases and test beds in which a joint annotation approach might be tried, to useful scientific effect, and (e) propose additional action items necessary to continue the development of this joint effort. Several focused working teams were identified to continue the work after the meeting ended.

RevDate: 2013-02-27
CmpDate: 2012-08-23

Gilbert JA, Catlett C, Desai N, et al (2012)

Conceptualizing a Genomics Software Institute (GSI).

Standards in genomic sciences, 6(1):136-144.

Microbial ecology has been enhanced greatly by the ongoing 'omics revolution, bringing half the world's biomass and most of its biodiversity into analytical view for the first time; indeed, it feels almost like the invention of the microscope and the discovery of the new world at the same time. With major microbial ecology research efforts accumulating prodigious quantities of sequence, protein, and metabolite data, we are now poised to address environmental microbial research at macro scales, and to begin to characterize and understand the dimensions of microbial biodiversity on the planet. What is currently impeding progress is the need for a framework within which the research community can develop, exchange and discuss predictive ecosystem models that describe the biodiversity and functional interactions. Such a framework must encompass data and metadata transparency and interoperation; data and results validation, curation, and search; application programming interfaces for modeling and analysis tools; and human and technical processes and services necessary to ensure broad adoption. Here we discuss the need for focused community interaction to augment and deepen established community efforts, beginning with the Genomic Standards Consortium (GSC), to create a science-driven strategic plan for a Genomic Software Institute (GSI).

RevDate: 2017-02-20
CmpDate: 2012-02-28

Hankeln W, Wendel NJ, Gerken J, et al (2011)

CDinFusion--submission-ready, on-line integration of sequence and contextual data.

PloS one, 6(9):e24797.

State of the art (DNA) sequencing methods applied in "Omics" studies grant insight into the 'blueprints' of organisms from all domains of life. Sequencing is carried out around the globe and the data is submitted to the public repositories of the International Nucleotide Sequence Database Collaboration. However, the context in which these studies are conducted often gets lost, because experimental data, as well as information about the environment are rarely submitted along with the sequence data. If these contextual or metadata are missing, key opportunities of comparison and analysis across studies and habitats are hampered or even impossible. To address this problem, the Genomic Standards Consortium (GSC) promotes checklists and standards to better describe our sequence data collection and to promote the capturing, exchange and integration of sequence data with contextual data. In a recent community effort the GSC has developed a series of recommendations for contextual data that should be submitted along with sequence data. To support the scientific community to significantly enhance the quality and quantity of contextual data in the public sequence data repositories, specialized software tools are needed. In this work we present CDinFusion, a web-based tool to integrate contextual and sequence data in (Multi)FASTA format prior to submission. The tool is open source and available under the Lesser GNU Public License 3. A public installation is hosted and maintained at the Max Planck Institute for Marine Microbiology at http://www.megx.net/cdinfusion. The tool may also be installed locally using the open source code available at http://code.google.com/p/cdinfusion.

RevDate: 2017-02-24
CmpDate: 2011-10-24

Field D, Amaral-Zettler L, Cochrane G, et al (2011)

The Genomic Standards Consortium.

PLoS biology, 9(6):e1001088.

A vast and rich body of information has grown up as a result of the world's enthusiasm for 'omics technologies. Finding ways to describe and make available this information that maximise its usefulness has become a major effort across the 'omics world. At the heart of this effort is the Genomic Standards Consortium (GSC), an open-membership organization that drives community-based standardization activities, Here we provide a short history of the GSC, provide an overview of its range of current activities, and make a call for the scientific community to join forces to improve the quality and quantity of contextual information about our public collections of genomes, metagenomes, and marker gene sequences.

RevDate: 2011-08-01
CmpDate: 2011-07-14

Morrison N, Hancock D, Hirschman L, et al (2011)

Data shopping in an open marketplace: Introducing the Ontogrator web application for marking up data using ontologies and browsing using facets.

Standards in genomic sciences, 4(2):286-292.

In the future, we hope to see an open and thriving data market in which users can find and select data from a wide range of data providers. In such an open access market, data are products that must be packaged accordingly. Increasingly, eCommerce sellers present heterogeneous product lines to buyers using faceted browsing. Using this approach we have developed the Ontogrator platform, which allows for rapid retrieval of data in a way that would be familiar to any online shopper. Using Knowledge Organization Systems (KOS), especially ontologies, Ontogrator uses text mining to mark up data and faceted browsing to help users navigate, query and retrieve data. Ontogrator offers the potential to impact scientific research in two major ways: 1) by significantly improving the retrieval of relevant information; and 2) by significantly reducing the time required to compose standard database queries and assemble information for further research. Here we present a pilot implementation developed in collaboration with the Genomic Standards Consortium (GSC) that includes content from the StrainInfo, GOLD, CAMERA, Silva and Pubmed databases. This implementation demonstrates the power of ontogration and highlights that the usefulness of this approach is fully dependent on both the quality of data and the KOS (ontologies) used. Ideally, the use and further expansion of this collaborative system will help to surface issues associated with the underlying quality of annotation and could lead to a systematic means for accessing integrated data resources.

RevDate: 2017-02-20
CmpDate: 2011-07-14

Duhaime MB, Kottmann R, Field D, et al (2011)

Enriching public descriptions of marine phages using the Genomic Standards Consortium MIGS standard.

Standards in genomic sciences, 4(2):271-285.

In any sequencing project, the possible depth of comparative analysis is determined largely by the amount and quality of the accompanying contextual data. The structure, content, and storage of this contextual data should be standardized to ensure consistent coverage of all sequenced entities and facilitate comparisons. The Genomic Standards Consortium (GSC) has developed the "Minimum Information about Genome/Metagenome Sequences (MIGS/MIMS)" checklist for the description of genomes and here we annotate all 30 publicly available marine bacteriophage sequences to the MIGS standard. These annotations build on existing International Nucleotide Sequence Database Collaboration (INSDC) records, and confirm, as expected that current submissions lack most MIGS fields. MIGS fields were manually curated from the literature and placed in XML format as specified by the Genomic Contextual Data Markup Language (GCDML). These "machine-readable" reports were then analyzed to highlight patterns describing this collection of genomes. Completed reports are provided in GCDML. This work represents one step towards the annotation of our complete collection of genome sequences and shows the utility of capturing richer metadata along with raw sequences.

RevDate: 2018-04-25
CmpDate: 2011-09-08

Yilmaz P, Kottmann R, Field D, et al (2011)

Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications.

Nature biotechnology, 29(5):415-420.

Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences--the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The 'environmental packages' apply to any genome sequence of known origin and can be used in combination with MIMARKS and other GSC checklists. Finally, to establish a unified standard for describing sequence data and to provide a single point of entry for the scientific community to access and learn about GSC checklists, we present the minimum information about any (x) sequence (MIxS). Adoption of MIxS will enhance our ability to analyze natural genetic diversity documented by massive DNA sequencing efforts from myriad ecosystems in our ever-changing biosphere.

RevDate: 2011-07-25
CmpDate: 2011-07-14

Gilbert JA, Meyer F, Knight R, et al (2010)

Meeting report: GSC M5 roundtable at the 13th International Society for Microbial Ecology meeting in Seattle, WA, USA August 22-27, 2010.

Standards in genomic sciences, 3(3):235-239.

This report summarizes the proceedings of the Metagenomics, Metadata, Metaanalysis, Models and Metainfrastructure (M5) Roundtable at the 13th International Society for Microbial Ecology Meeting in Seattle, WA, USA August 22-27, 2010. The Genomic Standards Consortium (GSC) hosted this meeting as a community engagement exercise to describe the GSC to the microbial ecology community during this important international meeting. The roundtable included five talks given by members of the GSC, and was followed by audience participation in the form of a roundtable discussion. This report summarizes this event. Further information on the GSC and its range of activities can be found at http://www.gensc.org.

RevDate: 2017-02-20
CmpDate: 2011-07-14

Field D, Sansone S, Delong EF, et al (2010)

Meeting Report: Metagenomics, Metadata and MetaAnalysis (M3) at ISMB 2010.

Standards in genomic sciences, 3(3):232-234.

This report summarizes the proceedings of the first day of the Metagenomics, Metadata and MetaAnalysis (M3) workshop held at the Intelligent Systems for Molecular Biology 2010 conference. The second day, which was dedicated to the inaugural meeting of the BioSharing initiative is presented in a separate report. The Genomic Standards Consortium (GSC) hosted the first day of this Special Interest Group (SIG) at ISMB to continue exploring the bottlenecks and emerging solutions for obtaining biological insights through large-scale comparative analysis of metagenomic datasets. The M3 SIG included invited and selected talks and a panel discussion at the end of the day involving the plenary speakers. Further information about the GSC and its range of activities can be found at http://gensc.org. Information about the newly established BioSharing effort can be found at http://biosharing.org/.

RevDate: 2017-02-20
CmpDate: 2011-07-14

Glass E, Meyer F, Gilbert JA, et al (2010)

Meeting Report from the Genomic Standards Consortium (GSC) Workshop 10.

Standards in genomic sciences, 3(3):225-231.

This report summarizes the proceedings of the 10th workshop of the Genomic Standards Consortium (GSC), held at Argonne National Laboratory, IL, USA. It was the second GSC workshop to have open registration and attracted over 60 participants who worked together to progress the full range of projects ongoing within the GSC. Overall, the primary focus of the workshop was on advancing the M5 platform for next-generation collaborative computational infrastructures. Other key outcomes included the formation of a GSC working group focused on MIGS/MIMS/MIENS compliance using the ISA software suite and the formal launch of the GSC Developer Working Group. Further information about the GSC and its range of activities can be found at http://gensc.org/.

RevDate: 2017-02-20
CmpDate: 2011-07-14

Davidsen T, Madupu R, Sterk P, et al (2010)

Meeting Report from the Genomic Standards Consortium (GSC) Workshop 9.

Standards in genomic sciences, 3(3):216-224.

This report summarizes the proceedings of the 9th workshop of the Genomic Standards Consortium (GSC), held at the J. Craig Venter Institute, Rockville, MD, USA. It was the first GSC workshop to have open registration and attracted over 90 participants. This workshop featured sessions that provided overviews of the full range of ongoing GSC projects. It included sessions on Standards in Genomic Sciences, the open access journal of the GSC, building standards for genome annotation, the M5 platform for next-generation collaborative computational infrastructures, building ties with the biodiversity research community and two discussion panels with government and industry participants. Progress was made on all fronts, and major outcomes included the completion of the MIENS specification for publication and the formation of the Biodiversity working group.

RevDate: 2017-02-20
CmpDate: 2011-07-14

Hirschman L, Sterk P, Field D, et al (2010)

Meeting Report: "Metagenomics, Metadata and Meta-analysis" (M3) Workshop at the Pacific Symposium on Biocomputing 2010.

Standards in genomic sciences, 2(3):357-360.

This report summarizes the M3 Workshop held at the January 2010 Pacific Symposium on Biocomputing. The workshop, organized by Genomic Standards Consortium members, included five contributed talks, a series of short presentations from stakeholders in the genomics standards community, a poster session, and, in the evening, an open discussion session to review current projects and examine future directions for the GSC and its stakeholders.

RevDate: 2017-02-20
CmpDate: 2011-07-14

Kyrpides N, Field D, Sterk P, et al (2010)

Meeting Report from the Genomic Standards Consortium (GSC) Workshop 8.

Standards in genomic sciences, 3(1):93-96.

This report summarizes the proceedings of the 8th meeting of the Genomic Standards Consortium held at the Department of Energy Joint Genome Institute in Walnut Creek, CA, USA on September 9-11, 2009. This three-day workshop marked the maturing of Genomic Standards Consortium from an informal gathering of researchers interested in developing standards in the field of genomic and metagenomics to an established community with a defined governance mechanism, its own open access journal, and a family of established standards for describing genomes, metagenomes and marker studies (i.e. ribosomal RNA gene surveys). There will be increased efforts within the GSC to reach out to the wider scientific community via a range of new projects. Further information about the GSC and its activities can be found at http://gensc.org/.

RevDate: 2017-02-20
CmpDate: 2011-04-28

Sun S, Chen J, Li W, et al (2011)

Community cyberinfrastructure for Advanced Microbial Ecology Research and Analysis: the CAMERA resource.

Nucleic acids research, 39(Database issue):D546-51.

The Community Cyberinfrastructure for Advanced Microbial Ecology Research and Analysis (CAMERA, http://camera.calit2.net/) is a database and associated computational infrastructure that provides a single system for depositing, locating, analyzing, visualizing and sharing data about microbial biology through an advanced web-based analysis portal. CAMERA collects and links metadata relevant to environmental metagenome data sets with annotation in a semantically-aware environment allowing users to write expressive semantic queries against the database. To meet the needs of the research community, users are able to query metadata categories such as habitat, sample type, time, location and other environmental physicochemical parameters. CAMERA is compliant with the standards promulgated by the Genomic Standards Consortium (GSC), and sustains a role within the GSC in extending standards for content and format of the metagenomic data and metadata and its submission to the CAMERA repository. To ensure wide, ready access to data and annotation, CAMERA also provides data submission tools to allow researchers to share and forward data to other metagenomics sites and community data archives such as GenBank. It has multiple interfaces for easy submission of large or complex data sets, and supports pre-registration of samples for sequencing. CAMERA integrates a growing list of tools and viewers for querying, analyzing, annotating and comparing metagenome and genome data.

RevDate: 2017-02-20
CmpDate: 2011-07-14

Field D, Friedberg I, Sterk P, et al (2009)

Meeting Report: "Metagenomics, Metadata and Meta-analysis" (M3) Special Interest Group at ISMB 2009.

Standards in genomic sciences, 1(3):278-282.

This report summarizes the proceedings of the "Metagenomics, Metadata and Meta-analysis" (M3) Special Interest Group (SIG) meeting held at the Intelligent Systems for Molecular Biology 2009 conference. The Genomic Standards Consortium (GSC) hosted this meeting to explore the bottlenecks and emerging solutions for obtaining biological insights through large-scale comparative analysis of metagenomic datasets. The M3 SIG included 16 talks, half of which were selected from submitted abstracts, a poster session and a panel discussion involving members of the GSC Board. This report summarizes this one-day SIG, attempts to identify shared themes and recapitulates community recommendations for the future of this field. The GSC will also host an M3 workshop at the Pacific Symposium on Biocomputing (PSB) in January 2010. Further information about the GSC and its range of activities can be found at http://gensc.org/.

RevDate: 2011-07-25
CmpDate: 2011-07-14

Wooley JC, Field D, FO Glöckner (2009)

Extending Standards for Genomics and Metagenomics Data: A Research Coordination Network for the Genomic Standards Consortium (RCN4GSC).

Standards in genomic sciences, 1(1):87-90.

Through a newly established Research Coordination Network for the Genomic Standards Consortium (RCN4GSC), the GSC will continue its leadership in establishing and integrating genomic standards through community-based efforts. These efforts, undertaken in the context of genomic and metagenomic research aim to ensure the electronic capture of all genomic data and to facilitate the achievement of a community consensus around collecting and managing relevant contextual information connected to the sequence data. The GSC operates as an open, inclusive organization, welcoming inspired biologists with a commitment to community service. Within the collaborative framework of the ongoing, international activities of the GSC, the RCN will expand the range of research domains engaged in these standardization efforts and sustain scientific networking to encourage active participation by the broader community. The RCN4GSC, funded for five years by the US National Science Foundation, will primarily support outcome-focused working meetings and the exchange of early-career scientists between GSC research groups in order to advance key standards contributions such as GCDML. Focusing on the timely delivery of the extant GSC core projects, the RCN will also extend the pioneering efforts of the GSC to engage researchers active in developing ecological, environmental and biodiversity data standards. As the initial goals of the GSC are increasingly achieved, promoting the comprehensive use of effective standards will be essential to ensure the effective use of sequence and associated data, to provide access for all biologists to all of the information, and to create interdisciplinary opportunities for discovery. The RCN will facilitate these implementation activities through participation in major scientific conferences and presentations on scientific advances enabled by community usage of genomic standards.

RevDate: 2011-07-25
CmpDate: 2011-07-14

Field D, Sterk P, Kyrpides N, et al (2009)

Meeting Report from the Genomic Standards Consortium (GSC) Workshops 6 and 7.

Standards in genomic sciences, 1(1):68-71.

This report summarizes the proceedings of the 6th and 7th workshops of the Genomic Standards Consortium (GSC), held back-to-back in 2008. GSC 6 focused on furthering the activities of GSC working groups, GSC 7 focused on outreach to the wider community. GSC 6 was held October 10-14, 2008 at the European Bioinformatics Institute, Cambridge, United Kingdom and included a two-day workshop focused on the refinement of the Genomic Contextual Data Markup Language (GCDML). GSC 7 was held as the opening day of the International Congress on Metagenomics 2008 in San Diego California. Major achievements of these combined meetings included an agreement from the International Nucleotide Sequence Database Consortium (INSDC) to create a "MIGS" keyword for capturing "Minimum Information about a Genome Sequence" compliant information within INSDC (DDBJ/EMBL /Genbank) records, launch of GCDML 1.0, MIGS compliance of the first set of "Genomic Encyclopedia of Bacteria and Archaea" project genomes, approval of a proposal to extend MIGS to 16S rRNA sequences within a "Minimum Information about an Environmental Sequence", finalization of plans for the GSC eJournal, "Standards in Genomic Sciences" (SIGS), and the formation of a GSC Board. Subsequently, the GSC has been awarded a Research Co-ordination Network (RCN4GSC) grant from the National Science Foundation, held the first SIGS workshop and launched the journal. The GSC will also be hosting outreach workshops at both ISMB 2009 and PSB 2010 focused on "Metagenomics, Metadata and MetaAnalysis" (M(3)). Further information about the GSC and its range of activities can be found at http://gensc.org, including videos of all the presentations at GSC 7.

RevDate: 2017-09-22
CmpDate: 2008-09-03

Garrity GM, Field D, Kyrpides N, et al (2008)

Toward a standards-compliant genomic and metagenomic publication record.

Omics : a journal of integrative biology, 12(2):157-160.

Increasingly, we are aware as a community of the growing need to manage the avalanche of genomic and metagenomic data, in addition to related data types like ribosomal RNA and barcode sequences, in a way that tightly integrates contextual data with traditional literature in a machine-readable way. It is for this reason that the Genomic Standards Consortium (GSC) formed in 2005. Here we suggest that we move beyond the development of standards and tackle standards compliance and improved data capture at the level of the scientific publication. We are supported in this goal by the fact that the scientific community is in the midst of a publishing revolution. This revolution is marked by a growing shift away from a traditional dichotomy between "journal articles" and "database entries" and an increasing adoption of hybrid models of collecting and disseminating scientific information. With respect to genomes and metagenomes and related data types, we feel the scientific community would be best served by the immediate launch of a central repository of short, highly structured "Genome Notes" that must be standards compliant. This could be done in the context of an existing journal, but we also suggest the more radical solution of launching a new journal. Such a journal could be designed to cater to a wide range of standards-related content types that are not currently centralized in the published literature. It could also support the demand for centralizing aspects of the "gray literature" (documents developed by institutions or communities) such as the call by the GSC for a central repository of Standard Operating Procedures describing the genomic annotation pipelines of the major sequencing centers. We argue that such an "eJournal," published under the Open Access paradigm by the GSC, could be an attractive publishing forum for a broader range of standardization initiatives within, and beyond, the GSC and thereby fill an unoccupied yet increasingly important niche within the current research landscape.

RevDate: 2017-09-22
CmpDate: 2008-09-03

Field D, Garrity GM, Sansone SA, et al (2008)

Meeting report: the fifth Genomic Standards Consortium (GSC) workshop.

Omics : a journal of integrative biology, 12(2):109-113.

This meeting report summarizes the proceedings of the fifth Genomic Standards Consortium (GSC) workshop held December 12-14, 2007, at the European Bioinformatics Institute (EBI), Cambridge, UK. This fifth workshop served as a milestone event in the evolution of the GSC (launched in September 2005); the key outcome of the workshop was the finalization of a stable version of the MIGS specification (v2.0) for publication. This accomplishment enables, and also in some cases necessitates, downstream activities, which are described in the multiauthor, consensus-driven articles in this special issue of OMICS produced as a direct result of the workshop. This report briefly summarizes the workshop and overviews the special issue. In particular, it aims to explain how the various GSC-led projects are working together to help this community achieve its stated mission of further standardizing the descriptions of genomes and metagenomes and implementing improved mechanisms of data exchange and integration to enable more accurate comparative analyses. Further information about the GSC and its range of activities can be found at http://gensc.org.

RevDate: 2008-06-20
CmpDate: 2008-09-03

Field D, Glöckner FO, Garrity GM, et al (2008)

Meeting report: the fourth Genomic Standards Consortium (GSC) workshop.

Omics : a journal of integrative biology, 12(2):101-108.

This meeting report summarizes the proceedings of the "eGenomics: Cataloguing our Complete Genome Collection IV" workshop held June 6-8, 2007, at the National Institute for Environmental eScience (NIEeS), Cambridge, United Kingdom. This fourth workshop of the Genomic Standards Consortium (GSC) was a mix of short presentations, strategy discussions, and technical sessions. Speakers provided progress reports on the development of the "Minimum Information about a Genome Sequence" (MIGS) specification and the closely integrated "Minimum Information about a Metagenome Sequence" (MIMS) specification. The key outcome of the workshop was consensus on the next version of the MIGS/MIMS specification (v1.2). This drove further definition and restructuring of the MIGS/MIMS XML schema (syntax). With respect to semantics, a term vetting group was established to ensure that terms are properly defined and submitted to the appropriate ontology projects. Perhaps the single most important outcome of the workshop was a proposal to move beyond the concept of "minimum" to create a far richer XML schema that would define a "Genomic Contextual Data Markup Language" (GCDML) suitable for wider semantic integration across databases. GCDML will contain not only curated information (e.g., compliant with MIGS/MIMS), but also be extended to include a variety of data processing and calculations. Further information about the Genomic Standards Consortium and its range of activities can be found at http://gensc.org.

RevDate: 2008-06-20
CmpDate: 2008-09-03

Van Brabant B, Gray T, Verslyppe B, et al (2008)

Laying the foundation for a Genomic Rosetta Stone: creating information hubs through the use of consensus identifiers.

Omics : a journal of integrative biology, 12(2):123-127.

Given the growing wealth of downstream information, the integration of molecular and non-molecular data on a given organism has become a major challenge. For micro-organisms, this information now includes a growing collection of sequenced genes and complete genomes, and for communities of organisms it includes metagenomes. Integration of the data is facilitated by the existence of authoritative, community-recognized, consensus identifiers that may form the heart of so-called information knuckles. The Genomic Standards Consortium (GSC) is building a mapping of identifiers across a group of federated databases with the aim to improve navigation across these resources and to enable the integration of their information in the near future. In particular, this is possible because of the existence of INSDC Genome Project Identifiers (GPIDs) and accession numbers, and the ability of the community to define new consensus identifiers such as the culture identifiers used in the StrainInfo.net bioportal. Here we outline (1) the general design of the Genomic Rosetta Stone project, (2) introduce example linkages between key databases (that cover information about genomes, 16S rRNA gene sequences, and microbial biological resource centers), and (3) make an open call for participation in this project providing a vision for its future use.

RevDate: 2008-06-20
CmpDate: 2008-09-03

Kottmann R, Gray T, Murphy S, et al (2008)

A standard MIGS/MIMS compliant XML Schema: toward the development of the Genomic Contextual Data Markup Language (GCDML).

Omics : a journal of integrative biology, 12(2):115-121.

The Genomic Contextual Data Markup Language (GCDML) is a core project of the Genomic Standards Consortium (GSC) that implements the "Minimum Information about a Genome Sequence" (MIGS) specification and its extension, the "Minimum Information about a Metagenome Sequence" (MIMS). GCDML is an XML Schema for generating MIGS/MIMS compliant reports for data entry, exchange, and storage. When mature, this sample-centric, strongly-typed schema will provide a diverse set of descriptors for describing the exact origin and processing of a biological sample, from sampling to sequencing, and subsequent analysis. Here we describe the need for such a project, outline design principles required to support the project, and make an open call for participation in defining the future content of GCDML. GCDML is freely available, and can be downloaded, along with documentation, from the GSC Web site (http://gensc.org).

RevDate: 2018-04-25
CmpDate: 2008-06-04

Field D, Garrity G, Gray T, et al (2008)

The minimum information about a genome sequence (MIGS) specification.

Nature biotechnology, 26(5):541-547.

With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the 'transparency' of the information contained in existing genomic databases.

RevDate: 2008-07-28
CmpDate: 2008-09-03

Hirschman L, Clark C, Cohen KB, et al (2008)

Habitat-Lite: a GSC case study based on free text terms for environmental metadata.

Omics : a journal of integrative biology, 12(2):129-136.

There is an urgent need to capture metadata on the rapidly growing number of genomic, metagenomic and related sequences, such as 16S ribosomal genes. This need is a major focus within the Genomic Standards Consortium (GSC), and Habitat is a key metadata descriptor in the proposed "Minimum Information about a Genome Sequence" (MIGS) specification. The goal of the work described here is to provide a light-weight, easy-to-use (small) set of terms ("Habitat-Lite") that captures high-level information about habitat while preserving a mapping to the recently launched Environment Ontology (EnvO). Our motivation for building Habitat-Lite is to meet the needs of multiple users, such as annotators curating these data, database providers hosting the data, and biologists and bioinformaticians alike who need to search and employ such data in comparative analyses. Here, we report a case study based on semiautomated identification of terms from GenBank and GOLD. We estimate that the terms in the initial version of Habitat-Lite would provide useful labels for over 60% of the kinds of information found in the GenBank isolation_source field, and around 85% of the terms in the GOLD habitat field. We present a revised version of Habitat-Lite defined within the EnvO Environmental Ontology through a new category, EnvO-Lite-GSC. We invite the community's feedback on its further development to provide a minimum list of terms to capture high-level habitat information and to provide classification bins needed for future studies.

RevDate: 2008-07-28
CmpDate: 2008-09-03

Gil IS, Sheldon W, Schmidt T, et al (2008)

Defining linkages between the GSC and NSF's LTER program: how the Ecological Metadata Language (EML) relates to GCDML and other outcomes.

Omics : a journal of integrative biology, 12(2):151-156.

The Genomic Standards Consortium (GSC) invited a representative of the Long-Term Ecological Research (LTER) to its fifth workshop to present the Ecological Metadata Language (EML) metadata standard and its relationship to the Minimum Information about a Genome/Metagenome Sequence (MIGS/MIMS) and its implementation, the Genomic Contextual Data Markup Language (GCDML). The LTER is one of the top National Science Foundation (NSF) programs in biology since 1980, representing diverse ecosystems and creating long-term, interdisciplinary research, synthesis of information, and theory. The adoption of EML as the LTER network standard has been key to build network synthesis architectures based on high-quality standardized metadata. EML is the NSF-recognized metadata standard for LTER, and EML is a criteria used to review the LTER program progress. At the workshop, a potential crosswalk between the GCDML and EML was explored. Also, collaboration between the LTER and GSC developers was proposed to join efforts toward a common metadata cataloging designer's tool. The community adoption success of a metadata standard depends, among other factors, on the tools and trainings developed to use the standard. LTER's experience in embracing EML may help GSC to achieve similar success. A possible collaboration between LTER and GSC to provide training opportunities for GCDML and the associated tools is being explored. Finally, LTER is investigating EML enhancements to better accommodate genomics data, possibly integrating the GCDML schema into EML. All these action items have been accepted by the LTER contingent, and further collaboration between the GSC and LTER is expected.

RevDate: 2017-09-22
CmpDate: 2007-08-30

Field D, N Kyrpides (2007)

The positive role of the ecological community in the genomic revolution.

Microbial ecology, 53(3):507-511.

The exponential increase of genomic and metagenomic data, fueled in part by recent advancements in sequencing technology, are greatly expanding our understanding of the phylogenetic diversity and metabolic capacity present in the environment. Two of the central challenges that bioinformaticians and ecologists alike must face are the design of bioinformatic resources that facilitate the analysis of genomic and metagenomic data in a comparative context and the efficient capture and organization of the plethora of descriptive information required to usefully describe these data sets. In this commentary, we review three initiatives presented in the "new frontiers" session of the second SCOPE meeting on Microbial Environmental Genomics (MicroEnGen-II, Shanghai, June 12-15, 2006). These are (1) the Integrated Microbial Genomes Resources (IMG), (2) the Genomic Standards Consortium (GSC), and (3) the Natural Environment Research Council (NERC) Environmental Bioinformatics Centre (NEBC). These integrative bioinformatics and data management initiatives underscore the increasingly important role ecologists have to play in the genomic (metagenomic) revolution.

RevDate: 2006-08-11
CmpDate: 2007-11-07

Morrison N, Cochrane G, Faruque N, et al (2006)

Concept of sample in OMICS technology.

Omics : a journal of integrative biology, 10(2):127-137.

Fundamental biological processes can now be studied by applying the full range of OMICS technologies (genomics, transcriptomics, proteomics, metabolomics, and beyond) to the same biological sample. Clearly, it would be desirable if the concept of sample were shared among these technologies, especially as up until the time a biological sample is prepared for use in a specific OMICS assay, its description is inherently technology independent. Sharing a common informatic representation would encourage data sharing (rather than data replication), thereby reducing redundant data capture and the potential for error. This would result in a significant degree of harmonization across different OMICS data standardization activities, a task that is critical if we are to integrate data from these different data sources. Here, we review the current concept of sample in OMICS technologies as it is being dealt with by different OMICS standardization initiatives and discuss the special role that the newly formed Genomic Standards Consortium (GSC) might have to play in this domain.

RevDate: 2007-11-15
CmpDate: 2007-11-07

Field D, Morrison N, Selengut J, et al (2006)

Meeting report: eGenomics: Cataloguing our Complete Genome Collection II.

Omics : a journal of integrative biology, 10(2):100-104.

This article summarizes the proceedings of the "eGenomics: Cataloguing our Complete Genome Collection II" workshop held November 10-11, 2005, at the European Bioinformatics Institute. This exploratory workshop, organized by members of the Genomic Standards Consortium (GSC), brought together researchers from the genomic, functional OMICS, and computational biology communities to discuss standardization activities across a range of projects. The workshop proceedings and outcomes are set to help guide the development of the GSC's Minimal Information about a Genome Sequence (MIGS) specification.

ESP Quick Facts

ESP Origins

In the early 1990's, Robert Robbins was a faculty member at Johns Hopkins, where he directed the informatics core of GDB — the human gene-mapping database of the international human genome project. To share papers with colleagues around the world, he set up a small paper-sharing section on his personal web page. This small project evolved into The Electronic Scholarly Publishing Project.

ESP Support

In 1995, Robbins became the VP/IT of the Fred Hutchinson Cancer Research Center in Seattle, WA. Soon after arriving in Seattle, Robbins secured funding, through the ELSI component of the US Human Genome Project, to create the original ESP.ORG web site, with the formal goal of providing free, world-wide access to the literature of classical genetics.

ESP Rationale

Although the methods of molecular biology can seem almost magical to the uninitiated, the original techniques of classical genetics are readily appreciated by one and all: cross individuals that differ in some inherited trait, collect all of the progeny, score their attributes, and propose mechanisms to explain the patterns of inheritance observed.

ESP Goal

In reading the early works of classical genetics, one is drawn, almost inexorably, into ever more complex models, until molecular explanations begin to seem both necessary and natural. At that point, the tools for understanding genome research are at hand. Assisting readers reach this point was the original goal of The Electronic Scholarly Publishing Project.

ESP Usage

Usage of the site grew rapidly and has remained high. Faculty began to use the site for their assigned readings. Other on-line publishers, ranging from The New York Times to Nature referenced ESP materials in their own publications. Nobel laureates (e.g., Joshua Lederberg) regularly used the site and even wrote to suggest changes and improvements.

ESP Content

When the site began, no journals were making their early content available in digital format. As a result, ESP was obliged to digitize classic literature before it could be made available. For many important papers — such as Mendel's original paper or the first genetic map — ESP had to produce entirely new typeset versions of the works, if they were to be available in a high-quality format.

ESP Help

Early support from the DOE component of the Human Genome Project was critically important for getting the ESP project on a firm foundation. Since that funding ended (nearly 20 years ago), the project has been operated as a purely volunteer effort. Anyone wishing to assist in these efforts should send an email to Robbins.

ESP Plans

With the development of methods for adding typeset side notes to PDF files, the ESP project now plans to add annotated versions of some classical papers to its holdings. We also plan to add new reference and pedagogical material. We have already started providing regularly updated, comprehensive bibliographies to the ESP.ORG site.

Electronic Scholarly Publishing
21454 NE 143rd Street
Woodinville, WA 98077

E-mail: RJR8222 @ gmail.com

Papers in Classical Genetics

The ESP began as an effort to share a handful of key papers from the early days of classical genetics. Now the collection has grown to include hundreds of papers, in full-text format.

Digital Books

Along with papers on classical genetics, ESP offers a collection of full-text digital books, including many works by Darwin (and even a collection of poetry — Chicago Poems by Carl Sandburg).

Timelines

ESP now offers a much improved and expanded collection of timelines, designed to give the user choice over subject matter and dates.

Biographies

Biographical information about many key scientists.

Selected Bibliographies

Bibliographies on several topics of potential interest to the ESP community are now being automatically maintained and generated on the ESP site.

ESP Picks from Around the Web (updated 07 JUL 2018 )