Viewport Size Code:
Login | Create New Account


About | Classical Genetics | Timelines | What's New | What's Hot

About | Classical Genetics | Timelines | What's New | What's Hot


Bibliography Options Menu

Hide Abstracts   |   Hide Additional Links
Long bibliographies are displayed in blocks of 100 citations at a time. At the end of each block there is an option to load the next block.

Bibliography on: Pangenome

The Electronic Scholarly Publishing Project: Providing world-wide, free access to classic scientific papers and other scholarly materials, since 1993.


ESP: PubMed Auto Bibliography 21 Sep 2018 at 01:32 Created: 


Although the enforced stability of genomic content is ubiquitous among MCEs, the opposite is proving to be the case among prokaryotes, which exhibit remarkable and adaptive plasticity of genomic content. Early bacterial whole-genome sequencing efforts discovered that whenever a particular "species" was re-sequenced, new genes were found that had not been detected earlier — entirely new genes, not merely new alleles. This led to the concepts of the bacterial core-genome, the set of genes found in all members of a particular "species", and the flex-genome, the set of genes found in some, but not all members of the "species". Together these make up the species' pan-genome.

Created with PubMed® Query: pangenome or "pan-genome" or "pan genome" NOT pmcbook NOT ispreviousversion

Citations The Papers (from PubMed®)

RevDate: 2018-09-20

Wang LYR, Jokinen CC, Laing CR, et al (2018)

Multi-Year Persistence of Verotoxigenic Escherichia coli (VTEC) in a Closed Canadian Beef Herd: A Cohort Study.

Frontiers in microbiology, 9:2040.

In this study, fecal samples were collected from a closed beef herd in Alberta, Canada from 2012 to 2015. To limit serotype bias, which was observed in enrichment broth cultures, Verotoxigenic Escherichia coli (VTEC) were isolated directly from samples using a hydrophobic grid-membrane filter verotoxin immunoblot assay. Overall VTEC isolation rates were similar for three different cohorts of yearling heifers on both an annual (68.5 to 71.8%) and seasonal basis (67.3 to 76.0%). Across all three cohorts, O139:H19 (37.1% of VTEC-positive samples), O22:H8 (15.8%) and O?(O108):H8 (15.4%) were among the most prevalent serotypes. However, isolation rates for serotypes O139:H19, O130:H38, O6:H34, O91:H21, and O113:H21 differed significantly between cohort-years, as did isolation rates for some serotypes within a single heifer cohort. There was a high level of VTEC serotype diversity with an average of 4.3 serotypes isolated per heifer and 65.8% of the heifers classified as "persistent shedders" of VTEC based on the criteria of >50% of samples positive and ≥4 consecutive samples positive. Only 26.8% (90/336) of the VTEC isolates from yearling heifers belonged to the human disease-associated seropathotypes A (O157:H7), B (O26:H11, O111:NM), and C (O22:H8, O91:H21, O113:H21, O137:H41, O2:H6). Conversely, seropathotypes B (O26:NM, O111:NM) and C (O91:H21, O2:H29) strains were dominant (76.0%, 19/25) among VTEC isolates from month-old calves from this herd. Among VTEC from heifers, carriage rates of vt1, vt2, vt1+vt2, eae, and hlyA were 10.7, 20.8, 68.5, 3.9, and 88.7%, respectively. The adhesin gene saa was present in 82.7% of heifer strains but absent from all of 13 eae+ve strains (from serotypes/intimin types O157:H7/γ1, O26:H11/β1, O111:NM/θ, O84:H2/ζ, and O182:H25/ζ). Phylogenetic relationships inferred from wgMLST and pan genome-derived core SNP analysis showed that strains clustered by phylotype and serotype. Further, VTEC strains of the same serotype usually shared the same suite of antibiotic resistance and virulence genes, suggesting the circulation of dominant clones within this distinct herd. This study provides insight into the diverse and dynamic nature of VTEC populations within groups of cattle and points to a broad spectrum of human health risks associated with these E. coli strains.

RevDate: 2018-09-20

Golanowska M, Potrykus M, Motyka-Pomagruk A, et al (2018)

Comparison of Highly and Weakly Virulent Dickeya solani Strains, With a View on the Pangenome and Panregulon of This Species.

Frontiers in microbiology, 9:1940.

Bacteria belonging to the genera Dickeya and Pectobacterium are responsible for significant economic losses in a wide variety of crops and ornamentals. During last years, increasing losses in potato production have been attributed to the appearance of Dickeya solani. The D. solani strains investigated so far share genetic homogeneity, although different virulence levels were observed among strains of various origins. The purpose of this study was to investigate the genetic traits possibly related to the diverse virulence levels by means of comparative genomics. First, we developed a new genome assembly pipeline which allowed us to complete the D. solani genomes. Four de novo sequenced and ten publicly available genomes were used to identify the structure of the D. solani pangenome, in which 74.8 and 25.2% of genes were grouped into the core and dispensable genome, respectively. For D. solani panregulon analysis, we performed a binding site prediction for four transcription factors, namely CRP, KdgR, PecS and Fur, to detect the regulons of these virulence regulators. Most of the D. solani potential virulence factors were predicted to belong to the accessory regulons of CRP, KdgR, and PecS. Thus, some differences in gene expression could exist between D. solani strains. The comparison between a highly and a low virulent strain, IFB0099 and IFB0223, respectively, disclosed only small differences between their genomes but significant differences in the production of virulence factors like pectinases, cellulases and proteases, and in their mobility. The D. solani strains also diverge in the number and size of prophages present in their genomes. Another relevant difference is the disruption of the adhesin gene fhaB2 in the highly virulent strain. Strain IFB0223, which has a complete adhesin gene, is less mobile and less aggressive than IFB0099. This suggests that in this case, mobility rather than adherence is needed in order to trigger disease symptoms. This study highlights the utility of comparative genomics in predicting D. solani traits involved in the aggressiveness of this emerging plant pathogen.

RevDate: 2018-09-19

Bayer PE, Golicz AA, Tirnaz S, et al (2018)

Variation in abundance of predicted resistance genes in the Brassica oleracea pangenome.

Plant biotechnology journal [Epub ahead of print].

Brassica oleracea is an important agricultural species encompassing many vegetable crops including cabbage, cauliflower, broccoli and kale, however it can be susceptible to a variety of fungal diseases such as clubroot, blackleg, leaf spot, and downy mildew. Resistance to these diseases is meditated by specific disease resistance genes-analogs (RGAs) which are differently distributed across B. oleracea lines. The sequenced reference cultivar does not contain all B. oleracea genes due to gene presence/absence variation between individuals, which makes it necessary to search for RGA candidates in the B. oleracea pangenome. Here we present a comparative analysis of RGA candidates in the pangenome of B. oleracea. We show that the presence of RGA candidates differs between lines and suggest that in B. oleracea, SNPs and presence/absence variation drive RGA diversity using separate mechanisms. We identified 32 RGA candidates linked to Sclerotinia, clubroot, and Fusarium wilt resistance QTL, and these findings have implications for crop breeding in B. oleracea, which may also be applicable in other crops species. This article is protected by copyright. All rights reserved.

RevDate: 2018-09-18

Checcucci A, diCenzo G, Ghini V, et al (2018)

Creation and characterization of a genomically hybrid strain in the nitrogen-fixing symbiotic bacterium Sinorhizobium meliloti.

ACS synthetic biology [Epub ahead of print].

Many bacteria, often associated with eukaryotic hosts and of relevance for biotechnological applications, harbour a multipartite genome composed of more than one replicon. Biotechnologically relevant phenotypes are often encoded by genes residing on the secondary replicons. A synthetic biology approach to developing enhanced strains for biotechnological purposes could therefore involve merging pieces or entire replicons from multiple strains into a single genome. Here we report the creation of a genomic hybrid strain in a model multipartite genome species, the plant-symbiotic bacterium Sinorhizobium meliloti. We term this strain as cis-hybrid, since it is produced by genomic material coming from the same species' pangenome. In particular, we moved the secondary replicon pSymA (accounting for nearly 20% of total genome content) from a donor S. meliloti strain to an acceptor strain. The cis-hybrid strain was screened for a panel of complex phenotypes (carbon/nitrogen utilization phenotypes, intra- and extra-cellular metabolomes, symbiosis, and various microbiological tests). Additionally, metabolic network reconstruction and constraint-based modelling were employed for in silico prediction of metabolic flux reorganization. Phenotypes of the cis-hybrid strain were in good agreement with those of both parental strains. Interestingly, the symbiotic phenotype showed a marked cultivar-specific improvement with the cis-hybrid strains compared to both parental strains. These results provide a proof-of-principle for the feasibility of genome-wide replicon-based remodelling of bacterial strains for improved biotechnological applications in precision agriculture.

RevDate: 2018-09-13

Le KK, Whiteside MD, Hopkins JE, et al (2018)

Spfy: an integrated graph database for real-time prediction of bacterial phenotypes and downstream comparative analyses.

Database : the journal of biological databases and curation, 2018:1-10 pii:5096058.

Public health laboratories are currently moving to whole-genome sequence (WGS)-based analyses, and require the rapid prediction of standard reference laboratory methods based solely on genomic data. Currently, these predictive genomics tasks rely on workflows that chain together multiple programs for the requisite analyses. While useful, these systems do not store the analyses in a genome-centric way, meaning the same analyses are often re-computed for the same genomes. To solve this problem, we created Spfy, a platform that rapidly performs the common reference laboratory tests, uses a graph database to store and retrieve the results from the computational workflows and links data to individual genomes using standardized ontologies. The Spfy platform facilitates rapid phenotype identification, as well as the efficient storage and downstream comparative analysis of tens of thousands of genome sequences. Though generally applicable to bacterial genome sequences, Spfy currently contains 10 243 Escherichia coli genomes, for which in-silico serotype and Shiga-toxin subtype, as well as the presence of known virulence factors and antimicrobial resistance determinants have been computed. Additionally, the presence/absence of the entire E. coli pan-genome was computed and linked to each genome. Owing to its database of diverse pre-computed results, and the ability to easily incorporate user data, Spfy facilitates hypothesis testing in fields ranging from population genomics to epidemiology, while mitigating the re-computation of analyses. The graph approach of Spfy is flexible, and can accommodate new analysis software modules as they are developed, easily linking new results to those already stored. Spfy provides a database and analyses approach for E. coli that is able to match the rapid accumulation of WGS data in public databases.

RevDate: 2018-09-11

Kavya VNS, Tayal K, Srinivasan R, et al (2018)

Sequence Alignment on Directed Graphs.

Journal of computational biology : a journal of computational molecular cell biology [Epub ahead of print].

Genomic variations in a reference collection are naturally represented as genome variation graphs. Such graphs encode common subsequences as vertices and the variations are captured using additional vertices and directed edges. The resulting graphs are directed graphs possibly with cycles. Existing algorithms for aligning sequences on such graphs make use of partial order alignment (POA) techniques that work on directed acyclic graphs (DAGs). To achieve this, acyclic extensions of the input graphs are first constructed through expensive loop unrolling steps (DAGification). Furthermore, such graph extensions could have considerable blowup in their size and in the worst case the blow-up factor is proportional to the input sequence length. We provide a novel alignment algorithm V-ALIGN that aligns the input sequence directly on the input graph while avoiding such expensive DAGification steps. V-ALIGN is based on a novel dynamic programming (DP) formulation that allows gapped alignment directly on the input graph. It supports affine and linear gaps. We also propose refinements to V-ALIGN for better performance in practice. With the proposed refinements, the time to fill the DP table has linear dependence on the sizes of the sequence, the graph, and its feedback vertex set. We conducted experiments to compare the proposed algorithm against the existing POA-based techniques. We also performed alignment experiments on the genome variation graphs constructed from the 1000 Genomes data. For aligning short sequences, standard approaches restrict the expensive gapped alignment to small filtered subgraphs having high similarity to the input sequence. In such cases, the performance of V-ALIGN for gapped alignment on the filtered subgraph depends on the subgraph sizes.

RevDate: 2018-09-06

Chen X, Zhang Y, Zhang Z, et al (2018)

PGAweb: A Web Server for Bacterial Pan-Genome Analysis.

Frontiers in microbiology, 9:1910.

An astronomical increase in microbial genome data in recent years has led to strong demand for bioinformatic tools for pan-genome analysis within and across species. Here, we present PGAweb, a user-friendly, web-based tool for bacterial pan-genome analysis, which is composed of two main pan-genome analysis modules, PGAP and PGAP-X. PGAweb provides key interactive and customizable functions that include orthologous clustering, pan-genome profiling, sequence variation and evolution analysis, and functional classification. PGAweb presents features of genomic structural dynamics and sequence diversity with different visualization methods that are helpful for intuitively understanding the dynamics and evolution of bacterial genomes. PGAweb has an intuitive interface with one-click setting of parameters and is freely available at

RevDate: 2018-09-05

Syme RA, Tan KC, Rybak K, et al (2018)

Pan-Parastagonospora Comparative Genome Analysis - effector prediction and genome evolution.

Genome biology and evolution pii:5090454 [Epub ahead of print].

We report a fungal pan-genome study involving Parastagonospora spp., including 21 isolates of the wheat (Triticum aestivum) pathogen P. nodorum, 10 of the grass-infecting P. avenae and 2 of a closely-related undefined sister species. We observed substantial variation in the distribution of polymorphisms across the pan-genome, including repeat-induced point mutations (RIP), diversifying selection and gene gains and losses. We also discovered chromosome-scale inter and intra-specific presence/absence variation of some sequences, suggesting the occurrence of one or more accessory chromosomes or regions that may play a role in host-pathogen interactions.The presence of known pathogenicity effector loci SnToxA, SnTox1 and SnTox3 varied substantially among isolates. Three P. nodorum isolates lacked functional versions for all three loci whilst three P. avenae isolates carried one or both of the SnTox1 and SnTox3 genes, indicating previously unrecognized potential for discovering additional effectors in the P. nodorum-wheat pathosystem. We utilized the pan-genomic comparative analysis to improve the prediction of pathogenicity effector candidates, recovering the three confirmed effectors among our top-ranked candidates. We propose applying this pan-genomic approach to identify the effector repertoire involved in other host-microbe interactions involving necrotrophic pathogens in the Pezizomycotina.

RevDate: 2018-09-04

Yang T, Zhong J, Zhang J, et al (2018)

Pan-Genomic Study of Mycobacterium tuberculosis Reflecting the Primary/Secondary Genes, Generality/Individuality, and the Interconversion Through Copy Number Variations.

Frontiers in microbiology, 9:1886.

Tuberculosis (TB) has surpassed HIV as the leading infectious disease killer worldwide since 2014. The main pathogen, Mycobacterium tuberculosis (Mtb), contains ~4,000 genes that account for ~90% of the genome. However, it is still unclear which of these genes are primary/secondary, which are responsible for generality/individuality, and which interconvert during evolution. Here we utilized a pan-genomic analysis of 36 Mtb genomes to address these questions. We identified 3,679 Mtb core (i.e., primary) genes, determining their phenotypic generality (e.g., virulence, slow growth, dormancy). We also observed 1,122 dispensable and 964 strain-specific secondary genes, reflecting partially shared and lineage-/strain-specific individualities. Among which, five L2 lineage-specific genes might be related to the increased virulence of the L2 lineage. Notably, we discovered 28 Mtb "Super Core Genes" (SCGs: more than a copy in at least 90% strains), which might be of increased importance, and reflected the "super phenotype generality." Most SCGs encode PE/PPE, virulence factors, antigens, and transposases, and have been verified as playing crucial roles in Mtb pathogenicity. Further investigation of the 28 SCGs demonstrated the interconversion among SCGs, single-copy core, dispensable, and strain-specific genes through copy number variations (CNVs) during evolution; different mutations on different copies highlight the delicate adaptive-evolution regulation amongst Mtb lineages. This reflects that the importance of genes varied through CNVs, which might be driven by selective pressure from environment/host-adaptation. In addition, compared with Mycobacterium bovis (Mbo), Mtb possesses 48 specific single core genes that partially reflect the differences between Mtb and Mbo individuality.

RevDate: 2018-09-03

Asaf S, Khan AL, Khan MA, et al (2018)

Complete genome sequencing and analysis of endophytic Sphingomonas sp. LK11 and its potential in plant growth.

3 Biotech, 8(9):389.

Our study aimed to elucidate the plant growth-promoting characteristics and the structure and composition of Sphingomonas sp. LK11 genome using the single molecule real-time (SMRT) sequencing technology of Pacific Biosciences. The results revealed that LK11 produces different types of gibberellins (GAs) in pure culture and significantly improves soybean plant growth by influencing endogenous GAs compared with non-inoculated control plants. Detailed genomic analyses revealed that the Sphingomonas sp. LK11 genome consists of a circular chromosome (3.78 Mbp; 66.2% G+C content) and two circular plasmids (122,975 bps and 34,160 bps; 63 and 65% G+C content, respectively). Annotation showed that the LK11 genome consists of 3656 protein-coding genes, 59 tRNAs, and 4 complete rRNA operons. Functional analyses predicted that LK11 encodes genes for phosphate solubilization and nitrate/nitrite ammonification, which are beneficial for promoting plant growth. Genes for production of catalases, superoxide dismutase, and peroxidases that confer resistance to oxidative stress in plants were also identified in LK11. Moreover, genes for trehalose and glycine betaine biosynthesis were also found in LK11 genome. Similarly, Sphingomonas spp. analysis revealed an open pan-genome and a total of 8507 genes were identified in the Sphingomonas spp. pan-genome and about 1356 orthologous genes were found to comprise the core genome. However, the number of genomes analyzed was not enough to describe complete gene sets. Our findings indicated that the genetic makeup of Sphingomonas sp. LK11 can be utilized as an eco-friendly bioresource for cleaning contaminated sites and promoting growth of plants confronted with environmental perturbations.

RevDate: 2018-08-31

Kiu R, LJ Hall (2018)

Response: Commentary: Probing Genomic Aspects of the Multi-Host Pathogen Clostridium perfringens Reveals Significant Pangenome Diversity, and a Diverse Array of Virulence Factors.

Frontiers in microbiology, 9:1857.

RevDate: 2018-08-30

Inman JM, Sutton GG, Beck E, et al (2018)

Large-Scale Comparative Analysis of Microbial Pan-genomes using PanOCT.

Bioinformatics (Oxford, England) pii:5079328 [Epub ahead of print].

Summary: The JCVI Pan-Genome Pipeline is a collection of programs to run PanOCT and tools that support and extend the capabilities of PanOCT. PanOCT (Pan-genome Ortholog Clustering Tool) is a tool for pan-genome analysis of closely related prokaryotic species or strains. The JCVI Pan-Genome Pipeline wrapper invokes command-line utilities that prepare input genomes, invoke third-party tools such as NCBI Blast+, run PanOCT, generate a consensus pan-genome, annotate features of the pan-genome, detect sets of genes of interest such as antimicrobial resistance (AMR) genes, and generate figures, tables, and html pages to visualize the results. The pipeline can run in a hierarchical mode, lowering the RAM and compute resources used.

Availability: Source code, demo data, and detailed documentation are freely available at

RevDate: 2018-08-29

Mehdizadeh Gohari I, JF Prescott (2018)

Commentary: Probing Genomic Aspects of the Multi-Host Pathogen Clostridium perfringens Reveals Significant Pangenome Diversity, and a Diverse Array of Virulence Factors.

Frontiers in microbiology, 9:1856.

RevDate: 2018-08-28

Sánchez-Vallet A, Fouché S, Fudal I, et al (2018)

The Genome Biology of Effector Gene Evolution in Filamentous Plant Pathogens.

Annual review of phytopathology, 56:21-40.

Filamentous pathogens, including fungi and oomycetes, pose major threats to global food security. Crop pathogens cause damage by secreting effectors that manipulate the host to the pathogen's advantage. Genes encoding such effectors are among the most rapidly evolving genes in pathogen genomes. Here, we review how the major characteristics of the emergence, function, and regulation of effector genes are tightly linked to the genomic compartments where these genes are located in pathogen genomes. The presence of repetitive elements in these compartments is associated with elevated rates of point mutations and sequence rearrangements with a major impact on effector diversification. The expression of many effectors converges on an epigenetic control mediated by the presence of repetitive elements. Population genomics analyses showed that rapidly evolving pathogens show high rates of turnover at effector loci and display a mosaic in effector presence-absence polymorphism among strains. We conclude that effective pathogen containment strategies require a thorough understanding of the effector genome biology and the pathogen's potential for rapid adaptation.

RevDate: 2018-08-21

Ou L, Li D, Lv J, et al (2018)

Pan-genome of cultivated pepper (Capsicum) and its use in gene presence-absence variation analyses.

RevDate: 2018-08-21

Argemi X, Matelska D, Ginalski K, et al (2018)

Comparative genomic analysis of Staphylococcus lugdunensis shows a closed pan-genome and multiple barriers to horizontal gene transfer.

BMC genomics, 19(1):621 pii:10.1186/s12864-018-4978-1.

BACKGROUND: Coagulase negative staphylococci (CoNS) are commensal bacteria on human skin. Staphylococcus lugdunensis is a unique CoNS which produces various virulence factors and may, like S. aureus, cause severe infections, particularly in hospital settings. Unlike other staphylococci, it remains highly susceptible to antimicrobials, and genome-based phylogenetic studies have evidenced a highly conserved genome that distinguishes it from all other staphylococci.

RESULTS: We demonstrate that S. lugdunensis possesses a closed pan-genome with a very limited number of new genes, in contrast to other staphylococci that have an open pan-genome. Whole-genome nucleotide and amino acid identity levels are also higher than in other staphylococci. We identified numerous genetic barriers to horizontal gene transfer that might explain this result. The S. lugdunensis genome has multiple operons encoding for restriction-modification, CRISPR/Cas and toxin/antitoxin systems. We also identified a new PIN-like domain-associated protein that might belong to a larger operon, comprising a metalloprotease, that could function as a new toxin/antitoxin or detoxification system.

CONCLUSION: We show that S. lugdunensis has a unique genome profile within staphylococci, with a closed pan-genome and several systems to prevent horizontal gene transfer. Its virulence in clinical settings does not rely on its ability to acquire and exchange antibiotic resistance genes or other virulence factors as shown for other staphylococci.

RevDate: 2018-08-17

Pena-Gonzalez A, Rodriguez-R LM, Marston CK, et al (2018)

Genomic Characterization and Copy Number Variation of Bacillus anthracis Plasmids pXO1 and pXO2 in a Historical Collection of 412 Strains.

mSystems, 3(4): pii:mSystems00065-18.

Bacillus anthracis plasmids pXO1 and pXO2 carry the main virulence factors responsible for anthrax. However, the extent of copy number variation within the species and how the plasmids are related to pXO1/pXO2-like plasmids in other species of the Bacillus cereus sensu lato group remain unclear. To gain new insights into these issues, we sequenced 412 B. anthracis strains representing the total phylogenetic and ecological diversity of the species. Our results revealed that B. anthracis genomes carried, on average, 3.86 and 2.29 copies of pXO1 and pXO2, respectively, and also revealed a positive linear correlation between the copy numbers of pXO1 and pXO2. No correlation between the plasmid copy number and the phylogenetic relatedness of the strains was observed. However, genomes of strains isolated from animal tissues generally maintained a higher plasmid copy number than genomes of strains from environmental sources (P < 0.05 [Welch two-sample t test]). Comparisons against B. cereus genomes carrying complete or partial pXO1-like and pXO2-like plasmids showed that the plasmid-based phylogeny recapitulated that of the main chromosome, indicating limited plasmid horizontal transfer between or within these species. Comparisons of gene content revealed a closed pXO1 and pXO2 pangenome; e.g., plasmids encode <8 unique genes, on average, and a single large fragment deletion of pXO1 in one B. anthracis strain (2000031682) was detected. Collectively, our results provide a more complete view of the genomic diversity of B. anthracis plasmids, their copy number variation, and the virulence potential of other Bacillus species carrying pXO1/pXO2-like plasmids. IMPORTANCE Bacillus anthracis microorganisms are of historical and epidemiological importance and are among the most homogenous bacterial groups known, even though the B. anthracis genome is rich in mobile elements. Mobile elements can trigger the diversification of lineages; therefore, characterizing the extent of genomic variation in a large collection of strains is critical for a complete understanding of the diversity and evolution of the species. Here, we sequenced a large collection of B. anthracis strains (>400) that were recovered from human, animal, and environmental sources around the world. Our results confirmed the remarkable stability of gene content and synteny of the anthrax plasmids and revealed no signal of plasmid exchange between B. anthracis and pathogenic B. cereus isolates but rather predominantly vertical descent. These findings advance our understanding of the biology and pathogenomic evolution of B. anthracis and its plasmids.

RevDate: 2018-08-17

Thind AK, Wicker T, Müller T, et al (2018)

Chromosome-scale comparative sequence analysis unravels molecular mechanisms of genome dynamics between two wheat cultivars.

Genome biology, 19(1):104 pii:10.1186/s13059-018-1477-2.

BACKGROUND: Recent improvements in DNA sequencing and genome scaffolding have paved the way to generate high-quality de novo assemblies of pseudomolecules representing complete chromosomes of wheat and its wild relatives. These assemblies form the basis to compare the dynamics of wheat genomes on a megabase scale.

RESULTS: Here, we provide a comparative sequence analysis of the 700-megabase chromosome 2D between two bread wheat genotypes-the old landrace Chinese Spring and the elite Swiss spring wheat line 'CH Campala Lr22a'. Both chromosomes were assembled into megabase-sized scaffolds. There is a high degree of sequence conservation between the two chromosomes. Analysis of large structural variations reveals four large indels of more than 100 kb. Based on the molecular signatures at the breakpoints, unequal crossing over and double-strand break repair were identified as the molecular mechanisms that caused these indels. Three of the large indels affect copy number of NLRs, a gene family involved in plant immunity. Analysis of SNP density reveals four haploblocks of 4, 8, 9 and 48 Mb with a 35-fold increased SNP density compared to the rest of the chromosome. Gene content across the two chromosomes was highly conserved. Ninety-nine percent of the genic sequences were present in both genotypes and the fraction of unique genes ranged from 0.4 to 0.7%.

CONCLUSIONS: This comparative analysis of two high-quality chromosome assemblies enabled a comprehensive assessment of large structural variations and gene content. The insight obtained from this analysis will form the basis of future wheat pan-genome studies.

RevDate: 2018-08-14

Cleary A, Ramaraj T, Kahanda I, et al (2018)

Exploring Frequented Regions in Pan-Genomic Graphs.

IEEE/ACM transactions on computational biology and bioinformatics [Epub ahead of print].

We consider the problem of identifying regions within a pan-genome De Bruijn graph that are traversed by many sequence paths. We define such regions and the subpaths that traverse them as frequented regions (FRs). In this work, we formalize the FR problem and describe an efficient algorithm for finding FRs. Subsequently, we propose some applications of FRs based on machine-learning and pan-genome graph simplification. We demonstrate the effectiveness of these applications using data sets for the organisms Staphylococcus aureus (bacterium) and Saccharomyces cerevisiae (yeast). We corroborate the biological relevance of FRs such as identifying introgressions in yeast that aid in alcohol tolerance, and show that FRs are useful for classification of yeast strains by industrial use and visualizing pan-genomic space.

RevDate: 2018-08-14

Das S, Pettersson BMF, Behra PRK, et al (2018)

Extensive genomic diversity among Mycobacterium marinum strains revealed by whole genome sequencing.

Scientific reports, 8(1):12040 pii:10.1038/s41598-018-30152-y.

Mycobacterium marinum is the causative agent for the tuberculosis-like disease mycobacteriosis in fish and skin lesions in humans. Ubiquitous in its geographical distribution, M. marinum is known to occupy diverse fish as hosts. However, information about its genomic diversity is limited. Here, we provide the genome sequences for 15 M. marinum strains isolated from infected humans and fish. Comparative genomic analysis of these and four available genomes of the M. marinum strains M, E11, MB2 and Europe reveal high genomic diversity among the strains, leading to the conclusion that M. marinum should be divided into two different clusters, the "M"- and the "Aronson"-type. We suggest that these two clusters should be considered to represent two M. marinum subspecies. Our data also show that the M. marinum pan-genome for both groups is open and expanding and we provide data showing high number of mutational hotspots in M. marinum relative to other mycobacteria such as Mycobacterium tuberculosis. This high genomic diversity might be related to the ability of M. marinum to occupy different ecological niches.

RevDate: 2018-08-03

Pluta R, M Espinosa (2018)

Antisense and yet sensitive: Copy number control of rolling circle-replicating plasmids by small RNAs.

Wiley interdisciplinary reviews. RNA [Epub ahead of print].

Bacterial plasmids constitute a wealth of shared DNA amounting to about 20% of the total prokaryotic pangenome. Plasmids replicate autonomously and control their replication by maintaining a fairly constant number of copies within a given host. Plasmids should acquire a good fitness to their hosts so that they do not constitute a genetic load. Here we review some basic concepts in plasmid biology, pertaining to the control of replication and distribution of plasmid copies among daughter cells. A particular class of plasmids is constituted by those that replicate by the rolling circle mode (rolling circle-replicating [RCR]-plasmids). They are small double-stranded DNA molecules, with a rather high number of copies in the original host. RCR-plasmids control their replication by means of a small short-lived antisense RNA, alone or in combination with a plasmid-encoded transcriptional repressor protein. Two plasmid prototypes have been studied in depth, namely the staphylococcal plasmid pT181 and the streptococcal plasmid pMV158, each corresponding to the two types of replication control circuits, respectively. We further discuss possible applications of the plasmid-encoded antisense RNAs and address some future directions that, in our opinion, should be pursued in the study of these small molecules. This article is categorized under: Regulatory RNAs/RNAi/Riboswitches > Regulatory RNAs RNA Structure and Dynamics > Influence of RNA Structure in Biological Systems.

RevDate: 2018-08-03

González-Torres P, T Gabaldón (2018)

Genome Variation in the Model Halophilic Bacterium Salinibacter ruber.

Frontiers in microbiology, 9:1499.

The halophilic bacterium Salinibacter ruber is an abundant and ecologically important member of halophilic communities worldwide. Given its broad distribution and high intraspecific genetic diversity, S. ruber is considered one of the main models for ecological and evolutionary studies of bacterial adaptation to hypersaline environments. However, current insights on the genomic diversity of this species is limited to the comparison of the genomes of two co-isolated strains. Here, we present a comparative genomic analysis of eight S. ruber strains isolated at two different time points in each of two different Mediterranean solar salterns. Our results show an open pangenome with contrasting evolutionary patterns in the core and accessory genomes. We found that the core genome is shaped by extensive homologous recombination (HR), which results in limited sequence variation within population clusters. In contrast, the accessory genome is modulated by horizontal gene transfer (HGT), with genomic islands and plasmids acting as gateways to the rest of the genome. In addition, both types of genetic exchange are modulated by restriction and modification (RM) or CRISPR-Cas systems. Finally, genes differentially impacted by such processes reveal functional processes potentially relevant for environmental interactions and adaptation to extremophilic conditions. Altogether, our results support scenarios that conciliate "Neutral" and "Constant Diversity" models of bacterial evolution.

RevDate: 2018-07-31

Springer NM, Anderson SN, Andorf CM, et al (2018)

The maize W22 genome provides a foundation for functional genomics and transposon biology.

Nature genetics pii:10.1038/s41588-018-0158-0 [Epub ahead of print].

The maize W22 inbred has served as a platform for maize genetics since the mid twentieth century. To streamline maize genome analyses, we have sequenced and de novo assembled a W22 reference genome using short-read sequencing technologies. We show that significant structural heterogeneity exists in comparison to the B73 reference genome at multiple scales, from transposon composition and copy number variation to single-nucleotide polymorphisms. The generation of this reference genome enables accurate placement of thousands of Mutator (Mu) and Dissociation (Ds) transposable element insertions for reverse and forward genetics studies. Annotation of the genome has been achieved using RNA-seq analysis, differential nuclease sensitivity profiling and bisulfite sequencing to map open reading frames, open chromatin sites and DNA methylation profiles, respectively. Collectively, the resources developed here integrate W22 as a community reference genome for functional genomics and provide a foundation for the maize pan-genome.

RevDate: 2018-07-29

Wolf IR, Paschoal AR, Quiroga C, et al (2018)

Functional annotation and distribution overview of RNA families in 27 Streptococcus agalactiae genomes.

BMC genomics, 19(1):556 pii:10.1186/s12864-018-4951-z.

BACKGROUND: Streptococcus agalactiae, also known as Group B Streptococcus (GBS), is a Gram-positive bacterium that colonizes the gastrointestinal and genitourinary tract of humans. This bacterium has also been isolated from various animals, such as fish and cattle. Non-coding RNAs (ncRNAs) can act as regulators of gene expression in bacteria, such as Streptococcus pneumoniae and Streptococcus pyogenes. However, little is known about the genomic distribution of ncRNAs and RNA families in S. agalactiae.

RESULTS: Comparative genome analysis of 27 S. agalactiae strains showed more than 5 thousand genomic regions identified and classified as Core, Exclusive, and Shared genome sequences. We identified 27 to 89 RNA families per genome distributed over these regions, from these, 25 were in Core regions while Shared and Exclusive regions showed variations amongst strains. We propose that the amount and type of ncRNA present in each genome can provide a pattern to contribute in the identification of the clonal types.

CONCLUSIONS: The identification of RNA families provides an insight over ncRNAs, sRNAs and ribozymes function, that can be further explored as targets for antibiotic development or studied in gene regulation of cellular processes. RNA families could be considered as markers to determine infection capabilities of different strains. Lastly, pan-genome analysis of GBS including the full range of functional transcripts provides a broader approach in the understanding of this pathogen.

RevDate: 2018-07-27

Luo Y, Cheng Y, Yi J, et al (2018)

Complete Genome Sequence of Industrial Biocontrol Strain Paenibacillus polymyxa HY96-2 and Further Analysis of Its Biocontrol Mechanism.

Frontiers in microbiology, 9:1520.

Paenibacillus polymyxa (formerly known as Bacillus polymyxa) has been extensively studied for agricultural applications as a plant-growth-promoting rhizobacterium and is also an important biocontrol agent. Our team has developed the P. polymyxa strain HY96-2 from the tomato rhizosphere as the first microbial biopesticide based on P. polymyxa for controlling plant diseases around the world, leading to the commercialization of this microbial biopesticide in China. However, further research is essential for understanding its precise biocontrol mechanisms. In this paper, we report the complete genome sequence of HY96-2 and the results of a comparative genomic analysis between different P. polymyxa strains. The complete genome size of HY96-2 was found to be 5.75 Mb and 5207 coding sequences were predicted. HY96-2 was compared with seven other P. polymyxa strains for which complete genome sequences have been published, using phylogenetic tree, pan-genome, and nucleic acid co-linearity analysis. In addition, the genes and gene clusters involved in biofilm formation, antibiotic synthesis, and systemic resistance inducer production were compared between strain HY96-2 and two other strains, namely, SC2 and E681. The results revealed that all three of the P. polymyxa strains have the ability to control plant diseases via the mechanisms of colonization (biofilm formation), antagonism (antibiotic production), and induced resistance (systemic resistance inducer production). However, the variation of the corresponding genes or gene clusters between the three strains may lead to different antimicrobial spectra and biocontrol efficacies. Two possible pathways of biofilm formation in P. polymyxa were reported for the first time after searching the KEGG database. This study provides a scientific basis for the further optimization of the field applications and quality standards of industrial microbial biopesticides based on HY96-2. It may also serve as a reference for studying the differences in antimicrobial spectra and biocontrol capability between different biocontrol agents.

RevDate: 2018-07-25

Aherfi S, Andreani J, Baptiste E, et al (2018)

A Large Open Pangenome and a Small Core Genome for Giant Pandoraviruses.

Frontiers in microbiology, 9:1486.

Giant viruses of amoebae are distinct from classical viruses by the giant size of their virions and genomes. Pandoraviruses are the record holders in size of genomes and number of predicted genes. Three strains, P. salinus, P. dulcis, and P. inopinatum, have been described to date. We isolated three new ones, namely P. massiliensis, P. braziliensis, and P. pampulha, from environmental samples collected in Brazil. We describe here their genomes, the transcriptome and proteome of P. massiliensis, and the pangenome of the group encompassing the six pandoravirus isolates. Genome sequencing was performed with an Illumina MiSeq instrument. Genome annotation was performed using GeneMarkS and Prodigal softwares and comparative genomic analyses. The core genome and pangenome were determined using notably ProteinOrtho and CD-HIT programs. Transcriptomics was performed for P. massiliensis with the Illumina MiSeq instrument; proteomics was also performed for this virus using 1D/2D gel electrophoresis and mass spectrometry on a Synapt G2Si Q-TOF traveling wave mobility spectrometer. The genomes of the three new pandoraviruses are comprised between 1.6 and 1.8 Mbp. The genomes of P. massiliensis, P. pampulha, and P. braziliensis were predicted to harbor 1,414, 2,368, and 2,696 genes, respectively. These genes comprise up to 67% of ORFans. Phylogenomic analyses showed that P. massiliensis and P. braziliensis were more closely related to each other than to the other pandoraviruses. The core genome of pandoraviruses comprises 352 clusters of genes, and the ratio core genome/pangenome is less than 0.05. The extinction curve shows clearly that the pangenome is still open. A quarter of the gene content of P. massiliensis was detected by transcriptomics. In addition, a product for a total of 162 open reading frames were found by proteomic analysis of P. massiliensis virions, including notably the products of 28 ORFans, 99 hypothetical proteins, and 90 core genes. Further analyses should allow to gain a better knowledge and understanding of the evolution and origin of these giant pandoraviruses, and of their relationships with viruses and cellular microorganisms.

RevDate: 2018-07-23

Fleshman A, Mullins K, Sahl J, et al (2018)

Comparative pan-genomic analyses of Orientia tsutsugamushi reveal an exceptional model of bacterial evolution driving genomic diversity.

Microbial genomics [Epub ahead of print].

Orientia tsutsugamushi, formerly Rickettsia tsutsugamushi, is an obligate intracellular pathogen that causes scrub typhus, an underdiagnosed acute febrile disease with high morbidity. Scrub typhus is transmitted by the larval stage (chigger) of Leptotrombidium mites and is irregularly distributed across endemic regions of Asia, Australia and islands of the western Pacific Ocean. Previous work to understand population genetics in O. tsutsugamushi has been based on sub-genomic sampling methods and whole-genome characterization of two genomes. In this study, we compared 40 genomes from geographically dispersed areas and confirmed patterns of extensive homologous recombination likely driven by transposons, conjugative elements and repetitive sequences. High rates of lateral gene transfer (LGT) among O. tsutsugamushi genomes appear to have effectively eliminated a detectable clonal frame, but not our ability to infer evolutionary relationships and phylogeographical clustering. Pan-genomic comparisons using 31 082 high-quality bacterial genomes from 253 species suggests that genomic duplication in O. tsutsugamushi is almost unparalleled. Unlike other highly recombinant species where the uptake of exogenous DNA largely drives genomic diversity, the pan-genome of O. tsutsugamushi is driven by duplication and divergence. Extensive gene innovation by duplication is most commonly attributed to plants and animals and, in contrast with LGT, is thought to be only a minor evolutionary mechanism for bacteria. The near unprecedented evolutionary characteristics of O. tsutsugamushi, coupled with extensive intra-specific LGT, expand our present understanding of rapid bacterial evolutionary adaptive mechanisms.

RevDate: 2018-07-23

Zhou Z, Lundstrøm I, Tran-Dien A, et al (2018)

Pan-genome Analysis of Ancient and Modern Salmonella enterica Demonstrates Genomic Stability of the Invasive Para C Lineage for Millennia.

Current biology : CB pii:S0960-9822(18)30694-8 [Epub ahead of print].

Salmonella enterica serovar Paratyphi C causes enteric (paratyphoid) fever in humans. Its presentation can range from asymptomatic infections of the blood stream to gastrointestinal or urinary tract infection or even a fatal septicemia [1]. Paratyphi C is very rare in Europe and North America except for occasional travelers from South and East Asia or Africa, where the disease is more common [2, 3]. However, early 20th-century observations in Eastern Europe [3, 4] suggest that Paratyphi C enteric fever may once have had a wide-ranging impact on human societies. Here, we describe a draft Paratyphi C genome (Ragna) recovered from the 800-year-old skeleton (SK152) of a young woman in Trondheim, Norway. Paratyphi C sequences were recovered from her teeth and bones, suggesting that she died of enteric fever and demonstrating that these bacteria have long caused invasive salmonellosis in Europeans. Comparative analyses against modern Salmonella genome sequences revealed that Paratyphi C is a clade within the Para C lineage, which also includes serovars Choleraesuis, Typhisuis, and Lomita. Although Paratyphi C only infects humans, Choleraesuis causes septicemia in pigs and boar [5] (and occasionally humans), and Typhisuis causes epidemic swine salmonellosis (chronic paratyphoid) in domestic pigs [2, 3]. These different host specificities likely evolved in Europe over the last ∼4,000 years since the time of their most recent common ancestor (tMRCA) and are possibly associated with the differential acquisitions of two genomic islands, SPI-6 and SPI-7. The tMRCAs of these bacterial clades coincide with the timing of pig domestication in Europe [6].

RevDate: 2018-07-20

Zhong C, Han M, Yu S, et al (2018)

Pan-genome analyses of 24 Shewanella strains re-emphasize the diversification of their functions yet evolutionary dynamics of metal-reducing pathway.

Biotechnology for biofuels, 11:193 pii:1201.

Background: Shewanella strains are important dissimilatory metal-reducing bacteria which are widely distributed in diverse habitats. Despite efforts to genomically characterize Shewanella, knowledge of the molecular components, functional information and evolutionary patterns remain lacking, especially for their compatibility in the metal-reducing pathway. The increasing number of genome sequences of Shewanella strains offers a basis for pan-genome studies.

Results: A comparative pan-genome analysis was conducted to study genomic diversity and evolutionary relationships among 24 Shewanella strains. Results revealed an open pan-genome of 13,406 non-redundant genes and a core-genome of 1878 non-redundant genes. Selective pressure acted on the invariant members of core genome, in which purifying selection drove evolution in the housekeeping mechanisms. Shewanella strains exhibited extensive genome variability, with high levels of gene gain and loss during the evolution, which affected variable gene sets and facilitated the rapid evolution. Additionally, genes related to metal reduction were diversely distributed in Shewanella strains and evolved under purifying selection, which highlighted the basic conserved functionality and specificity of respiratory systems.

Conclusions: The diversity of genes present in the accessory and specific genomes of Shewanella strains indicates that each strain uses different strategies to adapt to diverse environments. Horizontal gene transfer is an important evolutionary force in shaping Shewanella genomes. Purifying selection plays an important role in the stability of the core-genome and also drives evolution in mtr-omc cluster of different Shewanella strains.

RevDate: 2018-07-20

Collins FWJ, Mesa-Pereira B, O'Connor PM, et al (2018)

Reincarnation of Bacteriocins From the Lactobacillus Pangenomic Graveyard.

Frontiers in microbiology, 9:1298.

Bacteria commonly produce narrow spectrum bacteriocins as a means of inhibiting closely related species competing for similar resources in an environment. The increasing availability of genomic data means that it is becoming easier to identify bacteriocins encoded within genomes. Often, however, the presence of bacteriocin genes in a strain does not always translate into biological antimicrobial activity. For example, when analysing the Lactobacillus pangenome we identified strains encoding ten pediocin-like bacteriocin structural genes which failed to display inhibitory activity. Nine of these bacteriocins were novel whilst one was identified as the previously characterized bacteriocin "penocin A." The composition of these bacteriocin operons varied between strains, often with key components missing which are required for bacteriocin production, such as dedicated bacteriocin transporters and accessory proteins. In an effort to functionally express these bacteriocins, the structural genes for the ten pediocin homologs were cloned alongside the dedicated pediocin PA-1 transporter in both Escherichia coli and Lactobacillus paracasei heterologous hosts. Each bacteriocin was cloned with its native leader sequence and as a fusion protein with the pediocin PA-1 leader sequence. Several of these bacteriocins displayed a broader spectrum of inhibition than the original pediocin PA-1. We show how potentially valuable bacteriocins can easily be "reincarnated" from in silico data and produced in vitro despite often lacking the necessary accompanying machinery. Moreover, the study demonstrates how genomic datasets such as the Lactobacilus pangenome harbor a potential "arsenal" of antimicrobial activity with the possibility of being activated when expressed in more genetically amenable hosts.

RevDate: 2018-07-16

Holley G, Wittler R, Stoye J, et al (2018)

Dynamic Alignment-Free and Reference-Free Read Compression.

Journal of computational biology : a journal of computational molecular cell biology, 25(7):825-836.

The advent of high throughput sequencing (HTS) technologies raises a major concern about storage and transmission of data produced by these technologies. In particular, large-scale sequencing projects generate an unprecedented volume of genomic sequences ranging from tens to several thousands of genomes per species. These collections contain highly similar and redundant sequences, also known as pangenomes. The ideal way to represent and transfer pangenomes is through compression. A number of HTS-specific compression tools have been developed to reduce the storage and communication costs of HTS data, yet none of them is designed to process a pangenome. In this article, we present dynamic alignment-free and reference-free read compression (DARRC), a new alignment-free and reference-free compression method. It addresses the problem of pangenome compression by encoding the sequences of a pangenome as a guided de Bruijn graph. The novelty of this method is its ability to incrementally update DARRC archives with new genome sequences without full decompression of the archive. DARRC can compress both single-end and paired-end read sequences of any length using all symbols of the IUPAC nucleotide code. On a large Pseudomonas aeruginosa data set, our method outperforms all other tested tools. It provides a 30% compression ratio improvement in single-end mode compared with the best performing state-of-the-art HTS-specific compression method in our experiments.

RevDate: 2018-07-14

Driscoll CB, Meyer KA, Šulčius S, et al (2018)

A closely-related clade of globally distributed bloom-forming cyanobacteria within the Nostocales.

Harmful algae, 77:93-107.

In order to better understand the relationships among current Nostocales cyanobacterial blooms, eight genomes were sequenced from cultured isolates or from environmental metagenomes of recent planktonic Nostocales blooms. Phylogenomic analysis of publicly available sequences placed the new genomes among a group of 15 genomes from four continents in a distinct ADA clade (Anabaena/Dolichospermum/Aphanizomenon) within the Nostocales. This clade contains four species-level groups, two of which include members with both Anabaena-like and Aphanizomenon flos-aquae-like morphology. The genomes contain many repetitive genetic elements and a sizable pangenome, in which ABC-type transporters are highly represented. Alongside common core genes for photosynthesis, the differentiation of N2-fixing heterocysts, and the uptake and incorporation of the major nutrients P, N and S, we identified several gene pathways in the pangenome that may contribute to niche partitioning. Genes for problematic secondary metabolites-cyanotoxins and taste-and-odor compounds-were sporadically present, as were other polyketide synthase (PKS) and nonribosomal peptide synthetase (NRPS) gene clusters. By contrast, genes predicted to encode the ribosomally generated bacteriocin peptides were found in all genomes.

RevDate: 2018-07-11

Rizzi R, Cairo M, Makinen V, et al (2018)

Hardness of Covering Alignment: Phase Transition in Post-Sequence Genomics.

IEEE/ACM transactions on computational biology and bioinformatics [Epub ahead of print].

Covering alignment problems arise from recent developments in genomics; so called pan-genome graphs are replacing reference genomes, and advances in haplotyping enable full content of diploid genomes to be used as basis of sequence analysis. In this paper, we show that the computational complexity will change for natural extensions of alignments to pan-genome representations and to diploid genomes. More broadly, our approach can also be seen as a minimal extension of sequence alignment to labelled directed acyclic graphs (labeled DAGs). Namely, we show that finding a covering alignment of two labeled DAGs is NP-hard even on binary alphabets. A covering alignment asks for two paths (red) and (green) in DAG and two paths (red) and (green) in DAG that cover the nodes of the graphs and maximize the sum of the global alignment scores: , where is the concatenation of labels on the path P. Pair-wise alignment of haplotype sequences forming a diploid chromosome can be converted to a two-path coverable labelled DAG, and then the covering alignment models the similarity of two diploids over arbitrary recombinations. Reduction to the other direction shows that problem NP-hard on alphabets of size 3.

RevDate: 2018-07-06

Tetz G, V Tetz (2018)

Tetz's theory and law of longevity.

Theory in biosciences = Theorie in den Biowissenschaften pii:10.1007/s12064-018-0267-4 [Epub ahead of print].

Here, we present new theory and law of longevity intended to evaluate fundamental factors that control lifespan. This theory is based on the fact that genes affecting host organism longevity are represented by subpopulations: genes of host eukaryotic cells, commensal microbiota, and non-living genetic elements. Based on Tetz's theory of longevity, we propose that lifespan and aging are defined by the accumulation of alterations over all genes of macroorganism and microbiome and the non-living genetic elements associated with them. Tetz's law of longevity states that longevity is limited by the accumulation of alterations to the limiting value that is not compatible with life. Based on theory and law, we also propose a novel model to calculate several parameters, including the rate of aging and the remaining lifespan of individuals. We suggest that this theory and model have explanatory and predictive potential to eukaryotic organisms, allowing the influence of diseases, medication, and medical procedures to be re-examined in relation to longevity. Such estimates also provide a framework to evaluate new fundamental aspects that control aging and lifespan.

RevDate: 2018-07-05

Choi S, Jin GD, Park J, et al (2018)

Pangenomics of Lactobacillus plantarum revealed Group-specific genomic profiles without habitat association.

Journal of microbiology and biotechnology pii:10.4014/jmb.1803.03029 [Epub ahead of print].

Lactobacillus plantarum is a lactic acid bacterium that promotes animal intestinal health as a probiotic and is found in a wide variety of habitats. Here, we investigated the genomic features of different clusters of L. plantarum strains via pan-genomic analysis. We compared the genomes of 108 L. plantarum strains that were available from the NCBI GenBank database. These genomes were 2.9-3.7 Mbp in size and 44-45% in G+C content. A total of 8,847 orthologs were collected, and 1,709 genes were identified to be shared as core genes by all the strains analyzed. On the basis of SNPs from the core genes, 108 strains were clustered into five major groups (G1-G5) that are different from previous reports and are not clearly associated with habitats. Analysis of group-specific enriched or depleted genes revealed that G1 and G2 were rich in genes for carbohydrate utilization (L-arabinose, L-rhamnose,and fructo-oligosaccharides) and that G3, G4, and G5 possessed more genes for restriction-modification system and MazEF toxin-antitoxin. These results indicate that there are critical differences in gene content and survival strategies among genetically clustered L. plantarum strains, regardless of habitats.

RevDate: 2018-07-16

Lemos Junior WJF, da Silva Duarte V, Treu L, et al (2018)

Whole genome comparison of two Starmerella bacillaris strains with other wine yeasts uncovers genes involved in modulating important winemaking traits.

FEMS yeast research, 18(7):.

Starmerella bacillaris is an osmotolerant yeast with interesting winemaking traits such as low-ethanol and high-glycerol production, previously considered as wine spoilage and recently proposed to improve the sensory quality of wine. This is the first work performing a whole-genome analysis of the variants identified by comparing two S. bacillaris strains (PAS13 and FRI751). Additionally, an extensive search for orthologous genes against Saccharomyces and non-Saccharomyces yeasts produced a detailed reconstruction of the pan-genome for yeast species used in winemaking. Starmerella bacillaris PAS13 was able to produce 36% more glycerol than S. bacillaris FRI751 without increasing ethanol level over 5% (v/v). Orthologous genes revealed new insights in the response to osmotic stress determined by the mitogen-activated protein kinase (MAPK) from S. bacillaris strains. The comparison between the two S. bacillaris genomes revealed 33 771 high-quality variants that were ranked considering their predicted impact on gene functions. Furthermore, analysis of structural variations in the genome revealed five translocations. The absence of some transcriptional factors involved in the regulation of GPD (glycerol-3-phosphate dehydrogenase), like the protein kinases YpK1p and YpK2p, and the identification of a tandem duplication increasing the GPP1 (glycerol-3-phosphate phosphatase) gene copy number suggest a remarkably different regulation of the glycerol pathway for S. bacillaris in comparison to S. cerevisiae.

RevDate: 2018-07-11

Her HL, YW Wu (2018)

A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains.

Bioinformatics (Oxford, England), 34(13):i89-i95.

Motivation: Antimicrobial resistance (AMR) is becoming a huge problem in both developed and developing countries, and identifying strains resistant or susceptible to certain antibiotics is essential in fighting against antibiotic-resistant pathogens. Whole-genome sequences have been collected for different microbial strains in order to identify crucial characteristics that allow certain strains to become resistant to antibiotics; however, a global inspection of the gene content responsible for AMR activities remains to be done.

Results: We propose a pan-genome-based approach to characterize antibiotic-resistant microbial strains and test this approach on the bacterial model organism Escherichia coli. By identifying core and accessory gene clusters and predicting AMR genes for the E. coli pan-genome, we not only showed that certain classes of genes are unevenly distributed between the core and accessory parts of the pan-genome but also demonstrated that only a portion of the identified AMR genes belong to the accessory genome. Application of machine learning algorithms to predict whether specific strains were resistant to antibiotic drugs yielded the best prediction accuracy for the set of AMR genes within the accessory part of the pan-genome, suggesting that these gene clusters were most crucial to AMR activities in E. coli. Selecting subsets of AMR genes for different antibiotic drugs based on a genetic algorithm (GA) achieved better prediction performances than the gene sets established in the literature, hinting that the gene sets selected by the GA may warrant further analysis in investigating more details about how E. coli fight against antibiotics.

Supplementary information: Supplementary data are available at Bioinformatics online.

RevDate: 2018-06-29

Gemmell MR, Berry S, Mukhopadhya I, et al (2018)

Comparative genomics of Campylobacter concisus: Analysis of clinical strains reveals genome diversity and pathogenic potential.

Emerging microbes & infections, 7(1):116 pii:10.1038/s41426-018-0118-x.

In recent years, an increasing number of Campylobacter species have been associated with human gastrointestinal (GI) diseases including gastroenteritis, inflammatory bowel disease, and colorectal cancer. Campylobacter concisus, an oral commensal historically linked to gingivitis and periodontitis, has been increasingly detected in the lower GI tract. In the present study, we generated robust genome sequence data from C. concisus strains and undertook a comprehensive pangenome assessment to identify C. concisus virulence properties and to explain potential adaptations acquired while residing in specific ecological niche(s) of the GI tract. Genomes of 53 new C. concisus strains were sequenced, assembled, and annotated including 36 strains from gastroenteritis patients, 13 strains from Crohn's disease patients and four strains from colitis patients (three collagenous colitis and one lymphocytic colitis). When compared with previous published sequences, strains clustered into two main groups/genomospecies (GS) with phylogenetic clustering explained neither by disease phenotype nor sample location. Paired oral/faecal isolates, from the same patient, indicated that there are few genetic differences between oral and gut isolates which suggests that gut isolates most likely reflect oral strain relocation. Type IV and VI secretion systems genes, genes known to be important for pathogenicity in the Campylobacter genus, were present in the genomes assemblies, with 82% containing Type VI secretion system genes. Our findings indicate that C. concisus strains are genetically diverse, and the variability in bacterial secretion system content may play an important role in their virulence potential.

RevDate: 2018-07-08

Clarke TH, Brinkac LM, Inman JM, et al (2018)

PanACEA: a bioinformatics tool for the exploration and visualization of bacterial pan-chromosomes.

BMC bioinformatics, 19(1):246 pii:10.1186/s12859-018-2250-y.

BACKGROUND: Bacterial pan-genomes, comprised of conserved and variable genes across multiple sequenced bacterial genomes, allow for identification of genomic regions that are phylogenetically discriminating or functionally important. Pan-genomes consist of large amounts of data, which can restrict researchers ability to locate and analyze these regions. Multiple software packages are available to visualize pan-genomes, but currently their ability to address these concerns are limited by using only pre-computed data sets, prioritizing core over variable gene clusters, or by not accounting for pan-chromosome positioning in the viewer.

RESULTS: We introduce PanACEA (Pan-genome Atlas with Chromosome Explorer and Analyzer), which utilizes locally-computed interactive web-pages to view ordered pan-genome data. It consists of multi-tiered, hierarchical display pages that extend from pan-chromosomes to both core and variable regions to single genes. Regions and genes are functionally annotated to allow for rapid searching and visual identification of regions of interest with the option that user-supplied genomic phylogenies and metadata can be incorporated. PanACEA's memory and time requirements are within the capacities of standard laptops. The capability of PanACEA as a research tool is demonstrated by highlighting a variable region important in differentiating strains of Enterobacter hormaechei.

CONCLUSIONS: PanACEA can rapidly translate the results of pan-chromosome programs into an intuitive and interactive visual representation. It will empower researchers to visually explore and identify regions of the pan-chromosome that are most biologically interesting, and to obtain publication quality images of these regions.

RevDate: 2018-07-08

Matey-Hernandez ML, Danish Pan Genome Consortium, Brunak S, et al (2018)

Benchmarking the HLA typing performance of Polysolver and Optitype in 50 Danish parental trios.

BMC bioinformatics, 19(1):239 pii:10.1186/s12859-018-2239-6.

BACKGROUND: The adaptive immune response intrinsically depends on hypervariable human leukocyte antigen (HLA) genes. Concomitantly, correct HLA phenotyping is crucial for successful donor-patient matching in organ transplantation. The cost and technical limitations of current laboratory techniques, together with advances in next-generation sequencing (NGS) methodologies, have increased the need for precise computational typing methods.

RESULTS: We tested two widespread HLA typing methods using high quality full genome sequencing data from 150 individuals in 50 family trios from the Genome Denmark project. First, we computed descendant accuracies assessing the agreement in the inheritance of alleles from parents to offspring. Second, we compared the locus-specific homozygosity rates as well as the allele frequencies; and we compared those to the observed values in related populations. We provide guidelines for testing the accuracy of HLA typing methods by comparing family information, which is independent of the availability of curated alleles.

CONCLUSIONS: Although current computational methods for HLA typing generally provide satisfactory results, our benchmark - using data with ultra-high sequencing depth - demonstrates the incompleteness of current reference databases, and highlights the importance of providing genomic databases addressing current sequencing standards, a problem yet to be resolved before benefiting fully from personalised medicine approaches HLA phenotyping is essential.

RevDate: 2018-06-25

Cislak A, Grabowski S, J Holub (2018)

SOPanG: online text searching over a pan-genome.

Bioinformatics (Oxford, England) pii:5043008 [Epub ahead of print].

Motivation: The many thousands of high-quality genomes available nowadays imply a shift from single genome to pan-genomic analyses. A basic algorithmic building brick for such a scenario is online search over a collection of similar texts, a problem with surprisingly few solutions presented so far.

Results: We present SOPanG, a simple tool for exact pattern matching over an elastic-degenerate string, a recently proposed simplified model for the pan-genome. Thanks to bit-parallelism, it achieves pattern matching speeds above 400MB/s, more than an order of magnitude higher than of other software.

Availability: SOPanG is available for free from:

Supplementary information: Supplementary data are available at Bioinformatics online.

RevDate: 2018-06-27

Zhang X, Liu Z, Wei G, et al (2018)

In Silico Genome-Wide Analysis Reveals the Potential Links Between Core Genome of Acidithiobacillus thiooxidans and Its Autotrophic Lifestyle.

Frontiers in microbiology, 9:1255.

The coinage "pan-genome" was first introduced dating back to 2005, and was used to elaborate the entire gene repertoire of any given species. Core genome consists of genes shared by all bacterial strains studied and is considered to encode essential functions associated with species' basic biology and phenotypes, yet its relatedness with bacterial lifestyle of the species remains elusive. We performed the pan-genome analysis of sulfur-oxidizing acidophile Acidithiobacillus thiooxidans as a case study to highlight species' core genome and its relevance with autotrophic lifestyle of bacterial species. The mathematical modeling based on bacterial genomes of A. thiooxidans species, including a novel strain ZBY isolated from Zambian copper mine plus eight other recognized strains, was attempted to extrapolate the expansion of its pan-genome, suggesting that A. thiooxidans pan-genome is closed. Further investigation revealed a common set of genes, many of which were assigned to metabolic profiles, notably with respect to energy metabolism, amino acid metabolism, and carbohydrate metabolism. The predicted metabolic profiles of A. thiooxidans were characterized by the fixation of inorganic carbon, assimilation of nitrogen compounds, and aerobic oxidation of various sulfur species. Notably, several hydrogenase (H2ase)-like genes dispersed in core genome might represent the novel classes due to the potential functional disparities, despite being closely related homologous genes that code for H2ase. Overall, the findings shed light on the distinguishing features of A. thiooxidans genomes on a global scale, and extend the understanding of its conserved core genome pertaining to autotrophic lifestyle.

RevDate: 2018-07-06

Tschitschko B, Erdmann S, DeMaere MZ, et al (2018)

Genomic variation and biogeography of Antarctic haloarchaea.

Microbiome, 6(1):113 pii:10.1186/s40168-018-0495-3.

BACKGROUND: The genomes of halophilic archaea (haloarchaea) often comprise multiple replicons. Genomic variation in haloarchaea has been linked to viral infection pressure and, in the case of Antarctic communities, can be caused by intergenera gene exchange. To expand understanding of genome variation and biogeography of Antarctic haloarchaea, here we assessed genomic variation between two strains of Halorubrum lacusprofundi that were isolated from Antarctic hypersaline lakes from different regions (Vestfold Hills and Rauer Islands). To assess variation in haloarchaeal populations, including the presence of genomic islands, metagenomes from six hypersaline Antarctic lakes were characterised.

RESULTS: The sequence of the largest replicon of each Hrr. lacusprofundi strain (primary replicon) was highly conserved, while each of the strains' two smaller replicons (secondary replicons) were highly variable. Intergenera gene exchange was identified, including the sharing of a type I-B CRISPR system. Evaluation of infectivity of an Antarctic halovirus provided experimental evidence for the differential susceptibility of the strains, bolstering inferences that strain variation is important for modulating interactions with viruses. A relationship was found between genomic structuring and the location of variation within replicons and genomic islands, demonstrating that the way in which haloarchaea accommodate genomic variability relates to replicon structuring. Metagenome read and contig mapping and clustering and scaling analyses demonstrated biogeographical patterning of variation consistent with environment and distance effects. The metagenome data also demonstrated that specific haloarchaeal species dominated the hypersaline systems indicating they are endemic to Antarctica.

CONCLUSION: The study describes how genomic variation manifests in Antarctic-lake haloarchaeal communities and provides the basis for future assessments of Antarctic regional and global biogeography of haloarchaea.

RevDate: 2018-06-21

Yu J, Zhao J, Song Y, et al (2018)

Comparative Genomics of the Herbivore Gut Symbiont Lactobacillus reuteri Reveals Genetic Diversity and Lifestyle Adaptation.

Frontiers in microbiology, 9:1151.

Lactobacillus reuteri is a catalase-negative, Gram-positive, non-motile, obligately heterofermentative bacterial species that has been used as a model to describe the ecology and evolution of vertebrate gut symbionts. However, the genetic features and evolutionary strategies of L. reuteri from the gastrointestinal tract of herbivores remain unknown. Therefore, 16 L. reuteri strains isolated from goat, sheep, cow, and horse in Inner Mongolia, China were sequenced in this study. A comparative genomic approach was used to assess genetic diversity and gain insight into the distinguishing features related to the different hosts based on 21 published genomic sequences. Genome size, G + C content, and average nucleotide identity values of the L. reuteri strains from different hosts indicated that the strains have broad genetic diversity. The pan-genome of 37 L. reuteri strains contained 8,680 gene families, and the core genome contained 726 gene families. A total of 92,270 nucleotide mutation sites were discovered among 37 L. reuteri strains, and all core genes displayed a Ka/Ks ratio much lower than 1, suggesting strong purifying selective pressure (negative selection). A highly robust maximum likelihood tree based on the core genes shown in the herbivore isolates were divided into three clades; clades A and B contained most of the herbivore isolates and were more closely related to human isolates and vastly distinct from clade C. Some functional genes may be attributable to host-specific of the herbivore, omnivore, and sourdough groups. Moreover, the numbers of genes encoding cell surface proteins and active carbohydrate enzymes were host-specific. This study provides new insight into the adaptation of L. reuteri to the intestinal habitat of herbivores, suggesting that the genomic diversity of L. reuteri from different ecological origins is closely associated with their living environment.

RevDate: 2018-06-19

Sibbesen JA, Maretty L, Danish Pan-Genome Consortium, et al (2018)

Accurate genotyping across variant classes and lengths using variant graphs.

Nature genetics pii:10.1038/s41588-018-0145-5 [Epub ahead of print].

Genotype estimates from short-read sequencing data are typically based on the alignment of reads to a linear reference, but reads originating from more complex variants (for example, structural variants) often align poorly, resulting in biased genotype estimates. This bias can be mitigated by first collecting a set of candidate variants across discovery methods, individuals and databases, and then realigning the reads to the variants and reference simultaneously. However, this realignment problem has proved computationally difficult. Here, we present a new method (BayesTyper) that uses exact alignment of read k-mers to a graph representation of the reference and variants to efficiently perform unbiased, probabilistic genotyping across the variation spectrum. We demonstrate that BayesTyper generally provides superior variant sensitivity and genotyping accuracy relative to existing methods when used to integrate variants across discovery approaches and individuals. Finally, we demonstrate that including a 'variation-prior' database containing already known variants significantly improves sensitivity.

RevDate: 2018-06-19

Kawasaki M, Delamare-Deboutteville J, Bowater RO, et al (2018)

Microevolution of aquatic Streptococcus agalactiae ST-261 from Australia indicates dissemination via imported tilapia and ongoing adaptation to marine hosts or environment.

Applied and environmental microbiology pii:AEM.00859-18 [Epub ahead of print].

Streptococcus agalactiae (GBS) causes disease in a wide range of animals. The serotype Ib lineage is highly adapted to aquatic hosts, exhibiting substantial genome reduction compared with terrestrial conspecifics. Here we sequence genomes from 40 GBS isolates including 25 from wild fish and captive stingrays in Australia, six local veterinary or human clinical isolates, and nine isolates from farmed tilapia in Honduras and compare with 42 genomes from public databases. Phylogenetic analysis based on non-recombinant core genome SNPs indicated that aquatic serotype Ib isolates from Queensland were distantly related to local veterinary and human clinical isolates. In contrast, Australian aquatic isolates are most closely related to a tilapia isolate from Israel, differing by only 63 core-genome SNPs. A consensus minimum spanning tree based on core genome SNPs indicates dissemination of ST-261 from an ancestral tilapia strain, which is congruent with several introductions of tilapia into Australia from Israel during the 1970s and 1980s. Pan-genome analysis identified 1,440 genes as core with the majority being dispensable or strain-specific with non-protein-coding intergenic regions (IGRs) divided amongst core and strain-specific genes. Aquatic serotype Ib strains have lost many virulence factors during adaptation, but six adhesins were well conserved across the aquatic isolates and might be critical for virulence in fish and targets for vaccine development. The close relationship amongst recent ST-261 isolates from Ghana, USA and China with the Israeli tilapia isolate from 1988 implicates the global trade in tilapia seed for aquaculture in the widespread dissemination of serotype Ib fish-adapted GBS.ImportanceStreptococcus agalactiae (GBS) is a significant pathogen of humans and animals. Some lineages have become adapted to particular hosts and serotype Ib is highly specialized to fish. Here we show that this lineage is likely to have been distributed widely by the global trade in tilapia for aquaculture, with probable introduction into Australia in the 1970s and subsequent dissemination in wild fish populations. We report variability in the polysaccharide capsule amongst this lineage, but identify a cohort of common surface proteins that may be a focus of future vaccine development to reduce the biosecurity risk in international fish trade.

RevDate: 2018-07-11

Rodriguez-R LM, Gunturu S, Harvey WT, et al (2018)

The Microbial Genomes Atlas (MiGA) webserver: taxonomic and gene diversity analysis of Archaea and Bacteria at the whole genome level.

Nucleic acids research, 46(W1):W282-W288.

The small subunit ribosomal RNA gene (16S rRNA) has been successfully used to catalogue and study the diversity of prokaryotic species and communities but it offers limited resolution at the species and finer levels, and cannot represent the whole-genome diversity and fluidity. To overcome these limitations, we introduced the Microbial Genomes Atlas (MiGA), a webserver that allows the classification of an unknown query genomic sequence, complete or partial, against all taxonomically classified taxa with available genome sequences, as well as comparisons to other related genomes including uncultivated ones, based on the genome-aggregate Average Nucleotide and Amino Acid Identity (ANI/AAI) concepts. MiGA integrates best practices in sequence quality trimming and assembly and allows input to be raw reads or assemblies from isolate genomes, single-cell sequences, and metagenome-assembled genomes (MAGs). Further, MiGA can take as input hundreds of closely related genomes of the same or closely related species (a so-called 'Clade Project') to assess their gene content diversity and evolutionary relationships, and calculate important clade properties such as the pangenome and core gene sets. Therefore, MiGA is expected to facilitate a range of genome-based taxonomic and diversity studies, and quality assessment across environmental and clinical settings. MiGA is available at

RevDate: 2018-06-22

Mahfouz N, Caucci S, Achatz E, et al (2018)

High genomic diversity of multi-drug resistant wastewater Escherichia coli.

Scientific reports, 8(1):8928 pii:10.1038/s41598-018-27292-6.

Wastewater treatment plants play an important role in the emergence of antibiotic resistance. They provide a hot spot for exchange of resistance within and between species. Here, we analyse and quantify the genomic diversity of the indicator Escherichia coli in a German wastewater treatment plant and we relate it to isolates' antibiotic resistance. Our results show a surprisingly large pan-genome, which mirrors how rich an environment a treatment plant is. We link the genomic analysis to a phenotypic resistance screen and pinpoint genomic hot spots, which correlate with a resistance phenotype. Besides well-known resistance genes, this forward genomics approach generates many novel genes, which correlated with resistance and which are partly completely unknown. A surprising overall finding of our analyses is that we do not see any difference in resistance and pan genome size between isolates taken from the inflow of the treatment plant and from the outflow. This means that while treatment plants reduce the amount of bacteria released into the environment, they do not reduce the potential for antibiotic resistance of these bacteria.

RevDate: 2018-06-15

Legendre M, Fabre E, Poirot O, et al (2018)

Diversity and evolution of the emerging Pandoraviridae family.

Nature communications, 9(1):2285 pii:10.1038/s41467-018-04698-4.

With DNA genomes reaching 2.5 Mb packed in particles of bacterium-like shape and dimension, the first two Acanthamoeba-infecting pandoraviruses remained up to now the most complex viruses since their discovery in 2013. Our isolation of three new strains from distant locations and environments is now used to perform the first comparative genomics analysis of the emerging worldwide-distributed Pandoraviridae family. Thorough annotation of the genomes combining transcriptomic, proteomic, and bioinformatic analyses reveals many non-coding transcripts and significantly reduces the former set of predicted protein-coding genes. Here we show that the pandoraviruses exhibit an open pan-genome, the enormous size of which is not adequately explained by gene duplications or horizontal transfers. As most of the strain-specific genes have no extant homolog and exhibit statistical features comparable to intergenic regions, we suggest that de novo gene creation could contribute to the evolution of the giant pandoravirus genomes.

RevDate: 2018-06-27

Fang X, Monk JM, Mih N, et al (2018)

Escherichia coli B2 strains prevalent in inflammatory bowel disease patients have distinct metabolic capabilities that enable colonization of intestinal mucosa.

BMC systems biology, 12(1):66 pii:10.1186/s12918-018-0587-5.

BACKGROUND: Escherichia coli is considered a leading bacterial trigger of inflammatory bowel disease (IBD). E. coli isolates from IBD patients primarily belong to phylogroup B2. Previous studies have focused on broad comparative genomic analysis of E. coli B2 isolates, and identified virulence factors that allow B2 strains to reside within human intestinal mucosa. Metabolic capabilities of E. coli strains have been shown to be related to their colonization site, but remain unexplored in IBD-associated strains.

RESULTS: In this study, we utilized pan-genome analysis and genome-scale models (GEMs) of metabolism to study metabolic capabilities of IBD-associated E. coli B2 strains. The study yielded three results: i) Pan-genome analysis of 110 E. coli strains (including 53 isolates from IBD studies) revealed discriminating metabolic genes between B2 strains and other strains; ii) Both comparative genomic analysis and GEMs suggested that B2 strains have an advantage in degrading and utilizing sugars derived from mucus glycan, and iii) GEMs revealed distinct metabolic features in B2 strains that potentially allow them to utilize energy more efficiently. For example, B2 strains lack the enzymes to degrade amadori products, but instead rely on neighboring bacteria to convert these substrates into a more readily usable and potentially less sought after product.

CONCLUSIONS: Taken together, these results suggest that the metabolic capabilities of B2 strains vary significantly from those of other strains, enabling B2 strains to colonize intestinal mucosa.The results from this study motivate a broad experimental assessment of the nutritional effects on E. coli B2 pathophysiology in IBD patients.

RevDate: 2018-06-08

de Moraes MH, Soto EB, Salas González I, et al (2018)

Genome-Wide Comparative Functional Analyses Reveal Adaptations of Salmonella sv. Newport to a Plant Colonization Lifestyle.

Frontiers in microbiology, 9:877.

Outbreaks of salmonellosis linked to the consumption of vegetables have been disproportionately associated with strains of serovar Newport. We tested the hypothesis that strains of sv. Newport have evolved unique adaptations to persistence in plants that are not shared by strains of other Salmonella serovars. We used a genome-wide mutant screen to compare growth in tomato fruit of a sv. Newport strain from an outbreak traced to tomatoes, and a sv. Typhimurium strain from animals. Most genes in the sv. Newport strain that were selected during persistence in tomatoes were shared with, and similarly selected in, the sv. Typhimurium strain. Many of their functions are linked to central metabolism, including amino acid biosynthetic pathways, iron acquisition, and maintenance of cell structure. One exception was a greater need for the core genes involved in purine metabolism in sv. Typhimurium than in sv. Newport. We discovered a gene, papA, that was unique to sv. Newport and contributed to the strain's fitness in tomatoes. The papA gene was present in about 25% of sv. Newport Group III genomes and generally absent from other Salmonella genomes. Homologs of papA were detected in the genomes of Pantoea, Dickeya, and Pectobacterium, members of the Enterobacteriacea family that can colonize both plants and animals.

RevDate: 2018-06-08

Adamek M, Alanjary M, Sales-Ortells H, et al (2018)

Comparative genomics reveals phylogenetic distribution patterns of secondary metabolites in Amycolatopsis species.

BMC genomics, 19(1):426 pii:10.1186/s12864-018-4809-4.

BACKGROUND: Genome mining tools have enabled us to predict biosynthetic gene clusters that might encode compounds with valuable functions for industrial and medical applications. With the continuously increasing number of genomes sequenced, we are confronted with an overwhelming number of predicted clusters. In order to guide the effective prioritization of biosynthetic gene clusters towards finding the most promising compounds, knowledge about diversity, phylogenetic relationships and distribution patterns of biosynthetic gene clusters is necessary.

RESULTS: Here, we provide a comprehensive analysis of the model actinobacterial genus Amycolatopsis and its potential for the production of secondary metabolites. A phylogenetic characterization, together with a pan-genome analysis showed that within this highly diverse genus, four major lineages could be distinguished which differed in their potential to produce secondary metabolites. Furthermore, we were able to distinguish gene cluster families whose distribution correlated with phylogeny, indicating that vertical gene transfer plays a major role in the evolution of secondary metabolite gene clusters. Still, the vast majority of the diverse biosynthetic gene clusters were derived from clusters unique to the genus, and also unique in comparison to a database of known compounds. Our study on the locations of biosynthetic gene clusters in the genomes of Amycolatopsis' strains showed that clusters acquired by horizontal gene transfer tend to be incorporated into non-conserved regions of the genome thereby allowing us to distinguish core and hypervariable regions in Amycolatopsis genomes.

CONCLUSIONS: Using a comparative genomics approach, it was possible to determine the potential of the genus Amycolatopsis to produce a huge diversity of secondary metabolites. Furthermore, the analysis demonstrates that horizontal and vertical gene transfer play an important role in the acquisition and maintenance of valuable secondary metabolites. Our results cast light on the interconnections between secondary metabolite gene clusters and provide a way to prioritize biosynthetic pathways in the search and discovery of novel compounds.

RevDate: 2018-06-02

Zhao Q, Feng Q, Lu H, et al (2018)

Publisher Correction: Pan-genome analysis highlights the extent of genomic variation in cultivated and wild rice.

When published, this article did not initially appear open access. This error has been corrected, and the open access status of the paper is noted in all versions of the paper.

RevDate: 2018-07-17

Angermeyer A, Das MM, Singh DV, et al (2018)

Analysis of 19 Highly Conserved Vibrio cholerae Bacteriophages Isolated from Environmental and Patient Sources Over a Twelve-Year Period.

Viruses, 10(6): pii:v10060299.

The Vibrio cholerae biotype "El Tor" is responsible for all of the current epidemic and endemic cholera outbreaks worldwide. These outbreaks are clonal, and it is hypothesized that they originate from the coastal areas near the Bay of Bengal, where the lytic bacteriophage ICP1 (International Centre for Diarrhoeal Disease Research, Bangladesh cholera phage 1) specifically preys upon these pathogenic outbreak strains. ICP1 has also been the dominant bacteriophage found in cholera patient stools since 2001. However, little is known about the genomic differences between the ICP1 strains that have been collected over time. Here, we elucidate the pan-genome and the phylogeny of the ICP1 strains by aligning, annotating, and analyzing the genomes of 19 distinct isolates that were collected between 2001 and 2012. Our results reveal that the ICP1 isolates are highly conserved and possess a large core-genome as well as a smaller, somewhat flexible accessory-genome. Despite its overall conservation, ICP1 strains have managed to acquire a number of unknown genes, as well as a CRISPR-Cas system which is known to be critical for its ongoing struggle for co-evolutionary dominance over its host. This study describes a foundation on which to construct future molecular and bioinformatic studies of these V. cholerae-associated bacteriophages.

RevDate: 2018-06-07

Bulagonda EP, Manivannan B, Mahalingam N, et al (2018)

Comparative genomic analysis of a naturally competent Elizabethkingia anophelis isolated from an eye infection.

Scientific reports, 8(1):8447 pii:10.1038/s41598-018-26874-8.

Elizabethkingia anophelis has now emerged as an opportunistic human pathogen. However, its mechanisms of transmission remain unexplained. Comparative genomic (CG) analysis of E. anopheles endophthalmitis strain surprisingly found from an eye infection patient with twenty-five other E. anophelis genomes revealed its potential to participate in horizontal gene transfer. CG analysis revealed that the study isolate has an open pan genome and has undergone extensive gene rearrangements. We demonstrate that the strain is naturally competent, hitherto not reported in any members of Elizabethkingia. Presence of competence related genes, mobile genetic elements, Type IV, VI secretory systems and a unique virulence factor arylsulfatase suggests a different lineage of the strain. Deciphering the genome of E. anophelis having a reservoir of antibiotic resistance genes and virulence factors associated with diverse human infections may open up avenues to deal with the myriad of its human infections and devise strategies to combat the pathogen.

RevDate: 2018-06-03

Oyedara OO, Segura-Cabrera A, Guo X, et al (2018)

Whole-Genome Sequencing and Comparative Genome Analysis Provided Insight into the Predatory Features and Genetic Diversity of Two Bdellovibrio Species Isolated from Soil.

International journal of genomics, 2018:9402073.

Bdellovibrio spp. are predatory bacteria with great potential as antimicrobial agents. Studies have shown that members of the genus Bdellovibrio exhibit peculiar characteristics that influence their ecological adaptations. In this study, whole genomes of two different Bdellovibrio spp. designated SKB1291214 and SSB218315 isolated from soil were sequenced. The core genes shared by all the Bdellovibrio spp. considered for the pangenome analysis including the epibiotic B. exovorus were 795. The number of unique genes identified in Bdellovibrio spp. SKB1291214, SSB218315, W, and B. exovorus JJS was 1343, 113, 857, and 1572, respectively. These unique genes encode hydrolytic, chemotaxis, and transporter proteins which might be useful for predation in the Bdellovibrio strains. Furthermore, the two Bdellovibrio strains exhibited differences based on the % GC content, amino acid identity, and 16S rRNA gene sequence. The 16S rRNA gene sequence of Bdellovibrio sp. SKB1291214 shared 99% identity with that of an uncultured Bdellovibrio sp. clone 12L 106 (a pairwise distance of 0.008) and 95-97% identity (a pairwise distance of 0.043) with that of other culturable terrestrial Bdellovibrio spp., including strain SSB218315. In Bdellovibrio sp. SKB1291214, 174 bp sequence was inserted at the host interaction (hit) locus region usually attributed to prey attachment, invasion, and development of host independent Bdellovibrio phenotypes. Also, a gene equivalent to Bd0108 in B. bacteriovorus HD100 was not conserved in Bdellovibrio sp. SKB1291214. The results of this study provided information on the genetic characteristics and diversity of the genus Bdellovibrio that can contribute to their successful applications as a biocontrol agent.

RevDate: 2018-05-28

Gias E, Brosnahan CL, Orr D, et al (2018)

In vivo growth and genomic characterization of rickettsia-like organisms isolated from farmed Chinook salmon (Oncorhynchus tshawytscha) in New Zealand.

Journal of fish diseases [Epub ahead of print].

A rickettsia-like organism, designated NZ-RLO2, was isolated from Chinook salmon (Oncorhynchus tshawytscha) farmed in the South Island, New Zealand. In vivo growth showed NZ-RLO2 was able to grow in CHSE-214, EPC, BHK-21, C6/36 and Sf21 cell lines, while Piscirickettsia salmonis LF-89T grew in all but BHK-21 and Sf21. NZ-RLO2 grew optimally in EPC at 15°C, CHSE-214 and EPC at 18°C. The growth of LF-89 T was optimal at 15°C, 18°C and 22°C in CHSE-24, but appeared less efficient in EPC cells at all temperatures. Pan-genome comparison of predicted proteomes shows that available Chilean strains of P. salmonis grouped into two clusters (p-value = 94%). NZ-RLO2 was genetically different from previously described NZ-RLO1, and both strains grouped separately from the Chilean strains in one of the two clusters (p-value = 88%), but were closely related to each other. TaqMan and Sybr Green real-time PCR targeting RNA polymerase (rpoB) and DNA primase (dnaG), respectively, were developed to detect NZ-RLO2. This study indicates that the New Zealand strains showed a closer genetic relationship to one of the Chilean P. salmonis clusters; however, more Piscirickettsia genomes from wider geographical regions and diverse hosts are needed to better understand the classification within this genus.

RevDate: 2018-06-22
CmpDate: 2018-06-22

Hurtado R, Carhuaricra D, Soares S, et al (2018)

Pan-genomic approach shows insight of genetic divergence and pathogenic-adaptation of Pasteurella multocida.

Gene, 670:193-206.

Pasteurella multocida is a gram-negative, non-motile bacterial pathogen, which is associated with chronic and acute infections as snuffles, pneumonia, atrophic rhinitis, fowl cholera and hemorrhagic septicemia. These diseases affect a wide range of domestic animals, leading to significant morbidity and mortality and causing significant economic losses worldwide. Due to the interest in deciphering the genetic diversity and process adaptive between P. multocida strains, this work aimed was to perform a pan-genome analysis to evidence horizontal gene transfer and positive selection among 23 P. multocida strains isolated from distinct diseases and hosts. The results revealed an open pan-genome containing 3585 genes and an accessory genome presenting 1200 genes. The phylogenomic analysis based on the presence/absence of genes and islands exhibit high levels of plasticity, which reflects a high intraspecific diversity and a possible adaptive mechanism responsible for the specific disease manifestation between the established groups (pneumonia, fowl cholera, hemorrhagic septicemia and snuffles). Additionally, we identified differences in accessory genes among groups, which are involved in sugar metabolism and transport systems, virulence-related genes and a high concentration of hypothetical proteins. However, there was no specific indispensable functional mechanism to decisively correlate the presence of genes and their adaptation to a specific host/disease. Also, positive selection was found only for two genes from sub-group hemorrhagic septicemia, serotype B. This comprehensive comparative genome analysis will provide new insights of horizontal gene transfers that play an essential role in the diversification and adaptation mechanism into P. multocida species to a specific disease.

RevDate: 2018-05-25

Abreu VAC, Popin RV, Alvarenga DO, et al (2018)

Corrigendum: Genomic and Genotypic Characterization of Cylindrospermopsis raciborskii: Toward an Intraspecific Phylogenetic Evaluation by Comparative Genomics.

Frontiers in microbiology, 9:979.

[This corrects the article on p. 306 in vol. 9, PMID: 29535689.].

RevDate: 2018-07-05
CmpDate: 2018-07-05

Jiao J, Ni M, Zhang B, et al (2018)

Coordinated regulation of core and accessory genes in the multipartite genome of Sinorhizobium fredii.

PLoS genetics, 14(5):e1007428 pii:PGENETICS-D-18-00237.

Prokaryotes benefit from having accessory genes, but it is unclear how accessory genes can be linked with the core regulatory network when developing adaptations to new niches. Here we determined hierarchical core/accessory subsets in the multipartite pangenome (composed of genes from the chromosome, chromid and plasmids) of the soybean microsymbiont Sinorhizobium fredii by comparing twelve Sinorhizobium genomes. Transcriptomes of two S. fredii strains at mid-log and stationary growth phases and in symbiotic conditions were obtained. The average level of gene expression, variation of expression between different conditions, and gene connectivity within the co-expression network were positively correlated with the gene conservation level from strain-specific accessory genes to genus core. Condition-dependent transcriptomes exhibited adaptive transcriptional changes in pangenome subsets shared by the two strains, while strain-dependent transcriptomes were enriched with accessory genes on the chromid. Proportionally more chromid genes than plasmid genes were co-expressed with chromosomal genes, while plasmid genes had a higher within-replicon connectivity in expression than chromid ones. However, key nitrogen fixation genes on the symbiosis plasmid were characterized by high connectivity in both within- and between-replicon analyses. Among those genes with host-specific upregulation patterns, chromosomal znu and mdt operons, encoding a conserved high-affinity zinc transporter and an accessory multi-drug efflux system, respectively, were experimentally demonstrated to be involved in host-specific symbiotic adaptation. These findings highlight the importance of integrative regulation of hierarchical core/accessory components in the multipartite genome of bacteria during niche adaptation and in shaping the prokaryotic pangenome in the long run.

RevDate: 2018-06-27

Satti M, Tanizawa Y, Endo A, et al (2018)

Comparative analysis of probiotic bacteria based on a new definition of core genome.

Journal of bioinformatics and computational biology, 16(3):1840012.

The commensal genus Bifidobacterium has probiotic properties. We prepared a public library of the gene functions of the genus Bifidobacterium for its online annotation. Orthologous gene cluster analysis showed that the pan genomes of Bifidobacterium and Lactobacillus exhibit striking similarities when mapped to the Clusters of Orthologous Group (COG) database of proteins. When the core genes in each genus were selected based on our statistical definition of "core genome", core genes were present in at least 92% of 52 Bifidobacterium and in 97% of 178 Lactobacillus genomes. Functional comparison of the core genes of the two genera revealed a significant difference in the categories "amino acid transport and metabolism" representing their difference in niche specificity. Over-represented Bifidobacterium protein families were primarily involved in host interactions, the complex compound metabolism, and in stress responses. These findings coincide with the published information and validate our bias-resilient definition of the core genome.

RevDate: 2018-05-25

Lacey JA, Allnutt TR, Vezina B, et al (2018)

Whole genome analysis reveals the diversity and evolutionary relationships between necrotic enteritis-causing strains of Clostridium perfringens.

BMC genomics, 19(1):379 pii:10.1186/s12864-018-4771-1.

BACKGROUND: Clostridium perfringens causes a range of diseases in animals and humans including necrotic enteritis in chickens and food poisoning and gas gangrene in humans. Necrotic enteritis is of concern in commercial chicken production due to the cost of the implementation of infection control measures and to productivity losses. This study has focused on the genomic analysis of a range of chicken-derived C. perfringens isolates, from around the world and from different years. The genomes were sequenced and compared with 20 genomes available from public databases, which were from a diverse collection of isolates from chickens, other animals, and humans. We used a distance based phylogeny that was constructed based on gene content rather than sequence identity. Similarity between strains was defined as the number of genes that they have in common divided by their total number of genes. In this type of phylogenetic analysis, evolutionary distance can be interpreted in terms of evolutionary events such as acquisition and loss of genes, whereas the underlying properties (the gene content) can be interpreted in terms of function. We also compared these methods to the sequence-based phylogeny of the core genome.

RESULTS: Distinct pathogenic clades of necrotic enteritis-causing C. perfringens were identified. They were characterised by variable regions encoded on the chromosome, with predicted roles in capsule production, adhesion, inhibition of related strains, phage integration, and metabolism. Some strains have almost identical genomes, even though they were isolated from different geographic regions at various times, while other highly distant genomes appear to result in similar outcomes with regard to virulence and pathogenesis.

CONCLUSIONS: The high level of diversity in chicken isolates suggests there is no reliable factor that defines a chicken strain of C. perfringens, however, disease-causing strains can be defined by the presence of netB-encoding plasmids. This study reveals that horizontal gene transfer appears to play a significant role in genetic variation of the C. perfringens chromosome as well as the plasmid content within strains.

RevDate: 2018-05-22

Kulsum U, Kapil A, Singh H, et al (2018)

NGSPanPipe: A Pipeline for Pan-genome Identification in Microbial Strains from Experimental Reads.

Advances in experimental medicine and biology, 1052:39-49.

Recent advancements in sequencing technologies have decreased both time span and cost for sequencing the whole bacterial genome. High-throughput Next-Generation Sequencing (NGS) technology has led to the generation of enormous data concerning microbial populations publically available across various repositories. As a consequence, it has become possible to study and compare the genomes of different bacterial strains within a species or genus in terms of evolution, ecology and diversity. Studying the pan-genome provides insights into deciphering microevolution, global composition and diversity in virulence and pathogenesis of a species. It can also assist in identifying drug targets and proposing vaccine candidates. The effective analysis of these large genome datasets necessitates the development of robust tools. Current methods to develop pan-genome do not support direct input of raw reads from the sequencer machine but require preprocessing of reads as an assembled protein/gene sequence file or the binary matrix of orthologous genes/proteins. We have designed an easy-to-use integrated pipeline, NGSPanPipe, which can directly identify the pan-genome from short reads. The output from the pipeline is compatible with other pan-genome analysis tools. We evaluated our pipeline with other methods for developing pan-genome, i.e. reference-based assembly and de novo assembly using simulated reads of Mycobacterium tuberculosis. The single script pipeline ( is applicable for all bacterial strains. It integrates multiple in-house Perl scripts and is freely accessible from .

RevDate: 2018-05-25

Kim YB, Kim JY, Song HS, et al (2018)

Novel haloarchaeon Natrinema thermophila having the highest growth temperature among haloarchaea with a large genome size.

Scientific reports, 8(1):7777 pii:10.1038/s41598-018-25887-7.

Environmental temperature is one of the most important factors for the growth and survival of microorganisms. Here we describe a novel extremely halophilic archaeon (haloarchaea) designated as strain CBA1119T isolated from solar salt. Strain CBA1119T had the highest maximum and optimal growth temperatures (66 °C and 55 °C, respectively) and one of the largest genome sizes among haloarchaea (5.1 Mb). It also had the largest number of strain-specific pan-genome orthologous groups and unique pathways among members of the genus Natrinema in the class Halobacteria. A dendrogram based on the presence/absence of genes and a phylogenetic tree constructed based on OrthoANI values highlighted the particularities of strain CBA1119T as compared to other Natrinema species and other haloarchaea members. The large genome of strain CBA1119T may provide information on genes that confer tolerance to extreme environmental conditions, which may lead to the discovery of other thermophilic strains with potential applications in industrial biotechnology.

RevDate: 2018-05-18

Vinuesa P, Ochoa-Sánchez LE, B Contreras-Moreira (2018)

GET_PHYLOMARKERS, a Software Package to Select Optimal Orthologous Clusters for Phylogenomics and Inferring Pan-Genome Phylogenies, Used for a Critical Geno-Taxonomic Revision of the Genus Stenotrophomonas.

Frontiers in microbiology, 9:771.

The massive accumulation of genome-sequences in public databases promoted the proliferation of genome-level phylogenetic analyses in many areas of biological research. However, due to diverse evolutionary and genetic processes, many loci have undesirable properties for phylogenetic reconstruction. These, if undetected, can result in erroneous or biased estimates, particularly when estimating species trees from concatenated datasets. To deal with these problems, we developed GET_PHYLOMARKERS, a pipeline designed to identify high-quality markers to estimate robust genome phylogenies from the orthologous clusters, or the pan-genome matrix (PGM), computed by GET_HOMOLOGUES. In the first context, a set of sequential filters are applied to exclude recombinant alignments and those producing anomalous or poorly resolved trees. Multiple sequence alignments and maximum likelihood (ML) phylogenies are computed in parallel on multi-core computers. A ML species tree is estimated from the concatenated set of top-ranking alignments at the DNA or protein levels, using either FastTree or IQ-TREE (IQT). The latter is used by default due to its superior performance revealed in an extensive benchmark analysis. In addition, parsimony and ML phylogenies can be estimated from the PGM. We demonstrate the practical utility of the software by analyzing 170 Stenotrophomonas genome sequences available in RefSeq and 10 new complete genomes of Mexican environmental S. maltophilia complex (Smc) isolates reported herein. A combination of core-genome and PGM analyses was used to revise the molecular systematics of the genus. An unsupervised learning approach that uses a goodness of clustering statistic identified 20 groups within the Smc at a core-genome average nucleotide identity (cgANIb) of 95.9% that are perfectly consistent with strongly supported clades on the core- and pan-genome trees. In addition, we identified 16 misclassified RefSeq genome sequences, 14 of them labeled as S. maltophilia, demonstrating the broad utility of the software for phylogenomics and geno-taxonomic studies. The code, a detailed manual and tutorials are freely available for Linux/UNIX servers under the GNU GPLv3 license at A docker image bundling GET_PHYLOMARKERS with GET_HOMOLOGUES is available at, which can be easily run on any platform.

RevDate: 2018-05-22

Valenzuela D, Norri T, Välimäki N, et al (2018)

Towards pan-genome read alignment to improve variation calling.

BMC genomics, 19(Suppl 2):87 pii:10.1186/s12864-018-4465-8.

BACKGROUND: Typical human genome differs from the reference genome at 4-5 million sites. This diversity is increasingly catalogued in repositories such as ExAC/gnomAD, consisting of >15,000 whole-genomes and >126,000 exome sequences from different individuals. Despite this enormous diversity, resequencing data workflows are still based on a single human reference genome. Identification and genotyping of genetic variants is typically carried out on short-read data aligned to a single reference, disregarding the underlying variation.

RESULTS: We propose a new unified framework for variant calling with short-read data utilizing a representation of human genetic variation - a pan-genomic reference. We provide a modular pipeline that can be seamlessly incorporated into existing sequencing data analysis workflows. Our tool is open source and available online: .

CONCLUSIONS: Our experiments show that by replacing a standard human reference with a pan-genomic one we achieve an improvement in single-nucleotide variant calling accuracy and in short indel calling accuracy over the widely adopted Genome Analysis Toolkit (GATK) in difficult genomic regions.

RevDate: 2018-05-17

Howat AM, Vollmers J, Taubert M, et al (2018)

Comparative Genomics and Mutational Analysis Reveals a Novel XoxF-Utilizing Methylotroph in the Roseobacter Group Isolated From the Marine Environment.

Frontiers in microbiology, 9:766.

The Roseobacter group comprises a significant group of marine bacteria which are involved in global carbon and sulfur cycles. Some members are methylotrophs, using one-carbon compounds as a carbon and energy source. It has recently been shown that methylotrophs generally require a rare earth element when using the methanol dehydrogenase enzyme XoxF for growth on methanol. Addition of lanthanum to methanol enrichments of coastal seawater facilitated the isolation of a novel methylotroph in the Roseobacter group: Marinibacterium anthonyi strain La 6. Mutation of xoxF5 revealed the essential nature of this gene during growth on methanol and ethanol. Physiological characterization demonstrated the metabolic versatility of this strain. Genome sequencing revealed that strain La 6 has the largest genome of all Roseobacter group members sequenced to date, at 7.18 Mbp. Multilocus sequence analysis (MLSA) showed that whilst it displays the highest core gene sequence similarity with subgroup 1 of the Roseobacter group, it shares very little of its pangenome, suggesting unique genetic adaptations. This research revealed that the addition of lanthanides to isolation procedures was key to cultivating novel XoxF-utilizing methylotrophs from the marine environment, whilst genome sequencing and MLSA provided insights into their potential genetic adaptations and relationship to the wider community.

RevDate: 2018-05-15

Zolfo M, Asnicar F, Manghi P, et al (2018)

Profiling microbial strains in urban environments using metagenomic sequencing data.

Biology direct, 13(1):9 pii:10.1186/s13062-018-0211-z.

BACKGROUND: The microbial communities populating human and natural environments have been extensively characterized with shotgun metagenomics, which provides an in-depth representation of the microbial diversity within a sample. Microbes thriving in urban environments may be crucially important for human health, but have received less attention than those of other environments. Ongoing efforts started to target urban microbiomes at a large scale, but the most recent computational methods to profile these metagenomes have never been applied in this context. It is thus currently unclear whether such methods, that have proven successful at distinguishing even closely related strains in human microbiomes, are also effective in urban settings for tasks such as cultivation-free pathogen detection and microbial surveillance. Here, we aimed at a) testing the currently available metagenomic profiling tools on urban metagenomics; b) characterizing the organisms in urban environment at the resolution of single strain and c) discussing the biological insights that can be inferred from such methods.

RESULTS: We applied three complementary methods on the 1614 metagenomes of the CAMDA 2017 challenge. With MetaMLST we identified 121 known sequence-types from 15 species of clinical relevance. For instance, we identified several Acinetobacter strains that were close to the nosocomial opportunistic pathogen A. nosocomialis. With StrainPhlAn, a generalized version of the MetaMLST approach, we inferred the phylogenetic structure of Pseudomonas stutzeri strains and suggested that the strain-level heterogeneity in environmental samples is higher than in the human microbiome. Finally, we also probed the functional potential of the different strains with PanPhlAn. We further showed that SNV-based and pangenome-based profiling provide complementary information that can be combined to investigate the evolutionary trajectories of microbes and to identify specific genetic determinants of virulence and antibiotic resistances within closely related strains.

CONCLUSION: We show that strain-level methods developed primarily for the analysis of human microbiomes can be effective for city-associated microbiomes. In fact, (opportunistic) pathogens can be tracked and monitored across many hundreds of urban metagenomes. However, while more effort is needed to profile strains of currently uncharacterized species, this work poses the basis for high-resolution analyses of microbiomes sampled in city and mass transportation environments.

REVIEWERS: This article was reviewed by Alexandra Bettina Graf, Daniel Huson and Trevor Cickovski.

RevDate: 2018-05-07

Kumar R, Acharya V, Singh D, et al (2018)

Strategies for high-altitude adaptation revealed from high-quality draft genome of non-violacein producing Janthinobacterium lividum ERGS5:01.

Standards in genomic sciences, 13:11 pii:313.

A light pink coloured bacterial strain ERGS5:01 isolated from glacial stream water of Sikkim Himalaya was affiliated to Janthinobacterium lividum based on 16S rRNA gene sequence identity and phylogenetic clustering. Whole genome sequencing was performed for the strain to confirm its taxonomy as it lacked the typical violet pigmentation of the genus and also to decipher its survival strategy at the aquatic ecosystem of high elevation. The PacBio RSII sequencing generated genome of 5,168,928 bp with 4575 protein-coding genes and 118 RNA genes. Whole genome-based multilocus sequence analysis clustering, in silico DDH similarity value of 95.1% and, the ANI value of 99.25% established the identity of the strain ERGS5:01 (MCC 2953) as a non-violacein producing J. lividum. The genome comparisons across genus Janthinobacterium revealed an open pan-genome with the scope of the addition of new orthologous cluster to complete the genomic inventory. The genomic insight provided the genetic basis of freezing and frequent freeze-thaw cycle tolerance and, for industrially important enzymes. Extended insight into the genome provided clues of crucial genes associated with adaptation in the harsh aquatic ecosystem of high altitude.

RevDate: 2018-05-11

Oliver A, Kay M, KK Cooper (2018)

Comparative genomics of cocci-shaped Sporosarcina strains with diverse spatial isolation.

BMC genomics, 19(1):310 pii:10.1186/s12864-018-4635-8.

BACKGROUND: Cocci-shaped Sporosarcina strains are currently one of the few known cocci-shaped spore-forming bacteria, yet we know very little about the genomics. The goal of this study is to utilize comparative genomics to investigate the diversity of cocci-shaped Sporosarcina strains that differ in their geographical isolation and show different nutritional requirements.

RESULTS: For this study, we sequenced 28 genomes of cocci-shaped Sporosarcina strains isolated from 13 different locations around the world. We generated the first six complete genomes and methylomes utilizing PacBio sequencing, and an additional 22 draft genomes using Illumina sequencing. Genomic analysis revealed that cocci-shaped Sporosarcina strains contained an average genome of 3.3 Mb comprised of 3222 CDS, 54 tRNAs and 6 rRNAs, while only two strains contained plasmids. The cocci-shaped Sporosarcina genome on average contained 2.3 prophages and 15.6 IS elements, while methylome analysis supported the diversity of these strains as only one of 31 methylation motifs were shared under identical growth conditions. Analysis with a 90% identity cut-off revealed 221 core genes or ~ 7% of the genome, while a 30% identity cut-off generated a pan-genome of 8610 genes. The phylogenetic relationship of the cocci-shaped Sporosarcina strains based on either core genes, accessory genes or spore-related genes consistently resulted in the 29 strains being divided into eight clades.

CONCLUSIONS: This study begins to unravel the phylogenetic relationship of cocci-shaped Sporosarcina strains, and the comparative genomics of these strains supports identification of several new species.

RevDate: 2018-05-03

Wang W, Mauleon R, Hu Z, et al (2018)

Genomic variation in 3,010 diverse accessions of Asian cultivated rice.

Nature, 557(7703):43-49.

Here we analyse genetic variation, population structure and diversity among 3,010 diverse Asian cultivated rice (Oryza sativa L.) genomes from the 3,000 Rice Genomes Project. Our results are consistent with the five major groups previously recognized, but also suggest several unreported subpopulations that correlate with geographic location. We identified 29 million single nucleotide polymorphisms, 2.4 million small indels and over 90,000 structural variations that contribute to within- and between-population variation. Using pan-genome analyses, we identified more than 10,000 novel full-length protein-coding genes and a high number of presence-absence variations. The complex patterns of introgression observed in domestication genes are consistent with multiple independent rice domestication events. The public availability of data from the 3,000 Rice Genomes Project provides a resource for rice genomics research and breeding.

RevDate: 2018-06-28

Rodrigues RAL, Andreani J, Andrade ACDSP, et al (2018)

Morphologic and Genomic Analyses of New Isolates Reveal a Second Lineage of Cedratviruses.

Journal of virology, 92(13): pii:JVI.00372-18.

Giant viruses have been isolated and characterized in different environments, expanding our knowledge about the biology of these unique microorganisms. In the last 2 years, a new group was discovered, the cedratviruses, currently composed of only two isolates and members of a putative new family, "Pithoviridae," along with previously known pithoviruses. Here we report the isolation and biological and genomic characterization of two novel cedratviruses isolated from samples collected in France and Brazil. Both viruses were isolated using Acanthamoeba castellanii as a host cell and exhibit ovoid particles with corks at either extremity of the particle. Curiously, the Brazilian cedratvirus is ∼20% smaller and presents a shorter genome of 460,038 bp, coding for fewer proteins than other cedratviruses. In addition, it has a completely asyntenic genome and presents a lower amino acid identity of orthologous genes (∼73%). Pangenome analysis comprising the four cedratviruses revealed an increase in the pangenome concomitant with a decrease in the core genome with the addition of the two novel viruses. Finally, phylogenetic analyses clustered the Brazilian virus in a separate branch within the group of cedratviruses, while the French isolate is closer to the previously reported Cedratvirus lausannensis Taking all together, we propose the existence of a second lineage of this emerging viral genus and provide new insights into the biodiversity and ubiquity of these giant viruses.IMPORTANCE Various giant viruses have been described in recent years, revealing a unique part of the virosphere. A new group among the giant viruses has recently been described, the cedratviruses, which is currently composed of only two isolates. In this paper, we describe two novel cedratviruses isolated from French and Brazilian samples. Biological and genomic analyses showed viruses with different particle sizes, genome lengths, and architecture, revealing the existence of a second lineage of this new group of giant viruses. Our results provide new insights into the biodiversity of cedratviruses and highlight the importance of ongoing efforts to prospect for and characterize new giant viruses.

RevDate: 2018-06-07

Nguyen TL, DH Kim (2018)

Genome-Wide Comparison Reveals a Probiotic Strain Lactococcus Lactis WFLU12 Isolated from the Gastrointestinal Tract of Olive Flounder (Paralichthys Olivaceus) Harboring Genes Supporting Probiotic Action.

Marine drugs, 16(5): pii:md16050140.

Our previous study has shown that dietary supplementation with Lactococcus lactis WFLU12 can enhance the growth of olive flounder and its resistance against streptococcal infection. The objective of the present study was to use comparative genomics tools to investigate genomic characteristics of strain WFLU12 and the presence of genes supporting its probiotic action using sequenced genomes of L. lactis strains. Dispensable and singleton genes of strain WFLU12 were found to be more enriched in genes associated with metabolism (e.g., energy production and conversion, and carbohydrate transport and metabolism) than pooled dispensable and singleton genes in other L. lactis strains, reflecting WFLU12 strain-specific ecosystem origin and its ability to metabolize different energy sources. Strain WFLU12 produced antimicrobial compounds that could inhibit several bacterial fish pathogens. It possessed the nisin gene cluster (nisZBTCIPRKFEG) and genes encoding lysozyme and colicin V. However, only three other strains (CV56, IO-1, and SO) harbor a complete nisin gene cluster. We also found that L. lactis WFLU12 possessed many other important functional genes involved in stress responses to the gastrointestinal tract environment, dietary energy extraction, and metabolism to support the probiotic action of this strain found in our previous study. This strongly indicates that not all L. lactis strains can be used as probiotics. This study highlights comparative genomics approaches as very useful and powerful tools to select probiotic candidates and predict their probiotic effects.

RevDate: 2018-07-19
CmpDate: 2018-07-19

Chen C, Wu L, Cao Q, et al (2018)

Genome comparison of different Zymomonas mobilis strains provides insights on conservation of the evolution.

PloS one, 13(4):e0195994 pii:PONE-D-17-35325.

Zymomonas mobilis has the special Entner-Doudoroff (ED) pathway and it has excellent industrial characteristics, including low cell mass formation, high-specific productivity,ethanol yield, notable ethanol tolerance and wide pH range, a relatively small genome size. In this study, the genome sequences of NRRL B-14023 and NRRL B-12526 were sequenced and compared with other strains to explore their evolutionary relationships and the genetic basis of Z. mobilis. The comparative genomic analyses revealed that the 8 strains share a conserved core chromosomal backbone. ZM4, NRRL B-12526, NRRL B-14023, NCIMB 11163 and NRRL B-1960 share 98% sequence identity across the whole genome sequences. Highly similar plasmids and CRISPR repeats were detected in these strains. A whole-genome phylogenetic tree of the 8 strains indicated that NRRL B-12526, NRRL B-14023 and ATCC 10988 had a close evolutionary relationship with the strain ZM4. Furthermore, strains ATCC29191 and ATCC29192 had distinctive CRISPR with a far distant relationship. The size of the pan-genome was 1945 genes, including 1428 core genes and 517 accessory genes. The genomes of Z. mobilis were highly conserved; particularly strains ZM4, NRRL B-12526, NRRL B-14023, NCIMB 11163 and NRRL B-1960 had a close genomic relationship. This comparative study of Z. mobilis presents a foundation for future functional analyses and applications.

RevDate: 2018-05-15

Inglin RC, Meile L, MJA Stevens (2018)

Clustering of Pan- and Core-genome of Lactobacillus provides Novel Evolutionary Insights for Differentiation.

BMC genomics, 19(1):284 pii:10.1186/s12864-018-4601-5.

BACKGROUND: Bacterial taxonomy aims to classify bacteria based on true evolutionary events and relies on a polyphasic approach that includes phenotypic, genotypic and chemotaxonomic analyses. Until now, complete genomes are largely ignored in taxonomy. The genus Lactobacillus consists of 173 species and many genomes are available to study taxonomy and evolutionary events.

RESULTS: We analyzed and clustered 98 completely sequenced genomes of the genus Lactobacillus and 234 draft genomes of 5 different Lactobacillus species, i.e. L. reuteri, L. delbrueckii, L. plantarum, L. rhamnosus and L. helveticus. The core-genome of the genus Lactobacillus contains 266 genes and the pan-genome 20'800 genes. Clustering of the Lactobacillus pan- and core-genome resulted in two highly similar trees. This shows that evolutionary history is traceable in the core-genome and that clustering of the core-genome is sufficient to explore relationships. Clustering of core- and pan-genomes at species' level resulted in similar trees as well. Detailed analyses of the core-genomes showed that the functional class "genetic information processing" is conserved in the core-genome but that "signaling and cellular processes" is not. The latter class encodes functions that are involved in environmental interactions. Evolution of lactobacilli seems therefore directed by the environment. The type species L. delbrueckii was analyzed in detail and its pan-genome based tree contained two major clades whose members contained different genes yet identical functions. In addition, evidence for horizontal gene transfer between strains of L. delbrueckii, L. plantarum, and L. rhamnosus, and between species of the genus Lactobacillus is presented. Our data provide evidence for evolution of some lactobacilli according to a parapatric-like model for species differentiation.

CONCLUSIONS: Core-genome trees are useful to detect evolutionary relationships in lactobacilli and might be useful in taxonomic analyses. Lactobacillus' evolution is directed by the environment and HGT.

RevDate: 2018-05-14

Kupczok A, Neve H, Huang KD, et al (2018)

Rates of Mutation and Recombination in Siphoviridae Phage Genome Evolution over Three Decades.

Molecular biology and evolution, 35(5):1147-1159.

The evolution of asexual organisms is driven not only by the inheritance of genetic modification but also by the acquisition of foreign DNA. The contribution of vertical and horizontal processes to genome evolution depends on their rates per year and is quantified by the ratio of recombination to mutation. These rates have been estimated for bacteria; however, no estimates have been reported for phages. Here, we delineate the contribution of mutation and recombination to dsDNA phage genome evolution. We analyzed 34 isolates of the 936 group of Siphoviridae phages using a Lactococcus lactis strain from a single dairy over 29 years. We estimate a constant substitution rate of 1.9 × 10-4 substitutions per site per year due to mutation that is within the range of estimates for eukaryotic RNA and DNA viruses. The reconstruction of recombination events reveals a constant rate of five recombination events per year and 4.5 × 10-3 nucleotide alterations due to recombination per site per year. Thus, the recombination rate exceeds the substitution rate, resulting in a relative effect of recombination to mutation (r/m) of ∼24 that is homogenous over time. Especially in the early transcriptional region, we detect frequent gene loss and regain due to recombination with phages of the 936 group, demonstrating the role of the 936 group pangenome as a reservoir of genetic variation. The observed substitution rate homogeneity conforms to the neutral theory of evolution; hence, the neutral theory can be applied to phage genome evolution and also to genetic variation brought about by recombination.

RevDate: 2018-05-13

Parry-Hanson Kunadu A, Holmes M, Miller EL, et al (2018)

Microbiological quality and antimicrobial resistance characterization of Salmonella spp. in fresh milk value chains in Ghana.

International journal of food microbiology, 277:41-49.

Consumer perception of poor hygiene of fresh milk products is a major barrier to promotion of milk consumption as an intervention to alleviate the burden of malnutrition in Ghana. Fresh milk is retailed raw, boiled, or processed into unfermented cheese and spontaneously fermented products in unlicensed outlets. In this study, we have determined microbiological quality of informally retailed fresh milk products and characterized the genomic diversity and antimicrobial resistance (AMR) patterns of non-typhoidal Salmonella (NTS) in implicated products. A total of 159 common dairy products were purchased from five traditional milk markets in Accra. Samples were analysed for concentrations of aerobic bacteria, total and fecal coliforms, Escherichia coli, staphylococci, lactic acid bacteria and yeast and moulds. The presence of Salmonella, E. coli O157:H7, Listeria monocytogenes and Staphylococcus aureus were determined. AMR of Salmonella against 18 antibiotics was experimentally determined. Genome sequencing of 19 Salmonella isolates allowed determination of serovars, antigenic profiles, prediction of AMR genes in silico and inference of phylogenetic relatedness between strains. Raw and heat-treated milk did not differ significantly in overall bacterial quality (P = 0.851). E. coli O157:H7 and Staphylococcus aureus were present in 34.3% and 12.9% of dairy products respectively. Multidrug resistant (MDR) Salmonella enterica serovars Muenster and Legon were identified in 11.8% and 5.9% of unfermented cheese samples respectively. Pan genome analysis revealed a total of 3712 core genes. All Salmonella strains were resistant to Trimethoprim/Sulfamethoxazole, Cefoxitin, Cefuroxime Axetil and Cefuroxime. Resistance to Chloramphenicol (18%) and Ciprofloxacin (100%), which are first line antibiotics used in treatment of NTS bacteremia in Ghana, was evident. AMR was attributed to presence and/or mutations in the following genes: golS, sdiA for cephalosporins, aac(6')-Iy, ant(9) for aminoglycosides, mdtK, gyrA, gyrB, parC, parE for quinolones and cat1, cat4 for phenicols. Phylogenetic analysis based on accessory genes clustered S. Legon strains separately from the S. Muenster strains. These strains were from different markets suggesting local circulation of related strains. Our study justifies consumer resistance to consumption of unripened soft cheese without further lethal heat treatment, and provides evidence that supports the Ghana Health Service recommendation for use of 3rd generation cephalosporins for the treatment of MDR NTS infections.

RevDate: 2018-05-25

Baby V, Lachance JC, Gagnon J, et al (2018)

Inferring the Minimal Genome of Mesoplasma florum by Comparative Genomics and Transposon Mutagenesis.

mSystems, 3(3): pii:mSystems00198-17.

The creation and comparison of minimal genomes will help better define the most fundamental mechanisms supporting life. Mesoplasma florum is a near-minimal, fast-growing, nonpathogenic bacterium potentially amenable to genome reduction efforts. In a comparative genomic study of 13 M. florum strains, including 11 newly sequenced genomes, we have identified the core genome and open pangenome of this species. Our results show that all of the strains have approximately 80% of their gene content in common. Of the remaining 20%, 17% of the genes were found in multiple strains and 3% were unique to any given strain. On the basis of random transposon mutagenesis, we also estimated that ~290 out of 720 genes are essential for M. florum L1 in rich medium. We next evaluated different genome reduction scenarios for M. florum L1 by using gene conservation and essentiality data, as well as comparisons with the first working approximation of a minimal organism, Mycoplasma mycoides JCVI-syn3.0. Our results suggest that 409 of the 473 M. mycoides JCVI-syn3.0 genes have orthologs in M. florum L1. Conversely, 57 putatively essential M. florum L1 genes have no homolog in M. mycoides JCVI-syn3.0. This suggests differences in minimal genome compositions, even for these evolutionarily closely related bacteria. IMPORTANCE The last years have witnessed the development of whole-genome cloning and transplantation methods and the complete synthesis of entire chromosomes. Recently, the first minimal cell, Mycoplasma mycoides JCVI-syn3.0, was created. Despite these milestone achievements, several questions remain to be answered. For example, is the composition of minimal genomes virtually identical in phylogenetically related species? On the basis of comparative genomics and transposon mutagenesis, we investigated this question by using an alternative model, Mesoplasma florum, that is also amenable to genome reduction efforts. Our results suggest that the creation of additional minimal genomes could help reveal different gene compositions and strategies that can support life, even within closely related species.

RevDate: 2018-04-12

Zhang X, Liu X, Yang F, et al (2018)

Pan-Genome Analysis Links the Hereditary Variation of Leptospirillum ferriphilum With Its Evolutionary Adaptation.

Frontiers in microbiology, 9:577.

Niche adaptation has long been recognized to drive intra-species differentiation and speciation, yet knowledge about its relatedness with hereditary variation of microbial genomes is relatively limited. Using Leptospirillum ferriphilum species as a case study, we present a detailed analysis of genomic features of five recognized strains. Genome-to-genome distance calculation preliminarily determined the roles of spatial distance and environmental heterogeneity that potentially contribute to intra-species variation within L. ferriphilum species at the genome level. Mathematical models were further constructed to extrapolate the expansion of L. ferriphilum genomes (an 'open' pan-genome), indicating the emergence of novel genes with new sequenced genomes. The identification of diverse mobile genetic elements (MGEs) (such as transposases, integrases, and phage-associated genes) revealed the prevalence of horizontal gene transfer events, which is an important evolutionary mechanism that provides avenues for the recruitment of novel functionalities and further for the genetic divergence of microbial genomes. Comprehensive analysis also demonstrated that the genome reduction by gene loss in a broad sense might contribute to the observed diversification. We thus inferred a plausible explanation to address this observation: the community-dependent adaptation that potentially economizes the limiting resources of the entire community. Now that the introduction of new genes is accompanied by a parallel abandonment of some other ones, our results provide snapshots on the biological fitness cost of environmental adaptation within the L. ferriphilum genomes. In short, our genome-wide analyses bridge the relation between genetic variation of L. ferriphilum with its evolutionary adaptation.

RevDate: 2018-06-07

Holm KO, Bækkedal C, Söderberg JJ, et al (2018)

Complete Genome Sequences of Seven Vibrio anguillarum Strains as Derived from PacBio Sequencing.

Genome biology and evolution, 10(4):1127-1131.

We report here the complete genome sequences of seven Vibrio anguillarum strains isolated from multiple geographic locations, thus increasing the total number of genomes of finished quality to 11. The genomes were de novo assembled from long-sequence PacBio reads. Including draft genomes, a total of 44 V. anguillarum genomes are currently available in the genome databases. They represent an important resource in the study of, for example, genetic variations and for identifying virulence determinants. In this article, we present the genomes and basic genome comparisons of the 11 complete genomes, including a BRIG analysis, and pan genome calculation. We also describe some structural features of superintegrons on chromosome 2 s, and associated insertion sequence (IS) elements, including 18 new ISs (ISVa3 - ISVa20), both of importance in the complement of V. anguillarum genomes.

RevDate: 2018-04-19

Thorpe HA, Bayliss SC, Sheppard SK, et al (2018)

Piggy: a rapid, large-scale pan-genome analysis tool for intergenic regions in bacteria.

GigaScience, 7(4):1-11.

Background: The concept of the "pan-genome," which refers to the total complement of genes within a given sample or species, is well established in bacterial genomics. Rapid and scalable pipelines are available for managing and interpreting pan-genomes from large batches of annotated assemblies. However, despite overwhelming evidence that variation in intergenic regions in bacteria can directly influence phenotypes, most current approaches for analyzing pan-genomes focus exclusively on protein-coding sequences.

Findings: To address this we present Piggy, a novel pipeline that emulates Roary except that it is based only on intergenic regions. A key utility provided by Piggy is the detection of highly divergent ("switched") intergenic regions (IGRs) upstream of genes. We demonstrate the use of Piggy on large datasets of clinically important lineages of Staphylococcus aureus and Escherichia coli.

Conclusions: For S. aureus, we show that highly divergent (switched) IGRs are associated with differences in gene expression and we establish a multilocus reference database of IGR alleles (igMLST; implemented in BIGSdb).

RevDate: 2018-04-09

Croxen MA, Lee TD, Azana R, et al (2018)

Use of genomics to design a diagnostic assay to discriminate between Streptococcus pneumoniae and Streptococcus pseudopneumoniae.

Microbial genomics [Epub ahead of print].

Distinuishing the species of mitis group streptococci is challenging due to ambiguous phenotypic characteristics and high degree of genetic similarity. This has been particularly true for resolving atypical Streptococcus pneumoniae and Streptococcus pseudopneumoniae. We used phylogenetic clustering to demonstrate specific and separate clades for both S. pneumoniae and S. pseudopneumoniae genomes. The genomes that clustered within these defined clades were used to extract species-specific genes from the pan-genome. The S. pneumoniae marker was detected in 8027 out of 8051 (>99.7 %) S. pneumoniae genomes. The S. pseudopneumoniae marker was specific for all genomes that clustered in the S. pseudopneumoniae clade, including unresolved species of the genus Streptococcus sequenced by the BC Centre for Disease Control Public Health Laboratory that previously could not be distinguished by other methods. Other than the presence of the S. pseudopneumoniae marker in six of 8051 (<0.08 %) S. pneumoniae genomes, both the S. pneumoniae and S. pseudopneumoniae markers showed little to no detectable cross-reactivity to the genomes of any other species of the genus Streptococcus or to a panel of over 46 000 genomes from viral, fungal, bacterial pathogens and microbiota commonly found in the respiratory tract. A real-time PCR assay was designed targeting these two markers. Genomics provides a useful technique for PCR assay design and development.

RevDate: 2018-06-07

Murillo T, Ramírez-Vargas G, Riedel T, et al (2018)

Two Groups of Cocirculating, Epidemic Clostridiodes difficile Strains Microdiversify through Different Mechanisms.

Genome biology and evolution, 10(3):982-998.

Clostridiodes difficile strains from the NAPCR1/ST54 and NAP1/ST01 types have caused outbreaks despite of their notable differences in genome diversity. By comparing whole genome sequences of 32 NAPCR1/ST54 isolates and 17 NAP1/ST01 recovered from patients infected with C. difficile we assessed whether mutation, homologous recombination (r) or nonhomologous recombination (NHR) through lateral gene transfer (LGT) have differentially shaped the microdiversification of these strains. The average number of single nucleotide polymorphisms (SNPs) in coding sequences (NAPCR1/ST54 = 24; NAP1/ST01 = 19) and SNP densities (NAPCR1/ST54 = 0.54/kb; NAP1/ST01 = 0.46/kb) in the NAPCR1/ST54 and NAP1/ST01 isolates was comparable. However, the NAP1/ST01 isolates showed 3× higher average dN/dS rates (8.35) that the NAPCR1/ST54 isolates (2.62). Regarding r, whereas 31 of the NAPCR1/ST54 isolates showed 1 recombination block (3,301-8,226 bp), the NAP1/ST01 isolates showed no bases in recombination. As to NHR, the pangenome of the NAPCR1/ST54 isolates was larger (4,802 gene clusters, 26% noncore genes) and more heterogeneous (644 ± 33 gene content changes) than that of the NAP1/ST01 isolates (3,829 gene clusters, ca. 6% noncore genes, 129 ± 37 gene content changes). Nearly 55% of the gene content changes seen among the NAPCR1/ST54 isolates (355 ± 31) were traced back to MGEs with putative genes for antimicrobial resistance and virulence factors that were only detected in single isolates or isolate clusters. Congruently, the LGT/SNP rate calculated for the NAPCR1/ST54 isolates (26.8 ± 2.8) was 4× higher than the one obtained for the NAP1/ST1 isolates (6.8 ± 2.0). We conclude that NHR-LGT has had a greater role in the microdiversification of the NAPCR1/ST54 strains, opposite to the NAP1/ST01 strains, where mutation is known to play a more prominent role.

RevDate: 2018-06-26
CmpDate: 2018-06-26

Azarian T, Grant LR, Arnold BJ, et al (2018)

The impact of serotype-specific vaccination on phylodynamic parameters of Streptococcus pneumoniae and the pneumococcal pan-genome.

PLoS pathogens, 14(4):e1006966 pii:PPATHOGENS-D-17-02341.

In the United States, the introduction of the heptavalent pneumococcal conjugate vaccine (PCV) largely eliminated vaccine serotypes (VT); non-vaccine serotypes (NVT) subsequently increased in carriage and disease. Vaccination also disrupts the composition of the pneumococcal pangenome, which includes mobile genetic elements and polymorphic non-capsular antigens important for virulence, transmission, and pneumococcal ecology. Antigenic proteins are of interest for future vaccines; yet, little is known about how the they are affected by PCV use. To investigate the evolutionary impact of vaccination, we assessed recombination, evolution, and pathogen demographic history of 937 pneumococci collected from 1998-2012 among Navajo and White Mountain Apache Native American communities. We analyzed changes in the pneumococcal pangenome, focusing on metabolic loci and 19 polymorphic protein antigens. We found the impact of PCV on the pneumococcal population could be observed in reduced diversity, a smaller pangenome, and changing frequencies of accessory clusters of orthologous groups (COGs). Post-PCV7, diversity rebounded through clonal expansion of NVT lineages and inferred in-migration of two previously unobserved lineages. Accessory COGs frequencies trended toward pre-PCV7 values with increasing time since vaccine introduction. Contemporary frequencies of protein antigen variants are better predicted by pre-PCV7 values (1998-2000) than the preceding period (2006-2008), suggesting balancing selection may have acted in maintaining variant frequencies in this population. Overall, we present the largest genomic analysis of pneumococcal carriage in the United States to date, which includes a snapshot of a true vaccine-naïve community prior to the introduction of PCV7. These data improve our understanding of pneumococcal evolution and emphasize the need to consider pangenome composition when inferring the impact of vaccination and developing future protein-based pneumococcal vaccines.

RevDate: 2018-04-08

Åvall-Jääskeläinen S, Taponen S, Kant R, et al (2018)

Comparative genome analysis of 24 bovine-associated Staphylococcus isolates with special focus on the putative virulence genes.

PeerJ, 6:e4560 pii:4560.

Non-aureus staphylococci (NAS) are most commonly isolated from subclinical mastitis. Different NAS species may, however, have diverse effects on the inflammatory response in the udder. We determined the genome sequences of 20 staphylococcal isolates from clinical or subclinical bovine mastitis, belonging to the NAS species Staphylococcus agnetis, S. chromogenes, and S. simulans, and focused on the putative virulence factor genes present in the genomes. For comparison we used our previously published genome sequences of four S. aureus isolates from bovine mastitis. The pan-genome and core genomes of the non-aureus isolates were characterized. After that, putative virulence factor orthologues were searched in silico. We compared the presence of putative virulence factors in the NAS species and S. aureus and evaluated the potential association between bacterial genotype and type of mastitis (clinical vs. subclinical). The NAS isolates had much less virulence gene orthologues than the S. aureus isolates. One third of the virulence genes were detected only in S. aureus. About 100 virulence genes were present in all S. aureus isolates, compared to about 40 to 50 in each NAS isolate. S. simulans differed the most. Several of the virulence genes detected among NAS were harbored only by S. simulans, but it also lacked a number of genes present both in S. agnetis and S. chromogenes. The type of mastitis was not associated with any specific virulence gene profile. It seems that the virulence gene profiles or cumulative number of different virulence genes are not directly associated with the type of mastitis (clinical or subclinical), indicating that host derived factors such as the immune status play a pivotal role in the manifestation of mastitis.

RevDate: 2018-04-01

Moldovan MA, MS Gelfand (2018)

Pangenomic Definition of Prokaryotic Species and the Phylogenetic Structure of Prochlorococcus spp.

Frontiers in microbiology, 9:428.

The pangenome is the collection of all groups of orthologous genes (OGGs) from a set of genomes. We apply the pangenome analysis to propose a definition of prokaryotic species based on identification of lineage-specific gene sets. While being similar to the classical biological definition based on allele flow, it does not rely on DNA similarity levels and does not require analysis of homologous recombination. Hence this definition is relatively objective and independent of arbitrary thresholds. A systematic analysis of 110 accepted species with the largest numbers of sequenced strains yields results largely consistent with the existing nomenclature. However, it has revealed that abundant marine cyanobacteria Prochlorococcus marinus should be divided into two species. As a control we have confirmed the paraphyletic origin of Yersinia pseudotuberculosis (with embedded, monophyletic Y. pestis) and Burkholderia pseudomallei (with B. mallei). We also demonstrate that by our definition and in accordance with recent studies Escherichia coli and Shigella spp. are one species.

RevDate: 2018-07-09
CmpDate: 2018-07-09

Kelly AC, TJ Ward (2018)

Population genomics of Fusarium graminearum reveals signatures of divergent evolution within a major cereal pathogen.

PloS one, 13(3):e0194616 pii:PONE-D-18-01198.

The cereal pathogen Fusarium graminearum is the primary cause of Fusarium head blight (FHB) and a significant threat to food safety and crop production. To elucidate population structure and identify genomic targets of selection within major FHB pathogen populations in North America we sequenced the genomes of 60 diverse F. graminearum isolates. We also assembled the first pan-genome for F. graminearum to clarify population-level differences in gene content potentially contributing to pathogen diversity. Bayesian and phylogenomic analyses revealed genetic structure associated with isolates that produce the novel NX-2 mycotoxin, suggesting a North American population that has remained genetically distinct from other endemic and introduced cereal-infecting populations. Genome scans uncovered distinct signatures of selection within populations, focused in high diversity, frequently recombining regions. These patterns suggested selection for genomic divergence at the trichothecene toxin gene cluster and thirteen additional regions containing genes potentially involved in pathogen specialization. Gene content differences further distinguished populations, in that 121 genes showed population-specific patterns of conservation. Genes that differentiated populations had predicted functions related to pathogenesis, secondary metabolism and antagonistic interactions, though a subset had unique roles in temperature and light sensitivity. Our results indicated that F. graminearum populations are distinguished by dozens of genes with signatures of selection and an array of dispensable accessory genes, suggesting that FHB pathogen populations may be equipped with different traits to exploit the agroecosystem. These findings provide insights into the evolutionary processes and genomic features contributing to population divergence in plant pathogens, and highlight candidate genes for future functional studies of pathogen specialization across evolutionarily and ecologically diverse fungi.

RevDate: 2018-04-27
CmpDate: 2018-04-27

Dutkiewicz J, Zając V, Sroka J, et al (2018)

Streptococcus suis: a re-emerging pathogen associated with occupational exposure to pigs or pork products. Part II - Pathogenesis.

Annals of agricultural and environmental medicine : AAEM, 25(1):186-203.

Streptococcus suis is a re-emerging zoonotic pathogen that may cause severe disease, mostly meningitis, in pigs and in humans having occupational contact with pigs and pork, such as farmers, slaughterhose workers and butchers. The first stage of the pathogenic process, similar in pigs and humans, is adherence to and colonisation of mucosal and/or epithelial surface(s) of the host. The second stage is invasion into deeper tissue and extracellular translocation of bacterium in the bloodstream, either free in circulation or attached to the surface of monocytes. If S. suis present in blood fails to cause fatal septicaemia, it is able to progress into the third stage comprising penetration into host's organs, mostly by crossing the blood-brain barrier and/or blood-cerebrospinal fluid barrier to gain access to the central nervous system (CNS) and cause meningitis. The fourth stage is inflammation that plays a key role in the pathogen esis of both systemic and CNS infections caused by S. suis. The pathogen may induce the overproduction of pro-inflammatory cytokines that cause septic shock and/or the recruitment and activation of different leukocyte populations, causing acute inflammation of the CNS. Streptococcus suis can also evoke - through activation of microglial cells, astrocytes and possibly other cell types - a fulminant inflammatory reaction of the brain which leads to intracranial complications, including brain oedema, increased intracranial pressure, cerebrovascular insults, and deafness, as a result of cochlear sepsis. In all stages of the pathogenic process, S. suis interacts with many types of immunocompetent host's cells, such as polymorphonuclear leukocytes, mononuclear macrophages, lymphocytes, dendritic cells and microglia, using a range of versatile virulence factors for evasion of the innate and adaptive immune defence of the host, and for overcoming environmental stress. It is estimated that S. suis produces more than 100 different virulence factors that could be classified into 4 groups: surface components or secreted elements, enzymes, transcription factors or regulatory systems and transporter factors or secretion systems. A major virulence factor is capsular polysaccharide (CPS) that protects bacteria from phagocytosis. However, it hampers adhesion to and invasion of host's cells, release of inflammatory cytokines and formation of the resistant biofilm which, in many cases, is vital for the persistence of bacteria. It has been demonstrated that the arising by mutation unencapsulated S. suis clones, which are more successful in penetration to and propagation within the host's cells, may coexist in the organism of a single host together with those that are encapsulated. Both 'complementary' clones assist each other in the successful colonization of host's tissues and persistence therein. S. suis has an open pan-genome characterized by a frequent gene transfer and a large diversity. Of the genetic determinants of S. suis pathogenicity, the most important are pathogenicity islands (PAI), in particular, a novel DNA segment of 89 kb length with evident pathogenic traits that has been designated as 89K PAI. It has been estimated that more than one-third of the S. suis virulence factors is associated with this PAI. It has been proved that the virulent S. suis strains possess smaller genomes, compared to avirulent ones, but more genes associated with virulence. Overall, the evolution of the species most probably aims towards increased pathogenicity, and hence the most significant task of the current research is an elaboration of a vaccine, efficient both for humans and pigs.

RevDate: 2018-03-19

Cundon CC, Ameal A, Maubecín E, et al (2018)

[Characterization of extraintestinal pathogenic Escherichia coli strains isolated from household dogs and cats in Buenos Aires, Argentina].

Revista Argentina de microbiologia pii:S0325-7541(17)30182-7 [Epub ahead of print].

The pangenome of Escherichia coli is composed of a conserved core and variable genomic regions. The constant genetic component allows to determine the phylogeny of the microorganism, while genetic variability promoted the emergence of intestinal pathogenic strains and extraintestinal strains. In this study we characterized 85 extraintestinal pathogenic isolates genetically isolated from canines and felines. We used the Clermont scheme that includes intestinal (A and B1) and extraintestinal (B2 and D) phylogroups, virulence markers (pap1-2, pap3-4, sfa, afa, hlyA, aer and cnf) and hybrid pathogens. A percentage of 69.4% of the isolates belonged to phylogroup A; 1.2% to phylogroup B1; 16.5% to phylogroup B2 and 12.9% to phylogroup D. The most commonly found gene was sfa (21/85), followed by pap1-2 and cnf (20/85) and pap3-4 (19/85). No hybrids were detected. Animal isolates should be studied due to the zoonotic potential of the microorganism.

RevDate: 2018-03-18

Abreu VAC, Popin RV, Alvarenga DO, et al (2018)

Genomic and Genotypic Characterization of Cylindrospermopsis raciborskii: Toward an Intraspecific Phylogenetic Evaluation by Comparative Genomics.

Frontiers in microbiology, 9:306.

Cylindrospermopsis raciborskii is a freshwater cyanobacterial species with increasing bloom reports worldwide that are likely due to factors related to climate change. In addition to the deleterious effects of blooms on aquatic ecosystems, the majority of ecotypes can synthesize toxic secondary metabolites causing public health issues. To overcome the harmful effects of C. raciborskii blooms, it is important to advance knowledge of diversity, genetic variation, and evolutionary processes within populations. An efficient approach to exploring this diversity and understanding the evolution of C. raciborskii is to use comparative genomics. Here, we report two new draft genomes of C. raciborskii (strains CENA302 and CENA303) from Brazilian isolates of different origins and explore their molecular diversity, phylogeny, and evolutionary diversification by comparing their genomes with sequences from other strains available in public databases. The results obtained by comparing seven C. raciborskii and the Raphidiopsis brookii D9 genomes revealed a set of conserved core genes and a variable set of accessory genes, such as those involved in the biosynthesis of natural products, heterocyte glycolipid formation, and nitrogen fixation. Gene cluster arrangements related to the biosynthesis of the antifungal cyclic glycosylated lipopeptide hassallidin were identified in four C. raciborskii genomes, including the non-nitrogen fixing strain CENA303. Shifts in gene clusters involved in toxin production according to geographic origins were observed, as well as a lack of nitrogen fixation (nif) and heterocyte glycolipid (hgl) gene clusters in some strains. Single gene phylogeny (16S rRNA sequences) was congruent with phylogeny based on 31 concatenated housekeeping protein sequences, and both analyses have shown, with high support values, that the species C. raciborskii is monophyletic. This comparative genomics study allowed a species-wide view of the biological diversity of C. raciborskii and in some cases linked genome differences to phenotype.

RevDate: 2018-07-10
CmpDate: 2018-07-10

Beck C, Knoop H, R Steuer (2018)

Modules of co-occurrence in the cyanobacterial pan-genome reveal functional associations between groups of ortholog genes.

PLoS genetics, 14(3):e1007239 pii:PGENETICS-D-17-02072.

Cyanobacteria are a monophyletic phylogenetic group of global importance and have received considerable attention as potential host organisms for the renewable synthesis of chemical bulk products from atmospheric CO2. The cyanobacterial phylum exhibits enormous metabolic diversity with respect to morphology, lifestyle and habitat. As yet, however, research has mostly focused on few model strains and cyanobacterial diversity is insufficiently understood. In this respect, the increasing availability of fully sequenced bacterial genomes opens new and unprecedented opportunities to investigate the genetic inventory of organisms in the context of their pan-genome. Here, we seek understand cyanobacterial diversity using a comparative genome analysis of 77 fully sequenced and assembled cyanobacterial genomes. We use phylogenetic profiling to analyze the co-occurrence of clusters of likely ortholog genes (CLOGs) and reveal novel functional associations between CLOGs that are not captured by co-localization of genes. Going beyond pair-wise co-occurrences, we propose a network approach that allows us to identify modules of co-occurring CLOGs. The extracted modules exhibit a high degree of functional coherence and reveal known as well as previously unknown functional associations. We argue that the high functional coherence observed for the modules is a consequence of the similar-yet-diverse nature of cyanobacteria. Our approach highlights the importance of a multi-strain analysis to understand gene functions and environmental adaptations, with implications beyond the cyanobacterial phylum. The analysis is augmented with a simple toolbox that facilitates further analysis to investigate the co-occurrence neighborhood of specific CLOGs of interest.

RevDate: 2018-03-11

Dias GM, Bidault A, Le Chevalier P, et al (2018)

Vibrio tapetis Displays an Original Type IV Secretion System in Strains Pathogenic for Bivalve Molluscs.

Frontiers in microbiology, 9:227.

The Brown Ring Disease (BRD) caused high mortality rates since 1986 in the Manila clam Venerupis philippinarum introduced and cultured in Western Europe from the 1970s. The causative agent of BRD is a Gram-Negative bacterium, Vibrio tapetis, which is also pathogenic to fish. Here we report the first assembly of the complete genome of V. tapetis CECT4600T, together with the genome sequences of 16 additional strains isolated across a broad host and geographic range. Our extensive genome dataset allowed us to describe the pathogen pan- and core genomes and to identify putative virulence factors. The V. tapetis core genome consists of 3,352 genes, including multiple potential virulence factors represented by haemolysins, transcriptional regulators, Type I restriction modification system, GGDEF domain proteins, several conjugative plasmids, and a Type IV secretion system. Future research on the coevolutionary arms race between V. tapetis virulence factors and host resistance mechanisms will improve our understanding of how pathogenicity develops in this emerging pathogen.

RevDate: 2018-05-18

Méric G, Mageiros L, Pascoe B, et al (2018)

Lineage-specific plasmid acquisition and the evolution of specialized pathogens in Bacillus thuringiensis and the Bacillus cereus group.

Molecular ecology, 27(7):1524-1540.

Bacterial plasmids can vary from small selfish genetic elements to large autonomous replicons that constitute a significant proportion of total cellular DNA. By conferring novel function to the cell, plasmids may facilitate evolution but their mobility may be opposed by co-evolutionary relationships with chromosomes or encouraged via the infectious sharing of genes encoding public goods. Here, we explore these hypotheses through large-scale examination of the association between plasmids and chromosomal DNA in the phenotypically diverse Bacillus cereus group. This complex group is rich in plasmids, many of which encode essential virulence factors (Cry toxins) that are known public goods. We characterized population genomic structure, gene content and plasmid distribution to investigate the role of mobile elements in diversification. We analysed coding sequence within the core and accessory genome of 190 B. cereus group isolates, including 23 novel sequences and genes from 410 reference plasmid genomes. While cry genes were widely distributed, those with invertebrate toxicity were predominantly associated with one sequence cluster (clade 2) and phenotypically defined Bacillus thuringiensis. Cry toxin plasmids in clade 2 showed evidence of recent horizontal transfer and variable gene content, a pattern of plasmid segregation consistent with transfer during infectious cooperation. Nevertheless, comparison between clades suggests that co-evolutionary interactions may drive association between plasmids and chromosomes and limit wider transfer of key virulence traits. Proliferation of successful plasmid and chromosome combinations is a feature of specialized pathogens with characteristic niches (Bacillus anthracis, B. thuringiensis) and has occurred multiple times in the B. cereus group.

RevDate: 2018-03-29

Argemi X, Nanoukon C, Affolabi D, et al (2018)

Comparative Genomics and Identification of an Enterotoxin-Bearing Pathogenicity Island, SEPI-1/SECI-1, in Staphylococcus epidermidis Pathogenic Strains.

Toxins, 10(3): pii:toxins10030093.

Staphylococcus epidermidis is a leading cause of nosocomial infections, majorly resistant to beta-lactam antibiotics, and may transfer several mobile genetic elements among the members of its own species, as well as to Staphylococcus aureus; however, a genetic exchange from S. aureus to S. epidermidis remains controversial. We recently identified two pathogenic clinical strains of S. epidermidis that produce a staphylococcal enterotoxin C3-like (SEC) similar to that by S. aureus pathogenicity islands. This study aimed to determine the genetic environment of the SEC-coding sequence and to identify the mobile genetic elements. Whole-genome sequencing and annotation of the S. epidermidis strains were performed using Illumina technology and a bioinformatics pipeline for assembly, which provided evidence that the SEC-coding sequences were located in a composite pathogenicity island that was previously described in the S. epidermidis strain FRI909, called SePI-1/SeCI-1, with 83.8-89.7% nucleotide similarity. Various other plasmids were identified, particularly p_3_95 and p_4_95, which carry antibiotic resistance genes (hsrA and dfrG, respectively), and share homologies with SAP085A and pUSA04-2-SUR11, two plasmids described in S. aureus. Eventually, one complete prophage was identified, ΦSE90, sharing 30 out of 52 coding sequences with the Acinetobacter phage vB_AbaM_IME200. Thus, the SePI-1/SeCI-1 pathogenicity island was identified in two pathogenic strains of S. epidermidis that produced a SEC enterotoxin causing septic shock. These findings suggest the existence of in vivo genetic exchange from S. aureus to S. epidermidis.

RevDate: 2018-03-04

Stice SP, Stumpf SD, Gitaitis RD, et al (2018)

Pantoea ananatis Genetic Diversity Analysis Reveals Limited Genomic Diversity as Well as Accessory Genes Correlated with Onion Pathogenicity.

Frontiers in microbiology, 9:184.

Pantoea ananatis is a member of the family Enterobacteriaceae and an enigmatic plant pathogen with a broad host range. Although P. ananatis strains can be aggressive on onion causing foliar necrosis and onion center rot, previous genomic analysis has shown that P. ananatis lacks the primary virulence secretion systems associated with other plant pathogens. We assessed a collection of fifty P. ananatis strains collected from Georgia over three decades to determine genetic factors that correlated with onion pathogenic potential. Previous genetic analysis studies have compared strains isolated from different hosts with varying diseases potential and isolation sources. Strains varied greatly in their pathogenic potential and aggressiveness on different cultivated Allium species like onion, leek, shallot, and chive. Using multi-locus sequence analysis (MLSA) and repetitive extragenic palindrome repeat (rep)-PCR techniques, we did not observe any correlation between onion pathogenic potential and genetic diversity among strains. Whole genome sequencing and pan-genomic analysis of a sub-set of 10 strains aided in the identification of a novel series of genetic regions, likely plasmid borne, and correlating with onion pathogenicity observed on single contigs of the genetic assemblies. We named these loci Onion Virulence Regions (OVR) A-D. The OVR loci contain genes involved in redox regulation as well as pectate lyase and rhamnogalacturonase genes. Previous studies have not identified distinct genetic loci or plasmids correlating with onion foliar pathogenicity or pathogenicity on a single host pathosystem. The lack of focus on a single host system for this phytopathgenic disease necessitates the pan-genomic analysis performed in this study.

RevDate: 2018-06-01

Lee LL, Blumer-Schuette SE, Izquierdo JA, et al (2018)

Genus-Wide Assessment of Lignocellulose Utilization in the Extremely Thermophilic Genus Caldicellulosiruptor by Genomic, Pangenomic, and Metagenomic Analyses.

Applied and environmental microbiology, 84(9): pii:AEM.02694-17.

Metagenomic data from Obsidian Pool (Yellowstone National Park, USA) and 13 genome sequences were used to reassess genus-wide biodiversity for the extremely thermophilic Caldicellulosiruptor The updated core genome contains 1,401 ortholog groups (average genome size for 13 species = 2,516 genes). The pangenome, which remains open with a revised total of 3,493 ortholog groups, encodes a variety of multidomain glycoside hydrolases (GHs). These include three cellulases with GH48 domains that are colocated in the glucan degradation locus (GDL) and are specific determinants for microcrystalline cellulose utilization. Three recently sequenced species, Caldicellulosiruptor sp. strain Rt8.B8 (renamed here Caldicellulosiruptor morganii), Thermoanaerobacter cellulolyticus strain NA10 (renamed here Caldicellulosiruptor naganoensis), and Caldicellulosiruptor sp. strain Wai35.B1 (renamed here Caldicellulosiruptor danielii), degraded Avicel and lignocellulose (switchgrass). C. morganii was more efficient than Caldicellulosiruptor bescii in this regard and differed from the other 12 species examined, both based on genome content and organization and in the specific domain features of conserved GHs. Metagenomic analysis of lignocellulose-enriched samples from Obsidian Pool revealed limited new information on genus biodiversity. Enrichments yielded genomic signatures closely related to that of Caldicellulosiruptor obsidiansis, but there was also evidence for other thermophilic fermentative anaerobes (Caldanaerobacter, Fervidobacterium, Caloramator, and Clostridium). One enrichment, containing 89.8% Caldicellulosiruptor and 9.7% Caloramator, had a capacity for switchgrass solubilization comparable to that of C. bescii These results refine the known biodiversity of Caldicellulosiruptor and indicate that microcrystalline cellulose degradation at temperatures above 70°C, based on current information, is limited to certain members of this genus that produce GH48 domain-containing enzymes.IMPORTANCE The genus Caldicellulosiruptor contains the most thermophilic bacteria capable of lignocellulose deconstruction, which are promising candidates for consolidated bioprocessing for the production of biofuels and bio-based chemicals. The focus here is on the extant capability of this genus for plant biomass degradation and the extent to which this can be inferred from the core and pangenomes, based on analysis of 13 species and metagenomic sequence information from environmental samples. Key to microcrystalline hydrolysis is the content of the glucan degradation locus (GDL), a set of genes encoding glycoside hydrolases (GHs), several of which have GH48 and family 3 carbohydrate binding module domains, that function as primary cellulases. Resolving the relationship between the GDL and lignocellulose degradation will inform efforts to identify more prolific members of the genus and to develop metabolic engineering strategies to improve this characteristic.

RevDate: 2018-02-27

Castillo D, Pérez-Reytor D, Plaza N, et al (2018)

Exploring the Genomic Traits of Non-toxigenic Vibrio parahaemolyticus Strains Isolated in Southern Chile.

Frontiers in microbiology, 9:161.

Vibrio parahaemolyticus is the leading cause of seafood-borne gastroenteritis worldwide. As reported in other countries, after the rise and fall of the pandemic strain in Chile, other post-pandemic strains have been associated with clinical cases, including strains lacking the major toxins TDH and TRH. Since the presence or absence of tdh and trh genes has been used for diagnostic purposes and as a proxy of the virulence of V. parahaemolyticus isolates, the understanding of virulence in V. parahaemolyticus strains lacking toxins is essential to detect these strains present in water and marine products to avoid possible food-borne infection. In this study, we characterized the genome of four environmental and two clinical non-toxigenic strains (tdh-, trh-, and T3SS2-). Using whole-genome sequencing, phylogenetic, and comparative genome analysis, we identified the core and pan-genome of V. parahaemolyticus of strains of southern Chile. The phylogenetic tree based on the core genome showed low genetic diversity but the analysis of the pan-genome revealed that all strains harbored genomic islands carrying diverse virulence and fitness factors or prophage-like elements that encode toxins like Zot and RTX. Interestingly, the three strains carrying Zot-like toxin have a different sequence, although the alignment showed some conserved areas with the zot sequence found in V. cholerae. In addition, we identified an unexpected diversity in the genetic architecture of the T3SS1 gene cluster and the presence of the T3SS2 gene cluster in a non-pandemic environmental strain. Our study sheds light on the diversity of V. parahaemolyticus strains from the southern Pacific which increases our current knowledge regarding the global diversity of this organism.

RevDate: 2018-02-27

Duchaud E, Rochat T, Habib C, et al (2018)

Genomic Diversity and Evolution of the Fish Pathogen Flavobacterium psychrophilum.

Frontiers in microbiology, 9:138.

Flavobacterium psychrophilum, the etiological agent of rainbow trout fry syndrome and bacterial cold-water disease in salmonid fish, is currently one of the main bacterial pathogens hampering the productivity of salmonid farming worldwide. In this study, the genomic diversity of the F. psychrophilum species is analyzed using a set of 41 genomes, including 30 newly sequenced isolates. These were selected on the basis of available MLST data with the two-fold objective of maximizing the coverage of the species diversity and of allowing a focus on the main clonal complex (CC-ST10) infecting farmed rainbow trout (Oncorhynchus mykiss) worldwide. The results reveal a bacterial species harboring a limited genomic diversity both in terms of nucleotide diversity, with ~0.3% nucleotide divergence inside CDSs in pairwise genome comparisons, and in terms of gene repertoire, with the core genome accounting for ~80% of the genes in each genome. The pan-genome seems nevertheless "open" according to the scaling exponent of a power-law fitted on the rate of new gene discovery when genomes are added one-by-one. Recombination is a key component of the evolutionary process of the species as seen in the high level of apparent homoplasy in the core genome. Using a Hidden Markov Model to delineate recombination tracts in pairs of closely related genomes, the average recombination tract length was estimated to ~4.0 Kbp and the typical ratio of the contributions of recombination and mutations to nucleotide-level differentiation (r/m) was estimated to ~13. Within CC-ST10, evolutionary distances computed on non-recombined regions and comparisons between 22 isolates sampled up to 27 years apart suggest a most recent common ancestor in the second half of the nineteenth century in North America with subsequent diversification and transmission of this clonal complex coinciding with the worldwide expansion of rainbow trout farming. With the goal to promote the development of tools for the genetic manipulation of F. psychrophilum, a particular attention was also paid to plasmids. Their extraction and sequencing to completion revealed plasmid diversity that remained hidden to classical plasmid profiling due to size similarities.

RevDate: 2018-02-18

Lin H, Yu M, Wang X, et al (2018)

Comparative genomic analysis reveals the evolution and environmental adaptation strategies of vibrios.

BMC genomics, 19(1):135 pii:10.1186/s12864-018-4531-2.

BACKGROUND: Vibrios are among the most diverse and ecologically important marine bacteria, which have evolved many characteristics and lifestyles to occupy various niches. The relationship between genome features and environmental adaptation strategies is an essential part for understanding the ecological functions of vibrios in the marine system. The advent of complete genome sequencing technology has provided an important method of examining the genetic characteristics of vibrios on the genomic level.

RESULTS: Two Vibrio genomes were sequenced and found to occupy many unique orthologues families which absent from the previously genes pool of the complete genomes of vibrios. Comparative genomics analysis found vibrios encompass a steady core-genome and tremendous pan-genome with substantial gene gain and horizontal gene transfer events in the evolutionary history. Evolutionary analysis based on the core-genome tree suggested that V. fischeri emerged ~ 385 million years ago, along with the occurrence of cephalopods and the flourish of fish. The relatively large genomes, the high number of 16S rRNA gene copies, and the presence of R-M systems and CRISPR system help vibrios live in various marine environments. Chitin-degrading related genes are carried in nearly all the Vibrio genomes. The number of chitinase genes in vibrios has been extremely expanded compared to which in the most recent ancestor of the genus. The chitinase A genes were estimated to have evolved along with the genus, and have undergone significant purifying selective force to conserve the ancestral state.

CONCLUSIONS: Vibrios have experienced extremely genome expansion events during their evolutionary history, allowing them to develop various functions to spread globally. Despite their close phylogenetic relationships, vibrios were found to have a tremendous pan-genome with a steady core-genome, which indicates the highly plastic genome of the genus. Additionally, the existence of various chitin-degrading related genes and the expansion of chitinase A in the genus demonstrate the importance of the chitin utilization for vibrios. Defensive systems in the Vibrio genomes may protect them from the invasion of external DNA. These genomic features investigated here provide a better knowledge of how the evolutionary process has forged Vibrio genomes to occupy various niches.

RevDate: 2018-04-21

Viver T, Orellana L, González-Torres P, et al (2018)

Genomic comparison between members of the Salinibacteraceae family, and description of a new species of Salinibacter (Salinibacter altiplanensis sp. nov.) isolated from high altitude hypersaline environments of the Argentinian Altiplano.

Systematic and applied microbiology, 41(3):198-212.

The application of tandem MALDI-TOF MS screening with 16S rRNA gene sequencing of selected isolates has been demonstrated to be an excellent approach for retrieving novelty from large-scale culturing. The application of such methodologies in different hypersaline samples allowed the isolation of the culture-recalcitrant Salinibacter ruber second phylotype (EHB-2) for the first time, as well as a new species recently isolated from the Argentinian Altiplano hypersaline lakes. In this study, the genome sequences of the different species of the phylum Rhodothermaeota were compared and the genetic repertoire along the evolutionary gradient was analyzed together with each intraspecific variability. Altogether, the results indicated an open pan-genome for the family Salinibacteraceae, as well as the codification of relevant traits such as diverse rhodopsin genes, CRISPR-Cas systems and spacers, and one T6SS secretion system that could give ecological advantages to an EHB-2 isolate. For the new Salinibacter species, we propose the name Salinibacter altiplanensis sp. nov. (the designated type strain is AN15T=CECT 9105T=IBRC-M 11031T).


ESP Quick Facts

ESP Origins

In the early 1990's, Robert Robbins was a faculty member at Johns Hopkins, where he directed the informatics core of GDB — the human gene-mapping database of the international human genome project. To share papers with colleagues around the world, he set up a small paper-sharing section on his personal web page. This small project evolved into The Electronic Scholarly Publishing Project.

ESP Support

In 1995, Robbins became the VP/IT of the Fred Hutchinson Cancer Research Center in Seattle, WA. Soon after arriving in Seattle, Robbins secured funding, through the ELSI component of the US Human Genome Project, to create the original ESP.ORG web site, with the formal goal of providing free, world-wide access to the literature of classical genetics.

ESP Rationale

Although the methods of molecular biology can seem almost magical to the uninitiated, the original techniques of classical genetics are readily appreciated by one and all: cross individuals that differ in some inherited trait, collect all of the progeny, score their attributes, and propose mechanisms to explain the patterns of inheritance observed.

ESP Goal

In reading the early works of classical genetics, one is drawn, almost inexorably, into ever more complex models, until molecular explanations begin to seem both necessary and natural. At that point, the tools for understanding genome research are at hand. Assisting readers reach this point was the original goal of The Electronic Scholarly Publishing Project.

ESP Usage

Usage of the site grew rapidly and has remained high. Faculty began to use the site for their assigned readings. Other on-line publishers, ranging from The New York Times to Nature referenced ESP materials in their own publications. Nobel laureates (e.g., Joshua Lederberg) regularly used the site and even wrote to suggest changes and improvements.

ESP Content

When the site began, no journals were making their early content available in digital format. As a result, ESP was obliged to digitize classic literature before it could be made available. For many important papers — such as Mendel's original paper or the first genetic map — ESP had to produce entirely new typeset versions of the works, if they were to be available in a high-quality format.

ESP Help

Early support from the DOE component of the Human Genome Project was critically important for getting the ESP project on a firm foundation. Since that funding ended (nearly 20 years ago), the project has been operated as a purely volunteer effort. Anyone wishing to assist in these efforts should send an email to Robbins.

ESP Plans

With the development of methods for adding typeset side notes to PDF files, the ESP project now plans to add annotated versions of some classical papers to its holdings. We also plan to add new reference and pedagogical material. We have already started providing regularly updated, comprehensive bibliographies to the ESP.ORG site.

Electronic Scholarly Publishing
21454 NE 143rd Street
Woodinville, WA 98077

E-mail: RJR8222 @

Papers in Classical Genetics

The ESP began as an effort to share a handful of key papers from the early days of classical genetics. Now the collection has grown to include hundreds of papers, in full-text format.

Digital Books

Along with papers on classical genetics, ESP offers a collection of full-text digital books, including many works by Darwin (and even a collection of poetry — Chicago Poems by Carl Sandburg).


ESP now offers a much improved and expanded collection of timelines, designed to give the user choice over subject matter and dates.


Biographical information about many key scientists.

Selected Bibliographies

Bibliographies on several topics of potential interest to the ESP community are now being automatically maintained and generated on the ESP site.

ESP Picks from Around the Web (updated 07 JUL 2018 )