Viewport Size Code:
Login | Create New Account


About | Classical Genetics | Timelines | What's New | What's Hot

About | Classical Genetics | Timelines | What's New | What's Hot


Bibliography Options Menu

Hide Abstracts   |   Hide Additional Links
Long bibliographies are displayed in blocks of 100 citations at a time. At the end of each block there is an option to load the next block.

Bibliography on: Pangenome

The Electronic Scholarly Publishing Project: Providing world-wide, free access to classic scientific papers and other scholarly materials, since 1993.


ESP: PubMed Auto Bibliography 09 Apr 2020 at 01:32 Created: 


Although the enforced stability of genomic content is ubiquitous among MCEs, the opposite is proving to be the case among prokaryotes, which exhibit remarkable and adaptive plasticity of genomic content. Early bacterial whole-genome sequencing efforts discovered that whenever a particular "species" was re-sequenced, new genes were found that had not been detected earlier — entirely new genes, not merely new alleles. This led to the concepts of the bacterial core-genome, the set of genes found in all members of a particular "species", and the flex-genome, the set of genes found in some, but not all members of the "species". Together these make up the species' pan-genome.

Created with PubMed® Query: pangenome or "pan-genome" or "pan genome" NOT pmcbook NOT ispreviousversion

Citations The Papers (from PubMed®)

RevDate: 2020-04-08

Zhou Y, Chebotarov D, Kudrna D, et al (2020)

A platinum standard pan-genome resource that represents the population structure of Asian rice.

Scientific data, 7(1):113 pii:10.1038/s41597-020-0438-2.

As the human population grows from 7.8 billion to 10 billion over the next 30 years, breeders must do everything possible to create crops that are highly productive and nutritious, while simultaneously having less of an environmental footprint. Rice will play a critical role in meeting this demand and thus, knowledge of the full repertoire of genetic diversity that exists in germplasm banks across the globe is required. To meet this demand, we describe the generation, validation and preliminary analyses of transposable element and long-range structural variation content of 12 near-gap-free reference genome sequences (RefSeqs) from representatives of 12 of 15 subpopulations of cultivated Asian rice. When combined with 4 existing RefSeqs, that represent the 3 remaining rice subpopulations and the largest admixed population, this collection of 16 Platinum Standard RefSeqs (PSRefSeq) can be used as a template to map resequencing data to detect virtually all standing natural variation that exists in the pan-genome of cultivated Asian rice.

RevDate: 2020-04-04

Smith EA, Miller EA, Weber BP, et al (2020)

Genomic landscape of Ornithobacterium rhinotracheale in commercial turkey production in the United States.

Applied and environmental microbiology pii:AEM.02874-19 [Epub ahead of print].

Ornithobacterium rhinotracheale (ORT) is a causative agent of respiratory tract infections in avian hosts worldwide, but is a particular problem for commercial turkey production. Little is known about the ecologic and evolutionary dynamics of ORT, which makes prevention and control of this pathogen a challenge. The purpose of this study was to gain insight into the genetic relationships between ORT populations through comparative genomics of clinical isolates from different US turkey producers. ORT clinical isolates were collected from four major US turkey producers and several independent turkey growers from the upper Midwest and Southeast, and whole-genome sequencing was performed. Genomes were compared phylogenetically using single nucleotide polymorphism (SNP)-based analysis, and then assemblies and annotations were performed to identify genes encoding putative virulence factors and antimicrobial resistance determinants. A pangenome approach was also used to establish a core set of genes consistently present in ORT, and to highlight differences in gene content between phylogenetic clades. A total of 1,457 non-recombinant SNPs were identified from 157 ORT genomes, and four distinct phylogenetic clades were identified. Isolates clustered by company on the phylogenetic tree, however, each company had isolates in multiple clades with similar collection dates, indicating that there are multiple ORT strains circulating within each of the companies examined. Additionally, several antimicrobial resistance proteins, putative virulence factors, and the pOR1 plasmid were associated with particular clades and multi-locus sequence types, which may explain why the same strains seem to have persisted in the same turkey operations for decades.Importance The whole-genome approach enhances our understanding of evolutionary relationships between clinical ORT isolates from different commercial turkey producers, and allows for identification of genes associated with virulence, antimicrobial resistance, or mobile genetic elements that are often excluded using traditional typing methods. Additionally, differentiating ORT isolates at the whole-genome level may provide insight into selection of the most appropriate autogenous vaccine strain, or groups of strains, for a given population of clinical isolates.

RevDate: 2020-04-02

Zhu L, Zhao M, Chen M, et al (2020)

The bHLH gene family and its response to saline stress in Jilin ginseng, Panax ginseng C.A. Meyer.

Molecular genetics and genomics : MGG pii:10.1007/s00438-020-01658-w [Epub ahead of print].

Basic helix-loop-helix (bHLH) gene family is a gene family of transcription factors that plays essential roles in plant growth and development, secondary metabolism and response to biotic and abiotic stresses. Therefore, a comprehensive knowledge of the bHLH gene family is paramount to understand the molecular mechanisms underlying these processes and develop advanced technologies to manipulate the processes efficiently. Ginseng, Panax ginseng C.A. Meyer, is a well-known medicinal herb; however, little is known about the bHLH genes (PgbHLH) in the species. Here, we identified 137 PgbHLH genes from Jilin ginseng cultivar, Damaya, widely cultivated in Jilin, China, of which 50 are newly identified by pan-genome analysis. These 137 PgbHLH genes were phylogenetically classified into 26 subfamilies, suggesting their sequence diversification. They are alternatively spliced into 366 transcripts in a 4-year-old plant and involved in 11 functional subcategories of the gene ontology, indicating their functional differentiation in ginseng. The expressions of the PgbHLH genes dramatically vary spatio-temporally and across 42 genotypes, but they are still somehow functionally correlated. Moreover, the PgbHLH gene family, at least some of its genes, is shown to have roles in plant response to the abiotic stress of saline. These results provide a new insight into the evolution and functional differentiation of the bHLH gene family in plants, new bHLH genes to the PgbHLH gene family, and saline stress-responsive genes for genetic improvement in ginseng and other plant species.

RevDate: 2020-04-01

Niu XK, Narsing Rao MP, Dong ZY, et al (2020)

Vulcaniibacterium gelatinicum sp. nov., a moderately thermophilic bacterium isolated from a hot spring.

International journal of systematic and evolutionary microbiology, 70(3):1571-1577.

The present study aimed to determine the taxonomic positions of strains designated R-5-52-3T, R-5-33-5-1-2, R-5-48-2 and R-5-51-4 isolated from hot spring water samples. Cells of these strains were Gram-stain-negative, non-motile and rod-shaped. The strains shared highest 16S rRNA gene sequence similarity with Vulcaniibacterium thermophilum KCTC 32020T (95.1%). Growth occurred at 28-55 °C, at pH 6-8 and with up to 3 % (w/v) NaCl. DNA fingerprinting, biochemical, phylogenetic and 16S rRNA gene sequence analyses suggested that R-5-52-3T, R-5-33-5-1-2, R-5-48-2 and R-5-51-4 were different strains but belonged to the same species. Hence, R-5-52-3T was chosen for further analysis and R-5-33-5-1-2, R-5-48-2 and R-5-51-4 were considered as additional strains of this species. R-5-52-3T possessed Q-8 as the only quinone and iso-C15:0, iso-C11:0, C16 : 0 and iso-C17 : 0 as major fatty acids. The polar lipids were diphosphatidylglycerol, phosphatidylglycerol, phosphatidylethanolamine, unidentified polar lipids and two unidentified phospholipids. The genomic G+C content was 71.6 mol%. Heat shock proteins (e.g. Hsp20, GroEL, DnaK and Clp ATPases) were noted in the R-5-52-3T genome, which could suggest its protection in the hot spring environment. Pan-genome analysis showed the number of singleton gene clusters among Vulcaniibacterium members varied. Average nucleotide identity (ANI) values between R-5-52-3T, Vulcaniibacterium tengchongense YIM 77520T and V. thermophilum KCTC 32020T were 80.1-85.8 %, which were below the cut-off level (95-96 %) recommended as the ANI criterion for interspecies identity. Thus, based on the above results, strain R-5-52-3T represents a novel species of the genus Vulcaniibacterium, for which the name Vulcaniibacterium gelatinicum sp. nov. is proposed. The type strain is R-5-52-3T (=KCTC 72061T=CGMCC 1.16678T).

RevDate: 2020-03-21

Dunning LT, PA Christin (2020)

Reticulate evolution, lateral gene transfer, and innovation in plants.

American journal of botany [Epub ahead of print].

RevDate: 2020-03-20

Muthukumarasamy U, Preusse M, Kordes A, et al (2020)

Single-nucleotide polymorphism-based genetic diversity analysis of clinical Pseudomonas aeruginosa isolates.

Genome biology and evolution pii:5810496 [Epub ahead of print].

Extensive use of next-generation sequencing has the potential to transform our knowledge on how genomic variation within bacterial species impacts phenotypic versatility. Since different environments have unique selection pressures, they drive divergent evolution. However, there is also parallel or convergent evolution of traits in independent bacterial isolates inhabiting similar environments. The application of tools to describe population-wide genomic diversity provides an opportunity to measure the predictability of genetic changes underlying adaptation. Here we describe patterns of sequence variations in the core genome among 99 individual Pseudomonas aeruginosa clinical isolates and identified single nucleotide polymorphisms (SNPs) that are the basis for branching of the phylogenetic tree. We also identified SNPs that were acquired independently, in separate lineages, and not through inheritance from a common ancestor. While our results demonstrate that the P. aeruginosa core genome is highly conserved and in general, not subject to adaptive evolution, instances of parallel evolution will provide an opportunity to uncover genetic changes that underlie phenotypic diversity.

RevDate: 2020-03-19

Gautreau G, Bazin A, Gachet M, et al (2020)

PPanGGOLiN: Depicting microbial diversity via a partitioned pangenome graph.

PLoS computational biology, 16(3):e1007732 pii:PCOMPBIOL-D-19-02015 [Epub ahead of print].

The use of comparative genomics for functional, evolutionary, and epidemiological studies requires methods to classify gene families in terms of occurrence in a given species. These methods usually lack multivariate statistical models to infer the partitions and the optimal number of classes and don't account for genome organization. We introduce a graph structure to model pangenomes in which nodes represent gene families and edges represent genomic neighborhood. Our method, named PPanGGOLiN, partitions nodes using an Expectation-Maximization algorithm based on multivariate Bernoulli Mixture Model coupled with a Markov Random Field. This approach takes into account the topology of the graph and the presence/absence of genes in pangenomes to classify gene families into persistent, cloud, and one or several shell partitions. By analyzing the partitioned pangenome graphs of isolate genomes from 439 species and metagenome-assembled genomes from 78 species, we demonstrate that our method is effective in estimating the persistent genome. Interestingly, it shows that the shell genome is a key element to understand genome dynamics, presumably because it reflects how genes present at intermediate frequencies drive adaptation of species, and its proportion in genomes is independent of genome size. The graph-based approach proposed by PPanGGOLiN is useful to depict the overall genomic diversity of thousands of strains in a compact structure and provides an effective basis for very large scale comparative genomics. The software is freely available at

RevDate: 2020-03-19

Hasni I, Andréani J, Colson P, et al (2020)

Description of Virulent Factors and Horizontal Gene Transfers of Keratitis-Associated Amoeba Acanthamoeba Triangularis by Genome Analysis.

Pathogens (Basel, Switzerland), 9(3): pii:pathogens9030217.

Acanthamoeba triangularis strain SH 621 is a free-living amoeba belonging to Acanthamoeba ribo-genotype T4. This ubiquitous protist is among the free-living amoebas responsible for Acanthamoeba keratitis, a severe infection of human cornea. Genome sequencing and genomic comparison were carried out to explore the biological functions and to better understand the virulence mechanism related to the pathogenicity of Acanthamoeba keratitis. The genome assembly harbored a length of 66.43 Mb encompassing 13,849 scaffolds. The analysis of predicted proteins reported the presence of 37,062 ORFs. A complete annotation revealed 33,168 and 16,605 genes that matched with NCBI non-redundant protein sequence (nr) and Cluster of Orthologous Group of proteins (COG) databases, respectively. The Kyoto Encyclopedia of Genes and Genomes Pathway (KEGG) annotation reported a great number of genes related to carbohydrate, amino acid and lipid metabolic pathways. The pangenome performed with 8 available amoeba genomes belonging to genus Acanthamoeba revealed a core genome containing 843 clusters of orthologous genes with a ratio core genome/pangenome of less than 0.02. We detected 48 genes related to virulent factors of Acanthamoeba keratitis. Best hit analyses in nr database identified 99 homologous genes shared with amoeba-resisting microorganisms. This study allows the deciphering the genome of a free-living amoeba with medical interest and provides genomic data to better understand virulence-related Acanthamoeba keratitis.

RevDate: 2020-03-19

Kim YJ, Park JY, Balusamy SR, et al (2020)

Comprehensive Genome Analysis on the Novel Species Sphingomonas panacis DCY99T Reveals Insights into Iron Tolerance of Ginseng.

International journal of molecular sciences, 21(6): pii:ijms21062019.

Plant growth-promoting rhizobacteria play vital roles not only in plant growth, but also in reducing biotic/abiotic stress. Sphingomonas panacis DCY99T is isolated from soil and root of Panax ginseng with rusty root disease, characterized by raised reddish-brown root and this is seriously affects ginseng cultivation. To investigate the relationship between 159 sequenced Sphingomonas strains, pan-genome analysis was carried out, which suggested genomic diversity of the Sphingomonas genus. Comparative analysis of S. panacis DCY99T with Sphingomonas sp. LK11 revealed plant growth-promoting potential of S. panacis DCY99T through indole acetic acid production, phosphate solubilizing, and antifungal abilities. Detailed genomic analysis has shown that S. panacis DCY99T contain various heavy metals resistance genes in its genome and the plasmid. Functional analysis with Sphingomonas paucimobilis EPA505 predicted that S. panacis DCY99T possess genes for degradation of polyaromatic hydrocarbon and phenolic compounds in rusty-ginseng root. Interestingly, when primed ginseng with S. panacis DCY99T during high concentration of iron exposure, iron stress of ginseng was suppressed. In order to detect S. panacis DCY99T in soil, biomarker was designed using spt gene. This study brings new insights into the role of S. panacis DCY99T as a microbial inoculant to protect ginseng plants against rusty root disease.

RevDate: 2020-03-18

Kang SM, Asaf S, Khan AL, et al (2020)

Complete Genome Sequence of Pseudomonas psychrotolerans CS51, a Plant Growth-Promoting Bacterium, Under Heavy Metal Stress Conditions.

Microorganisms, 8(3): pii:microorganisms8030382.

In the current study, we aimed to elucidate the plant growth-promoting characteristics of Pseudomonas psychrotolerans CS51 under heavy metal stress conditions (Zn, Cu, and Cd) and determine the genetic makeup of the CS51 genome using the single-molecule real-time (SMRT) sequencing technology of Pacific Biosciences. The results revealed that inoculation with CS51 induced endogenous indole-3-acetic acid (IAA) and gibberellins (GAs), which significantly enhanced cucumber growth (root shoot length) and increased the heavy metal tolerance of cucumber plants. Moreover, genomic analysis revealed that the CS51 genome consisted of a circular chromosome of 5,364,174 base pairs with an average G+C content of 64.71%. There were around 4774 predicted protein-coding sequences (CDSs) in 4859 genes, 15 rRNA genes, and 67 tRNA genes. Around 3950 protein-coding genes with function prediction and 733 genes without function prediction were identified. Furthermore, functional analyses predicted that the CS51 genome could encode genes required for auxin biosynthesis, nitrate and nitrite ammonification, the phosphate-specific transport system, and the sulfate transport system, which are beneficial for plant growth promotion. The heavy metal resistance of CS51 was confirmed by the presence of genes responsible for cobalt-zinc-cadmium resistance, nickel transport, and copper homeostasis in the CS51 genome. The extrapolation of the curve showed that the core genome contained a minimum of 2122 genes (95% confidence interval = 2034.24 to 2080.215). Our findings indicated that the genome sequence of CS51 may be used as an eco-friendly bioresource to promote plant growth in heavy metal-contaminated areas.

RevDate: 2020-03-14

Satyam R, Bhardwaj T, Jha NK, et al (2020)

Toward a chimeric vaccine against multiple isolates of Mycobacteroides - An integrative approach.

Life sciences pii:S0024-3205(20)30289-7 [Epub ahead of print].

AIM: Nontuberculous mycobacterial infection (NTM) such as endophthalmitis, dacryocystitis, and canaliculitis are pervasive across the globe and are currently managed by antibiotics. However, the recent cases of Mycobacteroides developing drug resistance reported along with the improper practice of medicine intrigued us to explore its genomic and proteomic canvas at a global scale and develop a chimeric vaccine against Mycobacteroides.

MAIN METHODS: We carried out a vivid genomic study on five recently sequenced strains of Mycobacteroides and explored their Pan-Core genome/proteome in three different Phases. The promiscuous antigenic proteins were identified via a subtractive proteomics approach that qualified for virulence causation, resistance and essentiality factors for this notorious bacterium. An integrated pipeline was developed for the identification of B-Cell, MHC (Major histocompatibility complex) class I and II epitopes.

KEY FINDINGS: Phase I identified the shreds of evidence of reductive evolution and propensity of the Pan-genome of Mycobacteroides getting closed soon. Phase II and Phase III produced 8 vaccine constructs. Our final vaccine construct, V6 qualified for all tests such as absence for allergenicity, presence of antigenicity, etc. V6 contains β defensin as an adjuvant, linkers, LAMP1 (Lysosomal-associated membrane protein 1) signal peptide, and PADRE (Pan HLA-DR epitopes) amino acid sequence. Besides, V6 also interacts with a maximum number of MHC molecules and the TLR4/MD2 (Toll-like Receptor 4/Myeloid Differentiation Factor 2) complex confirmed by docking and molecular dynamics simulation studies.

SIGNIFICANCE: The knowledge harnessed from the current study can help improve the current treatment regimens or in an event of an outbreak and propel further related studies.

RevDate: 2020-03-10

Chen M, Xu CY, Wang X, et al (2020)

Comparative genomics analysis of c-di-GMP metabolism and regulation in Microcystis aeruginosa.

BMC genomics, 21(1):217 pii:10.1186/s12864-020-6591-3.

BACKGROUND: Cyanobacteria are of special concern because they proliferate in eutrophic water bodies worldwide and affect water quality. As an ancient photosynthetic microorganism, cyanobacteria can survive in ecologically diverse habitats because of their capacity to rapidly respond to environmental changes through a web of complex signaling networks, including using second messengers to regulate physiology or metabolism. A ubiquitous second messenger, bis-(3',5')-cyclic-dimeric-guanosine monophosphate (c-di-GMP), has been found to regulate essential behaviors in a few cyanobacteria but not Microcystis, which are the most dominant species in cyanobacterial blooms. In this study, comparative genomics analysis was performed to explore the genomic basis of c-di-GMP signaling in Microcystis aeruginosa.

RESULTS: Proteins involved in c-di-GMP metabolism and regulation, such as diguanylate cyclases, phosphodiesterases, and PilZ-containing proteins, were encoded in M. aeruginosa genomes. However, the number of identified protein domains involved in c-di-GMP signaling was not proportional to the size of M. aeruginosa genomes (4.97 Mb in average). Pan-genome analysis showed that genes involved in c-di-GMP metabolism and regulation are conservative in M. aeruginosa strains. Phylogenetic analysis showed good congruence between the two types of phylogenetic trees based on 31 highly conserved protein-coding genes and sensor domain-coding genes. Propensity for gene loss analysis revealed that most of genes involved in c-di-GMP signaling are stable in M. aeruginosa strains. Moreover, bioinformatics and structure analysis of c-di-GMP signal-related GGDEF and EAL domains revealed that they all possess essential conserved amino acid residues that bind the substrate. In addition, it was also found that all selected M. aeruginosa genomes encode PilZ domain containing proteins.

CONCLUSIONS: Comparative genomics analysis of c-di-GMP metabolism and regulation in M. aeruginosa strains helped elucidating the genetic basis of c-di-GMP signaling pathways in M. aeruginosa. Knowledge of c-di-GMP metabolism and relevant signal regulatory processes in cyanobacteria can enhance our understanding of their adaptability to various environments and bloom-forming mechanism.

RevDate: 2020-03-09

Aaltonen K, Kant R, Eklund M, et al (2020)

Streptococcus halichoeri: Comparative Genomics of an Emerging Pathogen.

International journal of genomics, 2020:8708305.

Streptococcus halichoeri is an emerging pathogen with a variety of host species and zoonotic potential. It has been isolated from grey seals and other marine mammals as well as from human infections. Beginning in 2010, two concurrent epidemics were identified in Finland, in fur animals and domestic dogs, respectively. The fur animals suffered from a new disease fur animal epidemic necrotic pyoderma (FENP) and the dogs presented with ear infections with poor treatment response. S. halichoeri was isolated in both studies, albeit among other pathogens, indicating a possible role in the disease etiologies. The aim was to find a possible common origin of the fur animal and dog isolates and study the virulence factors to assess pathogenic potential. Isolates from seal, human, dogs, and fur animals were obtained for comparison. The whole genomes were sequenced from 20 different strains using the Illumina MiSeq platform and annotated using an automatic annotation pipeline RAST. The core and pangenomes were formed by comparing the genomes against each other in an all-against-all comparison. A phylogenetic tree was constructed using the genes of the core genome. Virulence factors were assessed using the Virulence Factor Database (VFDB) concentrating on the previously confirmed streptococcal factors. A core genome was formed which encompassed approximately half of the genes in Streptococcus halichoeri. The resulting core was nearly saturated and would not change significantly by adding more genomes. The remaining genes formed the pangenome which was highly variable and would still evolve after additional genomes. The results highlight the great adaptability of this bacterium possibly explaining the ease at which it switches hosts and environments. Virulence factors were also analyzed and were found primarily in the core genome. They represented many classes and functions, but the largest single category was adhesins which again supports the marine origin of this species.

RevDate: 2020-03-06

Moustafa AM, PJ Planet (2020)

WhatsGNU: a tool for identifying proteomic novelty.

Genome biology, 21(1):58 pii:10.1186/s13059-020-01965-w.

To understand diversity in enormous collections of genome sequences, we need computationally scalable tools that can quickly contextualize individual genomes based on their similarities and identify features of each genome that make them unique. We present WhatsGNU, a tool based on exact match proteomic compression that, in seconds, classifies any new genome and provides a detailed report of protein alleles that may have novel functional differences. We use this technique to characterize the total allelic diversity (panallelome) of Salmonella enterica, Mycobacterium tuberculosis, Pseudomonas aeruginosa, and Staphylococcus aureus. It could be extended to others. WhatsGNU is available from

RevDate: 2020-03-05

Seif Y, Choudhary KS, Hefner Y, et al (2020)

Metabolic and genetic basis for auxotrophies in Gram-negative species.

Proceedings of the National Academy of Sciences of the United States of America pii:1910499117 [Epub ahead of print].

Auxotrophies constrain the interactions of bacteria with their environment, but are often difficult to identify. Here, we develop an algorithm (AuxoFind) using genome-scale metabolic reconstruction to predict auxotrophies and apply it to a series of available genome sequences of over 1,300 Gram-negative strains. We identify 54 auxotrophs, along with the corresponding metabolic and genetic basis, using a pangenome approach, and highlight auxotrophies conferring a fitness advantage in vivo. We show that the metabolic basis of auxotrophy is species-dependent and varies with 1) pathway structure, 2) enzyme promiscuity, and 3) network redundancy. Various levels of complexity constitute the genetic basis, including 1) deleterious single-nucleotide polymorphisms (SNPs), in-frame indels, and deletions; 2) single/multigene deletion; and 3) movement of mobile genetic elements (including prophages) combined with genomic rearrangements. Fourteen out of 19 predictions agree with experimental evidence, with the remaining cases highlighting shortcomings of sequencing, assembly, annotation, and reconstruction that prevent predictions of auxotrophies. We thus develop a framework to identify the metabolic and genetic basis for auxotrophies in Gram-negatives.

RevDate: 2020-03-05

Jin Y, Zhou J, Zhou J, et al (2020)

Genome-based classification of Burkholderia cepacia complex provides new insight into its taxonomic status.

Biology direct, 15(1):6 pii:10.1186/s13062-020-0258-5.

BACKGROUND: Accurate classification of different Burkholderia cepacia complex (BCC) species is essential for therapy, prognosis assessment and research. The taxonomic status of BCC remains problematic and an improved knowledge about the classification of BCC is in particular needed.

METHODS: We compared phylogenetic trees of BCC based on 16S rRNA, recA, hisA and MLSA (multilocus sequence analysis). Using the available whole genome sequences of BCC, we inferred a species tree based on estimated single-copy orthologous genes and demarcated species of BCC using dDDH/ANI clustering.

RESULTS: We showed that 16S rRNA, recA, hisA and MLSA have limited resolutions in the taxonomic study of closely related bacteria such as BCC. Our estimated species tree and dDDH/ANI clustering clearly separated 116 BCC strains into 36 clusters. With the appropriate reclassification of misidentified strains, these clusters corresponded to 22 known species as well as 14 putative novel species.

CONCLUSIONS: This is the first large-scale and systematic study of the taxonomic status of the BCC and could contribute to further insights into BCC taxonomy. Our study suggested that conjunctive use of core phylogeny based on single-copy orthologous genes, as well as pangenome-based dDDH/ANI clustering would provide a preferable framework for demarcating closely related species.

REVIEWER: This article was reviewed by Dr. Xianwen Ren.

RevDate: 2020-03-04

Thukral A, Ross K, Hansen C, et al (2020)

A single dose polyanhydride-based nanovaccine against paratuberculosis infection.

NPJ vaccines, 5:15 pii:164.

Mycobacterium avium subsp. paratuberculosis (M. paratuberculosis) causes Johne's disease in ruminants and is characterized by chronic gastroenteritis leading to heavy economic losses to the dairy industry worldwide. The currently available vaccine (inactivated bacterin in oil base) is not effective in preventing pathogen shedding and is rarely used to control Johne's disease in dairy herds. To develop a better vaccine that can prevent the spread of Johne's disease, we utilized polyanhydride nanoparticles (PAN) to encapsulate mycobacterial antigens composed of whole cell lysate (PAN-Lysate) and culture filtrate (PAN-Cf) of M. paratuberculosis. These nanoparticle-based vaccines (i.e., nanovaccines) were well tolerated in mice causing no inflammatory lesions at the site of injection. Immunological assays demonstrated a substantial increase in the levels of antigen-specific T cell responses post-vaccination in the PAN-Cf vaccinated group as indicated by high percentages of triple cytokine (IFN-γ, IL-2, TNF-α) producing CD8+ T cells. Following challenge, animals vaccinated with PAN-Cf continued to produce significant levels of double (IFN-γ, TNF-α) and single cytokine (IFN-γ) secreting CD8+ T cells compared with animals vaccinated with an inactivated vaccine. A significant reduction in bacterial load was observed in multiple organs of animals vaccinated with PAN-Cf, which is a clear indication of protection. Overall, the use of polyanhydride nanovaccines resulted in development of protective and sustained immunity against Johne's disease, an approach that could be applied to counter other intracellular pathogens.

RevDate: 2020-02-28

Tekedar HC, Blom J, Kalindamar S, et al (2020)

Comparative genomics of the fish pathogens Edwardsiella ictaluri 93-146 and Edwardsiella piscicida C07-087.

Microbial genomics, 6(2):.

Edwardsiella ictaluri and Edwardsiella piscicida are important fish pathogens affecting cultured and wild fish worldwide. To investigate the genome-level differences and similarities between catfish-adapted strains in these two species, the complete E. ictaluri 93-146 and E. piscicida C07-087 genomes were evaluated by applying comparative genomics analysis. All available complete (10) and non-complete (19) genomes from five Edwardsiella species were also included in a systematic analysis. Average nucleotide identity and core-genome phylogenetic tree analyses indicated that the five Edwardsiella species were separated from each other. Pan-/core-genome analyses for the 29 strains from the five species showed that genus Edwardsiella members have 9474 genes in their pan genome, while the core genome consists of 1421 genes. Orthology cluster analysis showed that E. ictaluri and E. piscicida genomes have the greatest number of shared clusters. However, E. ictaluri and E. piscicida also have unique features; for example, the E. ictaluri genome encodes urease enzymes and cytochrome o ubiquinol oxidase subunits, whereas E. piscicida genomes encode tetrathionate reductase operons, capsular polysaccharide synthesis enzymes and vibrioferrin-related genes. Additionally, we report for what is believed to be the first time that E. ictaluri 93-146 and three other E. ictaluri genomes encode a type IV secretion system (T4SS), whereas none of the E. piscicida genomes encode this system. Additionally, the E. piscicida C07-087 genome encodes two different type VI secretion systems. E. ictaluri genomes tend to encode more insertion elements, phage regions and genomic islands than E. piscicida. We speculate that the T4SS could contribute to the increased number of mobilome elements in E. ictaluri compared to E. piscicida. Two of the E. piscicida genomes encode full CRISPR-Cas regions, whereas none of the E. ictaluri genomes encode Cas proteins. Overall, comparison of the E. ictaluri and E. piscicida genomes reveals unique features and provides new insights on pathogenicity that may reflect the host adaptation of the two species.

RevDate: 2020-02-28

Li Q, Cooper RE, Wegner CE, et al (2020)

Molecular Mechanisms Underpinning Aggregation in Acidiphilium sp. C61 Isolated from Iron-Rich Pelagic Aggregates.

Microorganisms, 8(3): pii:microorganisms8030314.

Iron-rich pelagic aggregates (iron snow) are hot spots for microbial interactions. Using iron snow isolates, we previously demonstrated that the iron-oxidizer Acidithrix sp. C25 triggers Acidiphilium sp. C61 aggregation by producing the infochemical 2-phenethylamine (PEA). Here, we showed slightly enhanced aggregate formation in the presence of PEA on different Acidiphilium spp. but not other iron-snow microorganisms, including Acidocella sp. C78 and Ferrovum sp. PN-J47. Next, we sequenced the Acidiphilium sp. C61 genome to reconstruct its metabolic potential. Pangenome analyses of Acidiphilium spp. genomes revealed the core genome contained 65 gene clusters associated with aggregation, including autoaggregation, motility, and biofilm formation. Screening the Acidiphilium sp. C61 genome revealed the presence of autotransporter, flagellar, and extracellular polymeric substances (EPS) production genes. RNA-seq analyses of Acidiphilium sp. C61 incubations (+/- 10 µM PEA) indicated genes involved in energy production, respiration, and genetic processing were the most upregulated differentially expressed genes in the presence of PEA. Additionally, genes involved in flagellar basal body synthesis were highly upregulated, whereas the expression pattern of biofilm formation-related genes was inconclusive. Our data shows aggregation is a common trait among Acidiphilium spp. and PEA stimulates the central cellular metabolism, potentially advantageous in aggregates rapidly falling through the water column.

RevDate: 2020-02-27

González-Castillo A, Enciso-Ibarra J, B Gomez-Gil (2020)

Genomic taxonomy of the Mediterranei clade of the genus Vibrio (Gammaproteobacteria).

Antonie van Leeuwenhoek pii:10.1007/s10482-020-01396-4 [Epub ahead of print].

The first genomic study of Mediterranei clade using five type strains (V. mediterranei, V. maritimus, V. variabilis, V. thalassae, and V. barjaei) and fourteen reference strains isolated from marine organisms, seawater, water and sediments of the sea was performed. These bacterial strains were characterised by means of a polyphasic approach comprising 16S rRNA gene, multilocus sequence analysis (MLSA) of 139 single-copy genes, the DNA G + C content, ANI, and in silico phenotypic characterisation. We found that the species of the Mediterranei clade formed two separate clusters based in 16S rRNA gene sequence similarity, MLSA, OrthoANI, and Codon and Amino Acid usage. The Mediterranei clade species showed values between 76 and 95% for ANIb, 84 and 95% for ANIm. The core genome consisted of 2057 gene families and the pan-genome of 13,094 gene families. Based on the genomic analyses performed, the Mediterranei clade can be divided in two clusters, one with the strains of V. maritimus, V. variabilis and two potential new species, and the other cluster with the strains of V. mediterranei, V. thalassae, and V. barjaei.

RevDate: 2020-02-26

Whelan FJ, Rusilowicz M, JO McInerney (2020)

Coinfinder: detecting significant associations and dissociations in pangenomes.

Microbial genomics [Epub ahead of print].

The accessory genes of prokaryote and eukaryote pangenomes accumulate by horizontal gene transfer, differential gene loss, and the effects of selection and drift. We have developed Coinfinder, a software program that assesses whether sets of homologous genes (gene families) in pangenomes associate or dissociate with each other (i.e. are 'coincident') more often than would be expected by chance. Coinfinder employs a user-supplied phylogenetic tree in order to assess the lineage-dependence (i.e. the phylogenetic distribution) of each accessory gene, allowing Coinfinder to focus on coincident gene pairs whose joint presence is not simply because they happened to appear in the same clade, but rather that they tend to appear together more often than expected across the phylogeny. Coinfinder is implemented in C++, Python3 and R and is freely available under the GNU license from

RevDate: 2020-02-22

Khan AMAM, Hauk VJ, Ibrahim M, et al (2020)

Caldicellulosiruptor bescii adheres to polysaccharides using a type IV pilin-dependent mechanism.

Applied and environmental microbiology pii:AEM.00200-20 [Epub ahead of print].

Biological hydrolysis of cellulose above 70°C involves microorganisms that secrete free enzymes, and deploy separate protein systems to adhere to their substrate. Strongly cellulolytic Caldicellulosiruptor bescii is one such extreme thermophile, which deploys modular, multi-functional carbohydrate acting enzymes to deconstruct plant biomass. Additionally, C. bescii also encodes for non-catalytic carbohydrate binding proteins, which likely evolved as a mechanism to compete against other heterotrophs in carbon limited biotopes that these bacteria inhabit. Analysis of the Caldicellulosiruptor pangenome identified a type IV pilus (T4P) locus encoded upstream of the tāpirins, that is encoded for by all Caldicellulosiruptor species. In this study, we sought to determine if the C. bescii T4P plays a role in attachment to plant polysaccharides. The major C. bescii pilin (CbPilA) was identified by the presence of pilin-like protein domains, paired with transcriptomics and proteomics data. Using immuno-dot blots, we determined that the plant polysaccharide, xylan, induced production of CbPilA 10 to 14-fold higher than glucomannan or xylose. Furthermore, we are able to demonstrate that recombinant CbPilA directly interacts with xylan, and cellulose at elevated temperatures. Localization of CbPilA at the cell surface was confirmed by immunofluorescence microscopy. Lastly, a direct role for CbPilA in cell adhesion was demonstrated using recombinant CbPilA or anti-CbPilA antibodies to reduce C. bescii cell adhesion to xylan and crystalline cellulose up to 4.5 and 2-fold, respectively. Based on these observations, we propose that CbPilA and by extension, the T4P, play a role in Caldicellulosiruptor cell attachment to plant biomass.IMPORTANCEMost microorganisms are capable of attaching to surfaces in order to persist in their environment. Type IV (T4) pili produced by select mesophilic Firmicutes promote adherence, however a role for T4 pili encoded by thermophilic members of this phylum has yet to be demonstrated. Prior comparative genomics analyses identified a T4 pilus locus encoded by an extremely thermophilic genus within the Firmicutes. Here, we demonstrate that attachment to plant biomass-related carbohydrates by strongly cellulolytic Caldicellulosiruptor bescii is mediated by T4 pilins. Surprisingly, xylan but not cellulose induced expression of the major T4 pilin. Regardless, the C. bescii T4 pilin interacts with both polysaccharides at high temperatures, and is located to the cell surface where it is directly involved in C. bescii attachment. Adherence to polysaccharides is likely key to survival in environments where carbon sources are limiting, allowing C. bescii to compete against other plant degrading microorganisms.

RevDate: 2020-02-20

Romano I, Ventorino V, O Pepe (2020)

Effectiveness of Plant Beneficial Microbes: Overview of the Methodological Approaches for the Assessment of Root Colonization and Persistence.

Frontiers in plant science, 11:6.

Issues concerning the use of harmful chemical fertilizers and pesticides that have large negative impacts on environmental and human health have generated increasing interest in the use of beneficial microorganisms for the development of sustainable agri-food systems. A successful microbial inoculant has to colonize the root system, establish a positive interaction and persist in the environment in competition with native microorganisms living in the soil through rhizocompetence traits. Currently, several approaches based on culture-dependent, microscopic and molecular methods have been developed to follow bioinoculants in the soil and plant surface over time. Although culture-dependent methods are commonly used to estimate the persistence of bioinoculants, it is difficult to differentiate inoculated organisms from native populations based on morphological characteristics. Therefore, these methods should be used complementary to culture-independent approaches. Microscopy-based techniques (bright-field, electron and fluorescence microscopy) allow to obtain a picture of microbial colonization outside and inside plant tissues also at high resolution, but it is not possible to always distinguish living cells from dead cells by direct observation as well as distinguish bioinoculants from indigenous microbial populations living in soils. In addition, the development of metagenomic techniques, including the use of DNA probes, PCR-based methods, next-generation sequencing, whole-genome sequencing and pangenome methods, provides a complementary approach useful to understand plant-soil-microbe interactions. However, to ensure good results in microbiological analysis, the first fundamental prerequisite is correct soil sampling and sample preparation for the different methodological approaches that will be assayed. Here, we provide an overview of the advantages and limitations of the currently used methods and new methodological approaches that could be developed to assess the presence, plant colonization and soil persistence of bioinoculants in the rhizosphere. We further discuss the possibility of integrating multidisciplinary approaches to examine the variations in microbial communities after inoculation and to track the inoculated microbial strains.

RevDate: 2020-02-19

Yu YY, CC Wei (2020)

[HUPAN promotes striding across of biomedical research from human genome to human pan-genome].

Zhonghua bing li xue za zhi = Chinese journal of pathology, 49(2):105-107.

RevDate: 2020-02-18

Iversen KH, Rasmussen LH, Al-Nakeeb K, et al (2020)

Similar genomic patterns of clinical infective endocarditis and oral isolates of Streptococcus sanguinis and Streptococcus gordonii.

Scientific reports, 10(1):2728 pii:10.1038/s41598-020-59549-4.

Streptococcus gordonii and Streptococcus sanguinis belong to the Mitis group streptococci, which mostly are commensals in the human oral cavity. Though they are oral commensals, they can escape their niche and cause infective endocarditis, a severe infection with high mortality. Several virulence factors important for the development of infective endocarditis have been described in these two species. However, the background for how the commensal bacteria, in some cases, become pathogenic is still not known. To gain a greater understanding of the mechanisms of the pathogenic potential, we performed a comparative analysis of 38 blood culture strains, S. sanguinis (n = 20) and S. gordonii (n = 18) from patients with verified infective endocarditis, along with 21 publicly available oral isolates from healthy individuals, S. sanguinis (n = 12) and S. gordonii (n = 9). Using whole genome sequencing data of the 59 streptococci genomes, functional profiles were constructed, using protein domain predictions based on the translated genes. These functional profiles were used for clustering, phylogenetics and machine learning. A clear separation could be made between the two species. No clear differences between oral isolates and clinical infective endocarditis isolates were found in any of the 675 translated core-genes. Additionally, random forest-based machine learning and clustering of the pan-genome data as well as amino acid variations in the core-genome could not separate the clinical and oral isolates. A total of 151 different virulence genes was identified in the 59 genomes. Among these homologs of genes important for adhesion and evasion of the immune system were found in all of the strains. Based on the functional profiles and virulence gene content of the genomes, we believe that all analysed strains had the ability to become pathogenic.

RevDate: 2020-02-17

Wu H, Wang D, F Gao (2020)

Toward a high-quality pan-genome landscape of Bacillus subtilis by removal of confounding strains.

Briefings in bioinformatics pii:5739184 [Epub ahead of print].

Pan-genome analysis is widely used to study the evolution and genetic diversity of species, particularly in bacteria. However, the impact of strain selection on the outcome of pan-genome analysis is poorly understood. Furthermore, a standard protocol to ensure high-quality pan-genome results is lacking. In this study, we carried out a series of pan-genome analyses of different strain sets of Bacillus subtilis to understand the impact of various strains on the performance and output quality of pan-genome analyses. Consequently, we found that the results obtained by pan-genome analyses of B. subtilis can be influenced by the inclusion of incorrectly classified Bacillus subspecies strains, phylogenetically distinct strains, engineered genome-reduced strains, chimeric strains, strains with a large number of unique genes or a large proportion of pseudogenes, and multiple clonal strains. Since the presence of these confounding strains can seriously affect the quality and true landscape of the pan-genome, we should remove these deviations in the process of pan-genome analyses. Our study provides new insights into the removal of biases from confounding strains in pan-genome analyses at the beginning of data processing, which enables the achievement of a closer representation of a high-quality pan-genome landscape of B. subtilis that better reflects the performance and credibility of the B. subtilis pan-genome. This procedure could be added as an important quality control step in pan-genome analyses for improving the efficiency of analyses, and ultimately contributing to a better understanding of genome function, evolution and genome-reduction strategies for B. subtilis in the future.

RevDate: 2020-02-14

Laflamme B, Dillon MM, Martel A, et al (2020)

The pan-genome effector-triggered immunity landscape of a host-pathogen interaction.

Science (New York, N.Y.), 367(6479):763-768.

Effector-triggered immunity (ETI), induced by host immune receptors in response to microbial effectors, protects plants against virulent pathogens. However, a systematic study of ETI prevalence against species-wide pathogen diversity is lacking. We constructed the Pseudomonas syringae Type III Effector Compendium (PsyTEC) to reduce the pan-genome complexity of 5127 unique effector proteins, distributed among 70 families from 494 strains, to 529 representative alleles. We screened PsyTEC on the model plant Arabidopsis thaliana and identified 59 ETI-eliciting alleles (11.2%) from 19 families (27.1%), with orthologs distributed among 96.8% of P. syringae strains. We also identified two previously undescribed host immune receptors, including CAR1, which recognizes the conserved effectors AvrE and HopAA1, and found that 94.7% of strains harbor alleles predicted to be recognized by either CAR1 or ZAR1.

RevDate: 2020-02-14

Liao F, Mo Z, Gu W, et al (2020)

A comparative genomic analysis between methicillin-resistant Staphylococcus aureus strains of hospital acquired and community infections in Yunnan province of China.

BMC infectious diseases, 20(1):137 pii:10.1186/s12879-020-4866-6.

BACKGROUND: Currently, Staphylococcus aureus is one of the most important pathogens worldwide, especially for methicillin-resistant S. aureus (MRSA) infection. However, few reports referred to patients' MRSA infections in Yunnan province, southwest China.

METHODS: In this study, we selected representative MRSA strains from patients' systemic surveillance in Yunnan province of China, performed the genomic sequencing and compared their features, together with some food derived strains.

RESULTS: Among sixty selective isolates, forty strains were isolated from patients, and twenty isolated from food. Among the patients' strains, sixteen were recognized as community-acquired (CA), compared with 24 for hospital-acquired (HA). ST6-t701, ST59-t437 and ST239-t030 were the three major genotype profiles. ST6-t701 was predominated in food strains, while ST59-t437 and ST239-t030 were the primary clones in patients. The clinical features between CA and HA-MRSA of patients were statistical different. Compared the antibiotic resistant results between patients and food indicated that higher antibiotic resistant rates were found in patients' strains. Totally, the average genome sizes of 60 isolates were 2.79 ± 0.05 Mbp, with GC content 33% and 84.50 ± 0.20% of coding rate. The core genomes of these isolates were 1593 genes. Phylogenetic analysis based on pan-genome and SNP of strains showed that five clustering groups were generated. Clustering ST239-t030 contained all the HA-MRSA cases in this study; clustering ST6-t701 referred to food and CA-MRSA infections in community; clustering ST59-t437 showed the heterogeneity for provoking different clinical diseases in both community and hospital. Phylogenetic tree, incorporating 24 isolates from different regions, indicated ST239-t030 strains in this study were more closely related to T0131 isolate from Tianjin, China, belonged to 'Turkish clade' from Eastern Europe; two groups of ST59-t437 clones of MRSA in Yunnan province were generated, belonged to the 'Asian-Pacific' clone (AP) and 'Taiwan' clone (TW) respectively.

CONCLUSIONS: ST239-t030, ST59-t437 and ST6-t701 were the three major MRSA clones in Yunnan province of China. ST239-t030 clonal Yunnan isolates demonstrated the local endemic of clone establishment for a number of years, whereas ST59-t437 strains revealed the multi-origins of this clone. In general, genomic study on epidemic clones of MRSA in southwest China provided the features and evolution of this pathogen.

RevDate: 2020-02-13

Dos Santos Silva LK, Rodrigues RAL, Dos Santos Pereira Andrade AC, et al (2020)

Isolation and genomic characterization of a new mimivirus of lineage B from a Brazilian river.

Archives of virology pii:10.1007/s00705-020-04542-5 [Epub ahead of print].

Since its discovery, the first identified giant virus associated with amoebae, Acanthamoeba polyphaga mimivirus (APMV), has been rigorously studied to understand the structural and genomic complexity of this virus. In this work, we report the isolation and genomic characterization of a new mimivirus of lineage B, named "Borely moumouvirus". This new virus exhibits a structure and replicative cycle similar to those of other members of the family Mimiviridae. The genome of the new isolate is a linear double-strand DNA molecule of ~1.0 Mb, containing over 900 open reading frames. Genome annotation highlighted different translation system components encoded in the DNA of Borely moumouvirus, including aminoacyl-tRNA synthetases, translation factors, and tRNA molecules, in a distribution similar to that in other lineage B mimiviruses. Pan-genome analysis indicated an increase in the genetic arsenal of this group of viruses, showing that the family Mimiviridae is still expanding. Furthermore, phylogenetic analysis has shown that Borely moumouvirus is closely related to moumouvirus australiensis. This is the first mimivirus lineage B isolated from Brazilian territory to be characterized. Further prospecting studies are necessary for us to better understand the diversity of these viruses so a better classification system can be established.

RevDate: 2020-02-13

Hickey G, Heller D, Monlong J, et al (2020)

Genotyping structural variants in pangenome graphs using the vg toolkit.

Genome biology, 21(1):35 pii:10.1186/s13059-020-1941-7.

Structural variants (SVs) remain challenging to represent and study relative to point mutations despite their demonstrated importance. We show that variation graphs, as implemented in the vg toolkit, provide an effective means for leveraging SV catalogs for short-read SV genotyping experiments. We benchmark vg against state-of-the-art SV genotypers using three sequence-resolved SV catalogs generated by recent long-read sequencing studies. In addition, we use assemblies from 12 yeast strains to show that graphs constructed directly from aligned de novo assemblies improve genotyping compared to graphs built from intermediate SV catalogs in the VCF format.

RevDate: 2020-02-12

Maistrenko OM, Mende DR, Luetge M, et al (2020)

Disentangling the impact of environmental and phylogenetic constraints on prokaryotic within-species diversity.

The ISME journal pii:10.1038/s41396-020-0600-z [Epub ahead of print].

Microbial organisms inhabit virtually all environments and encompass a vast biological diversity. The pangenome concept aims to facilitate an understanding of diversity within defined phylogenetic groups. Hence, pangenomes are increasingly used to characterize the strain diversity of prokaryotic species. To understand the interdependence of pangenome features (such as the number of core and accessory genes) and to study the impact of environmental and phylogenetic constraints on the evolution of conspecific strains, we computed pangenomes for 155 phylogenetically diverse species (from ten phyla) using 7,000 high-quality genomes to each of which the respective habitats were assigned. Species habitat ubiquity was associated with several pangenome features. In particular, core-genome size was more important for ubiquity than accessory genome size. In general, environmental preferences had a stronger impact on pangenome evolution than phylogenetic inertia. Environmental preferences explained up to 49% of the variance for pangenome features, compared with 18% by phylogenetic inertia. This observation was robust when the dataset was extended to 10,100 species (59 phyla). The importance of environmental preferences was further accentuated by convergent evolution of pangenome features in a given habitat type across different phylogenetic clades. For example, the soil environment promotes expansion of pangenome size, while host-associated habitats lead to its reduction. Taken together, we explored the global principles of pangenome evolution, quantified the influence of habitat, and phylogenetic inertia on the evolution of pangenomes and identified criteria governing species ubiquity and habitat specificity.

RevDate: 2020-02-12

Badet T, Oggenfuss U, Abraham L, et al (2020)

A 19-isolate reference-quality global pangenome for the fungal wheat pathogen Zymoseptoria tritici.

BMC biology, 18(1):12 pii:10.1186/s12915-020-0744-3.

BACKGROUND: The gene content of a species largely governs its ecological interactions and adaptive potential. A species is therefore defined by both core genes shared between all individuals and accessory genes segregating presence-absence variation. There is growing evidence that eukaryotes, similar to bacteria, show intra-specific variability in gene content. However, it remains largely unknown how functionally relevant such a pangenome structure is for eukaryotes and what mechanisms underlie the emergence of highly polymorphic genome structures.

RESULTS: Here, we establish a reference-quality pangenome of a fungal pathogen of wheat based on 19 complete genomes from isolates sampled across six continents. Zymoseptoria tritici causes substantial worldwide losses to wheat production due to rapidly evolved tolerance to fungicides and evasion of host resistance. We performed transcriptome-assisted annotations of each genome to construct a global pangenome. Major chromosomal rearrangements are segregating within the species and underlie extensive gene presence-absence variation. Conserved orthogroups account for only ~ 60% of the species pangenome. Investigating gene functions, we find that the accessory genome is enriched for pathogenesis-related functions and encodes genes involved in metabolite production, host tissue degradation and manipulation of the immune system. De novo transposon annotation of the 19 complete genomes shows that the highly diverse chromosomal structure is tightly associated with transposable element content. Furthermore, transposable element expansions likely underlie recent genome expansions within the species.

CONCLUSIONS: Taken together, our work establishes a highly complex eukaryotic pangenome providing an unprecedented toolbox to study how pangenome structure impacts crop-pathogen interactions.

RevDate: 2020-02-12

Zwickl NF, Stralis-Pavese N, Schäffer C, et al (2020)

Comparative genome characterization of the periodontal pathogen Tannerella forsythia.

BMC genomics, 21(1):150 pii:10.1186/s12864-020-6535-y.

BACKGROUND: Tannerella forsythia is a bacterial pathogen implicated in periodontal disease. Numerous virulence-associated T. forsythia genes have been described, however, it is necessary to expand the knowledge on T. forsythia's genome structure and genetic repertoire to further elucidate its role within pathogenesis. Tannerella sp. BU063, a putative periodontal health-associated sister taxon and closest known relative to T. forsythia is available for comparative analyses. In the past, strain confusion involving the T. forsythia reference type strain ATCC 43037 led to discrepancies between results obtained from in silico analyses and wet-lab experimentation.

RESULTS: We generated a substantially improved genome assembly of T. forsythia ATCC 43037 covering 99% of the genome in three sequences. Using annotated genomes of ten Tannerella strains we established a soft core genome encompassing 2108 genes, based on orthologs present in > = 80% of the strains analysed. We used a set of known and hypothetical virulence factors for comparisons in pathogenic strains and the putative periodontal health-associated isolate Tannerella sp. BU063 to identify candidate genes promoting T. forsythia's pathogenesis. Searching for pathogenicity islands we detected 38 candidate regions in the T. forsythia genome. Only four of these regions corresponded to previously described pathogenicity islands. While the general protein O-glycosylation gene cluster of T. forsythia ATCC 43037 has been described previously, genes required for the initiation of glycan synthesis are yet to be discovered. We found six putative glycosylation loci which were only partially conserved in other bacteria. Lastly, we performed a comparative analysis of translational bias in T. forsythia and Tannerella sp. BU063 and detected highly biased genes.

CONCLUSIONS: We provide resources and important information on the genomes of Tannerella strains. Comparative analyses enabled us to assess the suitability of T. forsythia virulence factors as therapeutic targets and to suggest novel putative virulence factors. Further, we report on gene loci that should be addressed in the context of elucidating T. forsythia's protein O-glycosylation pathway. In summary, our work paves the way for further molecular dissection of T. forsythia biology in general and virulence of this species in particular.

RevDate: 2020-02-10

Sherman RM, SL Salzberg (2020)

Pan-genomics in the human genome era.

Nature reviews. Genetics pii:10.1038/s41576-020-0210-7 [Epub ahead of print].

Since the early days of the genome era, the scientific community has relied on a single 'reference' genome for each species, which is used as the basis for a wide range of genetic analyses, including studies of variation within and across species. As sequencing costs have dropped, thousands of new genomes have been sequenced, and scientists have come to realize that a single reference genome is inadequate for many purposes. By sampling a diverse set of individuals, one can begin to assemble a pan-genome: a collection of all the DNA sequences that occur in a species. Here we review efforts to create pan-genomes for a range of species, from bacteria to humans, and we further consider the computational methods that have been proposed in order to capture, interpret and compare pan-genome data. As scientists continue to survey and catalogue the genomic variation across human populations and begin to assemble a human pan-genome, these efforts will increase our power to connect variation to human diversity, disease and beyond.

RevDate: 2020-02-05

Zhao J, Bayer PE, Ruperao P, et al (2020)

Trait associations in the pangenome of pigeon pea (Cajanus cajan).

Plant biotechnology journal [Epub ahead of print].

Pigeon pea (Cajanus cajan) is an important orphan crop mainly grown by smallholder farmers in India and Africa. Here we present the first pigeon pea pangenome based on 89 accessions mainly from India and the Philippines, showing that there is significant genetic diversity in Philippine individuals that is not present in Indian individuals. Annotation of variable genes suggests that they are associated with self-fertilisation and response to disease. We identified 225 SNPs associated with nine agronomically important traits over three locations and 2 different time-points, with SNPs associated with genes for transcription factors and kinases. These results will lead the way to an improved pigeon pea breeding program.

RevDate: 2020-02-06

Zhou X, Yang B, Stanton C, et al (2020)

Comparative analysis of Lactobacillus gasseri from Chinese subjects reveals a new species-level taxa.

BMC genomics, 21(1):119.

BACKGROUND: Lactobacillus gasseri as a probiotic has history of safe consumption is prevalent in infants and adults gut microbiota to maintain gut homeostasis.

RESULTS: In this study, to explore the genomic diversity and mine potential probiotic characteristics of L. gasseri, 92 strains of L. gasseri were isolated from Chinese human feces and identified based on 16 s rDNA sequencing, after draft genomes sequencing, further average nucleotide identity (ANI) value and phylogenetic analysis reclassified them as L. paragasseri (n = 79) and L. gasseri (n = 13), respectively. Their pan/core-genomes were determined, revealing that L. paragasseri had an open pan-genome. Comparative analysis was carried out to identify genetic features, and the results indicated that 39 strains of L. paragasseri harboured Type II-A CRISPR-Cas system while 12 strains of L. gasseri contained Type I-E and II-A CRISPR-Cas systems. Bacteriocin operons and the number of carbohydrate-active enzymes were significantly different between the two species.

CONCLUSIONS: This is the first time to study pan/core-genome of L. gasseri and L. paragasseri, and compare their genetic diversity, and all the results provided better understating on genetics of the two species.

RevDate: 2020-02-09

Isidro J, Ferreira S, Pinto M, et al (2020)

Virulence and antibiotic resistance plasticity of Arcobacter butzleri: Insights on the genomic diversity of an emerging human pathogen.

Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases, 80:104213 pii:S1567-1348(20)30045-9 [Epub ahead of print].

Arcobacter butzleri is a foodborne emerging human pathogen, frequently displaying a multidrug resistant character. Still, the lack of comprehensive genome-scale comparative analysis has limited our knowledge on A. butzleri diversification and pathogenicity. Here, we performed a deep genome analysis of A. butzleri focused on decoding its core- and pan-genome diversity and specific genetic traits underlying its pathogenic potential and diverse ecology. A. butzleri (genome size 2.07-2.58 Mbp) revealed a large open pan-genome with 7474 genes (about 50% being singletons) and a small but diverse core-genome with 1165 genes. It presents a plastic virulome (including newly identified determinants), marked by the differential presence of multiple adaptation-related virulence factors, such as the urease cluster ureD(AB)CEFG (phenotypically confirmed), the hypervariable hemagglutinin-encoding hecA, a type I secretion system (T1SS) harboring another agglutinin and a novel VirB/D4 T4SS likely linked to interbacterial competition and cytotoxicity. In addition, A. butzleri harbors a large repertoire of efflux pumps (EPs) and other antibiotic resistant determinants. We unprecedentedly describe a genetic mechanism of A. butzleri macrolides resistance, (inactivation of a TetR repressor likely regulating an EP). Fluoroquinolones resistance correlated with Thr-85-Ile in GyrA and ampicillin resistance was linked to an OXA-15-like β-lactamase. Remarkably, by decoding the polymorphism pattern of the main antigen PorA, we show that A. butzleri is able to exchange porA as a whole and/or hypervariable epitope-encoding regions separately, leading to a multitude of chimeric PorA presentations that can impact pathogen-host interaction during infection. Ultimately, our unprecedented screening of short sequence repeats indicates that phase variation likely modulates A. butzleri key adaptive functions. In summary, this study constitutes a turning point on A. butzleri comparative genomics revealing that this human gastrointestinal pathogen is equipped with vast and diverse virulence and antibiotic resistance arsenals that open a multitude of phenotypic fingerprints for environmental/host adaptation and pathogenicity.

RevDate: 2020-02-07

Danilevicz MF, Tay Fernandez CG, Marsh JI, et al (2020)

Plant pangenomics: approaches, applications and advancements.

Current opinion in plant biology, 54:18-25 pii:S1369-5266(19)30120-7 [Epub ahead of print].

With the assembly of increasing numbers of plant genomes, it is becoming accepted that a single reference assembly does not reflect the gene diversity of a species. The production of pangenomes, which reflect the structural variation and polymorphisms in genomes, enables in depth comparisons of variation within species or higher taxonomic groups. In this review, we discuss the current and emerging approaches for pangenome assembly, analysis and visualisation. In addition, we consider the potential of pangenomes for applied crop improvement, evolutionary and biodiversity studies. To fully exploit the value of pangenomes it is important to integrate broad information such as phenotypic, environmental, and expression data to gain insights into the role of variable regions within genomes.

RevDate: 2020-01-31

Talwar C, Nagar S, Kumar R, et al (2020)

Defining the Environmental Adaptations of Genus Devosia: Insights into its Expansive Short Peptide Transport System and Positively Selected Genes.

Scientific reports, 10(1):1151.

Devosia are well known for their dominance in soil habitats contaminated with various toxins and are best characterized for their bioremediation potential. In this study, we compared the genomes of 27 strains of Devosia with aim to understand their metabolic abilities. The analysis revealed their adaptive gene repertoire which was bared from 52% unique pan-gene content. A striking feature of all genomes was the abundance of oligo- and di-peptide permeases (oppABCDF and dppABCDF) with each genome harboring an average of 60.7 ± 19.1 and 36.5 ± 10.6 operon associated genes respectively. Apart from their primary role in nutrition, these permeases may help Devosia to sense environmental signals and in chemotaxis at stressed habitats. Through sequence similarity network analyses, we identified 29 Opp and 19 Dpp sequences that shared very little homology with any other sequence suggesting an expansive short peptidic transport system within Devosia. The substrate determining components of these permeases viz. OppA and DppA further displayed a large diversity that separated into 12 and 9 homologous clusters respectively in addition to large number of isolated nodes. We also dissected the genome scale positive evolution and found genes associated with growth (exopolyphosphatase, HesB_IscA_SufA family protein), detoxification (moeB, nifU-like domain protein, alpha/beta hydrolase), chemotaxis (cheB, luxR) and stress response (phoQ, uspA, luxR, sufE) were positively selected. The study highlights the genomic plasticity of the Devosia spp. for conferring adaptation, bioremediation and the potential to utilize a wide range of substrates. The widespread toxin-antitoxin loci and 'open' state of the pangenome provided evidence of plastic genomes and a much larger genetic repertoire of the genus which is yet uncovered.

RevDate: 2020-01-30

Sanderson H, Ortega-Polo R, Zaheer R, et al (2020)

Comparative genomics of multidrug-resistant Enterococcus spp. isolated from wastewater treatment plants.

BMC microbiology, 20(1):20.

BACKGROUND: Wastewater treatment plants (WWTPs) are considered hotspots for the environmental dissemination of antimicrobial resistance (AMR) determinants. Vancomycin-Resistant Enterococcus (VRE) are candidates for gauging the degree of AMR bacteria in wastewater. Enterococcus faecalis and Enterococcus faecium are recognized indicators of fecal contamination in water. Comparative genomics of enterococci isolated from conventional activated sludge (CAS) and biological aerated filter (BAF) WWTPs was conducted.

RESULTS: VRE isolates, including E. faecalis (n = 24), E. faecium (n = 11), E. casseliflavus (n = 2) and E. gallinarum (n = 2) were selected for sequencing based on WWTP source, species and AMR phenotype. The pangenomes of E. faecium and E. faecalis were both open. The genomic fraction related to the mobilome was positively correlated with genome size in E. faecium (p < 0.001) and E. faecalis (p < 0.001) and with the number of AMR genes in E. faecium (p = 0.005). Genes conferring vancomycin resistance, including vanA and vanM (E. faecium), vanG (E. faecalis), and vanC (E. casseliflavus/E. gallinarum), were detected in 20 genomes. The most prominent functional AMR genes were efflux pumps and transporters. A minimum of 16, 6, 5 and 3 virulence genes were detected in E. faecium, E. faecalis, E. casseliflavus and E. gallinarum, respectively. Virulence genes were more common in E. faecalis and E. faecium, than E. casseliflavus and E. gallinarum. A number of mobile genetic elements were shared among species. Functional CRISPR/Cas arrays were detected in 13 E. faecalis genomes, with all but one also containing a prophage. The lack of a functional CRISPR/Cas arrays was associated with multi-drug resistance in E. faecium. Phylogenetic analysis demonstrated differential clustering of isolates based on original source but not WWTP. Genes related to phage and CRISPR/Cas arrays could potentially serve as environmental biomarkers.

CONCLUSIONS: There was no discernible difference between enterococcal genomes from the CAS and BAF WWTPs. E. faecalis and E. faecium have smaller genomes and harbor more virulence, AMR, and mobile genetic elements than other Enterococcus spp.

RevDate: 2020-02-08

Yun BR, Malik A, SB Kim (2020)

Genome based characterization of Kitasatospora sp. MMS16-BH015, a multiple heavy metal resistant soil actinobacterium with high antimicrobial potential.

Gene, 733:144379 pii:S0378-1119(20)30048-2 [Epub ahead of print].

An actinobacterial strain designated Kitasatospora sp. MMS16-BH015, exhibiting high level of heavy metal resistance, was isolated from soil of an abandoned metal mining site, and its potential for metal resistance and secondary metabolite production was studied. The strain was resistant to multiple heavy metals including zinc (up to 100 mM), nickel (up to 2 mM) and copper (up to 0.8 mM), and also showed antimicrobial potential against a broad group of microorganisms, in particular filamentous fungi. The genome of strain MMS16-BH015 was 8.96 Mbp in size with a G + C content of 72.7%, and contained 7270 protein-coding genes and 107 tRNA/rRNA genes. The genome analysis revealed presence of at least 121 metal resistance related genes, which was prominently higher in strain MMS16-BH015 compared to other genomes of Kitasatospora. The genes included those for proteins representing various families involved in the transport of heavy metals, for example dipeptide transport ATP-binding proteins, high-affinity nickel transport proteins, and P-type heavy metal-transporting ATPases. Additionally, 43 biosynthetic gene clusters (BGCs) for secondary metabolites, enriched with those for non-ribosomal peptides, were detected in this multiple heavy metal resistant actinobacterium, which was again the highest among the compared genomes of Kitasatospora. The pan-genome analysis also identified higher numbers of unique genes related to secondary metabolite production and metal resistance mechanism in strain MMS16-BH015. A high level of correlation between the biosynthetic potential and heavy metal resistance could be observed, thus indicating that heavy metal resistant actinobacteria can be a promising source of bioactive compounds.

RevDate: 2020-02-06

Wang L, Luo Y, Zhao Y, et al (2020)

Comparative genomic analysis reveals an 'open' pan-genome of African swine fever virus.

Transboundary and emerging diseases [Epub ahead of print].

The worldwide transmission of African swine fever virus (ASFV) drastically affects the pig industry and global trade. Development of vaccines is hindered by the lack of knowledge of the genomic characteristics of ASFV. In this study, we developed a pipeline for the de novo assembly of ASFV genome without virus isolation and purification. We then used a comparative genomics approach to systematically study 46 genomes of ASFVs to reveal the genomic characteristics. The analysis revealed that ASFV has an 'open' pan-genome based on both protein-coding genes and intergenic regions. Of the 151-174 genes found in the ASFV strains, only 86 were identified as core genes; the remainder were flexible accessory genes. Notably, 44 of the 86 core genes and 155 of the 324 accessory genes have been functionally annotated according to the known proteins. Interestingly, a dynamic number of taxis-related genes were identified in the accessory genes, and two potential virulence genes were identified in all ASFV isolates. The 'open' pan-genome of ASFV based on gene and intergenic regions reveals its pronounced natural diversity concerning genomic composition and regulation.

RevDate: 2020-01-23

Alexandraki V, Kazou M, Blom J, et al (2019)

Comparative Genomics of Streptococcus thermophilus Support Important Traits Concerning the Evolution, Biology and Technological Properties of the Species.

Frontiers in microbiology, 10:2916.

Streptococcus thermophilus is a major starter for the dairy industry with great economic importance. In this study we analyzed 23 fully sequenced genomes of S. thermophilus to highlight novel aspects of the evolution, biology and technological properties of this species. Pan/core genome analysis revealed that the species has an important number of conserved genes and that the pan genome is probably going to be closed soon. According to whole genome phylogeny and average nucleotide identity (ANI) analysis, most S. thermophilus strains were grouped in two major clusters (i.e., clusters A and B). More specifically, cluster A includes strains with chromosomes above 1.83 Mbp, while cluster B includes chromosomes below this threshold. This observation suggests that strains belonging to the two clusters may be differentiated by gene gain or gene loss events. Furthermore, certain strains of cluster A could be further subdivided in subgroups, i.e., subgroup I (ASCC 1275, DGCC 7710, KLDS SM, MN-BM-A02, and ND07), II (MN-BM-A01 and MN-ZLW-002), III (LMD-9 and SMQ-301), and IV (APC151 and ND03). In cluster B certain strains formed one distinct subgroup, i.e., subgroup I (CNRZ1066, CS8, EPS, and S9). Clusters and subgroups observed for S. thermophilus indicate the existence of lineages within the species, an observation which was further supported to a variable degree by the distribution and/or the architecture of several genomic traits. These would include exopolysaccharide (EPS) gene clusters, Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs)-CRISPR associated (Cas) systems, as well as restriction-modification (R-M) systems and genomic islands (GIs). Of note, the histidine biosynthetic cluster was found present in all cluster A strains (plus strain NCTC12958T) but was absent from all strains in cluster B. Other loci related to lactose/galactose catabolism and urea metabolism, aminopeptidases, the majority of amino acid and peptide transporters, as well as amino acid biosynthetic pathways were found to be conserved in all strains suggesting their central role for the species. Our study highlights the necessity of sequencing and analyzing more S. thermophilus complete genomes to further elucidate important aspects of strain diversity within this starter culture that may be related to its application in the dairy industry.

RevDate: 2020-02-09

Lannes-Costa PS, Baraúna RA, Ramos JN, et al (2020)

Comparative genomic analysis and identification of pathogenicity islands of hypervirulent ST-17 Streptococcus agalactiae Brazilian strain.

Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases, 80:104195 pii:S1567-1348(20)30027-7 [Epub ahead of print].

Streptococcus agalactiae are important pathogenic bacteria that cause severe infections in humans, especially neonates. The mechanism by which ST-17 causes invasive infections than other STs is not well understood. In this study, we sequenced the first genome of a S. agalactiae ST-17 strain isolated in Brazil using the Illumina HiSeq 2500 technology. S. agalactiae GBS90356 ST-17 belongs to the capsular type III and was isolated from a neonatal with a fatal case of meningitis. The genome presented a size of 2.03 Mbp and a G + C content of 35.2%. S. agalactiae has 706 genes in its core genome and an open pan-genome with a size of 5.020 genes, suggesting a high genomic plasticity. GIPSy software was used to identify 10 Pathogenicity islands (PAIs) which corresponded to 15% of the genome size. IslandViewer4 corroborated the prediction of six PAIs. The pathogenicity islands showed important virulence factors genes for S. agalactiae e.g. neu, cps, dlt, fbs, cfb, lmb. SignalP detected 20 proteins with signal peptides among the 352 proteins found in PAIs, which 60% were located in the SagPAI_5. SagPAI_2 and 5 were mainly detected in ST-17 strains studied. Moreover, we identified 51 unique genes, 9 recombination regions and a large number of SNPs with an average of 760.3 polymorphisms, which can be related with high genomic plasticity and virulence during host-pathogen interactions. Our results showed implications for pathogenesis, evolution, concept of species and in silico analysis value to understand the epidemiology and genome plasticity of S. agalactiae.

RevDate: 2020-01-19

Ying J, Ye J, Xu T, et al (2019)

Comparative Genomic Analysis of Rhodococcus equi: An Insight into Genomic Diversity and Genome Evolution.

International journal of genomics, 2019:8987436.

Rhodococcus equi, a member of the Rhodococcus genus, is a gram-positive pathogenic bacterium. Rhodococcus possesses an open pan-genome that constitutes the basis of its high genomic diversity and allows for adaptation to specific niche conditions and the changing host environments. Our analysis further showed that the core genome of R. equi contributes to the pathogenicity and niche adaptation of R. equi. Comparative genomic analysis revealed that the genomes of R. equi shared identical collinearity relationship, and heterogeneity was mainly acquired by means of genomic islands and prophages. Moreover, genomic islands in R. equi were always involved in virulence, resistance, or niche adaptation and possibly working with prophages to cause the majority of genome expansion. These findings provide an insight into the genomic diversity, evolution, and structural variation of R. equi and a valuable resource for functional genomic studies.

RevDate: 2020-01-17

Mataragas M (2020)

Investigation of genomic characteristics and carbohydrates' metabolic activity of Lactococcus lactis subsp. lactis during ripening of a Swiss-type cheese.

Food microbiology, 87:103392.

Genetic diversity and metabolic properties of Lactococcus lactis subsp. lactis were explored using phylogenetic, pan-genomic and metatranscriptomic analysis. The genomes, used in the current study, were available and downloaded from the GenBank which were primarily related with microorganisms isolated from dairy products and secondarily from other foodstuffs. To study the genetic diversity of the microorganism, various bioinformatics tools were employed such as average nucleotide identity, digital DNA-DNA hybridization, phylogenetic analysis, clusters of orthologous groups analysis, KEGG orthology analysis and pan-genomic analysis. The results showed that Lc. lactis subsp. lactis strains cannot be sufficiently separated into phylogenetic lineages based on the 16S rRNA gene sequences and core genome-based phylogenetic analysis was more appropriate. Pan-genomic analysis of the strains indicated that the core, accessory and unique genome comprised of 1036, 3146 and 1296 genes, respectively. Considering the results of pan-genomic and KEGG orthology analyses, the metabolic network of Lc. lactis subsp. lactis was rebuild regarding its carbohydrates' metabolic capabilities. Based on the metatranscriptomic data during the ripening of the Swiss-type Maasdam cheese at 20 °C and 5 °C, it was shown that the microorganism performed mixed acid fermentation producing lactate, formate, acetate, ethanol and 2,3-butanediol. Mixed acid fermentation was more pronounced at higher ripening temperatures. At lower ripening temperatures, the genes involved in mixed acid fermentation were repressed while lactate production remained unaffected resembling to a homolactic fermentation. Comparative genomics and metatranscriptomic analysis are powerful tools to gain knowledge on the genomic diversity of the lactic acid bacteria used as starter cultures as well as on the metabolic activities occurring in fermented dairy products.

RevDate: 2020-01-16

Yu J, Xiang X, Huang J, et al (2020)

Haplotyping by CRISPR-mediated DNA circularization (CRISPR-hapC) broadens allele-specific gene editing.

Nucleic acids research pii:5707197 [Epub ahead of print].

Allele-specific protospacer adjacent motif (asPAM)-positioning SNPs and CRISPRs are valuable resources for gene therapy of dominant disorders. However, one technical hurdle is to identify the haplotype comprising the disease-causing allele and the distal asPAM SNPs. Here, we describe a novel CRISPR-based method (CRISPR-hapC) for haplotyping. Based on the generation (with a pair of CRISPRs) of extrachromosomal circular DNA in cells, the CRISPR-hapC can map haplotypes from a few hundred bases to over 200 Mb. To streamline and demonstrate the applicability of the CRISPR-hapC and asPAM CRISPR for allele-specific gene editing, we reanalyzed the 1000 human pan-genome and generated a high frequency asPAM SNP and CRISPR database ( for four CRISPR systems (SaCas9, SpCas9, xCas9 and Cas12a). Using the huntingtin (HTT) CAG expansion and transthyretin (TTR) exon 2 mutation as examples, we showed that the asPAM CRISPRs can specifically discriminate active and dead PAMs for all 23 loci tested. Combination of the CRISPR-hapC and asPAM CRISPRs further demonstrated the capability for achieving highly accurate and haplotype-specific deletion of the HTT CAG expansion allele and TTR exon 2 mutation in human cells. Taken together, our study provides a new approach and an important resource for genome research and allele-specific (haplotype-specific) gene therapy.

RevDate: 2020-01-23

He Y, Zhou X, Chen Z, et al (2020)

PRAP: Pan Resistome analysis pipeline.

BMC bioinformatics, 21(1):20.

BACKGROUND: Antibiotic resistance genes (ARGs) can spread among pathogens via horizontal gene transfer, resulting in imparities in their distribution even within the same species. Therefore, a pan-genome approach to analyzing resistomes is necessary for thoroughly characterizing patterns of ARGs distribution within particular pathogen populations. Software tools are readily available for either ARGs identification or pan-genome analysis, but few exist to combine the two functions.

RESULTS: We developed Pan Resistome Analysis Pipeline (PRAP) for the rapid identification of antibiotic resistance genes from various formats of whole genome sequences based on the CARD or ResFinder databases. Detailed annotations were used to analyze pan-resistome features and characterize distributions of ARGs. The contribution of different alleles to antibiotic resistance was predicted by a random forest classifier. Results of analysis were presented in browsable files along with a variety of visualization options. We demonstrated the performance of PRAP by analyzing the genomes of 26 Salmonella enterica isolates from Shanghai, China.

CONCLUSIONS: PRAP was effective for identifying ARGs and visualizing pan-resistome features, therefore facilitating pan-genomic investigation of ARGs. This tool has the ability to further excavate potential relationships between antibiotic resistance genes and their phenotypic traits.

RevDate: 2020-02-04

Park CJ, CP Andam (2020)

Distinct but Intertwined Evolutionary Histories of Multiple Salmonella enterica Subspecies.

mSystems, 5(1):.

Salmonella is responsible for many nontyphoidal foodborne infections and enteric (typhoid) fever in humans. Of the two Salmonella species, Salmonella enterica is highly diverse and includes 10 known subspecies and approximately 2,600 serotypes. Understanding the evolutionary processes that generate the tremendous diversity in Salmonella is important in reducing and controlling the incidence of disease outbreaks and the emergence of virulent strains. In this study, we aim to elucidate the impact of homologous recombination in the diversification of S. enterica subspecies. Using a data set of previously published 926 Salmonella genomes representing the 10 S. enterica subspecies and Salmonella bongori, we calculated a genus-wide pan-genome composed of 84,041 genes and the S. enterica pan-genome of 81,371 genes. The size of the accessory genomes varies between 12,429 genes in S. enterica subsp. arizonae (subsp. IIIa) to 33,257 genes in S. enterica subsp. enterica (subsp. I). A total of 12,136 genes in the Salmonella pan-genome show evidence of recombination, representing 14.44% of the pan-genome. We identified genomic hot spots of recombination that include genes associated with flagellin and the synthesis of methionine and thiamine pyrophosphate, which are known to influence host adaptation and virulence. Last, we uncovered within-species heterogeneity in rates of recombination and preferential genetic exchange between certain donor and recipient strains. Frequent but biased recombination within a bacterial species may suggest that lineages vary in their response to environmental selection pressure. Certain lineages, such as the more uncommon non-enterica subspecies (non-S. enterica subsp. enterica), may also act as a major reservoir of genetic diversity for the wider population.IMPORTANCES. enterica is a major foodborne pathogen, which can be transmitted via several distinct routes from animals and environmental sources to human hosts. Multiple subspecies and serotypes of S. enterica exhibit considerable differences in virulence, host specificity, and colonization. This study provides detailed insights into the dynamics of recombination and its contributions to S. enterica subspecies evolution. Widespread recombination within the species means that new adaptations arising in one lineage can be rapidly transferred to another lineage. We therefore predict that recombination has been an important factor in the emergence of several major disease-causing strains from diverse genomic backgrounds and their ability to adapt to disparate environments.

RevDate: 2020-01-31

Nakamura K, Murase K, Sato MP, et al (2020)

Differential dynamics and impacts of prophages and plasmids on the pangenome and virulence factor repertoires of Shiga toxin-producing Escherichia coli O145:H28.

Microbial genomics, 6(1):.

Phages and plasmids play important roles in bacterial evolution and diversification. Although many draft genomes have been generated, phage and plasmid genomes are usually fragmented, limiting our understanding of their dynamics. Here, we performed a systematic analysis of 239 draft genomes and 7 complete genomes of Shiga toxin (Stx)-producing Escherichia coli O145:H28, the major virulence factors of which are encoded by prophages (PPs) or plasmids. The results indicated that PPs are more stably maintained than plasmids. A set of ancestrally acquired PPs was well conserved, while various PPs, including Stx phages, were acquired by multiple sublineages. In contrast, gains and losses of a wide range of plasmids have frequently occurred across the O145:H28 lineage, and only the virulence plasmid was well conserved. The different dynamics of PPs and plasmids have differentially impacted the pangenome of O145:H28, with high proportions of PP- and plasmid-associated genes in the variably present and rare gene fractions, respectively. The dynamics of PPs and plasmids have also strongly impacted virulence gene repertoires, such as the highly variable distribution of stx genes and the high conservation of a set of type III secretion effectors, which probably represents the core effectors of O145:H28 and the genes on the virulence plasmid in the entire O145:H28 population. These results provide detailed insights into the dynamics of PPs and plasmids, and show the application of genomic analyses using a large set of draft genomes and appropriately selected complete genomes.

RevDate: 2020-01-14

Tetz VV, GV Tetz (2020)

A new biological definition of life.

Biomolecular concepts, 11(1):1-6 pii:/j/bmc.2020.11.issue-1/bmc-2020-0001/bmc-2020-0001.xml.

Here we have proposed a new biological definition of life based on the function and reproduction of existing genes and creation of new ones, which is applicable to both unicellular and multicellular organisms. First, we coined a new term "genetic information metabolism" comprising functioning, reproduction, and creation of genes and their distribution among living and non-living carriers of genetic information. Encompassing this concept, life is defined as organized matter that provides genetic information metabolism. Additionally, we have articulated the general biological function of life as Tetz biological law: "General biological function of life is to provide genetic information metabolism" and formulated novel definition of life: "Life is an organized matter that provides genetic information metabolism". New definition of life and Tetz biological law allow to distinguish in a new way living and non-living objects on Earth and other planets based on providing genetic information metabolism.

RevDate: 2020-01-23

Song JM, Guan Z, Hu J, et al (2020)

Eight high-quality genomes reveal pan-genome architecture and ecotype differentiation of Brassica napus.

Nature plants, 6(1):34-45.

Rapeseed (Brassica napus) is the second most important oilseed crop in the world but the genetic diversity underlying its massive phenotypic variations remains largely unexplored. Here, we report the sequencing, de novo assembly and annotation of eight B. napus accessions. Using pan-genome comparative analysis, millions of small variations and 77.2-149.6 megabase presence and absence variations (PAVs) were identified. More than 9.4% of the genes contained large-effect mutations or structural variations. PAV-based genome-wide association study (PAV-GWAS) directly identified causal structural variations for silique length, seed weight and flowering time in a nested association mapping population with ZS11 (reference line) as the donor, which were not detected by single-nucleotide polymorphisms-based GWAS (SNP-GWAS), demonstrating that PAV-GWAS was complementary to SNP-GWAS in identifying associations to traits. Further analysis showed that PAVs in three FLOWERING LOCUS C genes were closely related to flowering time and ecotype differentiation. This study provides resources to support a better understanding of the genome architecture and acceleration of the genetic improvement of B. napus.

RevDate: 2020-01-15

Jaiswal AK, Tiwari S, Jamal SB, et al (2020)

The pan-genome of Treponema pallidum reveals differences in genome plasticity between subspecies related to venereal and non-venereal syphilis.

BMC genomics, 21(1):33.

BACKGROUND: Spirochetal organisms of the Treponema genus are responsible for causing Treponematoses. Pathogenic treponemes is a Gram-negative, motile, spirochete pathogen that causes syphilis in human. Treponema pallidum subsp. endemicum (TEN) causes endemic syphilis (bejel); T. pallidum subsp. pallidum (TPA) causes venereal syphilis; T. pallidum subsp. pertenue (TPE) causes yaws; and T. pallidum subsp. Ccarateum causes pinta. Out of these four high morbidity diseases, venereal syphilis is mediated by sexual contact; the other three diseases are transmitted by close personal contact. The global distribution of syphilis is alarming and there is an increasing need of proper treatment and preventive measures. Unfortunately, effective measures are limited.

RESULTS: Here, the genome sequences of 53 T. pallidum strains isolated from different parts of the world and a diverse range of hosts were comparatively analysed using pan-genomic strategy. Phylogenomic, pan-genomic, core genomic and singleton analysis disclosed the close connection among all strains of the pathogen T. pallidum, its clonal behaviour and showed increases in the sizes of the pan-genome. Based on the genome plasticity analysis of the subsets containing the subspecies T pallidum subsp. pallidum, T. pallidum subsp. endemicum and T. pallidum subsp. pertenue, we found differences in the presence/absence of pathogenicity islands (PAIs) and genomic islands (GIs) on subsp.-based study.

CONCLUSIONS: In summary, we identified four pathogenicity islands (PAIs), eight genomic islands (GIs) in subsp. pallidum, whereas subsp. endemicum has three PAIs and seven GIs and subsp. pertenue harbours three PAIs and eight GIs. Concerning the presence of genes in PAIs and GIs, we found some genes related to lipid and amino acid biosynthesis that were only present in the subsp. of T. pallidum, compared to T. pallidum subsp. endemicum and T. pallidum subsp. pertenue.

RevDate: 2020-02-09

Si-Tuan N, Ngoc HM, Nhat LD, et al (2020)

Genomic features, whole-genome phylogenetic and comparative genomic analysis of extreme-drug-resistant ventilator-associated-pneumonia Acinetobacter baumannii strain in a Vietnam hospital.

Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases, 80:104178 pii:S1567-1348(20)30010-1 [Epub ahead of print].

OBJECTIVES: Acinetobacter baumannii is a major cause of ventilator-associated-pneumonia (VAP) worldwide due to its impressive propensity to rapidly acquire resistance elements to a wide range of antibacterial agents. We sought to explore the genomic features of this pathogen from a sputum specimen of a VAP male patient.

METHODS: Whole genome analysis of A. baumannii DMS06670 included de novo assembly; functional annotation, whole-genome-phylogenetic analysis, antibiotics genes identification, prophage regions, virulent factor and pan-genome analysis.

RESULTS: Assembly of whole-genome shotgun sequences of strain DMS06670 yielded an estimated genome size of 3.8 Mb with Sequence Type 447. Functional annotation and orthologous protein cluster analysis identified several potential antibiotic resistance genes was conducted (with 1 novel gene), prophage regions, virulent factors. The clusters of orthologous groups (COGs) analysis in protein sequence of the A. baumannii strain was compared with the other five genomes showed that the orthologous protein clusters responsible for multi-drug exist inside highly antimicrobial resistant strains. Whole-genome phylogenetic and in silico MLST analysis revealed that this A. baumannii strain is in the same clade as strains LAC-4 and BJAB0715. Comparative analysis of 23 available genomes of A. baumannii revealed a pan-genome consisting of 15,883 genes.

CONCLUSION: Our findings provide insight into the virulence-associated genes and then compared with the genomes of other A. baumannii strains by calculation of ANI values and pan-genome analysis. Functional studies of these pathogens are required to validate these findings.

RevDate: 2020-01-12

Rodriguez CI, JBH Martiny (2020)

Evolutionary relationships among bifidobacteria and their hosts and environments.

BMC genomics, 21(1):26.

BACKGROUND: The assembly of animal microbiomes is influenced by multiple environmental factors and host genetics, although the relative importance of these factors remains unclear. Bifidobacteria (genus Bifidobacterium, phylum Actinobacteria) are common first colonizers of gut microbiomes in humans and inhabit other mammals, social insects, food, and sewages. In humans, the presence of bifidobacteria in the gut has been correlated with health-promoting benefits. Here, we compared the genome sequences of a subset of the over 400 Bifidobacterium strains publicly available to investigate the adaptation of bifidobacteria diversity. We tested 1) whether bifidobacteria show a phylogenetic signal with their isolation sources (hosts and environments) and 2) whether key traits encoded by the bifidobacteria genomes depend on the host or environment from which they were isolated. We analyzed Bifidobacterium genomes available in the PATRIC and NCBI repositories and identified the hosts and/or environment from which they were isolated. A multilocus phylogenetic analysis was conducted to compare the genetic relatedness the strains harbored by different hosts and environments. Furthermore, we examined differences in genomic traits and genes related to amino acid biosynthesis and degradation of carbohydrates.

RESULTS: We found that bifidobacteria diversity appears to have evolved with their hosts as strains isolated from the same host were non-randomly associated with their phylogenetic relatedness. Moreover, bifidobacteria isolated from different sources displayed differences in genomic traits such as genome size and accessory gene composition and on particular traits related to amino acid production and degradation of carbohydrates. In contrast, when analyzing diversity within human-derived bifidobacteria, we observed no phylogenetic signal or differences on specific traits (amino acid biosynthesis genes and CAZymes).

CONCLUSIONS: Overall, our study shows that bifidobacteria diversity is strongly adapted to specific hosts and environments and that several genomic traits were associated with their isolation sources. However, this signal is not observed in human-derived strains alone. Looking into the genomic signatures of bifidobacteria strains in different environments can give insights into how this bacterial group adapts to their environment and what types of traits are important for these adaptations.

RevDate: 2020-02-11

Garcia Teijeiro R, Belimov AA, IC Dodd (2019)

Microbial inoculum development for ameliorating crop drought stress: A case study of Variovorax paradoxus 5C-2.

New biotechnology, 56:103-113 pii:S1871-6784(19)30008-1 [Epub ahead of print].

Drought affects plant hormonal homeostasis, including root to shoot signalling. The plant is intimately connected below-ground with soil-dwelling microbes, including plant growth promoting rhizobacteria (PGPR) that can modulate plant hormonal homeostasis. Incorporating PGPR into the rhizosphere often delivers favourable results in greenhouse experiments, while field applications are much less predictable. We review the natural processes that affect the formation and dynamics of the rhizosphere, establishing a model for successful field application of PGPR utilizing an example microbial inoculum, Variovorax paradoxus 5C-2.

RevDate: 2020-01-03

Rasheed A, Takumi S, Hassan MA, et al (2020)

Appraisal of wheat genomics for gene discovery and breeding applications: a special emphasis on advances in Asia.

TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik pii:10.1007/s00122-019-03523-w [Epub ahead of print].

KEY MESSAGE: We discussed the most recent efforts in wheat functional genomics to discover new genes and their deployment in breeding with special emphasis on advances in Asian countries. Wheat research community is making significant progress to bridge genotype-to-phenotype gap and then applying this knowledge in genetic improvement. The advances in genomics and phenomics have intrigued wheat researchers in Asia to make best use of this knowledge in gene and trait discovery. These advancements include, but not limited to, map-based gene cloning, translational genomics, gene mapping, association genetics, gene editing and genomic selection. We reviewed more than 57 homeologous genes discovered underpinning important traits and multiple strategies used for their discovery. Further, the complementary advancements in wheat phenomics and analytical approaches to understand the genetics of wheat adaptability, resilience to climate extremes and resistance to pest and diseases were discussed. The challenge to build a gold standard reference genome sequence of bread wheat is now achieved and several de novo reference sequences from the cultivars representing different gene pools will be available soon. New pan-genome sequencing resources of wheat will strengthen the foundation required for accelerated gene discovery and provide more opportunities to practice the knowledge-based breeding.

RevDate: 2020-01-11

Sulthana A, Lakshmi SG, RS Madempudi (2019)

High-quality draft genome and characterization of commercially potent probiotic Lactobacillus strains.

Genomics & informatics, 17(4):e43.

Lactobacillus acidophilus UBLA-34, L. paracasei UBLPC-35, L. plantarum UBLP-40, and L. reuteri UBLRU-87 were isolated from different varieties of fermented foods. To determine the probiotic safety at the strain level, the whole genome of the respective strains was sequenced, assembled, and characterized. Both the core-genome and pan-genome phylogeny showed that L. reuteri was closest to L. plantarum than to L. acidophilus, which was closest to L. paracasei. The genomic analysis of all the strains confirmed the absence of genes encoding putative virulence factors, antibiotic resistance, and the plasmids.

RevDate: 2020-01-08

Hu H, Yuan Y, Bayer PE, et al (2020)

Legume Pangenome Construction Using an Iterative Mapping and Assembly Approach.

Methods in molecular biology (Clifton, N.J.), 2107:35-47.

A pangenome is a collection of genomic sequences found in the entire species rather than a single individual. It allows for comprehensive, species-wide characterization of genetic variations and mining of variable genes which may play important roles in phenotypes of interest. Recent advances in sequencing technologies have facilitated draft genome sequence construction and have made pangenome constructions feasible. Here, we present a reference genome-based iterative mapping and assembly method to construct a pangenome for a legume species.

RevDate: 2020-02-11

Kim Y, Gu C, Kim HU, et al (2019)

Current status of pan-genome analysis for pathogenic bacteria.

Current opinion in biotechnology, 63:54-62 pii:S0958-1669(19)30138-7 [Epub ahead of print].

Biological knowledge accumulated over the decades and advances in computational methods have facilitated the implementation of pan-genome analysis that aims at better understanding of genotype-phenotype associations of a specific group of organisms. Pan-genome analysis has been shown to be an effective approach to better understand a clade of pathogenic bacteria because it helps developing various and tailored therapeutic strategies on the basis of their biological similarities and differences. Here, we review recent progress in the pan-genome analysis of pathogenic bacteria. In particular, we focus on computational tools that allow streamlined pan-genome analysis. Also, various applications of pan-genome analysis including those relevant to devising strategies for the prevention and treatment of pathogenic bacteria are reviewed.

RevDate: 2020-02-05

Coutinho FH, Edwards RA, F Rodríguez-Valera (2019)

Charting the diversity of uncultured viruses of Archaea and Bacteria.

BMC biology, 17(1):109.

BACKGROUND: Viruses of Archaea and Bacteria are among the most abundant and diverse biological entities on Earth. Unraveling their biodiversity has been challenging due to methodological limitations. Recent advances in culture-independent techniques, such as metagenomics, shed light on the unknown viral diversity, revealing thousands of new viral nucleotide sequences at an unprecedented scale. However, these novel sequences have not been properly classified and the evolutionary associations between them were not resolved.

RESULTS: Here, we performed phylogenomic analysis of nearly 200,000 viral nucleotide sequences to establish GL-UVAB: Genomic Lineages of Uncultured Viruses of Archaea and Bacteria. The pan-genome content of the identified lineages shed light on some of their infection strategies, potential to modulate host physiology, and mechanisms to escape host resistance systems. Furthermore, using GL-UVAB as a reference database for annotating metagenomes revealed elusive habitat distribution patterns of viral lineages and environmental drivers of community composition.

CONCLUSIONS: These findings provide insights about the genomic diversity and ecology of viruses of prokaryotes. The source code used in these analyses is freely available at

RevDate: 2020-01-17

Golicz AA, Bayer PE, Bhalla PL, et al (2020)

Pangenomics Comes of Age: From Bacteria to Plant and Animal Applications.

Trends in genetics : TIG, 36(2):132-145.

The pangenome refers to a collection of genomic sequence found in the entire species or population rather than in a single individual; the sequence can be core, present in all individuals, or accessory (variable or dispensable), found in a subset of individuals only. While pangenomic studies were first undertaken in bacterial species, developments in genome sequencing and assembly approaches have allowed construction of pangenomes for eukaryotic organisms, fungi, plants, and animals, including two large-scale human pangenome projects. Analysis of the these pangenomes revealed key differences, most likely stemming from divergent evolutionary histories, but also surprising similarities.

RevDate: 2020-01-08

Lee IPA, CP Andam (2019)

Pan-genome diversification and recombination in Cronobacter sakazakii, an opportunistic pathogen in neonates, and insights to its xerotolerant lifestyle.

BMC microbiology, 19(1):306.

BACKGROUND: Cronobacter sakazakii is an emerging opportunistic bacterial pathogen known to cause neonatal and pediatric infections, including meningitis, necrotizing enterocolitis, and bacteremia. Multiple disease outbreaks of C. sakazakii have been documented in the past few decades, yet little is known of its genomic diversity, adaptation, and evolution. Here, we analyzed the pan-genome characteristics and phylogenetic relationships of 237 genomes of C. sakazakii and 48 genomes of related Cronobacter species isolated from diverse sources.

RESULTS: The C. sakazakii pan-genome contains 17,158 orthologous gene clusters, and approximately 19.5% of these constitute the core genome. Phylogenetic analyses reveal the presence of at least ten deep branching monophyletic lineages indicative of ancestral diversification. We detected enrichment of functions involved in proton transport and rotational mechanism in accessory genes exclusively found in human-derived strains. In environment-exclusive accessory genes, we detected enrichment for those involved in tryptophan biosynthesis and indole metabolism. However, we did not find significantly enriched gene functions for those genes exclusively found in food strains. The most frequently detected virulence genes are those that encode proteins associated with chemotaxis, enterobactin synthesis, ferrienterobactin transporter, type VI secretion system, galactose metabolism, and mannose metabolism. The genes fos which encodes resistance against fosfomycin, a broad-spectrum cell wall synthesis inhibitor, and mdf(A) which encodes a multidrug efflux transporter were found in nearly all genomes. We found that a total of 2991 genes in the pan-genome have had a history of recombination. Many of the most frequently recombined genes are associated with nutrient acquisition, metabolism and toxin production.

CONCLUSIONS: Overall, our results indicate that the presence of a large accessory gene pool, ability to switch between ecological niches, a diverse suite of antibiotic resistance, virulence and niche-specific genes, and frequent recombination partly explain the remarkable adaptability of C. sakazakii within and outside the human host. These findings provide critical insights that can help define the development of effective disease surveillance and control strategies for Cronobacter-related diseases.

RevDate: 2020-02-04

Wang Y, Luo L, Li Q, et al (2019)

Genomic dissection of the most prevalent Listeria monocytogenes clone, sequence type ST87, in China.

BMC genomics, 20(1):1014.

BACKGROUND: Listeria monocytogenes consists of four lineages that occupy a wide variety of ecological niches. Sequence type (ST) 87 (serotype 1/2b), belonging to lineage I, is one of the most common STs isolated from food products, food associated environments and sporadic listeriosis in China. Here, we performed a comparative genomic analysis of the L. monocytogenes ST87 clone by sequencing 71 strains representing a diverse range of sources, different geographical locations and isolation years.

RESULTS: The core genome and pan genome of ST87 contained 2667 genes and 3687 genes respectively. Phylogenetic analysis based on core genome SNPs divided the 71 strains into 10 clades. The clinical strains were distributed among multiple clades. Four clades contained strains from multiple geographic regions and showed high genetic diversity. The major gene content variation of ST87 genomes was due to putative prophages, with eleven hotspots of the genome that harbor prophages. All strains carry an intact CRISRP/Cas system. Two major CRISPR spacer profiles were found which were not clustered phylogenetically. A large plasmid of about 90 Kb, which carried heavy metal resistance genes, was found in 32.4% (23/71) of the strains. All ST87 strains harbored the Listeria pathogenicity island (LIPI)-4 and a unique 10-open read frame (ORF) genomic island containing a novel restriction-modification system.

CONCLUSION: Whole genome sequence analysis of L. monocytogenes ST87 enabled a clearer understanding of the population structure and the evolutionary history of ST87 L. monocytogenes in China. The novel genetic elements identified may contribute to its virulence and adaptation to different environmental niches. Our findings will be useful for the development of effective strategies for the prevention and treatment of listeriosis caused by this prevalent clone.

RevDate: 2020-01-08

Albert K, Rani A, DA Sela (2019)

Comparative Pangenomics of the Mammalian Gut Commensal Bifidobacterium longum.

Microorganisms, 8(1): pii:microorganisms8010007.

Bifidobacterium longum colonizes mammalian gastrointestinal tracts where it could metabolize host-indigestible oligosaccharides. Although B. longum strains are currently segregated into three subspecies that reflect common metabolic capacities and genetic similarity, heterogeneity within subspecies suggests that these taxonomic boundaries may not be completely resolved. To address this, the B. longum pangenome was analyzed from representative strains isolated from a diverse set of sources. As a result, the B. longum pangenome is open and contains almost 17,000 genes, with over 85% of genes found in ≤28 of 191 strains. B. longum genomes share a small core gene set of only ~500 genes, or ~3% of the total pangenome. Although the individual B. longum subspecies pangenomes share similar relative abundances of clusters of orthologous groups, strains show inter- and intrasubspecies differences with respect to carbohydrate utilization gene content and growth phenotypes.

RevDate: 2019-12-18

Sitto F, FU Battistuzzi (2019)

Estimating PanGenomes with Roary.

Molecular biology and evolution pii:5652084 [Epub ahead of print].

A description of the genetic make-up of a species based on a single genome is often insufficient because it ignores the variability in gene repertoire among multiple strains. The estimation of the pangenome of a species is a solution to this issue as it provides an overview of genes that are shared by all strains and genes that are present in only some of the genomes. These different sets of genes can then be analyzed functionally to explore correlations with unique phenotypes and adaptations. This protocol presents the usage of Roary, a Linux-native pangenome application. Roary is a straightforward software that provides (i) an overview about core and accessory genes for those interested in general trends and, also, (ii) detailed information on gene presence/absence in each genome for in-depth analyses. Results are provided both in text and graphic format.

RevDate: 2019-12-18

Heo S, Lee JS, Lee JH, et al (2019)

Comparative genomic analysis of food-originated coagulase-negative Staphylococcus: Analysis of conserved core genes and diversity of the pan-genome.

Journal of microbiology and biotechnology pii:10.4014/jmb.1910.10049 [Epub ahead of print].

To shed light on the genetic differences among food-originated coagulase-negative Staphylococcus (CNS), we performed pan-genome analysis of five species: Staphylococcus carnosus (two strains), Staphylococcus equorum (two strains), Staphylococcus succinus (three strains), Staphylococcus xylosus (two strains), and Staphylococcus saprophyticus (one strain). The pan-genome size increases with each new strain and currently holds about 4,500 genes from 10 genomes. Specific genes were shown to be strain dependent but not species dependent. Most specific genes were of unknown function or encoded restriction31 modification enzymes, transposases, or prophages. Our results indicate that unique genes have been acquired or lost by convergent evolution within individual strains.

RevDate: 2020-01-08

Liang CY, Yang CH, Lai CH, et al (2019)

Comparative Genomics of 86 Whole-Genome Sequences in the Six Species of the Elizabethkingia Genus Reveals Intraspecific and Interspecific Divergence.

Scientific reports, 9(1):19167.

Bacteria of the genus Elizabethkingia are emerging infectious agents that can cause infection in humans. The number of published whole-genome sequences of Elizabethkingia is rapidly increasing. In this study, we used comparative genomics to investigate the genomes of the six species in the Elizabethkingia genus, namely E. meningoseptica, E. anophelis, E. miricola, E. bruuniana, E. ursingii, and E. occulta. In silico DNA-DNA hybridization, whole-genome sequence-based phylogeny, pan genome analysis, and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were performed, and clusters of orthologous groups were evaluated. Of the 86 whole-genome sequences available in GenBank, 21 were complete genome sequences and 65 were shotgun sequences. In silico DNA-DNA hybridization clearly delineated the six Elizabethkingia species. Phylogenetic analysis confirmed that E. bruuniana, E. ursingii, and E. occulta were closer to E. miricola than to E. meningoseptica and E. anophelis. A total of 2,609 clusters of orthologous groups were identified among the six type strains of the Elizabethkingia genus. Metabolism-related clusters of orthologous groups accounted for the majority of gene families in KEGG analysis. New genes were identified that substantially increased the total repertoire of the pan genome after the addition of 86 Elizabethkingia genomes, which suggests that Elizabethkingia has shown adaptive evolution to environmental change. This study presents a comparative genomic analysis of Elizabethkingia, and the results of this study provide knowledge that facilitates a better understanding of this microorganism.

RevDate: 2020-02-05

D'Mello A, Ahearn CP, Murphy TF, et al (2019)

ReVac: a reverse vaccinology computational pipeline for prioritization of prokaryotic protein vaccine candidates.

BMC genomics, 20(1):981.

BACKGROUND: Reverse vaccinology accelerates the discovery of potential vaccine candidates (PVCs) prior to experimental validation. Current programs typically use one bacterial proteome to identify PVCs through a filtering architecture using feature prediction programs or a machine learning approach. Filtering approaches may eliminate potential antigens based on limitations in the accuracy of prediction tools used. Machine learning approaches are heavily dependent on the selection of training datasets with experimentally validated antigens (positive control) and non-protective-antigens (negative control). The use of one or few bacterial proteomes does not assess PVC conservation among strains, an important feature of vaccine antigens.

RESULTS: We present ReVac, which implements both a panoply of feature prediction programs without filtering out proteins, and scoring of candidates based on predictions made on curated positive and negative control PVCs datasets. ReVac surveys several genomes assessing protein conservation, as well as DNA and protein repeats, which may result in variable expression of PVCs. ReVac's orthologous clustering of conserved genes, identifies core and dispensable genome components. This is useful for determining the degree of conservation of PVCs among the population of isolates for a given pathogen. Potential vaccine candidates are then prioritized based on conservation and overall feature-based scoring. We present the application of ReVac, applied to 69 Moraxella catarrhalis and 270 non-typeable Haemophilus influenzae genomes, prioritizing 64 and 29 proteins as PVCs, respectively.

CONCLUSION: ReVac's use of a scoring scheme ranks PVCs for subsequent experimental testing. It employs a redundancy-based approach in its predictions of features using several prediction tools. The protein's features are collated, and each protein is ranked based on the scoring scheme. Multi-genome analyses performed in ReVac allow for a comprehensive overview of PVCs from a pan-genome perspective, as an essential pre-requisite for any bacterial subunit vaccine design. ReVac prioritized PVCs of two human respiratory pathogens, identifying both novel and previously validated PVCs.

RevDate: 2019-12-26

Haro-Moreno JM, Rodriguez-Valera F, Rosselli R, et al (2019)

Ecogenomics of the SAR11 clade.

Environmental microbiology [Epub ahead of print].

Members of the SAR11 clade, despite their high abundance, are often poorly represented by metagenome-assembled genomes. This fact has hampered our knowledge about their ecology and genetic diversity. Here we examined 175 SAR11 genomes, including 47 new single-amplified genomes. The presence of the first genomes associated with subclade IV suggests that, in the same way as subclade V, they might be outside the proposed Pelagibacterales order. An expanded phylogenomic classification together with patterns of metagenomic recruitment at a global scale have allowed us to define new ecogenomic units of classification (genomospecies), appearing at different, and sometimes restricted, metagenomic data sets. We detected greater microdiversity across the water column at a single location than in samples collected from similar depth across the global ocean, suggesting little influence of biogeography. In addition, pangenome analysis revealed that the flexible genome was essential to shape genomospecies distribution. In one genomospecies preferentially found within the Mediterranean, a set of genes involved in phosphonate utilization was detected. While another, with a more cosmopolitan distribution, was unique in having an aerobic purine degradation pathway. Together, these results provide a glimpse of the enormous genomic diversity within this clade at a finer resolution than the currently defined clades.

RevDate: 2019-12-15

Choi JY, Kim SC, PC Lee (2019)

Comparative genome analysis of Psychrobacillus strain PB01, isolated from an iceberg.

Journal of microbiology and biotechnology pii:10.4014/jmb.1909.09008 [Epub ahead of print].

A novel psychrotolerant Psychrobacillus strain PB01, isolated from an Antarctic iceberg, was comparatively analyzed with five related strains. The complete genome of strain PB01 consists of a single circular chromosome (4.3 Mbp) and a plasmid (19 Kbp). As potential low temperature adaption strategies strain PB01 has four genes encoding cold-shock proteins, two genes encoding DEAD-box RNA helicases, and eight genes encoding transporters for glycine betaine, which can serve as a cryoprotectant, on the genome. The pan-genome structure of the six Psychrobacillus strains suggest that strain PB01 might evolve to adapt to extreme environments by changing genome content such as high capacity for DNA repair, translation, and membrane transporter. Notably, strain PB01 possess a complete TCA cycle consisting of eight enzymes as well as additional Helicobacter pylori type three enzymes: ferredoxin-dependent 2-oxoglutarate synthase, succinyl-CoA/acetoacetyl-CoA transferase, and malate/quinone oxidoreductase. The co-existence of the genes for TCA cycle enzymes are also identified in the other five Psychrobacillus strains.

RevDate: 2019-12-18

Lee BH, Cole S, Badel-Berchoux S, et al (2019)

Biofilm Formation of Listeria monocytogenes Strains Under Food Processing Environments and Pan-Genome-Wide Association Study.

Frontiers in microbiology, 10:2698.

Concerns about food contamination by Listeria monocytogenes are on the rise with increasing consumption of ready-to-eat foods. Biofilm production of L. monocytogenes is presumed to be one of the ways that confer its increased resistance and persistence in the food chain. In this study, a collection of isolates from foods and food processing environments (FPEs) representing persistent, prevalent, and rarely detected genotypes was evaluated for biofilm forming capacities including adhesion and sessile biomass production under diverse environmental conditions. The quantity of sessile biomass varied according to growth conditions, lineage, serotype as well as genotype but association of clonal complex (CC) 26 genotype with biofilm production was evidenced under cold temperature. In general, relative biofilm productivity of each strain varied inconsistently across growth conditions. Under our experimental conditions, there were no clear associations between biofilm formation efficiency and persistent or prevalent genotypes. Distinct extrinsic factors affected specific steps of biofilm formation. Sudden nutrient deprivation enhanced cellular adhesion while a prolonged nutrient deficiency impeded biofilm maturation. Salt addition increased biofilm production, moreover, nutrient limitation supplemented by salt significantly stimulated biofilm formation. Pan-genome-wide association study (Pan-GWAS) assessed genetic composition with regard to biofilm phenotypes for the first time. The number of reported genes differed depending on the growth conditions and the number of common genes was low. However, a broad overview of the ontology contents revealed similar patterns regardless of the conditions. Functional analysis showed that functions related to transformation/competence and surface proteins including Internalins were highly enriched.

RevDate: 2020-01-08

Jandrasits C, Kröger S, Haas W, et al (2019)

Computational pan-genome mapping and pairwise SNP-distance improve detection of Mycobacterium tuberculosis transmission clusters.

PLoS computational biology, 15(12):e1007527.

Next-generation sequencing based base-by-base distance measures have become an integral complement to epidemiological investigation of infectious disease outbreaks. This study introduces PANPASCO, a computational pan-genome mapping based, pairwise distance method that is highly sensitive to differences between cases, even when located in regions of lineage specific reference genomes. We show that our approach is superior to previously published methods in several datasets and across different Mycobacterium tuberculosis lineages, as its characteristics allow the comparison of a high number of diverse samples in one analysis-a scenario that becomes more and more likely with the increased usage of whole-genome sequencing in transmission surveillance.

RevDate: 2020-02-05

Emery A, Marpaux N, Naegelen C, et al (2020)

Genotypic study of Citrobacter koseri, an emergent platelet contaminant since 2012 in France.

Transfusion, 60(2):245-249.

BACKGROUND: Transfusion-transmitted bacterial infection is a rare occurrence but the most feared complication in transfusion practices. Between 2012 and 2017, five cases of platelet concentrates (PCs) contaminated with the bacterial pathogen Citrobacter koseri (PC-Ck) have been reported in France, with two leading to the death of the recipients. We tested the possibilities of the emergence of a PC-specific clone of C. koseri (Ck) and of specific bacterial genes associated with PC contamination.

STUDY DESIGN AND METHODS: The phylogenetic network, based on a homemade Ck core genome scheme, inferred from the genomes of 20 worldwide Ck isolates unrelated to PC contamination taken as controls (U-Ck) and the genomes of the five PC-Ck, explored the clonal relationship between the genomes and evaluated the distribution of PC-Ck throughout the species. Along with this core genome multilocus sequence typing approach, a Ck pan genome has been used to seek genes specific to PC-Ck isolates.

RESULTS: Our genomic approach suggested that the population of C. koseri is nonclonal, although it also identified a cluster containing three PC-Ck and eight U-Ck. Indeed, the PC-Ck did not share any specific genes.

CONCLUSION: The elevated incidence of PCs contaminated by C. koseri in France between 2012 and 2017 was not due to the dissemination of a clone. The determinants of the recent outbreaks of PC contamination with C. koseri are still unknown.

RevDate: 2019-12-09

Li R, Fu W, Su R, et al (2019)

Towards the Complete Goat Pan-Genome by Recovering Missing Genomic Segments From the Reference Genome.

Frontiers in genetics, 10:1169.

It is broadly expected that next generation sequencing will ultimately generate a complete genome as is the latest goat reference genome (ARS1), which is considered to be one of the most continuous assemblies in livestock. However, the rich diversity of worldwide goat breeds indicates that a genome from one individual would be insufficient to represent the whole genomic contents of goats. By comparing nine de novo assemblies from seven sibling species of domestic goat with ARS1 and using resequencing and transcriptome data from goats for verification, we identified a total of 38.3 Mb sequences that were absent in ARS1. The pan-sequences contain genic fractions with considerable expression. Using the pan-genome (ARS1 together with the pan-sequences) as a reference genome, variation calling efficacy can be appreciably improved. A total of 56,657 spurious SNPs per individual were repressed and 24,414 novel SNPs per individual on average were recovered as a result of better reads mapping quality. The transcriptomic mapping rate was also increased by ∼1.15%. Our study demonstrated that comparing de novo assemblies from closely related species is an efficient and reliable strategy for finding missing sequences from the reference genome and could be applicable to other species. Pan-genome can serve as an improved reference genome in animals for a better exploration of the underlying genomic variations and could increase the probability of finding genotype-phenotype associations assessed by a comprehensive variation database containing much more differences between individuals. We have constructed a goat pan-genome web interface for data visualization (

RevDate: 2019-12-04

Sutton D, Livingstone PG, Furness E, et al (2019)

Genome-Wide Identification of Myxobacterial Predation Genes and Demonstration of Formaldehyde Secretion as a Potentially Predation-Resistant Trait of Pseudomonas aeruginosa.

Frontiers in microbiology, 10:2650.

Despite widespread use in human biology, genome-wide association studies (GWAS) of bacteria are few and have, to date, focused primarily on pathogens. Myxobacteria are predatory microbes with large patchwork genomes, with individual strains secreting unique cocktails of predatory proteins and metabolites. We investigated whether a GWAS strategy could be applied to myxobacteria to identify genes associated with predation. Deduced proteomes from 29 myxobacterial genomes (including eight Myxococcus genomes sequenced for this study), were clustered into orthologous groups, and the presence/absence of orthologues assessed in superior and inferior predators of ten prey organisms. 139 'predation genes' were identified as being associated significantly with predation, including some whose annotation suggested a testable predatory mechanism. Formaldehyde dismutase (fdm) was associated with superior predation of Pseudomonas aeruginosa, and predatory activity of a strain lacking fdm could be increased by the exogenous addition of a formaldehyde detoxifying enzyme, suggesting that production of formaldehyde by P. aeruginosa acts as an anti-predation behaviour. This study establishes the utility of bacterial GWAS to investigate microbial processes beyond pathogenesis, giving plausible and verifiable associations between gene presence/absence and predatory phenotype. We propose that the slow growth rate of myxobacteria, coupled with their predatory mechanism of constitutive secretion, has rendered them relatively resistant to genome streamlining. The resultant genome expansion made possible their observed accumulation of prey-specific predatory genes, without requiring them to be selected for by frequent or recent predation on diverse prey, potentially explaining both the large pan-genome and broad prey range of myxobacteria.

RevDate: 2020-01-08

Yuan J, Li YY, Xu Y, et al (2019)

Molecular Signatures Related to the Virulence of Bacillus cereus Sensu Lato, a Leading Cause of Devastating Endophthalmitis.

mSystems, 4(6):.

Bacillus endophthalmitis is a devastating eye infection that causes rapid blindness through extracellular tissue-destructive exotoxins. Despite its importance, knowledge of the phylogenetic relationships and population structure of intraocular Bacillus spp. is lacking. In this study, we sequenced the whole genomes of eight Bacillus intraocular pathogens independently isolated from 8/52 patients with posttraumatic Bacillus endophthalmitis infections in the Eye Hospital of Wenzhou Medical University between January 2010 and December 2018. Phylogenetic analysis revealed that the pathogenic intraocular isolates belonged to Bacillus cereus, Bacillus thuringiensis and Bacillus toyonensis To determine the virulence of the ocular isolates, three representative strains were injected into mouse models, and severe endophthalmitis leading to blindness was observed. Through incorporating publicly available genomes for Bacillus spp., we found that the intraocular pathogens could be isolated independently but displayed a similar genetic context. In addition, our data provide genome-wide support for intraocular and gastrointestinal sources of Bacillus spp. belonging to different lineages. Importantly, we identified five molecular signatures of virulence and motility genes associated with intraocular infection, namely, plcA-2, InhA-3, InhA-4, hblA-5, and fliD using pangenome-wide association studies. The characterization of overrepresented genes in the intraocular isolates holds value to predict bacterial evolution and for the design of future intervention strategies in patients with endophthalmitis.IMPORTANCE In this study, we provided a detailed and comprehensive clinicopathological and pathogenic report of Bacillus endophthalmitis over the 8 years of the study period. We first reported the whole-genome sequence of Bacillus spp. causing devastating endophthalmitis and found that Bacillus toyonensis is able to cause endophthalmitis. Finally, we revealed significant endophthalmitis-associated virulence genes involved in hemolysis, immunity inhibition, and pathogenesis. Overall, as more sequencing data sets become available, these data will facilitate comparative research and will reveal the emergence of pathogenic "ocular bacteria."

RevDate: 2020-02-10

Khan AW, Garg V, Roorkiwal M, et al (2020)

Super-Pangenome by Integrating the Wild Side of a Species for Accelerated Crop Improvement.

Trends in plant science, 25(2):148-158.

The pangenome provides genomic variations in the cultivated gene pool for a given species. However, as the crop's gene pool comprises many species, especially wild relatives with diverse genetic stock, here we suggest using accessions from all available species of a given genus for the development of a more comprehensive and complete pangenome, which we refer to as a super-pangenome. The super-pangenome provides a complete genomic variation repertoire of a genus and offers unprecedented opportunities for crop improvement. This opinion article focuses on recent developments in crop pangenomics, the need for a super-pangenome that should include wild species, and its application for crop improvement.

RevDate: 2019-12-06

Chaudhry V, PB Patil (2019)

Evolutionary insights into adaptation of Staphylococcus haemolyticus to human and non-human niches.

Genomics pii:S0888-7543(19)30804-3 [Epub ahead of print].

Staphylococcus haemolyticus is a well-known member of human skin microbiome and an emerging opportunistic human pathogen. Presently, evolutionary studies are limited to human isolates even though it is reported from plants with beneficial properties and in environmental settings. In the present study, we report isolation of novel S. haemolyticus strains from surface sterilized rice seeds and compare their genome to other isolates from diverse niches available in public domain. The study showed expanding nature of pan-genome and revealed set of genes with putative functions related to its adaptability. This is seen by presence of type II lanthipeptide cluster in rice isolates, metal homeostasis genes in an isolate from copper coin and gene encoding methicillin resistance in human isolates. The present study on differential genome dynamics and role of horizontal gene transfers has provided novel insights into capability for ecological diversification of a bacterium of significance to human health.

RevDate: 2019-12-01

Peeters C, De Canck E, Cnockaert M, et al (2019)

Comparative Genomics of Pandoraea, a Genus Enriched in Xenobiotic Biodegradation and Metabolism.

Frontiers in microbiology, 10:2556.

Comparative analysis of partial gyrB, recA, and gltB gene sequences of 84 Pandoraea reference strains and field isolates revealed several clusters that included no taxonomic reference strains. The gyrB, recA, and gltB phylogenetic trees were used to select 27 strains for whole-genome sequence analysis and for a comparative genomics study that also included 41 publicly available Pandoraea genome sequences. The phylogenomic analyses included a Genome BLAST Distance Phylogeny approach to calculate pairwise digital DNA-DNA hybridization values and their confidence intervals, average nucleotide identity analyses using the OrthoANIu algorithm, and a whole-genome phylogeny reconstruction based on 107 single-copy core genes using bcgTree. These analyses, along with subsequent chemotaxonomic and traditional phenotypic analyses, revealed the presence of 17 novel Pandoraea species among the strains analyzed, and allowed the identification of several unclassified Pandoraea strains reported in the literature. The genus Pandoraea has an open pan genome that includes many orthogroups in the 'Xenobiotics biodegradation and metabolism' KEGG pathway, which likely explains the enrichment of these species in polluted soils and participation in the biodegradation of complex organic substances. We propose to formally classify the 17 novel Pandoraea species as P. anapnoica sp. nov. (type strain LMG 31117T = CCUG 73385T), P. anhela sp. nov. (type strain LMG 31108T = CCUG 73386T), P. aquatica sp. nov. (type strain LMG 31011T = CCUG 73384T), P. bronchicola sp. nov. (type strain LMG 20603T = ATCC BAA-110T), P. capi sp. nov. (type strain LMG 20602T = ATCC BAA-109T), P. captiosa sp. nov. (type strain LMG 31118T = CCUG 73387T), P. cepalis sp. nov. (type strain LMG 31106T = CCUG 39680T), P. commovens sp. nov. (type strain LMG 31010T = CCUG 73378T), P. communis sp. nov. (type strain LMG 31110T = CCUG 73383T), P. eparura sp. nov. (type strain LMG 31012T = CCUG 73380T), P. horticolens sp. nov. (type strain LMG 31112T = CCUG 73379T), P. iniqua sp. nov. (type strain LMG 31009T = CCUG 73377T), P. morbifera sp. nov. (type strain LMG 31116T = CCUG 73389T), P. nosoerga sp. nov. (type strain LMG 31109T = CCUG 73390T), P. pneumonica sp. nov. (type strain LMG 31114T = CCUG 73388T), P. soli sp. nov. (type strain LMG 31014T = CCUG 73382T), and P. terrigena sp. nov. (type strain LMG 31013T = CCUG 73381T).

RevDate: 2020-01-08

Lupolova N, Lycett SJ, DL Gally (2019)

A guide to machine learning for bacterial host attribution using genome sequence data.

Microbial genomics, 5(12):.

With the ever-expanding number of available sequences from bacterial genomes, and the expectation that this data type will be the primary one generated from both diagnostic and research laboratories for the foreseeable future, then there is both an opportunity and a need to evaluate how effectively computational approaches can be used within bacterial genomics to predict and understand complex phenotypes, such as pathogenic potential and host source. This article applied various quantitative methods such as diversity indexes, pangenome-wide association studies (GWAS) and dimensionality reduction techniques to better understand the data and then compared how well unsupervised and supervised machine learning (ML) methods could predict the source host of the isolates. The study uses the example of the pangenomes of 1203 Salmonella enterica serovar Typhimurium isolates in order to predict 'host of isolation' using these different methods. The article is aimed as a review of recent applications of ML in infection biology, but also, by working through this specific dataset, it allows discussion of the advantages and drawbacks of the different techniques. As with all such sub-population studies, the biological relevance will be dependent on the quality and diversity of the input data. Given this major caveat, we show that supervised ML has the potential to add real value to interpretation of bacterial genomic data, as it can provide probabilistic outcomes for important phenotypes, something that is very difficult to achieve with the other methods.

RevDate: 2020-02-09

Eggertsson HP, Kristmundsdottir S, Beyter D, et al (2019)

GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs.

Nature communications, 10(1):5402.

Analysis of sequence diversity in the human genome is fundamental for genetic studies. Structural variants (SVs) are frequently omitted in sequence analysis studies, although each has a relatively large impact on the genome. Here, we present GraphTyper2, which uses pangenome graphs to genotype SVs and small variants using short-reads. Comparison to the syndip benchmark dataset shows that our SV genotyping is sensitive and variant segregation in families demonstrates the accuracy of our approach. We demonstrate that incorporating public assembly data into our pipeline greatly improves sensitivity, particularly for large insertions. We validate 6,812 SVs on average per genome using long-read data of 41 Icelanders. We show that GraphTyper2 can simultaneously genotype tens of thousands of whole-genomes by characterizing 60 million small variants and half a million SVs in 49,962 Icelanders, including 80 thousand SVs with high-confidence.

RevDate: 2020-02-05

Chernysheva N, Bystritskaya E, Stenkova A, et al (2019)

Comparative Genomics and CAZyme Genome Repertoires of Marine Zobellia amurskyensis KMM 3526T and Zobellia laminariae KMM 3676T.

Marine drugs, 17(12):.

We obtained two novel draft genomes of type Zobellia strains with estimated genome sizes of 5.14 Mb for Z. amurskyensis KMM 3526Т and 5.16 Mb for Z. laminariae KMM 3676Т. Comparative genomic analysis has been carried out between obtained and known genomes of Zobellia representatives. The pan-genome of Zobellia genus is composed of 4853 orthologous clusters and the core genome was estimated at 2963 clusters. The genus CAZome was represented by 775 GHs classified into 62 families, 297 GTs of 16 families, 100 PLs of 13 families, 112 CEs of 13 families, 186 CBMs of 18 families and 42 AAs of six families. A closer inspection of the carbohydrate-active enzyme (CAZyme) genomic repertoires revealed members of new putative subfamilies of GH16 and GH117, which can be biotechnologically promising for production of oligosaccharides and rare monomers with different bioactivities. We analyzed AA3s, among them putative FAD-dependent glycoside oxidoreductases (FAD-GOs) being of particular interest as promising biocatalysts for glycoside deglycosylation in food and pharmaceutical industries.

RevDate: 2019-11-28

Cabrera-Contreras R, Santamaría RI, Bustos P, et al (2019)

Genomic diversity of prevalent Staphylococcus epidermidis multidrug-resistant strains isolated from a Children's Hospital in México City in an eight-years survey.

PeerJ, 7:e8068.

Staphylococcus epidermidis is a human commensal and pathogen worldwide distributed. In this work, we surveyed for multi-resistant S. epidermidis strains in eight years at a children's health-care unit in México City. Multidrug-resistant S. epidermidis were present in all years of the study, including resistance to methicillin, beta-lactams, fluoroquinolones, and macrolides. To understand the genetic basis of antibiotic resistance and its association with virulence and gene exchange, we sequenced the genomes of 17 S. epidermidis isolates. Whole-genome nucleotide identities between all the pairs of S. epidermidis strains were about 97% to 99%. We inferred a clonal structure and eight Multilocus Sequence Types (MLSTs) in the S. epidermidis sequenced collection. The profile of virulence includes genes involved in biofilm formation and phenol-soluble modulins (PSMs). Half of the S. epidermidis analyzed lacked the ica operon for biofilm formation. Likely, they are commensal S. epidermidis strains but multi-antibiotic resistant. Uneven distribution of insertion sequences, phages, and CRISPR-Cas immunity phage systems suggest frequent horizontal gene transfer. Rates of recombination between S. epidermidis strains were more prevalent than the mutation rate and affected the whole genome. Therefore, the multidrug resistance, independently of the pathogenic traits, might explain the persistence of specific highly adapted S. epidermidis clonal lineages in nosocomial settings.

RevDate: 2019-11-28

Sujitha S, Vishnu US, Karthikeyan R, et al (2019)

Genome Investigation of a Cariogenic Pathogen with Implications in Cardiovascular Diseases.

Indian journal of microbiology, 59(4):451-459.

The proportion of people suffering from cardiovascular diseases has risen by 34% in the last 15 years in India. Cardiomyopathy is among the many forms of CVD s present. Infection of heart muscles is the suspected etiological agent for the same. Oral pathogens gaining entry into the bloodstream are responsible for such infections. Streptococcus mutans is an oral pathogen with implications in cardiovascular diseases. Previous studies have shown certain strains of S. mutans are found predominantly within atherosclerotic plaques and extirpated valves. To decipher the genetic differences responsible for endothelial cell invasion, we have sequenced the genome of Streptococcus mutans B14. Pan-genome analysis, search for adhesion proteins through a special algorithm, and protein-protein interactions search through HPIDB have been done. Pan-genome analysis of 187 whole genomes, assemblies revealed 6965 genes in total and 918 genes forming the core gene cluster. Adhesion to the endothelial cell is a critical virulence factor distinguishing virulent and non-virulent strains. Overall, 4% of the total proteins in S. mutans B14 were categorized as adhesion proteins. Protein-protein interaction between putative adhesion proteins and Human extracellular matrix components was predicted, revealing novel interactions. A conserved gene catalyzing the synthesis of branched-chain amino acids in S. mutans B14 shows possible interaction with isoforms of cathepsin protein of the ECM. This genome sequence analysis indicates towards other proteins in the S. mutans genome, which might have a specific role to play in host cell interaction.

RevDate: 2020-01-08

Decano AG, T Downing (2019)

An Escherichia coli ST131 pangenome atlas reveals population structure and evolution across 4,071 isolates.

Scientific reports, 9(1):17394.

Escherichia coli ST131 is a major cause of infection with extensive antimicrobial resistance (AMR) facilitated by widespread beta-lactam antibiotic use. This drug pressure has driven extended-spectrum beta-lactamase (ESBL) gene acquisition and evolution in pathogens, so a clearer resolution of ST131's origin, adaptation and spread is essential. E. coli ST131's ESBL genes are typically embedded in mobile genetic elements (MGEs) that aid transfer to new plasmid or chromosomal locations, which are mobilised further by plasmid conjugation and recombination, resulting in a flexible ESBL, MGE and plasmid composition with a conserved core genome. We used population genomics to trace the evolution of AMR in ST131 more precisely by extracting all available high-quality Illumina HiSeq read libraries to investigate 4,071 globally-sourced genomes, the largest ST131 collection examined so far. We applied rigorous quality-control, genome de novo assembly and ESBL gene screening to resolve ST131's population structure across three genetically distinct Clades (A, B, C) and abundant subclades from the dominant Clade C. We reconstructed their evolutionary relationships across the core and accessory genomes using published reference genomes, long read assemblies and k-mer-based methods to contextualise pangenome diversity. The three main C subclades have co-circulated globally at relatively stable frequencies over time, suggesting attaining an equilibrium after their origin and initial rapid spread. This contrasted with their ESBL genes, which had stronger patterns across time, geography and subclade, and were located at distinct locations across the chromosomes and plasmids between isolates. Within the three C subclades, the core and accessory genome diversity levels were not correlated due to plasmid and MGE activity, unlike patterns between the three main clades, A, B and C. This population genomic study highlights the dynamic nature of the accessory genomes in ST131, suggesting that surveillance should anticipate genetically variable outbreaks with broader antibiotic resistance levels. Our findings emphasise the potential of evolutionary pangenomics to improve our understanding of AMR gene transfer, adaptation and transmission to discover accessory genome changes linked to novel subtypes.

RevDate: 2020-01-23

de Fátima Rauber Würfel S, Jorge S, de Oliveira NR, et al (2020)

Campylobacter jejuni isolated from poultry meat in Brazil: in silico analysis and genomic features of two strains with different phenotypes of antimicrobial susceptibility.

Molecular biology reports, 47(1):671-681.

Campylobacter jejuni is the most common bacterial cause of foodborne diarrheal disease worldwide and is among the antimicrobial resistant "priority pathogens" that pose greatest threat to public health. The genomes of two C. jejuni isolated from poultry meat sold on the retail market in Southern Brazil phenotypically characterized as multidrug-resistant (CJ100) and susceptible (CJ104) were sequenced and analyzed by bioinformatic tools. The isolates CJ100 and CJ104 showed distinct multilocus sequence types (MLST). Comparative genomic analysis revealed a large number of single nucleotide polymorphisms, rearrangements, and inversions in both genomes, in addition to virulence factors, genomic islands, prophage sequences, and insertion sequences. A circular 103-kilobase megaplasmid carrying virulence factors was identified in the genome of CJ100, in addition to resistance mechanisms to aminoglycosides, beta-lactams, macrolides, quinolones, and tetracyclines. The molecular characterization of distinct phenotypes of foodborne C. jejuni and the discovery of a novel virulence megaplasmid provide useful data for pan-genome and large-scale studies to monitor the virulent C. jejuni in poultry meat is warranted.

RevDate: 2020-01-08

Chapeton-Montes D, Plourde L, Bouchier C, et al (2019)

Author Correction: The population structure of Clostridium tetani deduced from its pan-genome.

Scientific reports, 9(1):17409 pii:10.1038/s41598-019-53688-z.

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

RevDate: 2020-02-06

Lawson MAE, O'Neill IJ, Kujawska M, et al (2020)

Breast milk-derived human milk oligosaccharides promote Bifidobacterium interactions within a single ecosystem.

The ISME journal, 14(2):635-648.

Diet-microbe interactions play an important role in modulating the early-life microbiota, with Bifidobacterium strains and species dominating the gut of breast-fed infants. Here, we sought to explore how infant diet drives distinct bifidobacterial community composition and dynamics within individual infant ecosystems. Genomic characterisation of 19 strains isolated from breast-fed infants revealed a diverse genomic architecture enriched in carbohydrate metabolism genes, which was distinct to each strain, but collectively formed a pangenome across infants. Presence of gene clusters implicated in digestion of human milk oligosaccharides (HMOs) varied between species, with growth studies indicating that within single infants there were differences in the ability to utilise 2'FL and LNnT HMOs between strains. Cross-feeding experiments were performed with HMO degraders and non-HMO users (using spent or 'conditioned' media and direct co-culture). Further 1H-NMR analysis identified fucose, galactose, acetate, and N-acetylglucosamine as key by-products of HMO metabolism; as demonstrated by modest growth of non-HMO users on spend media from HMO metabolism. These experiments indicate how HMO metabolism permits the sharing of resources to maximise nutrient consumption from the diet and highlights the cooperative nature of bifidobacterial strains and their role as 'foundation' species in the infant ecosystem. The intra- and inter-infant bifidobacterial community behaviour may contribute to the diversity and dominance of Bifidobacterium in early life and suggests avenues for future development of new diet and microbiota-based therapies to promote infant health.

RevDate: 2019-12-09

Robertson J, Lin J, Wren-Hedgus A, et al (2019)

Development of a multi-locus typing scheme for an Enterobacteriaceae linear plasmid that mediates inter-species transfer of flagella.

PloS one, 14(11):e0218638.

Due to the public health importance of flagellar genes for typing, it is important to understand mechanisms that could alter their expression or presence. Phenotypic novelty in flagellar genes arise predominately through accumulation of mutations but horizontal transfer is known to occur. A linear plasmid termed pBSSB1 previously identified in Salmonella Typhi, was found to encode a flagellar operon that can mediate phase variation, which results in the rare z66 flagella phenotype. The identification and tracking of homologs of pBSSB1 is limited because it falls outside the normal replicon typing schemes for plasmids. Here we report the generation of nine new pBSSB1-family sequences using Illumina and Nanopore sequence data. Homologs of pBSSB1 were identified in 154 genomes representing 25 distinct serotypes from 67,758 Salmonella public genomes. Pangenome analysis of pBSSB1-family contigs was performed using roary and we identified three core genes amenable to a minimal pMLST scheme. Population structure analysis based on the newly developed pMLST scheme identified three major lineages representing 35 sequence types, and the distribution of these sequence types was found to span multiple serovars across the globe. This in silico pMLST scheme has shown utility in tracking and subtyping pBSSB1-family plasmids and it has been incorporated into the plasmid MLST database under the name "pBSSB1-family".

RevDate: 2019-11-21

Suresh G, Lodha TD, Indu B, et al (2019)

Taxogenomics Resolves Conflict in the Genus Rhodobacter: A Two and Half Decades Pending Thought to Reclassify the Genus Rhodobacter.

Frontiers in microbiology, 10:2480.

The genus Rhodobacter is taxonomically well studied, and some members are model organisms. However, this genus is comprised of a heterogeneous group of members. 16S rRNA gene-based phylogeny of the genus Rhodobacter indicates a motley assemblage of anoxygenic phototrophic bacteria (genus Rhodobacter) with interspersing members of other genera (chemotrophs) making the genus polyphyletic. Taxogenomics was performed to resolve the taxonomic conflicts of the genus Rhodobacter using twelve type strains. The phylogenomic analysis showed that Rhodobacter spp. can be grouped into four monophyletic clusters with interspersing chemotrophs. Genomic indices (ANI and dDDH) confirmed that all the current species are well defined, except Rhodobacter megalophilus. The average amino acid identity values between the monophyletic clusters of Rhodobacter members, as well as with the chemotrophic genera, are less than 80% whereas the percentage of conserved proteins values were below 70%, which has been observed among several genera related to Rhodobacter. The pan-genome analysis has shown that there are only 1239 core genes shared between the 12 species of the genus Rhodobacter. The polyphasic taxonomic analysis supports the phylogenomic and genomic studies in distinguishing the four Rhodobacter clusters. Each cluster is comprised of one to seven species according to the current Rhodobacter taxonomy. Therefore, to address this taxonomic discrepancy we propose to reclassify the members of the genus Rhodobacter into three new genera, Luteovulum gen. nov., Phaeovulum gen. nov. and Fuscovulum gen. nov., and provide an emended description of the genus Rhodobacter sensu stricto. Also, we propose reclassification of Rhodobacter megalophilus as a sub-species of Rhodobacter sphaeroides.

RevDate: 2020-01-08

Ghosh S, Sarangi AN, Mukherjee M, et al (2019)

Reanalysis of Lactobacillus paracasei Lbs2 Strain and Large-Scale Comparative Genomics Places Many Strains into Their Correct Taxonomic Position.

Microorganisms, 7(11):.

Lactobacillus paracasei are diverse Gram-positive bacteria that are very closely related to Lactobacillus casei, belonging to the Lactobacillus casei group. Due to extreme genome similarities between L. casei and L. paracasei, many strains have been cross placed in the other group. We had earlier sequenced and analyzed the genome of Lactobacillus paracasei Lbs2, but mistakenly identified it as L. casei. We re-analyzed Lbs2 reads into a 2.5 MB genome that is 91.28% complete with 0.8% contamination, which is now suitably placed under L. paracasei based on Average Nucleotide Identity and Average Amino Acid Identity. We took 74 sequenced genomes of L. paracasei from GenBank with assembly sizes ranging from 2.3 to 3.3 MB and genome completeness between 88% and 100% for comparison. The pan-genome of 75 L. paracasei strains hold 15,945 gene families (21,5232 genes), while the core genome contained about 8.4% of the total genes (243 gene families with 18,225 genes) of pan-genome. Phylogenomic analysis based on core gene families revealed that the Lbs2 strain has a closer relationship with L. paracasei subsp. tolerans DSM20258. Finally, the in-silico analysis of the L. paracasei Lbs2 genome revealed an important pathway that could underpin the production of thiamin, which may contribute to the host energy metabolism.

RevDate: 2019-11-15

Seribelli AA, Gonzales JC, de Almeida F, et al (2019)

Phylogenetic analysis revealed that Salmonella Typhimurium ST313 isolated from humans and food in Brazil presented a high genomic similarity.

Brazilian journal of microbiology : [publication of the Brazilian Society for Microbiology] pii:10.1007/s42770-019-00155-6 [Epub ahead of print].

Salmonella Typhimurium sequence type 313 (S. Typhimurium ST313) has caused invasive disease mainly in sub-Saharan Africa. In Brazil, ST313 strains have been recently described, and there is a lack of studies that assessed by whole genome sequencing (WGS)-the relationship of these strains. The aims of this work were to study the phylogenetic relationship of 70 S. Typhimurium genomes comparing strains of ST313 (n = 9) isolated from humans and food in Brazil among themselves, with other STs isolated in this country (n = 31) and in other parts of the globe (n = 30) by 16S rRNA sequences, the Gegenees software, whole genome multilocus sequence typing (wgMLST), and average nucleotide identity (ANI) for the genomes of ST313. Additionally, pangenome analysis was performed to verify the heterogeneity of these genomes. The phylogenetic analyses showed that the ST313 genomes were very similar among themselves. However, the ST313 genomes were usually clustered more distantly to other STs of strains isolated in Brazil and in other parts of the world. By pangenome calculation, the core genome was 2,880 CDSs and 4,171 CDSs singletons for all the 70 S. Typhimurium genomes studied. Considering the 10 ST313 genomes analyzed the core genome was 4,112 CDSs and 76 CDSs singletons. In conclusion, the ST313 genomes from Brazil showed a high similarity among them which information might eventually help in the development of vaccines and antibiotics. The pangenome analysis showed that the S. Typhimurium genomes studied presented an open pangenome, but specifically tending to become close for the ST313 strains.

RevDate: 2020-02-05

Chhotaray C, Wang S, Tan Y, et al (2020)

Comparative Analysis of Whole-Genome and Methylome Profiles of a Smooth and a Rough Mycobacterium abscessus Clinical Strain.

G3 (Bethesda, Md.), 10(1):13-22.

Mycobacterium abscessus is a fast growing Mycobacterium species mainly causing skin and respiratory infections in human. M. abscessus is resistant to numerous drugs, which is a major challenge for the treatment. In this study, we have sequenced the genomes of two clinical M. abscessus strains having rough and smooth morphology, using the single molecule real-time and Illumina HiSeq sequencing technology. In addition, we reported the first comparative methylome profiles of a rough and a smooth M. abscessus clinical strains. The number of N4-methylcytosine (4mC) and N6-methyladenine (6mA) modified bases obtained from smooth phenotype were two-fold and 1.6 fold respectively higher than that of rough phenotype. We have also identified 4 distinct novel motifs in two clinical strains and genes encoding antibiotic-modifying/targeting enzymes and genes associated with intracellular survivability having different methylation patterns. To our knowledge, this is the first report about genome-wide methylation profiles of M. abscessus strains and identification of a natural linear plasmid (15 kb) in this critical pathogen harboring methylated bases. The pan-genome analysis of 25 M. abscessus strains including two clinical strains revealed an open pan genome comprises of 7596 gene clusters. Likewise, structural variation analysis revealed that the genome of rough phenotype strain contains more insertions and deletions than the smooth phenotype and that of the reference strain. A total of 391 single nucleotide variations responsible for the non-synonymous mutations were detected in clinical strains compared to the reference genome. The comparative genomic analysis elucidates the genome plasticity in this emerging pathogen. Furthermore, the detection of genome-wide methylation profiles of M. abscessus clinical strains may provide insight into the significant role of DNA methylation in pathogenicity and drug resistance in this opportunistic pathogen.

RevDate: 2019-11-09

Kim KH, Chun BH, Baek JH, et al (2020)

Genomic and metabolic features of Lactobacillus sakei as revealed by its pan-genome and the metatranscriptome of kimchi fermentation.

Food microbiology, 86:103341.

The genomic and metabolic features of Lactobacillus sakei were investigated using its pan-genome and by analyzing the metatranscriptome of kimchi fermentation. In the genome-based relatedness analysis, the strains were divided into the Lb. sakei ssp. sakei and Lb. sakei ssp. carnosus lineage groups. Genomic and metabolic pathway analysis revealed that all Lb. sakei strains have the capability of producing d/l-lactate, ethanol, acetate, CO2, formate, l-malate, diacetyl, acetoin, and 2,3-butanediol from d-glucose, d-fructose, d-galactose, sucrose, d-lactose, l-arabinose, cellobiose, d-mannose, d-gluconate, and d-ribose through homolactic and heterolactic fermentation, whereas their capability of d-maltose, d-xylose, l-xylulose, d-galacturonate, and d-glucuronate metabolism is strain-specific. All strains carry genes for the biosynthesis of folate and thiamine, whereas genes for biogenic amine and toxin production, hemolysis, and antibiotic resistance were not identified. The metatranscriptomic analysis showed that the expression of Lb. sakei transcripts involved in carbohydrate metabolism increased as kimchi fermentation progressed, suggesting that Lb. sakei is more competitive during late fermentation stage. Homolactic fermentation pathway was highly expressed and generally constant during kimchi fermentation, whereas expression of heterolactic fermentation pathway increased gradually as fermentation progressed. l-Lactate dehydrogenase was more highly expressed than d-lactate dehydrogenase, suggesting that l-lactate is the major lactate metabolized by Lb. sakei.

RevDate: 2020-01-18

Bernheim A, R Sorek (2020)

The pan-immune system of bacteria: antiviral defence as a community resource.

Nature reviews. Microbiology, 18(2):113-119.

Viruses and their hosts are engaged in a constant arms race leading to the evolution of antiviral defence mechanisms. Recent studies have revealed that the immune arsenal of bacteria against bacteriophages is much more diverse than previously envisioned. These discoveries have led to seemingly contradictory observations: on one hand, individual microorganisms often encode multiple distinct defence systems, some of which are acquired by horizontal gene transfer, alluding to their fitness benefit. On the other hand, defence systems are frequently lost from prokaryotic genomes on short evolutionary time scales, suggesting that they impose a fitness cost. In this Perspective article, we present the 'pan-immune system' model in which we suggest that, although a single strain cannot carry all possible defence systems owing to their burden on fitness, it can employ horizontal gene transfer to access immune defence mechanisms encoded by closely related strains. Thus, the 'effective' immune system is not the one encoded by the genome of a single microorganism but rather by its pan-genome, comprising the sum of all immune systems available for a microorganism to horizontally acquire and use.

RevDate: 2020-02-05

Vila Nova M, Durimel K, La K, et al (2019)

Genetic and metabolic signatures of Salmonella enterica subsp. enterica associated with animal sources at the pangenomic scale.

BMC genomics, 20(1):814.

BACKGROUND: Salmonella enterica subsp. enterica is a public health issue related to food safety, and its adaptation to animal sources remains poorly described at the pangenome scale. Firstly, serovars presenting potential mono- and multi-animal sources were selected from a curated and synthetized subset of Enterobase. The corresponding sequencing reads were downloaded from the European Nucleotide Archive (ENA) providing a balanced dataset of 440 Salmonella genomes in terms of serovars and sources (i). Secondly, the coregenome variants and accessory genes were detected (ii). Thirdly, single nucleotide polymorphisms and small insertions/deletions from the coregenome, as well as the accessory genes were associated to animal sources based on a microbial Genome Wide Association Study (GWAS) integrating an advanced correction of the population structure (iii). Lastly, a Gene Ontology Enrichment Analysis (GOEA) was applied to emphasize metabolic pathways mainly impacted by the pangenomic mutations associated to animal sources (iv).

RESULTS: Based on a genome dataset including Salmonella serovars from mono- and multi-animal sources (i), 19,130 accessory genes and 178,351 coregenome variants were identified (ii). Among these pangenomic mutations, 52 genomic signatures (iii) and 9 over-enriched metabolic signatures (iv) were associated to avian, bovine, swine and fish sources by GWAS and GOEA, respectively.

CONCLUSIONS: Our results suggest that the genetic and metabolic determinants of Salmonella adaptation to animal sources may have been driven by the natural feeding environment of the animal, distinct livestock diets modified by human, environmental stimuli, physiological properties of the animal itself, and work habits for health protection of livestock.

RevDate: 2020-01-08

Aguirre de Cárcer D (2019)

A conceptual framework for the phylogenetically constrained assembly of microbial communities.

Microbiome, 7(1):142.

Microbial communities play essential and preponderant roles in all ecosystems. Understanding the rules that govern microbial community assembly will have a major impact on our ability to manage microbial ecosystems, positively impacting, for instance, human health and agriculture. Here, I present a phylogenetically constrained community assembly principle grounded on the well-supported facts that deterministic processes have a significant impact on microbial community assembly, that microbial communities show significant phylogenetic signal, and that microbial traits and ecological coherence are, to some extent, phylogenetically conserved. From these facts, I derive a few predictions which form the basis of the framework. Chief among them is the existence, within most microbial ecosystems, of phylogenetic core groups (PCGs), defined as discrete portions of the phylogeny of varying depth present in all instances of the given ecosystem, and related to specific niches whose occupancy requires a specific phylogenetically conserved set of traits. The predictions are supported by the recent literature, as well as by dedicated analyses. Integrating the effect of ecosystem patchiness, microbial social interactions, and scale sampling pitfalls takes us to a comprehensive community assembly model that recapitulates the characteristics most commonly observed in microbial communities. PCGs' identification is relatively straightforward using high-throughput 16S amplicon sequencing, and subsequent bioinformatic analysis of their phylogeny, estimated core pan-genome, and intra-group co-occurrence should provide valuable information on their ecophysiology and niche characteristics. Such a priori information for a significant portion of the community could be used to prime complementing analyses, boosting their usefulness. Thus, the use of the proposed framework could represent a leap forward in our understanding of microbial community assembly and function.

RevDate: 2020-02-03
CmpDate: 2020-02-03

Alonge M, Soyk S, Ramakrishnan S, et al (2019)

RaGOO: fast and accurate reference-guided scaffolding of draft genomes.

Genome biology, 20(1):224.

We present RaGOO, a reference-guided contig ordering and orienting tool that leverages the speed and sensitivity of Minimap2 to accurately achieve chromosome-scale assemblies in minutes. After the pseudomolecules are constructed, RaGOO identifies structural variants, including those spanning sequencing gaps. We show that RaGOO accurately orders and orients 3 de novo tomato genome assemblies, including the widely used M82 reference cultivar. We then demonstrate the scalability and utility of RaGOO with a pan-genome analysis of 103 Arabidopsis thaliana accessions by examining the structural variants detected in the newly assembled pseudomolecules. RaGOO is available open source at .

RevDate: 2020-01-21
CmpDate: 2020-01-21

Oh YJ, Kim JY, Park HK, et al (2019)

Salicibibacter halophilus sp. nov., a moderately halophilic bacterium isolated from kimchi.

Journal of microbiology (Seoul, Korea), 57(11):997-1002.

A Gram-stain-positive, rod-shaped, alkalitolerant, and halophilic bacterium-designated as strain NKC3-5T-was isolated from kimchi that was collected from the Geumsan area in the Republic of Korea. Cells of isolated strain NKC3-5T were 0.5-0.7 μm wide and 1.4-2.8 μm long. The strain NKC3-5T could grow at up to 20.0% (w/v) NaCl (optimum 10%), pH 6.5-10.0 (optimum pH 9.0), and 25-40°C (optimum 35°C). The cells were able to reduce nitrate under aerobic conditions, which is the first report in the genus Salicibibacter. The genome size and genomic G + C content of strain NKC3-5T were 3,754,174 bp and 45.9 mol%, respectively; it contained 3,630 coding sequences, 16S rRNA genes (six 16S, five 5S, and five 23S), and 59 tRNA genes. Phylogenetic analysis based on 16S rRNA showed that strain NKC3-5T clustered with bacterium Salicibibacter kimchii NKC1-1T, with a similarity of 96.2-97.6%, but formed a distinct branch with other published species of the family Bacillaceae. In addition, OrthoANI value between strain NKC3-5T and Salicibibacter kimchii NKC1-1T was far lower than the species demarcation threshold. Using functional genome annotation, the result found that carbohydrate, amino acid, and vitamin metabolism related genes were highly distributed in the genome of strain NKC3-5T. Comparative genomic analysis revealed that strain NKC3-5T had 716 pan-genome orthologous groups (POGs), dominated with carbohydrate metabolism. Phylogenomic analysis based on the concatenated core POGs revealed that strain NKC3-5T was closely related to Salicibibacter kimchii. The predominant polar lipids were phosphatidylglycerol and two unidentified lipids. Anteiso-C15:0, iso-C17:0, anteiso-C17:0, and iso-C15:0 were the major cellular fatty acids, and menaquinone-7 was the major isoprenoid quinone present in strain NKC3-5T. Cell wall peptidoglycan analysis of strain NKC3-5T showed that meso-diaminopimelic acid was the diagnostic diamino acid. The phephenotypic, genomic, phylogenetic, and chemotaxonomic properties reveal that the strain represents a novel species of the genus Salicibibacter, for which the name Salicibibacter halophilus sp. nov. is proposed, with the type strain NKC3-5T (= KACC 21230T = JCM 33437T).


ESP Quick Facts

ESP Origins

In the early 1990's, Robert Robbins was a faculty member at Johns Hopkins, where he directed the informatics core of GDB — the human gene-mapping database of the international human genome project. To share papers with colleagues around the world, he set up a small paper-sharing section on his personal web page. This small project evolved into The Electronic Scholarly Publishing Project.

ESP Support

In 1995, Robbins became the VP/IT of the Fred Hutchinson Cancer Research Center in Seattle, WA. Soon after arriving in Seattle, Robbins secured funding, through the ELSI component of the US Human Genome Project, to create the original ESP.ORG web site, with the formal goal of providing free, world-wide access to the literature of classical genetics.

ESP Rationale

Although the methods of molecular biology can seem almost magical to the uninitiated, the original techniques of classical genetics are readily appreciated by one and all: cross individuals that differ in some inherited trait, collect all of the progeny, score their attributes, and propose mechanisms to explain the patterns of inheritance observed.

ESP Goal

In reading the early works of classical genetics, one is drawn, almost inexorably, into ever more complex models, until molecular explanations begin to seem both necessary and natural. At that point, the tools for understanding genome research are at hand. Assisting readers reach this point was the original goal of The Electronic Scholarly Publishing Project.

ESP Usage

Usage of the site grew rapidly and has remained high. Faculty began to use the site for their assigned readings. Other on-line publishers, ranging from The New York Times to Nature referenced ESP materials in their own publications. Nobel laureates (e.g., Joshua Lederberg) regularly used the site and even wrote to suggest changes and improvements.

ESP Content

When the site began, no journals were making their early content available in digital format. As a result, ESP was obliged to digitize classic literature before it could be made available. For many important papers — such as Mendel's original paper or the first genetic map — ESP had to produce entirely new typeset versions of the works, if they were to be available in a high-quality format.

ESP Help

Early support from the DOE component of the Human Genome Project was critically important for getting the ESP project on a firm foundation. Since that funding ended (nearly 20 years ago), the project has been operated as a purely volunteer effort. Anyone wishing to assist in these efforts should send an email to Robbins.

ESP Plans

With the development of methods for adding typeset side notes to PDF files, the ESP project now plans to add annotated versions of some classical papers to its holdings. We also plan to add new reference and pedagogical material. We have already started providing regularly updated, comprehensive bibliographies to the ESP.ORG site.

Electronic Scholarly Publishing
961 Red Tail Lane
Bellingham, WA 98226

E-mail: RJR8222 @

Papers in Classical Genetics

The ESP began as an effort to share a handful of key papers from the early days of classical genetics. Now the collection has grown to include hundreds of papers, in full-text format.

Digital Books

Along with papers on classical genetics, ESP offers a collection of full-text digital books, including many works by Darwin (and even a collection of poetry — Chicago Poems by Carl Sandburg).


ESP now offers a much improved and expanded collection of timelines, designed to give the user choice over subject matter and dates.


Biographical information about many key scientists.

Selected Bibliographies

Bibliographies on several topics of potential interest to the ESP community are now being automatically maintained and generated on the ESP site.

ESP Picks from Around the Web (updated 07 JUL 2018 )