Viewport Size Code:
Login | Create New Account


About | Classical Genetics | Timelines | What's New | What's Hot

About | Classical Genetics | Timelines | What's New | What's Hot


Bibliography Options Menu

Hide Abstracts   |   Hide Additional Links
Long bibliographies are displayed in blocks of 100 citations at a time. At the end of each block there is an option to load the next block.

Bibliography on: Pangenome

The Electronic Scholarly Publishing Project: Providing world-wide, free access to classic scientific papers and other scholarly materials, since 1993.


ESP: PubMed Auto Bibliography 27 May 2022 at 01:32 Created: 


Although the enforced stability of genomic content is ubiquitous among MCEs, the opposite is proving to be the case among prokaryotes, which exhibit remarkable and adaptive plasticity of genomic content. Early bacterial whole-genome sequencing efforts discovered that whenever a particular "species" was re-sequenced, new genes were found that had not been detected earlier — entirely new genes, not merely new alleles. This led to the concepts of the bacterial core-genome, the set of genes found in all members of a particular "species", and the flex-genome, the set of genes found in some, but not all members of the "species". Together these make up the species' pan-genome.

Created with PubMed® Query: pangenome or "pan-genome" or "pan genome" NOT pmcbook NOT ispreviousversion

Citations The Papers (from PubMed®)


RevDate: 2022-05-26

Yao E, Blake VC, Cooper L, et al (2022)

GrainGenes: a data-rich repository for small grains genetics and genomics.

Database : the journal of biological databases and curation, 2022:.

As one of the US Department of Agriculture-Agricultural Research Service flagship databases, GrainGenes ( serves the data and community needs of globally distributed small grains researchers for the genetic improvement of the Triticeae family and Avena species that include wheat, barley, rye and oat. GrainGenes accomplishes its mission by continually enriching its cross-linked data content following the findable, accessible, interoperable and reusable principles, enhancing and maintaining an intuitive web interface, creating tools to enable easy data access and establishing data connections within and between GrainGenes and other biological databases to facilitate knowledge discovery. GrainGenes operates within the biological database community, collaborates with curators and genome sequencing groups and contributes to the AgBioData Consortium and the International Wheat Initiative through the Wheat Information System (WheatIS). Interactive and linked content is paramount for successful biological databases and GrainGenes now has 2917 manually curated gene records, including 289 genes and 254 alleles from the Wheat Gene Catalogue (WGC). There are >4.8 million gene models in 51 genome browser assemblies, 6273 quantitative trait loci and >1.4 million genetic loci on 4756 genetic and physical maps contained within 443 mapping sets, complete with standardized metadata. Most notably, 50 new genome browsers that include outputs from the Wheat and Barley PanGenome projects have been created. We provide an example of an expression quantitative trait loci track on the International Wheat Genome Sequencing Consortium Chinese Spring wheat browser to demonstrate how genome browser tracks can be adapted for different data types. To help users benefit more from its data, GrainGenes created four tutorials available on YouTube. GrainGenes is executing its vision of service by continuously responding to the needs of the global small grains community by creating a centralized, long-term, interconnected data repository. Database URL:

RevDate: 2022-05-26

Neuzil-Bunesova V, Ramirez Garcia A, Modrackova N, et al (2022)

Feed Insects as a Reservoir of Granadaene-Producing Lactococci.

Frontiers in microbiology, 13:848490.

Insects are a component of the diet of different animal species and have been suggested as the major source of human dietary protein for the future. However, insects are also carriers of potentially pathogenic microbes that constitute a risk to food and feed safety. In this study, we reported the occurrence of a hemolytic orange pigmented producing phenotype of Lactococcus garvieae/petauri/formosensis in the fecal microbiota of golden lion tamarins (Leontopithecus rosalia) and feed larvae (Zophobas atratus). Feed insects were identified as a regular source of L. garvieae/petauri/formosensis based on a reanalysis of available 16S rRNA gene libraries. Pan-genome analysis suggested the existence of four clusters within the L. garvieae/petauri/formosensis group. The presence of cyl cluster indicated that some strains of the L. garvieae/petauri/formosensis group produced a pigment similar to granadaene, an orange cytotoxic lipid produced by group B streptococci, including Streptococcus agalactiae. Pigment production by L. garvieae/petauri/formosensis strains was dependent on the presence of the fermentable sugars, with no pigment being observed at pH <4.7. The addition of buffering compounds or arginine, which can be metabolized to ammonium, restored pigment formation. In addition, pigment formation might be related to the source of peptone. These data suggest that edible insects are a possible source of granadaene-producing lactococci, which can be considered a pathogenic risk with zoonotic potential.

RevDate: 2022-05-25

Bach E, Rangel CP, Ribeiro IDA, et al (2022)

Pangenome analyses of Bacillus pumilus, Bacillus safensis, and Priestia megaterium exploring the plant-associated features of bacilli strains isolated from canola.

Molecular genetics and genomics : MGG [Epub ahead of print].

Previous genome mining of the strains Bacillus pumilus 7PB, Bacillus safensis 1TAz, 8Taz, and 32PB, and Priestia megaterium 16PB isolated from canola revealed differences in the profile of antimicrobial biosynthetic genes when compared to the species type strains. To evaluate not only the similarities among B. pumilus, B. safensis, and P. megaterium genomes but also the specificities found in the canola bacilli, we performed comparative genomic analyses through the pangenome evaluation of each species. Besides that, other genome features were explored, especially focusing on plant-associated and biotechnological characteristics. The combination of the genome metrics Average Nucleotide Identity and digital DNA-DNA hybridization formulas 1 and 3 adopting the universal thresholds of 95 and 70%, respectively, was suitable to verify the identification of strains from these groups. On average, core genes corresponded to 45%, 52%, and 34% of B. pumilus, B. safensis, and P. megaterium open pangenomes, respectively. Many genes related to adaptations to plant-associated lifestyles were predicted, especially in the Bacillus genomes. These included genes for acetoin production, polyamines utilization, root exudate chemoreceptors, biofilm formation, and plant cell-wall degrading enzymes. Overall, we could observe that strains of these species exhibit many features in common, whereas most of their variable genome portions have features yet to be uncovered. The observed antifungal activity of canola bacilli might be a result of the synergistic action of secondary metabolites, siderophores, and chitinases. Genome analysis confirmed that these species and strains have biotechnological potential to be used both as agricultural inoculants or hydrolases producers. Up to our knowledge, this is the first work that evaluates the pangenome features of P. megaterium.

RevDate: 2022-05-23

Saldarriaga-Córdoba M, R Avendaño-Herrera (2022)

Comparative pan-genomic analysis of 51 Renibacterium salmoninarum indicates heterogeneity in the principal virulence factor, the 57 kDa protein.

Journal of fish diseases [Epub ahead of print].

Renibacterium salmoninarum, a Gram-positive intracellular pathogen, is the causative agent of bacterial kidney disease (BKD), the impacts of which are high mortalities and economic losses for the salmon industry. This study provides novel analyses for the whole-genome sequences of 50 R. salmoninarum isolates and the reference strain ATCC 33209 using a pan-genomic approach to elucidate phylogenomic relationships and identify unique and shared genes associated with pathogenicity and infection mechanisms. Genome size varied from 3,061,638 to 3,155,332 bp; gene count from 3452 to 3580; and predicted coding sequences from 3402 to 3527. Comparative analyses revealed an open, but approaching closed, pan-genome. The pan-genome analysis recovered 4064 genes, with a core genome containing 3306 genes. Phylogenetic analysis of R. salmoninarum showed high genomic homogeneity, apart from one isolate obtained from Salmo trutta in Norway. All genomes presented the 57-kDa protein (p57). Strain ATCC 33209 and the Chilean isolates H-2 and DJ2R presented two copies of the msa gene, while the remaining isolates had one copy. The pan-genome analysis further identified differences in the number of copies and length of the signalling peptide for p57, the principal virulence factor reported for this bacterium. This heterogeneity could be associated with the secretion levels of p57, potentially influencing virulence. Additionally identified were numerous common genes related to iron uptake, the stress response and regulation, and cell signalling-all of which constitute the pathogenic repertoire of R. salmoninarum. This investigation provides information that is applicable in future studies for identifying therapeutic targets and/or for designing new strategies (e.g., vaccines) to prevent BKD infections in salmon farming.

RevDate: 2022-05-23

de Korne-Elenbaas J, Bruisten SM, van Dam AP, et al (2022)

The Neisseria gonorrhoeae Accessory Genome and Its Association with the Core Genome and Antimicrobial Resistance.

Microbiology spectrum [Epub ahead of print].

The bacterial accessory genome provides the genetic flexibility needed to facilitate environment and host adaptation. In Neisseria gonorrhoeae, known accessory elements include plasmids which can transfer and mediate antimicrobial resistance (AMR); however, chromosomal accessory genes could also play a role in AMR. Here, the gonococcal accessory genome was characterized using gene-by-gene approaches and its association with the core genome and AMR were assessed. The gonococcal accessory gene pool consisted of 247 genes, which were mainly genes located on large mobile genetic elements, phage associated genes, or genes encoding putative secretion systems. Accessory elements showed similar synteny across genomes, indicating either a predisposition for particular genomic locations or ancestral inheritance that are conserved during strain expansion. Significant associations were found between the prevalence of accessory elements and core genome multi-locus sequence types (cgMLST), consistent with a structured gonococcal population despite frequent horizontal gene transfer (HGT). Increased prevalence of putative DNA exchange regulators was significantly associated with AMR, which included a putative secretion system, methyltransferases and a toxin-antitoxin system. Although frequent HGT results in high genetic diversity in the gonococcus, we found that this is mediated by a small gene pool. In fact, a highly organized genome composition was identified with a strong association between the accessory and core genome. Increased prevalence of DNA exchange regulators in antimicrobial resistant isolates suggests that genetic material exchange plays a role in the development or maintenance of AMR. These findings enhance our understanding of gonococcal genome architecture and have important implications for gonococcal population biology. IMPORTANCE The emergence of antimicrobial resistance (AMR) against third generation cephalosporins in Neisseria gonorrhoeae is a major public health concern, as these are antibiotics of last resort for the effective treatment of gonorrhea. Although the resistance mechanisms against this class of antibiotics have not been entirely resolved, resistance against other classes of antibiotics, such as tetracyclines, is known to be mediated through plasmids, which are known gonococcal extra-chromosomal accessory elements. A complete assessment of the chromosomal accessory genome content and its role in AMR has not yet been undertaken. Here, we comprehensively characterize the gonococcal accessory genome to better understand genome architecture as well as the evolution and mechanisms of AMR in this species.

RevDate: 2022-05-23

Wang C, Ye Q, Jiang A, et al (2022)

Pseudomonas aeruginosa Detection Using Conventional PCR and Quantitative Real-Time PCR Based on Species-Specific Novel Gene Targets Identified by Pangenome Analysis.

Frontiers in microbiology, 13:820431.

Mining novel specific molecular targets and establishing efficient identification methods are significant for detecting Pseudomonas aeruginosa, which can enable P. aeruginosa tracing in food and water. Pangenome analysis was used to analyze the whole genomic sequences of 2017 strains (including 1,000 P. aeruginosa strains and 1,017 other common foodborne pathogen strains) downloaded from gene databases to obtain novel species-specific genes, yielding a total of 11 such genes. Four novel target genes, UCBPP-PA14_00095, UCBPP-PA14_03237, UCBPP-PA14_04976, and UCBPP-PA14_03627, were selected for use, which had 100% coverage in the target strain and were not present in nontarget bacteria. PCR primers (PA1, PA2, PA3, and PA4) and qPCR primers (PA12, PA13, PA14, and PA15) were designed based on these target genes to establish detection methods. For the PCR primer set, the minimum detection limit for DNA was 65.4 fg/μl, which was observed for primer set PA2 of the UCBPP-PA14_03237 gene. The detection limit in pure culture without pre-enrichment was 105 colony-forming units (CFU)/ml for primer set PA1, 103 CFU/ml for primer set PA2, and 104 CFU/ml for primer set PA3 and primer set PA4. Then, qPCR standard curves were established based on the novel species-specific targets. The standard curves showed perfect linear correlations, with R 2 values of 0.9901 for primer set PA12, 0.9915 for primer set PA13, 0.9924 for primer set PA14, and 0.9935 for primer set PA15. The minimum detection limit of the real-time PCR (qPCR) assay was 102 CFU/ml for pure cultures of P. aeruginosa. Compared with the endpoint PCR and traditional culture methods, the qPCR assay was more sensitive by one or two orders of magnitude. The feasibility of these methods was satisfactory in terms of sensitivity, specificity, and efficiency after evaluating 29 ready-to-eat vegetable samples and was almost consistent with that of the national standard detection method. The developed assays can be applied for rapid screening and detection of pathogenic P. aeruginosa, providing accurate results to inform effective monitoring measures in order to improve microbiological safety.

RevDate: 2022-05-23

Geng R, Cheng L, Cao C, et al (2022)

Comprehensive Analysis Reveals the Genetic and Pathogenic Diversity of Ralstonia solanacearum Species Complex and Benefits Its Taxonomic Classification.

Frontiers in microbiology, 13:854792.

Ralstonia solanacearum species complex (RSSC) is a diverse group of plant pathogens that attack a wide range of hosts and cause devastating losses worldwide. In this study, we conducted a comprehensive analysis of 131 RSSC strains to detect their genetic diversity, pathogenicity, and evolution dynamics. Average nucleotide identity analysis was performed to explore the genomic relatedness among these strains, and finally obtained an open pangenome with 32,961 gene families. To better understand the diverse evolution and pathogenicity, we also conducted a series of analyses of virulence factors (VFs) and horizontal gene transfer (HGT) in the pangenome and at the single genome level. The distribution of VFs and mobile genetic elements (MGEs) showed significant differences among different groups and strains, which were consistent with the new nomenclatures of the RSSC with three distinct species. Further functional analysis showed that most HGT events conferred from Burkholderiales and played a great role in shaping the genomic plasticity and genetic diversity of RSSC genomes. Our work provides insights into the genetic polymorphism, evolution dynamics, and pathogenetic variety of RSSC and provides strong supports for the new taxonomic classification, as well as abundant resources for studying host specificity and pathogen emergence.

RevDate: 2022-05-19

Gluck-Thaler E, Ralston T, Konkel Z, et al (2022)

Giant Starship elements mobilize accessory genes in fungal genomes.

Molecular biology and evolution pii:6588634 [Epub ahead of print].

Accessory genes are variably present among members of a species and are a reservoir of adaptive functions. In bacteria, differences in gene distributions among individuals largely result from mobile elements that acquire and disperse accessory genes as cargo. In contrast, the impact of cargo-carrying elements on eukaryotic evolution remains largely unknown. Here, we show that variation in genome content within multiple fungal species is facilitated by Starships, a newly discovered group of massive mobile elements that are 110 kb long on average, share conserved components, and carry diverse arrays of accessory genes. We identified hundreds of Starship-like regions across every major class of filamentous Ascomycetes, including 28 distinct Starships that range from 27-393 kb and last shared a common ancestor ca. 400 mya. Using new long-read assemblies of the plant pathogen Macrophomina phaseolina, we characterize 4 additional Starships whose activities contribute to standing variation in genome structure and content. One of these elements, Voyager, inserts into 5S rDNA and contains a candidate virulence factor whose increasing copy number has contrasting associations with pathogenic and saprophytic growth, suggesting Voyager's activity underlies an ecological trade-off. We propose that Starships are eukaryotic analogs of bacterial integrative and conjugative elements based on parallels between their conserved components and may therefore represent the first dedicated agents of active gene transfer in eukaryotes. Our results suggest that Starships have shaped the content and structure of fungal genomes for millions of years and reveal a new concerted route for evolution throughout an entire eukaryotic phylum.

RevDate: 2022-05-18

Ghimire N, Kim B, Lee CM, et al (2022)

Comparative genome analysis among Variovorax species and genome guided aromatic compound degradation analysis emphasizing 4-hydroxybenzoate degradation in Variovorax sp. PAMC26660.

BMC genomics, 23(1):375.

BACKGROUND: While the genus Variovorax is known for its aromatic compound metabolism, no detailed study of the peripheral and central pathways of aromatic compound degradation has yet been reported. Variovorax sp. PAMC26660 is a lichen-associated bacterium isolated from Antarctica. The work presents the genome-based elucidation of peripheral and central catabolic pathways of aromatic compound degradation genes in Variovorax sp. PAMC26660. Additionally, the accessory, core and unique genes were identified among Variovorax species using the pan genome analysis tool. A detailed analysis of the genes related to xenobiotic metabolism revealed the potential roles of Variovorax sp. PAMC26660 and other species in bioremediation.

RESULTS: TYGS analysis, dDDH, phylogenetic placement and average nucleotide identity (ANI) analysis identified the strain as Variovorax sp. Cell morphology was assessed using scanning electron microscopy (SEM). On analysis of the core, accessory, and unique genes, xenobiotic metabolism accounted only for the accessory and unique genes. On detailed analysis of the aromatic compound catabolic genes, peripheral pathway related to 4-hydroxybenzoate (4-HB) degradation was found among all species while phenylacetate and tyrosine degradation pathways were present in most of the species including PAMC26660. Likewise, central catabolic pathways, like protocatechuate, gentisate, homogentisate, and phenylacetyl-CoA, were also present. The peripheral pathway for 4-HB degradation was functionally tested using PAMC26660, which resulted in the growth using it as a sole source of carbon.

CONCLUSIONS: Computational tools for genome and pan genome analysis are important to understand the behavior of an organism. Xenobiotic metabolism-related genes, that only account for the accessory and unique genes infer evolution through events like lateral gene transfer, mutation and gene rearrangement. 4-HB, an aromatic compound present among lichen species is utilized by lichen-associated Variovorax sp. PAMC26660 as the sole source of carbon. The strain holds genes and pathways for its utilization. Overall, this study outlines the importance of Variovorax in bioremediation and presents the genomic information of the species.

RevDate: 2022-05-17

Nanni AV, Morse AM, Newman JRB, et al (2022)

Variation in leaf transcriptome responses to elevated ozone corresponds with physiological sensitivity to ozone across maize inbred lines.

Genetics pii:6586798 [Epub ahead of print].

We examine the impact of sustained elevated ozone concentration on the leaf transcriptome of 5 diverse maize inbred genotypes, which vary in physiological sensitivity to ozone (B73, Mo17, Hp301, C123, NC338), using long reads to assemble transcripts and short reads to quantify expression of these transcripts. More than 99% of the long reads, 99% of the assembled transcripts, and 97% of the short reads map to both B73 and Mo17 reference genomes. Approximately 95% of the genes with assembled transcripts belong to known B73-Mo17 syntenic loci and 94% of genes with assembled transcripts are present in all temperate lines in the NAM pan-genome. While there is limited evidence for alternative splicing in response to ozone stress, there is a difference in the magnitude of differential expression among the 5 genotypes. The transcriptional response to sustained ozone stress in the ozone resistant B73 genotype (151 genes) was modest, while more than 3,300 genes were significantly differentially expressed in the more sensitive NC338 genotype. There is the potential for tandem duplication in 30% of genes with assembled transcripts, but there is no obvious association between potential tandem duplication and differential expression. Genes with a common response across the 5 genotypes (83 genes) were associated with photosynthesis, in particular photosystem I. The functional annotation of genes not differentially expressed in B73 but responsive in the other 4 genotypes (789) identifies reactive oxygen species. This suggests that B73 has a different response to long term ozone exposure than the other 4 genotypes. The relative magnitude of the genotypic response to ozone, and the enrichment analyses are consistent regardless of whether aligning short reads to: long read assembled transcripts; the B73 reference; the Mo17 reference. We find that prolonged ozone exposure directly impacts the photosynthetic machinery of the leaf.

RevDate: 2022-05-16

Abdullah IT, Ulijasz AT, Girija UV, et al (2022)

Structure-function analysis for development of peptide inhibitors for a Gram positive quorum sensing system.

Molecular microbiology [Epub ahead of print].

The Streptococcus pneumoniae Rgg144/SHP144 regulator-peptide quorum sensing (QS) system is critical for nutrient utilisation, oxidative stress response, and virulence. Here we characterised this system by assessing the importance of each residue within the active short hydrophobic peptide (SHP) by alanine-scanning mutagenesis and testing the resulting peptides for receptor binding and activation of the receptor. Interestingly, several of the mutations had little effect on binding to Rgg144 but reduced transcriptional activation appreciably. In particular, a proline substitution (P21A) reduced transcriptional activation by 29-fold but bound with 3-fold higher affinity than the wild-type SHP. Consistent with the function of Rgg144, the mutant peptide led to decreased utilisation of mannose and increased susceptibility to superoxide generator paraquat. Pangenome comparison showed full conservation of P21 across SHP144 allelic variants. Crystalization of Rgg144 in the absence of peptide revealed a comparable structure to the DNA bound and free forms of its homologues suggesting similar mechanisms of activation. Together, these analyses identify key interactions in a critical pneumococcal QS system. Further manipulation of the SHP has the potential to facilitate the development of inhibitors that are functional across strains. The approach described here is likely to be effective across QS systems in multiple species.

RevDate: 2022-05-16

Chen H, Li Y, Xie X, et al (2022)

Exploration of the Molecular Mechanisms Underlying the Anti-Photoaging Effect of Limosilactobacillus fermentum XJC60.

Frontiers in cellular and infection microbiology, 12:838060.

Although lactic acid bacteria (LAB) were shown to be effective for preventing photoaging, the underlying molecular mechanisms have not been fully elucidated. Accordingly, we examined the anti-photoaging potential of 206 LAB isolates and discovered 32 strains with protective activities against UV-induced injury. All of these 32 LABs exhibited high levels of 2,2-diphenyl-picrylhydrazyl, as well as hydroxyl free radical scavenging ability (46.89-85.13% and 44.29-95.97%, respectively). Genome mining and metabonomic verification of the most effective strain, Limosilactobacillus fermentum XJC60, revealed that the anti-photoaging metabolite of LAB was nicotinamide (NAM; 18.50 mg/L in the cell-free serum of XJC60). Further analysis revealed that LAB-derived NAM could reduce reactive oxygen species levels by 70%, stabilize the mitochondrial membrane potential, and increase the NAD+/NADH ratio in UV-injured skin cells. Furthermore, LAB-derived NAM downregulated the transcript levels of matrix metalloproteinase (MMP)-1, MMP-3, interleukin (IL)-1β, IL-6, and IL-8 in skin cells. In vivo, XJC60 relieved imflammation and protected skin collagen fiber integrity in UV-injured Guinea pigs. Overall, our findings elucidate that LAB-derived NAM might protect skin from photoaging by stabilizing mitochondrial function, establishing a therotical foundation for the use of probiotics in the maintenance of skin health.

RevDate: 2022-05-14

Petereit J, Marsh JI, Bayer PE, et al (2022)

Genetic and Genomic Resources for Soybean Breeding Research.

Plants (Basel, Switzerland), 11(9): pii:plants11091181.

Soybean (Glycine max) is a legume species of significant economic and nutritional value. The yield of soybean continues to increase with the breeding of improved varieties, and this is likely to continue with the application of advanced genetic and genomic approaches for breeding. Genome technologies continue to advance rapidly, with an increasing number of high-quality genome assemblies becoming available. With accumulating data from marker arrays and whole-genome resequencing, studying variations between individuals and populations is becoming increasingly accessible. Furthermore, the recent development of soybean pangenomes has highlighted the significant structural variation between individuals, together with knowledge of what has been selected for or lost during domestication and breeding, information that can be applied for the breeding of improved cultivars. Because of this, resources such as genome assemblies, SNP datasets, pangenomes and associated databases are becoming increasingly important for research underlying soybean crop improvement.

RevDate: 2022-05-14

Du Y, Jin Y, Li B, et al (2022)

Comparative Genomic Analysis of Vibrio cincinnatiensis Provides Insights into Genetic Diversity, Evolutionary Dynamics, and Pathogenic Traits of the Species.

International journal of molecular sciences, 23(9): pii:ijms23094520.

Vibrio cincinnatiensis is a poorly understood pathogenic Vibrio species, and the underlying mechanisms of its genetic diversity, genomic plasticity, evolutionary dynamics, and pathogenicity have not yet been comprehensively investigated. Here, a comparative genomic analysis of V. cincinnatiensis was constructed. The open pan-genome with a flexible gene repertoire exhibited genetic diversity. The genomic plasticity and stability were characterized by the determinations of diverse mobile genetic elements (MGEs) and barriers to horizontal gene transfer (HGT), respectively. Evolutionary divergences were exhibited by the difference in functional enrichment and selective pressure between the different components of the pan-genome. The evolution on the Chr I and Chr II core genomes was mainly driven by purifying selection. Predicted essential genes in V. cincinnatiensis were mainly found in the core gene families on Chr I and were subject to stronger evolutionary constraints. We identified diverse virulence-related elements, including the gene clusters involved in encoding flagella, secretion systems, several pili, and scattered virulence genes. Our results indicated the pathogenic potential of V. cincinnatiensis and highlighted that HGT events from other Vibrio species promoted pathogenicity. This pan-genome study provides comprehensive insights into this poorly understood species from the genomic perspective.

RevDate: 2022-05-13

Song JM, Zhang Y, Zhou ZW, et al (2022)

Oil plant genomes: current state of the science.

Journal of experimental botany, 73(9):2859-2874.

Vegetable oils are an indispensable nutritional component of the human diet as well as important raw materials for a variety of industrial applications such as pharmaceuticals, cosmetics, oleochemicals, and biofuels. Oil plant genomes are highly diverse, and their genetic variation leads to a diversity in oil biosynthesis and accumulation along with agronomic traits. This review discusses plant oil biosynthetic pathways, current state of genome assembly, polyploidy and asymmetric evolution of genomes of oil plants and their wild relatives, and research progress of pan-genomics in oil plants. The availability of complete high-resolution genomes and pan-genomes has enabled the identification of structural variations in the genomes that are associated with the diversity of agronomic and environment fitness traits. These and future genomes also provide powerful tools to understand crop evolution and to harvest the rich natural variations to improve oil crops for enhanced productivity, oil quality, and adaptability to changing environments.

RevDate: 2022-05-13

Zhou J, Hu M, Hu A, et al (2022)

Isolation and Genome Analysis of Pectobacterium colocasium sp. nov. and Pectobacterium aroidearum, Two New Pathogens of Taro.

Frontiers in plant science, 13:852750.

Bacterial soft rot is one of the most destructive diseases of taro (Colocasia esculenta) worldwide. In recent years, frequent outbreaks of soft rot disease have seriously affected taro production and became a major constraint to the development of taro planting in China. However, little is known about the causal agents of this disease, and the only reported pathogens are two Dickeya species and P. carotovorum. In this study, we report taro soft rot caused by two novel Pectobacterium strains, LJ1 and LJ2, isolated from taro corms in Ruyuan County, Shaoguan City, Guangdong Province, China. We showed that LJ1 and LJ2 fulfill Koch's postulates for taro soft rot. The two pathogens can infect taro both individually and simultaneously, and neither synergistic nor antagonistic interaction was observed between the two pathogens. Genome sequencing of the two strains indicated that LJ1 represents a novel species of the genus Pectobacterium, for which the name "Pectobacterium colocasium sp. nov." is proposed, while LJ2 belongs to Pectobacterium aroidearum. Pan-genome analysis revealed multiple pathogenicity-related differences between LJ1, LJ2, and other Pectobacterium species, including unique virulence factors, variation in the copy number and organization of Type III, IV, and VI secretion systems, and differential production of plant cell wall degrading enzymes. This study identifies two new soft rot Pectobacteriaceae (SRP) pathogens causing taro soft rot in China, reports a new case of co-infection of plant pathogens, and provides valuable resources for further investigation of the pathogenic mechanisms of SRP.

RevDate: 2022-05-13

Guarracino A, Heumos S, Nahnsen S, et al (2022)

ODGI: understanding pangenome graphs.

Bioinformatics (Oxford, England) pii:6585331 [Epub ahead of print].

MOTIVATION: Pangenome graphs provide a complete representation of the mutual alignment of collections of genomes. These models offer the opportunity to study the entire genomic diversity of a population, including structurally complex regions. Nevertheless, analyzing hundreds of gigabase-scale genomes using pangenome graphs is difficult as it is not well-supported by existing tools. Hence, fast and versatile software is required to ask advanced questions to such data in an efficient way.

RESULTS: We wrote ODGI, a novel suite of tools that implements scalable algorithms and has an efficient in-memory representation of DNA pangenome graphs in the form of variation graphs. ODGI supports pre-built graphs in the Graphical Fragment Assembly format. ODGI includes tools for detecting complex regions, extracting pangenomic loci, removing artifacts, exploratory analysis, manipulation, validation, and visualization. Its fast parallel execution facilitates routine pangenomic tasks, as well as pipelines that can quickly answer complex biological questions of gigabase-scale pangenome graphs.

AVAILABILITY: ODGI is published as free software under the MIT open source license. Source code can be downloaded from and documentation is available at ODGI can be installed via Bioconda or GNU Guix

RevDate: 2022-05-13

Cella E, Sutcliffe CG, Tso C, et al (2022)

Carriage prevalence and genomic epidemiology of Staphylococcus aureus among Native American children and adults in the Southwestern USA.

Microbial genomics, 8(5):.

Native American individuals in the Southwestern USA experience a higher burden of invasive Staphylococcus aureus disease than the general population. However, little is known about S. aureus carriage in these communities. A cross-sectional study was conducted to determine the carriage prevalence, risk factors and genomic epidemiology of S. aureus among Native American children (<5 years, n=121) and adults (≥18 years, n=167) in the Southwestern USA. Short- and long-read sequencing data were generated using Illumina and Oxford Nanopore Technology platforms to produce high-quality hybrid assemblies, and antibiotic-resistance, virulence and pangenome analyses were performed. S. aureus carriage prevalence was 20.7 % among children, 30.2 % among adults 18-64 years and 16.7 % among adults ≥65 years. Risk factors among adults included recent surgery, prior S. aureus infection among household members, and recent use of gyms or locker rooms by household members. No risk factors were identified among children. The bacterial population structure was dominated by clonal complex 1 (CC1) (21.1 %), CC5 (22.2 %) and CC8 (22.2 %). Isolates from children and adults were intermixed throughout the phylogeny. While the S. aureus population was diverse, the carriage prevalence was comparable to that in the general USA population. Genomic and risk-factor data suggest household, community and healthcare transmission are important components of the local epidemiology.

RevDate: 2022-05-13

Mesa V, Monot M, Ferraris L, et al (2022)

Core-, pan- and accessory genome analyses of Clostridium neonatale: insights into genetic diversity.

Microbial genomics, 8(5):.

Clostridium neonatale is a potential opportunistic pathogen recovered from faecal samples in cases of necrotizing enterocolitis (NEC), a gastrointestinal disease affecting preterm neonates. Although the C. neonatale species description and name validation were published in 2018, comparative genomics are lacking. In the present study, we provide the closed genome assembly of the C. neonatale ATCC BAA-265T (=250.09) reference strain with a manually curated functional annotation of the coding sequences. Pan-, core- and accessory genome analyses were performed using the complete 250.09 genome (4.7 Mb), three new assemblies (4.6-5.6 Mb), and five publicly available draft genome assemblies (4.6-4.7 Mb). The C. neonatale pan-genome contains 6840 genes, while the core-genome has 3387 genes. Pan-genome analysis revealed an 'open' state and genomic diversity. The strain-specific gene families ranged from five to 742 genes. Multiple mobile genetic elements were predicted, including a total of 201 genomic islands, 13 insertion sequence families, one CRISPR-Cas type I-B system and 15 predicted intact prophage signatures. Primary virulence classes including offensive, defensive, regulation of virulence-associated genes and non-specific virulence factors were identified. The presence of a tet(W/N/W) gene encoding a tetracycline resistance ribosomal protection protein and a 23S rRNA methyltransferase ermQ gene were identified in two different strains. Together, our results revealed a genetic diversity and plasticity of C. neonatale genomes and provide a comprehensive view of this species genomic features, paving the way for the characterization of its biological capabilities.

RevDate: 2022-05-05

Tantoso E, Eisenhaber B, F Eisenhaber (2022)

Optimizing the Parametrization of Homologue Classification in the Pan-Genome Computation for a Bacterial Species: Case Study Streptococcus pyogenes.

Methods in molecular biology (Clifton, N.J.), 2449:299-324.

The paradigm shift associated with the introduction of the pan-genome concept has drawn the attention from singular reference genomes toward the actual sequence diversity within organism populations, strain collections, clades, etc. A single genome is no longer sufficient to describe bacteria of interest, but instead, the genomic repertoire of all existing strains is the key to the metabolic, evolutionary, or pathogenic potential of a species. The classification of orthologous genes derived from a collection of taxonomically related genome sequences is central to bacterial pan-genome computational analysis. In this work, we present a review of methods for computing pan-genome gene clusters including their comparative analysis for the case of Streptococcus pyogenes strain genomes. We exhaustively scanned the parametrization space of the homologue searching procedures and find optimal parameters (sequence identity (60%) and coverage (50-60%) in the pairwise alignment) for the orthologous clustering of gene sequences. We find that the sequence identity threshold influences the number of gene families ~3 times stronger than the sequence coverage threshold.

RevDate: 2022-05-03

Liu H, Zhao W, Hua W, et al (2022)

A large-scale population based organelle pan-genomes construction and phylogeny analysis reveal the genetic diversity and the evolutionary origins of chloroplast and mitochondrion in Brassica napus L.

BMC genomics, 23(1):339.

BACKGROUND: Allotetraploid oilseed rape (Brassica napus L.) is an important worldwide oil-producing crop. The origin of rapeseed is still undetermined due to the lack of wild resources. Despite certain genetic architecture and phylogenetic studies have been done focus on large group of Brassica nuclear genomes, the organelle genomes information under global pattern is largely unknown, which provide unique material for phylogenetic studies of B. napus. Here, based on de novo assemblies of 1,579 B. napus accessions collected globally, we constructed the chloroplast and mitochondrial pan-genomes of B. napus, and investigated the genetic diversity, phylogenetic relationships of B. napus, B. rapa and B. oleracea.

RESULTS: Based on mitotype-specific markers and mitotype-variant ORFs, four main cytoplasmic haplotypes were identified in our groups corresponding the nap, pol, ole, and cam mitotypes, among which the structure of chloroplast genomes was more conserved without any rearrangement than mitochondrial genomes. A total of 2,092 variants were detected in chloroplast genomes, whereas only 326 in mitochondrial genomes, indicating that chloroplast genomes exhibited a higher level of single-base polymorphism than mitochondrial genomes. Based on whole-genome variants diversity analysis, eleven genetic difference regions among different cytoplasmic haplotypes were identified on chloroplast genomes. The phylogenetic tree incorporating accessions of the B. rapa, B. oleracea, natural and synthetic populations of B. napus revealed multiple origins of B. napus cytoplasm. The cam-type and pol-type were both derived from B. rapa, while the ole-type was originated from B. oleracea. Notably, the nap-type cytoplasm was identified in both the B. rapa population and the synthetic B. napus, suggesting that B. rapa might be the maternal ancestor of nap-type B. napus.

CONCLUSIONS: The phylogenetic results provide novel insights into the organelle genomic evolution of Brassica species. The natural rapeseeds contained at least four cytoplastic haplotypes, of which the predominant nap-type might be originated from B. rapa. Besides, the organelle pan-genomes and the overall variation data offered useful resources for analysis of cytoplasmic inheritance related agronomical important traits of rapeseed, which can substantially facilitate the cultivation and improvement of rapeseed varieties.

RevDate: 2022-05-02

Burridge AJ, Winfield MO, Wilkinson PA, et al (2022)

The Use and Limitations of Exome Capture to Detect Novel Variation in the Hexaploid Wheat Genome.

Frontiers in plant science, 13:841855.

The bread wheat (Triticum aestivum) pangenome is a patchwork of variable regions, including translocations and introgressions from progenitors and wild relatives. Although a large number of these have been documented, it is likely that many more remain unknown. To map these variable regions and make them more traceable in breeding programs, wheat accessions need to be genotyped or sequenced. The wheat genome is large and complex and consequently, sequencing efforts are often targeted through exome capture. In this study, we employed exome capture prior to sequencing 12 wheat varieties; 10 elite T. aestivum cultivars and two T. aestivum landrace accessions. Sequence coverage across chromosomes was greater toward distal regions of chromosome arms and lower in centromeric regions, reflecting the capture probe distribution which itself is determined by the known telomere to centromere gene gradient. Superimposed on this general pattern, numerous drops in sequence coverage were observed. Several of these corresponded with reported introgressions. Other drops in coverage could not be readily explained and may point to introgressions that have not, to date, been documented.

RevDate: 2022-05-02

Nwaiwu O (2022)

Comparative genome analysis of the first Listeria monocytogenes core genome multi-locus sequence types CT2050 AND CT2051 strains with their close relatives.

AIMS microbiology, 8(1):61-72 pii:microbiol-08-01-006.

Genome sequences of the three strains of L. monocytogenes, which are the first core genome multi-locus sequence types (cgMLST) 2050 and 2051 were reviewed and compared with 21 close relatives and reference genomes. Using a pan-genomic approach to analyse whole genome sequences, it was found that the strains consisted of approximately 2200 shared genes and a much greater pool of genes present as an accessory genome. An unknown transmissible sequence of approximately 91 kb harbouring bacitracin resistance genes found in strain LmNG2 (1/2b) was revealed to be an Inc18 plasmid. The CT2051, strain LmNG3 (1/2a) haboured more unique genes (252 vs 230) than the well-known reference strain LmEGD-e (1/2a). More studies to monitor new strains can help reduce food-borne outbreaks.

RevDate: 2022-05-02

Song Y, Xu X, Huang Z, et al (2022)

Corrigendum: Genomic Characteristics and Pan-Genome Analysis of Rhodococcus equi.

Frontiers in cellular and infection microbiology, 12:884441.

[This corrects the article DOI: 10.3389/fcimb.2022.807610.].

RevDate: 2022-04-30

Mohd Saad NS, Neik TX, Thomas WJW, et al (2022)

Advancing designer crops for climate resilience through an integrated genomics approach.

Current opinion in plant biology, 67:102220 pii:S1369-5266(22)00049-8 [Epub ahead of print].

Climate change and exponential population growth are exposing an immediate need for developing future crops that are highly resilient and adaptable to changing environments to maintain global food security in the next decade. Rigorous selection from long domestication history has rendered cultivated crops genetically disadvantaged, raising concerns in their ability to adapt to these new challenges and limiting their usefulness in breeding programmes. As a result, future crop improvement efforts must rely on integrating various genomic strategies ranging from high-throughput sequencing to machine learning, in order to exploit germplasm diversity and overcome bottlenecks created by domestication, expansive multi-dimensional phenotypes, arduous breeding processes, complex traits and big data.

RevDate: 2022-04-30

Wang Z, Rouard M, Biswas MK, et al (2022)

A chromosome-level reference genome of Ensete glaucum gives insight into diversity and chromosomal and repetitive sequence evolution in the Musaceae.

GigaScience, 11:.

BACKGROUND: Ensete glaucum (2n = 2x = 18) is a giant herbaceous monocotyledonous plant in the small Musaceae family along with banana (Musa). A high-quality reference genome sequence assembly of E. glaucum is a resource for functional and evolutionary studies of Ensete, Musaceae, and the Zingiberales.

FINDINGS: Using Oxford Nanopore Technologies, chromosome conformation capture (Hi-C), Illumina and RNA survey sequence, supported by molecular cytogenetics, we report a high-quality 481.5 Mb genome assembly with 9 pseudo-chromosomes and 36,836 genes. A total of 55% of the genome is composed of repetitive sequences with predominantly LTR-retroelements (37%) and DNA transposons (7%). The single 5S ribosomal DNA locus had an exceptionally long monomer length of 1,056 bp, more than twice that of the monomers at multiple loci in Musa. A tandemly repeated satellite (1.1% of the genome, with no similar sequence in Musa) was present around all centromeres, together with a few copies of a long interspersed nuclear element (LINE) retroelement. The assembly enabled us to characterize in detail the chromosomal rearrangements occurring between E. glaucum and the x = 11 species of Musa. One E. glaucum chromosome has the same gene content as Musa acuminata, while others show multiple, complex, but clearly defined evolutionary rearrangements in the change between x= 9 and 11.

CONCLUSIONS: The advance towards a Musaceae pangenome including E. glaucum, tolerant of extreme environments, makes a complete set of gene alleles, copy number variation, and a reference for structural variation available for crop breeding and understanding environmental responses. The chromosome-scale genome assembly shows the nature of chromosomal fusion and translocation events during speciation, and features of rapid repetitive DNA change in terms of copy number, sequence, and genomic location, critical to understanding its role in diversity and evolution.

RevDate: 2022-04-28

Markello C, Huang C, Rodriguez A, et al (2022)

A complete pedigree-based graph workflow for rare candidate variant analysis.

Genome research pii:gr.276387.121 [Epub ahead of print].

Methods that use a linear genome reference for genome sequencing data analysis are reference-biased. In the field of clinical genetics for rare diseases, a resulting reduction in genotyping accuracy in some regions has likely prevented the resolution of some cases. Pangenome graphs embed population variation into a reference structure. Although pangenome graphs have helped to reduce reference mapping bias, further performance improvements are possible. We introduce VG-Pedigree, a pedigree-aware workflow based on the pangenome-mapping tool of Giraffe and the variant calling tool DeepTrio using a specially trained model for Giraffe-based alignments. We demonstrate mapping and variant calling improvements in both single-nucleotide variants (SNVs) and insertion and deletion (indel) variants over those produced by alignments created using BWA-MEM to a linear-reference and Giraffe mapping to a pangenome graph containing data from the 1000 Genomes Project. We have also adapted and upgraded deleterious-variant (DV) detecting methods and programs into a streamlined workflow. We used these workflows in combination to detect small lists of candidate DVs among 15 family quartets and quintets of the Undiagnosed Diseases Program (UDP). All candidate DVs that were previously diagnosed using the Mendelian models covered by the previously published methods were recapitulated by these workflows. The results of these experiments indicate that a slightly greater absolute count of DVs are detected in the proband population than in their matched unaffected siblings.

RevDate: 2022-04-28

Alotaibi G, Khan K, Al Mouslem AK, et al (2022)

Pan genome based reverse vaccinology approach to explore Enterococcus faecium (VRE) strains for identification of novel multi-epitopes vaccine candidate.

Immunobiology, 227(3):152221 pii:S0171-2985(22)00047-X [Epub ahead of print].

Enterococcus faecium is regarded as fourth most emerging common pathogen causing hospital acquired infections (HAIs), with high mortality rate, especially in children, elderly and immunocompromised patients. Recently, due to the emergence of E. faecium resistant strains especially vancomycin resistance (VRE) and their continuously growing resistivity to antibiotics, design of safe vaccine remains a choice for its control. Alternative control through vaccination has received much attention, but there is no clinically approved vaccine against this pathogen. Therefore, in current study we have applied a triple helix approach i.e., Pan-genome, subtractive genome and reverse vaccinology to identify and design potential vaccine candidates and multiepitope-based vaccine (MEV) construct against E. faecium (via core genome analysis from 216 strains). In this study, only 2 outer membrane proteins were identified through genome subtraction of resistant strains genes against human and essential proteins. Subsequently, phosphate ABC transporter substrate binding protein (Psts) was selected as a promiscuous vaccine candidate to develop a potent vaccine model. A final of four epitopes from CD8 + T-cell, CD4 + T-cell epitopes, and B-cell were shortlisted from outer membrane protein with highly antigenic, IFN-γ inducer, and overlapping characteristics for the construction of twelve vaccine models. The V3 construct was found to be highly immunogenic, non-toxic, non-allergenic, highly antigenic and most stable in terms of molecular docking and simulation studies against six HLAs, TLR2, and TLR4 complex. So far, this protein and multiepitope have never been characterized as vaccine targets against E. faecium. The current study proposed V3 as a significant vaccine candidate that could help the scientific community to treat E. faecium infections.

RevDate: 2022-04-28

Wu J, NicAogáin K, McAuliffe O, et al (2022)

Phylogenetic and Phenotypic Analyses of a Collection of Food and Clinical Listeria monocytogenes Isolates Reveal Loss of Function of Sigma B from Several Clonal Complexes.

Applied and environmental microbiology [Epub ahead of print].

To understand the molecular mechanisms that contribute to the stress responses of the important foodborne pathogen Listeria monocytogenes, we collected 139 strains (meat, n = 25; dairy, n = 10; vegetable, n = 8; seafood, n = 14; mixed food, n = 4; and food processing environments, n = 78), mostly isolated in Ireland, and subjected them to whole-genome sequencing. These strains were compared to 25 Irish clinical isolates and 4 well-studied reference strains. Core genome and pan-genome analysis confirmed a highly clonal and deeply branched population structure. Multilocus sequence typing showed that this collection contained a diverse range of strains from L. monocytogenes lineages I and II. Several groups of isolates with highly similar genome content were traced to single or multiple food business operators, providing evidence of strain persistence or prevalence, respectively. Phenotypic screening assays for tolerance to salt stress and resistance to acid stress revealed variants within several clonal complexes that were phenotypically distinct. Five of these phenotypic outliers were found to carry mutations in the sigB operon, which encodes the stress-inducible sigma factor sigma B. Transcriptional analysis confirmed that three of the strains that carried mutations in sigB, rsbV, or rsbU had reduced SigB activity, as predicted. These strains exhibited increased tolerance to salt stress and displayed decreased resistance to low pH stress. Overall, this study shows that loss-of-function mutations in the sigB operon are comparatively common in field isolates, probably reflecting the cost of the general stress response to reproductive fitness in this pathogen. IMPORTANCE The bacterial foodborne pathogen Listeria monocytogenes frequently contaminates various categories of food products and is able to cause life-threatening infections when ingested by humans. Thus, it is important to control the growth of this bacterium in food by understanding the mechanisms that allow its proliferation under suboptimal conditions. In this study, intraspecies heterogeneity in stress response was observed across a collection consisting of mainly Irish L. monocytogenes isolates. Through comparisons of genome sequence and phenotypes observed, we identified three strains with impairment of the general stress response regulator SigB. Two of these strains are used widely in food challenge studies for evaluating the growth potential of L. monocytogenes. Given that loss of SigB function is associated with atypical phenotypic properties, the use of these strains in food challenge studies should be re-evaluated.

RevDate: 2022-04-28

de Sá PHCG, Castro Alves JT, AAO Veras (2022)

Protocol to analyze the bacterial pangenome using PAN2HGENE software.

STAR protocols, 3(2):101327 pii:S2666-1667(22)00207-6.

The PAN2HGENE is a computational tool that enables two main analyses. First, the tool can identify gene products absent from the original prokaryotic genome sequence. Second, it enables automated comparative analysis for both complete and draft genomes. All analyses are performed through a simple and intuitive graphical user interface without the need for extensive and complex command lines. For complete details on the use and execution of this protocol, please refer to Silva de Oliveira (2021).

RevDate: 2022-04-27

Norsigian CJ, Danhof HA, Brand CK, et al (2022)

Systems biology approach to functionally assess the Clostridioides difficile pangenome reveals genetic diversity with discriminatory power.

Proceedings of the National Academy of Sciences of the United States of America, 119(18):e2119396119.

SignificanceClostridioides difficile infections are the most common source of hospital-acquired infections and are responsible for an extensive burden on the health care system. Strains of the C. difficile species comprise diverse lineages and demonstrate genome variability, with advantageous trait acquisition driving the emergence of endemic lineages. Here, we present a systems biology analysis of C. difficile that evaluates strain-specific genotypes and phenotypes to investigate the overall diversity of the species. We develop a strain typing method based on similarity of accessory genomes to identify and contextualize genetic loci capable of discriminating between strain groups.

RevDate: 2022-04-27

Adomako M, Ernst D, Simkovsky R, et al (2022)

Comparative Genomics of Synechococcus elongatus Explains the Phenotypic Diversity of the Strains.

mBio [Epub ahead of print].

Strains of the freshwater cyanobacterium Synechococcus elongatus were first isolated approximately 60 years ago, and PCC 7942 is well established as a model for photosynthesis, circadian biology, and biotechnology research. The recent isolation of UTEX 3055 and subsequent discoveries in biofilm and phototaxis phenotypes suggest that lab strains of S. elongatus are highly domesticated. We performed a comprehensive genome comparison among the available genomes of S. elongatus and sequenced two additional laboratory strains to trace the loss of native phenotypes from the standard lab strains and determine the genetic basis of useful phenotypes. The genome comparison analysis provides a pangenome description of S. elongatus, as well as correction of extensive errors in the published sequence for the type strain PCC 6301. The comparison of gene sets and single nucleotide polymorphisms (SNPs) among strains clarifies strain isolation histories and, together with large-scale genome differences, supports a hypothesis of laboratory domestication. Prophage genes in laboratory strains, but not UTEX 3055, affect pigmentation, while unique genes in UTEX 3055 are necessary for phototaxis. The genomic differences identified in this study include previously reported SNPs that are, in reality, sequencing errors, as well as SNPs and genome differences that have phenotypic consequences. One SNP in the circadian response regulator rpaA that has caused confusion is clarified here as belonging to an aberrant clone of PCC 7942, used for the published genome sequence, that has confounded the interpretation of circadian fitness research. IMPORTANCE Synechococcus elongatus is a versatile and robust model cyanobacterium for photosynthetic metabolism and circadian biology research, with utility as a biological production platform. We compared the genomes of closely related S. elongatus strains to create a pangenome annotation to aid gene discovery for novel phenotypes. The comparative genomic analysis revealed the need for a new sequence of the species type strain PCC 6301 and includes two new sequences for S. elongatus strains PCC 6311 and PCC 7943. The genomic comparison revealed a pattern of early laboratory domestication of strains, clarifies the relationship between the strains PCC 6301 and UTEX 2973, and showed that differences in large prophage regions, operons, and even single nucleotides have effects on phenotypes as wide-ranging as pigmentation, phototaxis, and circadian gene expression.

RevDate: 2022-04-27

Ferrés I, G Iraola (2021)

An object-oriented framework for evolutionary pangenome analysis.

Cell reports methods, 1(5):100085 pii:S2667-2375(21)00140-5.

Pangenome analysis is fundamental to explore molecular evolution occurring in bacterial populations. Here, we introduce Pagoo, an R framework that enables straightforward handling of pangenome data. The encapsulated nature of Pagoo allows the storage of complex molecular and phenotypic information using an object-oriented approach. This facilitates to go back and forward to the data using a single programming environment and saving any stage of analysis (including the raw data) in a single file, making it sharable and reproducible. Pagoo provides tools to query, subset, compare, visualize, and perform statistical analyses, in concert with other microbial genomics packages available in the R ecosystem. As working examples, we used 1,000 Escherichia coli genomes to show that Pagoo is scalable, and a global dataset of Campylobacter fetus genomes to identify evolutionary patterns and genomic markers of host-adaptation in this pathogen.

RevDate: 2022-04-26

Rhodes J, Abdolrasouli A, Dunne K, et al (2022)

Population genomics confirms acquisition of drug-resistant Aspergillus fumigatus infection by humans from the environment.

Nature microbiology [Epub ahead of print].

Infections caused by the fungal pathogen Aspergillus fumigatus are increasingly resistant to first-line azole antifungal drugs. However, despite its clinical importance, little is known about how susceptible patients acquire infection from drug-resistant genotypes in the environment. Here, we present a population genomic analysis of 218 A. fumigatus isolates from across the UK and Ireland (comprising 153 clinical isolates from 143 patients and 65 environmental isolates). First, phylogenomic analysis shows strong genetic structuring into two clades (A and B) with little interclade recombination and the majority of environmental azole resistance found within clade A. Second, we show occurrences where azole-resistant isolates of near-identical genotypes were obtained from both environmental and clinical sources, indicating with high confidence the infection of patients with resistant isolates transmitted from the environment. Third, genome-wide scans identified selective sweeps across multiple regions indicating a polygenic basis to the trait in some genetic backgrounds. These signatures of positive selection are seen for loci containing the canonical genes encoding fungicide resistance in the ergosterol biosynthetic pathway, while other regions under selection have no defined function. Lastly, pan-genome analysis identified genes linked to azole resistance and previously unknown resistance mechanisms. Understanding the environmental drivers and genetic basis of evolving fungal drug resistance needs urgent attention, especially in light of increasing numbers of patients with severe viral respiratory tract infections who are susceptible to opportunistic fungal superinfections.

RevDate: 2022-04-26

Shi YM, Hirschmann M, Shi YN, et al (2022)

Global analysis of biosynthetic gene clusters reveals conserved and unique natural products in entomopathogenic nematode-symbiotic bacteria.

Nature chemistry [Epub ahead of print].

Microorganisms contribute to the biology and physiology of eukaryotic hosts and affect other organisms through natural products. Xenorhabdus and Photorhabdus (XP) living in mutualistic symbiosis with entomopathogenic nematodes generate natural products to mediate bacteria-nematode-insect interactions. However, a lack of systematic analysis of the XP biosynthetic gene clusters (BGCs) has limited the understanding of how natural products affect interactions between the organisms. Here we combine pangenome and sequence similarity networks to analyse BGCs from 45 XP strains that cover all sequenced strains in our collection and represent almost all XP taxonomy. The identified 1,000 BGCs belong to 176 families. The most conserved families are denoted by 11 BGC classes. We homologously (over)express the ubiquitous and unique BGCs and identify compounds featuring unusual architectures. The bioactivity evaluation demonstrates that the prevalent compounds are eukaryotic proteasome inhibitors, virulence factors against insects, metallophores and insect immunosuppressants. These findings explain the functional basis of bacterial natural products in this tripartite relationship.

RevDate: 2022-04-23

Lau Vetter MCY, Huang B, Fenske L, et al (2022)

Metabolism of the Genus Guyparkeria Revealed by Pangenome Analysis.

Microorganisms, 10(4): pii:microorganisms10040724.

Halophilic sulfur-oxidizing bacteria belonging to the genus Guyparkeria occur at both marine and terrestrial habitats. Common physiological characteristics displayed by Guyparkeria isolates have not yet been linked to the metabolic potential encoded in their genetic inventory. To provide a genetic basis for understanding the metabolism of Guyparkeria, nine genomes were compared to reveal the metabolic capabilities and adaptations. A detailed account is given on Guyparkeria's ability to assimilate carbon by fixation, to oxidize reduced sulfur, to oxidize thiocyanate, and to cope with salinity stress.

RevDate: 2022-04-23

Néron B, Littner E, Haudiquet M, et al (2022)

IntegronFinder 2.0: Identification and Analysis of Integrons across Bacteria, with a Focus on Antibiotic Resistance in Klebsiella.

Microorganisms, 10(4): pii:microorganisms10040700.

Integrons are flexible gene-exchanging platforms that contain multiple cassettes encoding accessory genes whose order is shuffled by a specific integrase. Integrons embedded within mobile genetic elements often contain multiple antibiotic resistance genes that they spread among nosocomial pathogens and contribute to the current antibiotic resistance crisis. However, most integrons are presumably sedentary and encode a much broader diversity of functions. IntegronFinder is a widely used software to identify novel integrons in bacterial genomes, but has aged and lacks some useful functionalities to handle very large datasets of draft genomes or metagenomes. Here, we present IntegronFinder version 2. We have updated the code, improved its efficiency and usability, adapted the output to incomplete genome data, and added a few novel functions. We describe these changes and illustrate the relevance of the program by analyzing the distribution of integrons across more than 20,000 fully sequenced genomes. We also take full advantage of its novel capabilities to analyze close to 4000 Klebsiella pneumoniae genomes for the presence of integrons and antibiotic resistance genes within them. Our data show that K. pneumoniae has a large diversity of integrons and the largest mobile integron in our database of plasmids. The pangenome of these integrons contains a total of 165 different gene families with most of the largest families being related with resistance to numerous types of antibiotics. IntegronFinder is a free and open-source software available on multiple public platforms.

RevDate: 2022-04-23

Aggarwal SK, Singh A, Choudhary M, et al (2022)

Pangenomics in Microbial and Crop Research: Progress, Applications, and Perspectives.

Genes, 13(4): pii:genes13040598.

Advances in sequencing technologies and bioinformatics tools have fueled a renewed interest in whole genome sequencing efforts in many organisms. The growing availability of multiple genome sequences has advanced our understanding of the within-species diversity, in the form of a pangenome. Pangenomics has opened new avenues for future research such as allowing dissection of complex molecular mechanisms and increased confidence in genome mapping. To comprehensively capture the genetic diversity for improving plant performance, the pangenome concept is further extended from species to genus level by the inclusion of wild species, constituting a super-pangenome. Characterization of pangenome has implications for both basic and applied research. The concept of pangenome has transformed the way biological questions are addressed. From understanding evolution and adaptation to elucidating host-pathogen interactions, finding novel genes or breeding targets to aid crop improvement to design effective vaccines for human prophylaxis, the increasing availability of the pangenome has revolutionized several aspects of biological research. The future availability of high-resolution pangenomes based on reference-level near-complete genome assemblies would greatly improve our ability to address complex biological problems.

RevDate: 2022-04-22

Yu J, Xu X, Wang Y, et al (2022)

Prophage-mediated genome differentiation of the Salmonella Derby ST71 population.

Microbial genomics, 8(4):.

Although Salmonella Derby ST71 strains have been recognized as poultry-specific by previous studies, multiple swine-associated S. Derby ST71 strains were identified in this long-term, multi-site epidemic study. Here, 15 representative swine-associated S. Derby ST71 strains were sequenced and compared with 65 (one swine-associated and 64 poultry-associated) S. Derby ST71 strains available in the NCBI database at a pangenomic level through comparative genomics analysis to identify genomic features related to the differentiation of swine-associated strains and previously reported poultry-associated strains. The distribution patterns of known Salmonella pathogenicity islands (SPIs) and virulence factor (VF) encoding genes were not capable of differentiating between the two strain groups. The results demonstrated that the S. Derby ST71 population harbours an open pan-genome, and swine-associated ST71 strains contain many more genes than the poultry-associated strains, mainly attributed to the prophage sequence contents in the genomes. The numbers of prophage sequences identified in the swine-associated strains were higher than those in the poultry-associated strains. Prophages specifically harboured by the swine-associated strains were found to contain genes that facilitate niche adaptation for the bacterial hosts. Gene deletion experiments revealed that the dam gene specifically present in the prophage of the swine-associated strains is important for S. Derby to adhere onto the host cells. This study provides novel insights into the roles of prophages during the genome differentiation of Salmonella.

RevDate: 2022-04-21

Jv Y, Xi C, Zhao Y, et al (2022)

Pan-Genomic and Transcriptomic Analyses of Marine Pseudoalteromonas agarivorans Hao 2018 Revealed Its Genomic and Metabolic Features.

Marine drugs, 20(4): pii:md20040248.

The genomic and carbohydrate metabolic features of Pseudoalteromonas agarivorans Hao 2018 (P. agarivorans Hao 2018) were investigated through pan-genomic and transcriptomic analyses, and key enzyme genes that may encode the process involved in its extracellular polysaccharide synthesis were screened. The pan-genome of the P. agarivorans strains consists of a core-genome containing 2331 genes, an accessory-genome containing 956 genes, and a unique-genome containing 1519 genes. Clusters of Orthologous Groups analyses showed that P. agarivorans harbors strain-specifically diverse metabolisms, probably representing high evolutionary genome changes. The Kyoto Encyclopedia of Genes and Genomes and reconstructed carbohydrate metabolic pathways displayed that P. agarivorans strains can utilize a variety of carbohydrates, such as d-glucose, d-fructose, and d-lactose. Analyses of differentially expressed genes showed that compared with the stationary phase (24 h), strain P. agarivorans Hao 2018 had upregulated expression of genes related to the synthesis of extracellular polysaccharides in the logarithmic growth phase (2 h), and that the expression of these genes affected extracellular polysaccharide transport, nucleotide sugar synthesis, and glycosyltransferase synthesis. This is the first investigation of the genomic and metabolic features of P. agarivorans through pan-genomic and transcriptomic analyses, and these intriguing discoveries provide the possibility to produce novel marine drug lead compounds with high biological activity.

RevDate: 2022-04-21

Liu Y, Pei T, Du J, et al (2022)

Comparative Genomics Reveals Genetic Diversity and Metabolic Potentials of the Genus Qipengyuania and Suggests Fifteen Novel Species.

Microbiology spectrum [Epub ahead of print].

Members of the genus Qipengyuania are heterotrophic bacteria frequently isolated from marine environments with great application potential in areas such as carotenoid production. However, the genomic diversity, metabolic function, and adaption of this genus remain largely unclear. Here, 16 isolates related to the genus Qipengyuania were recovered from coastal samples and their genomes were sequenced. The phylogenetic inference of these isolates and reference type strains of this genus indicated that the 16S rRNA gene was insufficient to distinguish them at the species level; instead, the phylogenomic reconstruction could provide the reliable phylogenetic relationships and confirm 15 new well-supported branches, representing 15 putative novel genospecies corroborated by the digital DNA-DNA hybridization and average nucleotide identity analyses. Comparative genomics revealed that the genus Qipengyuania had an open pangenome and possessed multiple conserved genes and pathways related to metabolic functions and environmental adaptation, despite the presence of divergent genomic features and specific metabolic potential. Genetic analysis and pigment detection showed that the members of this genus were identified as carotenoid producers, while some proved to be potentially aerobic anoxygenic photoheterotrophs. Collectively, the first insight into the genetic diversity and metabolic potentials of the genus Qipengyuania will contribute to better understanding of the speciation and adaptive evolution in natural environments. IMPORTANCE The deciphering of the phylogenetic diversity and metabolic features of the abundant bacterial taxa is critical for exploring their ecological importance and application potential. Qipengyuania is a genus of frequently isolated heterotrophic microorganisms with great industrial application potential. Numerous strains related to the genus Qipengyuania have been isolated from diverse environments, but their genomic diversity and metabolic functions remain unclear. Our study revealed a high degree of genetic diversity, metabolic versatility, and environmental adaptation of the genus Qipengyuania using comparative genomics. Fifteen novel species of this genus have been established using a polyphasic taxonomic approach, expanding the number of described species to almost double. This study provided an overall view of the genus Qipengyuania at the genomic level and will enable us to better uncover its ecological roles and evolutionary history.

RevDate: 2022-04-21

Wang T, Antonacci-Fulton L, Howe K, et al (2022)

The Human Pangenome Project: a global resource to map genomic diversity.

Nature, 604(7906):437-446.

The human reference genome is the most widely used resource in human genetics and is due for a major update. Its current structure is a linear composite of merged haplotypes from more than 20 people, with a single individual comprising most of the sequence. It contains biases and errors within a framework that does not represent global human genomic variation. A high-quality reference with global representation of common variants, including single-nucleotide variants, structural variants and functional elements, is needed. The Human Pangenome Reference Consortium aims to create a more sophisticated and complete human reference genome with a graph-based, telomere-to-telomere representation of global genomic diversity. Here we leverage innovations in technology, study design and global partnerships with the goal of constructing the highest-possible quality human pangenome reference. Our goal is to improve data representation and streamline analyses to enable routine assembly of complete diploid genomes. With attention to ethical frameworks, the human pangenome reference will contain a more accurate and diverse representation of global genomic variation, improve gene-disease association studies across populations, expand the scope of genomics research to the most repetitive and polymorphic regions of the genome, and serve as the ultimate genetic resource for future biomedical research and precision medicine.

RevDate: 2022-04-20

Ferrero-Serrano Á, Sylvia MM, Forstmeier PC, et al (2022)

Experimental demonstration and pan-structurome prediction of climate-associated riboSNitches in Arabidopsis.

Genome biology, 23(1):101.

BACKGROUND: Genome-wide association studies (GWAS) aim to correlate phenotypic changes with genotypic variation. Upon transcription, single nucleotide variants (SNVs) may alter mRNA structure, with potential impacts on transcript stability, macromolecular interactions, and translation. However, plant genomes have not been assessed for the presence of these structure-altering polymorphisms or "riboSNitches."

RESULTS: We experimentally demonstrate the presence of riboSNitches in transcripts of two Arabidopsis genes, ZINC RIBBON 3 (ZR3) and COTTON GOLGI-RELATED 3 (CGR3), which are associated with continentality and temperature variation in the natural environment. These riboSNitches are also associated with differences in the abundance of their respective transcripts, implying a role in regulating the gene's expression in adaptation to local climate conditions. We then computationally predict riboSNitches transcriptome-wide in mRNAs of 879 naturally inbred Arabidopsis accessions. We characterize correlations between SNPs/riboSNitches in these accessions and 434 climate descriptors of their local environments, suggesting a role of these variants in local adaptation. We integrate this information in CLIMtools V2.0 and provide a new web resource, T-CLIM, that reveals associations between transcript abundance variation and local environmental variation.

CONCLUSION: We functionally validate two plant riboSNitches and, for the first time, demonstrate riboSNitch conditionality dependent on temperature, coining the term "conditional riboSNitch." We provide the first pan-genome-wide prediction of riboSNitches in plants. We expand our previous CLIMtools web resource with riboSNitch information and with 1868 additional Arabidopsis genomes and 269 additional climate conditions, which will greatly facilitate in silico studies of natural genetic variation, its phenotypic consequences, and its role in local adaptation.

RevDate: 2022-04-18

Belaouni HA, Compant S, Antonielli L, et al (2022)

In-depth genome analysis of Bacillus sp. BH32, a salt stress-tolerant endophyte obtained from a halophyte in a semiarid region.

Applied microbiology and biotechnology [Epub ahead of print].

Endophytic strains belonging to the Bacillus cereus group were isolated from the halophytes Atriplex halimus L. (Amaranthaceae) and Tamarix aphylla L. (Tamaricaceae) from costal and continental regions in Algeria. Based on their salt tolerance (up to 5%), the strains were tested for their ability to alleviate salt stress in tomato and wheat. Bacillus sp. strain BH32 showed the highest potential to reduce salinity stress (up to + 50% and + 58% of dry weight improvement, in tomato and wheat, respectively, compared to the control). To determine putative mechanisms involved in salt tolerance and plant growth promotion, the whole genome of Bacillus sp. BH32 was sequenced, annotated, and used for comparative genomics against the genomes of closely related strains. The pangenome of Bacillus sp. BH32 and its closest relative was further analyzed. The phylogenomic analyses confirmed its taxonomic position, a member of the Bacillus cereus group, with intergenomic distances (GBDP analysis) pinpointing to a new taxon (digital DNA-DNA hybridization, dDDH < 70%). Genome mining unveiled several genes involved in stress tolerance, production of anti-oxidants and genes involved in plant growth promotion as well as in the production of secondary metabolites. KEY POINTS : • Bacillus sp. BH32 and other bacterial endophytes were isolated from halophytes, to be tested on tomato and wheat and to limit salt stress adverse effects. • The strain with the highest potential was then studied at the genomic level to highlight numerous genes linked to plant growth promotion and stress tolerance. • Pangenome approaches suggest that the strain belongs to a new taxon within the Bacillus cereus group.

RevDate: 2022-04-18

Li Z, Li Z, Peng Y, et al (2022)

Trans-Regional and Cross-Host Spread of mcr-Carrying Plasmids Revealed by Complete Plasmid Sequences - 44 Countries, 1998-2020.

China CDC weekly, 4(12):242-248.

Background: The surveillance of antimicrobial resistance genes (ARGs) and bacteria is one critical approach to prevent and control antimicrobial resistance (AMR). Next-generation sequencing (NGS) is a powerful tool in monitoring the emergence and spread of ARGs and resistant bacteria. The horizontal transfer of ARGs across host bacteria mediated by plasmids is a challenge in NGS surveillance for resistance because short-read sequencing can hardly generate the complete plasmid genome sequence, and the correlation between ARGs and plasmids are difficult to determine.

Methods: The complete genome sequences of 455 mcr-carrying plasmids (pMCRs), and the data of their host bacteria and isolation regions were collected from the NCBI database. Genes of Inc types and ARGs were searched for each plasmid. The genome similarity of these plasmids was analyzed by pangenome clustering and genome alignment.

Results: A total of 52 Inc types, including a variety of fusion plasmids containing 2 or more Inc types were identified in these pMCRs and carried by complex host bacteria. The cooccurrence of ARGs in pMCRs was generally observed, with an average of 3.9 ARGs per plasmid. Twenty-two clusters with consistent or highly similar sequences and gene compositions were identified by the pangenome clustering, which were characterized with distributions in different countries/regions, years or host bacteria in each cluster.

Discussion: Based on the complete plasmid sequences, distribution of mcr genes in different Inc type plasmids, their co-existence with other AMRs, and transmission of one pMCR across regions and host bacteria can be revealed definitively. Complete plasmid genomes and comparisons in the laboratory network are necessary for spread tracing of ARG-carrying plasmids and risk assessment in AMR surveillance.

RevDate: 2022-04-18

Yuan PB, Zhan Y, Zhu JH, et al (2022)

Pan-Genome Analysis of Laribacter hongkongensis: Virulence Gene Profiles, Carbohydrate-Active Enzyme Prediction, and Antimicrobial Resistance Characterization.

Frontiers in microbiology, 13:862776.

Laribacter hongkongensis is a new emerging foodborne pathogen that causes community-acquired gastroenteritis and traveler's diarrhea. However, the genetic features of L. hongkongensis have not yet been properly understood. A total of 45 aquatic animal-associated L. hongkongensis strains isolated from intestinal specimens of frogs and grass carps were subjected to whole-genome sequencing (WGS), along with the genome data of 4 reported human clinical strains, the analysis of virulence genes, carbohydrate-active enzymes, and antimicrobial resistance (AMR) determinants were carried out for comprehensively understanding of this new foodborne pathogen. Human clinical strains were genetically more related to some strains from frogs inferred from phylogenetic trees. The distribution of virulence genes and carbohydrate-active enzymes exhibited different patterns among strains of different sources, reflecting their adaption to different host environments and indicating different potentials to infect humans. Thirty-two AMR genes were detected, susceptibility to 18 clinical used antibiotics including aminoglycoside, chloramphenicol, trimethoprim, and sulfa was checked to evaluate the availability of clinical medicines. Resistance to Rifampicin, Cefazolin, ceftazidime, Ampicillin, and ceftriaxone is prevalent in most strains, resistance to tetracycline, trimethoprim-sulfamethoxazole, ciprofloxacin, and levofloxacin are aggregated in nearly half of frog-derived strains, suggesting that drug resistance of frog-derived strains is more serious, and clinical treatment for L. hongkongensis infection should be more cautious.

RevDate: 2022-04-18

Xu S, Wei M, Li G, et al (2022)

Comprehensive Analysis of the Nocardia cyriacigeorgica Complex Reveals Five Species-Level Clades with Different Evolutionary and Pathogenicity Characteristics.

mSystems [Epub ahead of print].

Nocardia cyriacigeorgica is a common etiological agent of nocardiosis that has increasingly been implicated in serious pulmonary infections, especially in immunocompromised individuals. However, the evolution, diversity, and pathogenesis of N. cyriacigeorgica have remained unclear. Here, we performed a comparative genomic analysis using 91 N. cyriacigeorgica strains, 45 of which were newly sequenced in this study. Phylogenetic and average nucleotide identity (ANI) analyses revealed that N. cyriacigeorgica contained five species-level clades (8.6 to 14.6% interclade genetic divergence), namely, the N. cyriacigeorgica complex (NCC). Further pan-genome analysis revealed extensive differences among the five clades in nine functional categories, such as energy production, lipid metabolism, secondary metabolites, and signal transduction mechanisms. All 2,935 single-copy core genes undergoing purifying selection were highly conserved across NCC. However, clades D and E exhibited reduced selective constraints, compared to clades A to C. Horizontal gene transfer (HGT) and mobile genetic elements contributed to genomic plasticity, and clades A and B had experienced a higher level of HGT events than other clades. A total of 129 virulence factors were ubiquitous across NCC, such as the mce operon, hemolysin, and type VII secretion system (T7SS). However, different distributions of three toxin-coding genes and two new types of mce operons were detected, which might contribute to pathogenicity differences among the members of the NCC. Overall, our study provides comprehensive insights into the evolution, genetic diversity, and pathogenicity of NCC, facilitating the prevention of infections. IMPORTANCE Nocardia species are opportunistic bacterial pathogens that can affect all organ systems, primarily the skin, lungs, and brain. N. cyriacigeorgica is the most prevalent species within the genus, exhibits clinical significance, and can cause severe infections when disseminated throughout the body. However, the evolution, diversity, and pathogenicity of N. cyriacigeorgica remain unclear. Here, we have conducted a comparative genomic analysis of 91 N. cyriacigeorgica strains and revealed that N. cyriacigeorgica is not a single species but is composed of five closely related species. In addition, we discovered that these five species differ in many ways, involving selection pressure, horizontal gene transfer, functional capacity, pathogenicity, and antibiotic resistance. Overall, our work provides important clues in dissecting the evolution, genetic diversity, and pathogenicity of NCC, thereby advancing prevention measures against these infections.

RevDate: 2022-04-18

Yang MR, YW Wu (2022)

Enhancing predictions of antimicrobial resistance of pathogens by expanding the potential resistance gene repertoire using a pan-genome-based feature selection approach.

BMC bioinformatics, 23(Suppl 4):131.

BACKGROUND: Predicting which pathogens might exhibit antimicrobial resistance (AMR) based on genomics data is one of the promising ways to swiftly and precisely identify AMR pathogens. Currently, the most widely used genomics approach is through identifying known AMR genes from genomic information in order to predict whether a pathogen might be resistant to certain antibiotic drugs. The list of known AMR genes, however, is still far from comprehensive and may result in inaccurate AMR pathogen predictions. We thus felt the need to expand the AMR gene set and proposed a pan-genome-based feature selection method to identify potential gene sets for AMR prediction purposes.

RESULTS: By building pan-genome datasets and extracting gene presence/absence patterns from four bacterial species, each with more than 2000 strains, we showed that machine learning models built from pan-genome data can be very promising for predicting AMR pathogens. The gene set selected by the eXtreme Gradient Boosting (XGBoost) feature selection approach further improved prediction outcomes, and an incremental approach selecting subsets of XGBoost-selected features brought the machine learning model performance to the next level. Investigating selected gene sets revealed that on average about 50% of genes had no known function and very few of them were known AMR genes, indicating the potential of the selected gene sets to expand resistance gene repertoires.

CONCLUSIONS: We demonstrated that a pan-genome-based feature selection approach is suitable for building machine learning models for predicting AMR pathogens. The extracted gene sets may provide future clues to expand our knowledge of known AMR genes and provide novel hypotheses for inferring bacterial AMR mechanisms.

RevDate: 2022-04-14

Akwani WC, van Vliet AHM, Joel JO, et al (2022)

The Use of Comparative Genomic Analysis for the Development of Subspecies-Specific PCR Assays for Mycobacterium abscessus.

Frontiers in cellular and infection microbiology, 12:816615.

Mycobacterium abscessus complex (MABC) is an important pathogen of immunocompromised patients. Accurate and rapid determination of MABC at the subspecies level is vital for optimal antibiotic therapy. Here we have used comparative genomics to design MABC subspecies-specific PCR assays. Analysis of single nucleotide polymorphisms and core genome multilocus sequence typing showed clustering of genomes into three distinct clusters representing the MABC subspecies M. abscessus, M. bolletii and M. massiliense. Pangenome analysis of 318 MABC genomes from the three subspecies allowed for the identification of 15 MABC subspecies-specific genes. In silico testing of primer sets against 1,663 publicly available MABC genomes and 66 other closely related Mycobacterium genomes showed that all assays had >97% sensitivity and >98% specificity. Subsequent experimental validation of two subspecies-specific genes each showed the PCR assays worked well in individual and multiplex format with no false-positivity with 5 other mycobacteria of clinical importance. In conclusion, we have developed a rapid, accurate, multiplex PCR-assay for discriminating MABC subspecies that could improve their detection, diagnosis and inform correct treatment choice.

RevDate: 2022-04-14

Wambui J, Stevens MJA, Cernela N, et al (2022)

Unraveling the Genotypic and Phenotypic Diversity of the Psychrophilic Clostridium estertheticum Complex, a Meat Spoilage Agent.

Frontiers in microbiology, 13:856810.

The spoilage of vacuum-packed meat by Clostridium estertheticum complex (CEC), which is accompanied by or without production of copious amounts of gas, has been linked to the acetone-butyrate-ethanol fermentation, but the mechanism behind the variable gas production has not been fully elucidated. The reconstruction and comparison of intra- and interspecies metabolic pathways linked to meat spoilage at the genomic level can unravel the genetic basis for the variable phenotype. However, this is hindered by unavailability of CEC genomes, which in addition, has hampered the determination of genetic diversity and its drivers within CEC. Therefore, the current study aimed at determining the diversity of CEC through comprehensive comparative genomics. Fifty CEC genomes from 11 CEC species were compared. Recombination and gene gain/loss events were identified as important sources of natural variation within CEC, with the latter being pronounced in genomospecies2 that has lost genes related to flagellar assembly and signaling. Pan-genome analysis revealed variations in carbohydrate metabolic and hydrogenases genes within the complex. Variable inter- and intraspecies gas production in meat by C. estertheticum and Clostridium tagluense were associated with the distribution of the [NiFe]-hydrogenase hyp gene cluster whose absence or presence was associated with occurrence or lack of pack distention, respectively. Through comparative genomics, we have shown CEC species exhibit high genetic diversity that can be partly attributed to recombination and gene gain/loss events. We have also shown genetic basis for variable gas production in meat can be attributed to the presence/absence of the hyp gene cluster.

RevDate: 2022-04-13

Weisberg AJ, Rahman A, Backus D, et al (2022)

Pangenome Evolution Reconciles Robustness and Instability of Rhizobial Symbiosis.

mBio [Epub ahead of print].

Root nodulating rhizobia are nearly ubiquitous in soils and provide the critical service of nitrogen fixation to thousands of legume species, including staple crops. However, the magnitude of fixed nitrogen provided to hosts varies markedly among rhizobia strains, despite host legumes having mechanisms to selectively reward beneficial strains and to punish ones that do not fix sufficient nitrogen. Variation in the services of microbial mutualists is considered paradoxical given host mechanisms to select beneficial genotypes. Moreover, the recurrent evolution of non-fixing symbiont genotypes is predicted to destabilize symbiosis, but breakdown has rarely been observed. Here, we deconstructed hundreds of genome sequences from genotypically and phenotypically diverse Bradyrhizobium strains and revealed mechanisms that generate variation in symbiotic nitrogen fixation. We show that this trait is conferred by a modular system consisting of many extremely large integrative conjugative elements and few conjugative plasmids. Their transmissibility and propensity to reshuffle genes generate new combinations that lead to uncooperative genotypes and make individual partnerships unstable. We also demonstrate that these same properties extend beneficial associations to diverse host species and transfer symbiotic capacity among diverse strains. Hence, symbiotic nitrogen fixation is underpinned by modularity, which engenders flexibility, a feature that reconciles evolutionary robustness and instability. These results provide new insights into mechanisms driving the evolution of mobile genetic elements. Moreover, they yield a new predictive model on the evolution of rhizobial symbioses, one that informs on the health of organisms and ecosystems that are hosts to symbionts and that helps resolve the long-standing paradox. IMPORTANCE Genetic variation is fundamental to evolution yet is paradoxical in symbiosis. Symbionts exhibit extensive variation in the magnitude of services they provide despite hosts having mechanisms to select and increase the abundance of beneficial genotypes. Additionally, evolution of uncooperative symbiont genotypes is predicted to destabilize symbiosis, but breakdown has rarely been observed. We analyzed genome sequences of Bradyrhizobium, bacteria that in symbioses with legume hosts, fix nitrogen, a nutrient essential for ecosystems. We show that genes for symbiotic nitrogen fixation are within elements that can move between bacteria and reshuffle gene combinations that change host range and quality of symbiosis services. Consequently, nitrogen fixation is evolutionarily unstable for individual partnerships, but is evolutionarily stable for legume-Bradyrhizobium symbioses in general. We developed a holistic model of symbiosis evolution that reconciles robustness and instability of symbiosis and informs on applications of rhizobia in agricultural settings.

RevDate: 2022-04-12

Ebler J, Ebert P, Clarke WE, et al (2022)

Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes.

Nature genetics [Epub ahead of print].

Typical genotyping workflows map reads to a reference genome before identifying genetic variants. Generating such alignments introduces reference biases and comes with substantial computational burden. Furthermore, short-read lengths limit the ability to characterize repetitive genomic regions, which are particularly challenging for fast k-mer-based genotypers. In the present study, we propose a new algorithm, PanGenie, that leverages a haplotype-resolved pangenome reference together with k-mer counts from short-read sequencing data to genotype a wide spectrum of genetic variation-a process we refer to as genome inference. Compared with mapping-based approaches, PanGenie is more than 4 times faster at 30-fold coverage and achieves better genotype concordances for almost all variant types and coverages tested. Improvements are especially pronounced for large insertions (≥50 bp) and variants in repetitive regions, enabling the inclusion of these classes of variants in genome-wide association studies. PanGenie efficiently leverages the increasing amount of haplotype-resolved assemblies to unravel the functional impact of previously inaccessible variants while being faster compared with alignment-based workflows.

RevDate: 2022-04-11

Patrick S (2022)

A tale of two habitats: Bacteroides fragilis, a lethal pathogen and resident in the human gastrointestinal microbiome.

Microbiology (Reading, England), 168(4):.

Bacteroides fragilis is an obligately anaerobic Gram-negative bacterium and a major colonizer of the human large colon where Bacteroides is a predominant genus. During the growth of an individual clonal population, an astonishing number of reversible DNA inversion events occur, driving within-strain diversity. Additionally, the B. fragilis pan-genome contains a large pool of diverse polysaccharide biosynthesis loci, DNA restriction/modification systems and polysaccharide utilization loci, which generates remarkable between-strain diversity. Diversity clearly contributes to the success of B. fragilis within its normal habitat of the gastrointestinal (GI) tract and during infection in the extra-intestinal host environment. Within the GI tract, B. fragilis is usually symbiotic, for example providing localized nutrients for the gut epithelium, but B. fragilis within the GI tract may not always be benign. Metalloprotease toxin production is strongly associated with colorectal cancer. B. fragilis is unique amongst bacteria; some strains export a protein >99 % structurally similar to human ubiquitin and antigenically cross-reactive, which suggests a link to autoimmune diseases. B. fragilis is not a primary invasive enteric pathogen; however, if colonic contents contaminate the extra-intestinal host environment, it successfully adapts to this new habitat and causes infection; classically peritoneal infection arising from rupture of an inflamed appendix or GI surgery, which if untreated, can progress to bacteraemia and death. In this review selected aspects of B. fragilis adaptation to the different habitats of the GI tract and the extra-intestinal host environment are considered, along with the considerable challenges faced when studying this highly variable bacterium.

RevDate: 2022-04-11

Baker JL, Tang X, LaBonte S, et al (2022)

mucG, mucH, and mucI Modulate Production of Mutanocyclin and Reutericyclins in Streptococcus mutans B04Sm5.

Journal of bacteriology [Epub ahead of print].

Streptococcus mutans is considered a primary etiologic agent of dental caries, which is the most common chronic infectious disease worldwide. S. mutans B04Sm5 was recently shown to produce reutericyclins and mutanocyclin through the muc biosynthetic gene cluster and to utilize reutericyclins to inhibit the growth of neighboring commensal streptococci. In this study, examination of S. mutans and muc phylogeny suggested evolution of an ancestral S. mutans muc into three lineages within one S. mutans clade and then horizontal transfer of muc to other S. mutans clades. The roles of the mucG and mucH transcriptional regulators and the mucI transporter were also examined. mucH was demonstrated to encode a transcriptional activator of muc. mucH deletion reduced production of mutanocyclin and reutericyclins and eliminated the impaired growth and inhibition of neighboring streptococci phenotypes, which are associated with reutericyclin production. ΔmucG had increased mutanocyclin and reutericyclin production, which impaired growth and increased the ability to inhibit neighboring streptococci. However, deletion of mucG also caused reduced expression of mucD, mucE, and mucI. Deletion of mucI reduced mutanocyclin and reutericylin production but enhanced growth, suggesting that mucI may not transport reutericyclin as its homolog does in Limosilactobacillus reuteri. Further research is needed to determine the roles of mucG and mucI and to identify any cofactors affecting the activity of the mucG and mucH regulators. Overall, this study provided pangenome and phylogenetic analyses that serve as a resource for S. mutans research and began elucidation of the regulation of reutericyclins and mutanocyclin production in S. mutans. IMPORTANCE S. mutans must be able to outcompete neighboring organisms in its ecological niche in order to cause dental caries. S. mutans B04Sm5 inhibited the growth of neighboring commensal streptococci through production of reutericyclins via the muc biosynthetic gene cluster. In this study, an S. mutans pangenome database and updated phylogenetic tree were generated that will serve as valuable resources for the S. mutans research community and that provide insights into the carriage and evolution of S. mutans muc. The MucG and MucH regulators, and the MucI transporter, were shown to modulate production of reutericyclins and mutanocyclin. These genes also affected the ability of S. mutans to inhibit neighboring commensals, suggesting that they may play a role in S. mutans virulence.

RevDate: 2022-04-11

Pan W, Cheng Z, Han Z, et al (2022)

Efficient genetic transformation and CRISPR/Cas9-mediated genome editing of watermelon assisted by genes encoding developmental regulators.

Journal of Zhejiang University. Science. B, 23(4):339-344.

Cucurbitaceae is an important family of flowering plants containing multiple species of important food plants, such as melons, cucumbers, squashes, and pumpkins. However, a highly efficient genetic transformation system has not been established for most of these species (Nanasato and Tabei, 2020). Watermelon (Citrullus lanatus), an economically important and globally cultivated fruit crop, is a model species for fruit quality research due to its rich diversity of fruit size, shape, flavor, aroma, texture, peel and flesh color, and nutritional composition (Guo et al., 2019). Through pan-genome sequencing, many candidate loci associated with fruit quality traits have been identified (Guo et al., 2019). However, few of these loci have been validated. The major barrier is the low transformation efficiency of the species, with only few successful cases of genetic transformation reported so far (Tian et al., 2017; Feng et al., 2021; Wang JF et al., 2021; Wang YP et al., 2021). For example, Tian et al. (2017) obtained only 16 transgenic lines from about 960 cotyledon fragments, yielding a transformation efficiency of 1.67%. Therefore, efficient genetic transformation could not only facilitate the functional genomic studies in watermelon as well as other horticultural species, but also speed up the transgenic and genome-editing breeding.

RevDate: 2022-04-11

Sun Y, Wang J, Li Y, et al (2022)

Pan-Genome Analysis Reveals the Abundant Gene Presence/Absence Variations Among Different Varieties of Melon and Their Influence on Traits.

Frontiers in plant science, 13:835496.

Melon (Cucumismelo L.) is an important vegetable crop that has been subjected to domestication and improvement. Several varieties of melons with diverse phenotypes have been produced. In this study, we constructed a melon pan-genome based on 297 accessions comprising 168 Mb novel sequences and 4,325 novel genes. Based on the results, there were abundant genetic variations among different melon groups, including 364 unfavorable genes in the IMP_A vs. LDR_A group, 46 favorable genes, and 295 unfavorable genes in the IMP_M vs. LDR_M group. The distribution of 709 resistance gene analogs (RGAs) was also characterized across 297 melon lines, of which 603 were core genes. Further, 106 genes were found to be variable, 55 of which were absent in the reference melon genome. Using gene presence/absence variation (PAV)-based genome-wide association analysis (GWAS), 13 gene PAVs associated with fruit length, fruit shape, and fruit width were identified, four of which were located in pan-genome additional contigs.

RevDate: 2022-04-11

Kaushik A, Roberts DP, Ramaprasad A, et al (2022)

Pangenome Analysis of the Soilborne Fungal Phytopathogen Rhizoctonia solani and Development of a Comprehensive Web Resource: RsolaniDB.

Frontiers in microbiology, 13:839524.

Rhizoctonia solani is a collective group of genetically and pathologically diverse basidiomycetous fungi that damage economically important crops. Its isolates are classified into 13 Anastomosis Groups (AGs) and subgroups having distinctive morphology and host ranges. The genetic factors driving the unique features of R. solani pathology are not well characterized due to the limited availability of its annotated genomes. Therefore, we performed genome sequencing, assembly, annotation and functional analysis of 12 R. solani isolates covering 7 AGs and select subgroups (AG1-IA; AG1-IB; AG1-IC; AG2-2IIIB; AG3-PT, isolates Rhs 1AP and the hypovirulent Rhs1A1; AG3-TB; AG4-HG-I, isolates Rs23 and R118-11; AG5; AG6; and AG8), in which six genomes are reported for the first time. Using a pangenome comparative analysis of 12 R. solani isolates and 15 other Basidiomycetes, we defined the unique and shared secretomes, CAZymes, and effectors across the AGs. We have also elucidated the R. solani-derived factors potentially involved in determining AG-specific host preference, and the attributes distinguishing them from other Basidiomycetes. Finally, we present the largest repertoire of R. solani genomes and their annotated components as a comprehensive database, viz. RsolaniDB, with tools for large-scale data mining, functional enrichment and sequence analysis not available with other state-of-the-art platforms.

RevDate: 2022-04-09

Zhang F, Xue H, Dong X, et al (2022)

Long-read sequencing of 111 rice genomes reveals significantly larger pan-genomes.

Genome research pii:gr.276015.121 [Epub ahead of print].

The concept of a pan-genome, which is the collection of all genomes from a population, has shown great potential in genomics study, especially for crop sciences. The rice pan-genome constructed from the second-generation sequencing (SGS) data is about 270 Mb larger than Nipponbare, the rice reference genome (NipRG), but it still suffers from incompleteness and loss of genomic contexts. The third-generation sequencing (TGS) with long reads can help to construct better pan-genomes. In this paper, we reported a high-quality rice pan-genome construction method by introducing a series of new steps to deal with the long-read data including unmapped sequence block filtering, redundancy removing, and sequence block elongating. Compared to NipRG, the long-read sequencing-based pan-genome constructed from 105 rice accessions, which contains 604 Mb novel sequences, is much more comprehensive than the one constructed from ~3000 rice genomes sequenced with short reads. The repetitive sequences are the main components of novel sequences, which partially explained the differences between the pan-genomes based on TGS and SGS. Adding 6 wild rice accessions, there are about 879 Mb novel sequences and 19,000 novel genes in the rice pan-genome in total. In addition, we have created high-quality reference genomes for all representative rice populations, including 5 gapless reference genomes. This study has brought significant progress for our understanding about the rice pan-genome, and this pan-genome construction method for long-read data can be applied to accelerate a broad range of genomics studies.

RevDate: 2022-04-07

Jung H, Kim HS, Han G, et al (2022)

Comparative Analyses of Four Complete Genomes in Pseudomonas amygdali Revealed Differential Adaptation to Hostile Environments and Secretion Systems.

The plant pathology journal, 38(2):167-174.

Pseudomonas amygdali is a hemibiotrophic phytopathogen that causes disease in woody and herbaceous plants. Complete genomes of four P. amygdali pathovars were comparatively analyzed to decipher the impact of genomic diversity on host colonization. The pan-genome indicated that 3,928 core genes are conserved among pathovars, while 504-1,009 are unique to specific pathovars. The unique genome contained many mobile elements and exhibited a functional distribution different from the core genome. Genes involved in O-antigen biosynthesis and antimicrobial peptide resistance were significantly enriched for adaptation to hostile environments. While the type III secretion system was distributed in the core genome, unique genomes revealed a different organization of secretion systems as follows: type I in pv. tabaci, type II in pv. japonicus, type IV in pv. morsprunorum, and type VI in pv. lachrymans. These findings provide genetic insight into the dynamic interactions of the bacteria with plant hosts.

RevDate: 2022-04-06

Beier S, NR Thomson (2022)

Panakeia - a universal tool for bacterial pangenome analysis.

BMC genomics, 23(1):265.

BACKGROUND: Development of new pan-genome analysis tools is important, as the pangenome of a microbial species has become an important method to define the diversity of a selected taxon, most commonly a species, in the last years. This enables comparison of strains from different ecological niches and can be used to define the functional potential in a bacterial population. It gives us a much better view of microbial genomics than can be gained from singular genomes which after all are just single representatives of a much more varied population.

RESULTS: We present Panakeia, a tool which strives to be easy to use and providing a detailed view of the pangenome structure which can efficiently be utilised for discovery, or further in-depth analysis, of features of interest. It analyses synteny and multiple structural patterns of the pangenome, giving insights into the biological diversity and evolution of the studied taxon. Panakeia hence provides both broad and detailed information on the structure of a pangenome, for diverse and highly clonal populations of bacteria.

CONCLUSIONS: Previously published pangenome tools often reduce the information to a presence/absence matrix of unconnected genes or generate massive hard to interpret output graphs. However, Panakeia includes synteny and structural information and presents it in a way that can readily be used for further analysis. Panakeia can be downloaded at together with a detailed User Guide.

RevDate: 2022-04-05

Sivertsen A, Dyrhovden R, Tellevik MG, et al (2022)

Escherichia marmotae-a Human Pathogen Easily Misidentified as Escherichia coli.

Microbiology spectrum [Epub ahead of print].

We hereby present the first descriptions of human-invasive infections caused by Escherichia marmotae, a recently described species that encompasses the former "Escherichia cryptic clade V." We describe four cases, one acute sepsis of unknown origin, one postoperative sepsis after cholecystectomy, one spondylodiscitis, and one upper urinary tract infection. Cases were identified through unsystematic queries in a single clinical lab over 6 months. Through genome sequencing of the causative strains combined with available genomes from elsewhere, we demonstrate Es. marmotae to be a likely ubiquitous species containing genotypic virulence traits associated with Escherichia pathogenicity. The invasive isolates were scattered among isolates from a range of nonhuman sources in the phylogenetic analyses, thus indicating inherent virulence in multiple lineages. Pan genome analyses indicate that Es. marmotae has a large accessory genome and is likely to obtain ecologically advantageous traits, such as genes encoding antimicrobial resistance. Reliable identification might be possible by matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS), but relevant spectra are missing in commercial databases. It can be identified through 16S rRNA gene sequencing. Escherichia marmotae could represent a relatively common human pathogen, and improved diagnostics will provide a better understanding of its clinical importance. IMPORTANCE Escherichia coli is the most common pathogen found in blood cultures and urine and among the most important pathogenic species in the realm of human health. The notion that some of these isolates are not Es. coli but rather another species within the same genus may have implications for what Es. coli constitutes. We only recently have obtained methods to separate the two species, which means that possible differences in important clinical aspects, such as antimicrobial resistance rates, virulence, and phylogenetic structure, may exist. We believe that Es. marmotae as a common pathogen is new merely because we have not looked or bothered to distinguish between the thousands of invasive Escherichia passing through microbiological laboratories each day.

RevDate: 2022-04-04

Zhang Z, Guo J, Cai X, et al (2022)

Improved Reference Genome Annotation of Brassica rapa by Pacific Biosciences RNA Sequencing.

Frontiers in plant science, 13:841618.

The species Brassica rapa includes several important vegetable crops. The draft reference genome of B. rapa ssp. pekinensis was completed in 2011, and it has since been updated twice. The pangenome with structural variations of 18 B. rapa accessions was published in 2021. Although extensive genomic analysis has been conducted on B. rapa, a comprehensive genome annotation including gene structure, alternative splicing (AS) events, and non-coding genes is still lacking. Therefore, we used the Pacific Biosciences (PacBio) single-molecular long-read technology to improve gene models and produced the annotated genome version 3.5. In total, we obtained 753,041 full-length non-chimeric (FLNC) reads and collapsed these into 92,810 non-redundant consensus isoforms, capturing 48% of the genes annotated in the B. rapa reference genome annotation v3.1. Based on the isoform data, we identified 830 novel protein-coding genes that were missed in previous genome annotations, defined the untranslated regions (UTRs) of 20,340 annotated genes and corrected 886 wrongly spliced genes. We also identified 28,564 AS events and 1,480 long non-coding RNAs (lncRNAs). We produced a relatively complete and high-quality reference transcriptome for B. rapa that can facilitate further functional genomic research.

RevDate: 2022-04-04

Sanz MB, De Belder D, de Mendieta JM, et al (2022)

Carbapenemase-Producing Extraintestinal Pathogenic Escherichia coli From Argentina: Clonal Diversity and Predominance of Hyperepidemic Clones CC10 and CC131.

Frontiers in microbiology, 13:830209.

Extraintestinal pathogenic Escherichia coli (ExPEC) causes infections outside the intestine. Particular ExPEC clones, such as clonal complex (CC)/sequence type (ST)131, have been known to sequentially accumulate antimicrobial resistance that starts with chromosomal mutations against fluoroquinolones, followed with the acquisition of bla CTX-M-15 and, more recently, carbapenemases. Here we aimed to investigate the distribution of global epidemic clones of carbapenemase-producing ExPEC from Argentina in representative clinical isolates recovered between July 2008 and March 2017. Carbapenemase-producing ExPEC (n = 160) were referred to the Argentinean reference laboratory. Of these, 71 were selected for genome sequencing. Phenotypic and microbiological studies confirmed the presence of carbapenemases confirmed as KPC-2 (n = 52), NDM-1 (n = 16), IMP-8 (n = 2), and VIM-1 (n = 1) producers. The isolates had been recovered mainly from urine, blood, and abdominal fluids among others, and some were from screening samples. After analyzing the virulence gene content, 76% of the isolates were considered ExPEC, although non-ExPEC isolates were also obtained from extraintestinal sites. Pan-genome phylogeny and clonal analysis showed great clonal diversity, although the first phylogroup in abundance was phylogroup A, harboring CC10 isolates, followed by phylogroup B2 with CC/ST131, mostly H30Rx, the subclone co-producing CTX-M-15. Phylogroups D, B1, C, F, and E were also detected with fewer strains. CC10 and CC/ST131 were found throughout the country. In addition, CC10 nucleated most metalloenzymes, such as NDM-1. Other relevant international clones were identified, such as CC/ST38, CC155, CC14/ST1193, and CC23. Two isolates co-produced KPC-2 and OXA-163 or OXA-439, a point mutation variant of OXA-163, and three isolates co-produced MCR-1 among other resistance genes. To conclude, in this work, we described the molecular epidemiology of carbapenemase-producing ExPEC in Argentina. Further studies are necessary to determine the plasmid families disseminating carbapenemases in ExPEC in this region.

RevDate: 2022-03-31

Shropshire WC, Dinh AQ, Earley M, et al (2022)

Accessory Genomes Drive Independent Spread of Carbapenem-Resistant Klebsiella pneumoniae Clonal Groups 258 and 307 in Houston, TX.

mBio [Epub ahead of print].

Carbapenem-resistant Klebsiella pneumoniae (CRKp) is an urgent public health threat. Worldwide dissemination of CRKp has been largely attributed to clonal group (CG) 258. However, recent evidence indicates the global emergence of a CRKp CG307 lineage. Houston, TX, is the first large city in the United States with detected cocirculation of both CRKp CG307 and CG258. We sought to characterize the genomic and clinical factors contributing to the parallel endemic spread of CG258 and CG307. CRKp isolates were collected as part of the prospective, Consortium on Resistance against Carbapenems in Klebsiella and other Enterobacterales 2 (CRACKLE-2) study. Hybrid short-read and long-read genome assemblies were generated from 119 CRKp isolates (95 originated from Houston hospitals). A comprehensive characterization of phylogenies, gene transfer, and plasmid content with pan-genome analysis was performed on all CRKp isolates. Plasmid mating experiments were performed with CG307 and CG258 isolates of interest. Dissection of the accessory genomes suggested independent evolution and limited horizontal gene transfer between CG307 and CG258 lineages. CG307 contained a diverse repertoire of mobile genetic elements, which were shared with other non-CG258 K. pneumoniae isolates. Three unique clades of Houston CG307 isolates clustered distinctly from other global CG307 isolates, indicating potential selective adaptation of particular CG307 lineages to their respective geographical niches. CG307 strains were often isolated from the urine of hospitalized patients, likely serving as important reservoirs for genes encoding carbapenemases and extended-spectrum β-lactamases. Our findings suggest parallel cocirculation of high-risk lineages with potentially divergent evolution. IMPORTANCE The prevalence of carbapenem-resistant Klebsiella pneumoniae (CRKp) infections in nosocomial settings remains a public health challenge. High-risk clones such as clonal group 258 (CG258) are particularly concerning due to their association with blaKPC carriage, which can severely complicate antimicrobial treatments. There is a recent emergence of clonal group 307 (CG307) worldwide with little understanding of how this successful clone has been able to adapt while cocirculating with CG258. We provide the first evidence of potentially divergent evolution between CG258 and CG307 with limited sharing of adaptive genes. Houston, TX, is home to the largest medical center in the world, with a large influx of domestic and international patients. Thus, our unique geographical setting, where two pandemic strains of CRKp are circulating, provides an indication of how differential accessory genome content can drive stable, endemic populations of CRKp. Pan-genomic analyses such as these can reveal unique signatures of successful CRKp dissemination, such as the CG307-associated plasmid (pCG307_HTX), and provide invaluable insights into the surveillance of local carbapenem-resistant Enterobacterales (CRE) epidemiology.

RevDate: 2022-03-30

Gan L, Yan C, Cui J, et al (2022)

Genetic Diversity and Pathogenic Features in Klebsiella pneumoniae Isolates from Patients with Pyogenic Liver Abscess and Pneumonia.

Microbiology spectrum [Epub ahead of print].

While Klebsiella pneumoniae is a common cause of nosocomial and community-acquired infections, including pneumonia and pyogenic liver abscess, little is known about the population structure of this bacterium. In this study, we investigated the prevalence and molecular characteristics of K. pneumoniae isolates from carriers, pyogenic liver abscess patients, and pneumonia patients, and genomic and phenotypic assays were used to determine the differences among the isolates. A total of 232 K. pneumoniae isolates were subtyped into 74 sequence types (STs). The isolates from different sources had their own STs, and the predominant subtypes in liver abscess and pneumonia patients were ST23 and ST11, respectively. Pangenome analysis also distinguished three phylogroups that were consistent with the isolate sources. The isolates collected from liver abscess patients carried significantly more virulence factors, and those from pneumonia patients harbored significantly more resistance genes and replicons. Almost all isolate STs (93/97 [95.88%]) from liver abscesses strongly correlated with the virulence factor salmochelin, while most pneumonia isolate STs (52/53 [98.11%]) from pneumonia did not correlate with salmochelin. The isolates collected from liver abscesses showed higher virulence in the cytotoxicity and mouse models. These data provide genomic support for the proposal that isolates collected from carriers, liver abscess patients, and pneumonia patients have distinct genomic features. Isolates from the different sources are largely nonoverlapping, suggesting that different patients may be infected via different sources. Further studies on the pathogenic mechanisms of salmochelin and other virulence factors will be required. IMPORTANCE While Klebsiella pneumoniae is a common cause of nosocomial and community-acquired infections, including pneumonia and pyogenic liver abscess, little is known about the population structure of this bacterium. We collected 232 isolates from carriers, pyogenic liver abscess patients, and pneumonia patients, and the isolates from different sources had their own sequence types. Pangenome analysis also distinguished three phylogroups that were consistent with the isolate sources. The isolates collected from liver abscess patients carried significantly more virulence factors, and those from pneumonia patients harbored significantly more resistance genes and replicons. Besides, there was a strong link between salmochelin and liver abscess. The isolates collected from liver abscesses also showed higher virulence in the cytotoxicity and mouse models. Isolates collected from different sources have distinct genomic features, suggesting that different patients may be infected via different sources.

RevDate: 2022-03-26

Coll F, Gouliouris T, Bruchmann S, et al (2022)

PowerBacGWAS: a computational pipeline to perform power calculations for bacterial genome-wide association studies.

Communications biology, 5(1):266.

Genome-wide association studies (GWAS) are increasingly being applied to investigate the genetic basis of bacterial traits. However, approaches to perform power calculations for bacterial GWAS are limited. Here we implemented two alternative approaches to conduct power calculations using existing collections of bacterial genomes. First, a sub-sampling approach was undertaken to reduce the allele frequency and effect size of a known and detectable genotype-phenotype relationship by modifying phenotype labels. Second, a phenotype-simulation approach was conducted to simulate phenotypes from existing genetic variants. We implemented both approaches into a computational pipeline (PowerBacGWAS) that supports power calculations for burden testing, pan-genome and variant GWAS; and applied it to collections of Enterococcus faecium, Klebsiella pneumoniae and Mycobacterium tuberculosis. We used this pipeline to determine sample sizes required to detect causal variants of different minor allele frequencies (MAF), effect sizes and phenotype heritability, and studied the effect of homoplasy and population diversity on the power to detect causal variants. Our pipeline and user documentation are made available and can be applied to other bacterial populations. PowerBacGWAS can be used to determine sample sizes required to find statistically significant associations, or the associations detectable with a given sample size. We recommend to perform power calculations using existing genomes of the bacterial species and population of study.

RevDate: 2022-03-26

Kim E, Kim D, Yang SM, et al (2022)

Validation of probiotic species or subspecies identity in commercial probiotic products using high-resolution PCR method based on large-scale genomic analysis.

Food research international (Ottawa, Ont.), 154:111011.

The health-promoting effects of probiotics are species-specific, and hence it is important to declare the correct information in products. However, some studies have identified issues related to the accuracy of labeling commercial probiotic products. In this study, we developed a high-resolution real-time PCR method based on pangenome analysis for a more affordable, rapid, and accurate identification of commercial probiotic products than sequencing methods. We selected 25 species or subspecies primarily used for probiotic strains and are closely related to them as targets. To extract molecular markers, 354 whole-genome sequences present in the target genomes but not in the pangenome of other genomes were compared, which resulted in the identification of molecular marker genes. The marker genes exhibited 100% specificity for 100 strains as assessed by the real-time PCR method. Fifty probiotic and dairy products were investigated to verify the information claimed on the label. Real-time PCR results showed that most products reflected the bacterial species declared in the label claim, whereas 12 products showed the presence of undeclared species or missing species. Our method for accurately verifying the labeling of probiotic products would be useful for quality control and safety.

RevDate: 2022-03-26

Rivera-Ramírez A, Salgado-Morales R, Jiménez-Pérez A, et al (2022)

Comparative Genomics and Pathogenicity Analysis of Two Bacterial Symbionts of Entomopathogenic Nematodes: The Role of the GroEL Protein in Virulence.

Microorganisms, 10(3): pii:microorganisms10030486.

Bacteria of the genera Xenorhabdus and Photorhabdus are symbionts of entomopathogenic nematodes. Despite their close phylogenetic relationship, they show differences in their pathogenicity and virulence mechanisms in target insects. These differences were explored by the analysis of the pangenome, as it provides a framework for characterizing and defining the gene repertoire. We performed the first pangenome analysis of 91 strains of Xenorhabdus and Photorhabdus; the analysis showed that the Photorhabdus genus has a higher number of genes associated with pathogenicity. However, biological tests showed that whole cells of X. nematophila SC 0516 were more virulent than those of P. luminescens HIM3 when both were injected into G. mellonella larvae. In addition, we cloned and expressed the GroEL proteins of both bacteria, as this protein has been previously indicated to show insecticidal activity in the genus Xenorhabdus. Among these proteins, Cpn60-Xn was found to be the most toxic at all concentrations tested, with an LC50 value of 102.34 ng/larva. Sequence analysis suggested that the Cpn60-Xn toxin was homologous to Cpn60-Pl; however, Cpn60-Xn contained thirty-five differentially substituted amino acid residues that could be responsible for its insecticidal activity.

RevDate: 2022-03-26

Moreno E, Blasco JM, Letesson JJ, et al (2022)

Pathogenicity and Its Implications in Taxonomy: The Brucella and Ochrobactrum Case.

Pathogens (Basel, Switzerland), 11(3): pii:pathogens11030377.

The intracellular pathogens of the genus Brucella are phylogenetically close to Ochrobactrum, a diverse group of free-living bacteria with a few species occasionally infecting medically compromised patients. A group of taxonomists recently included all Ochrobactrum organisms in the genus Brucella based on global genome analyses and alleged equivalences with genera such as Mycobacterium. Here, we demonstrate that such equivalencies are incorrect because they overlook the complexities of pathogenicity. By summarizing Brucella and Ochrobactrum divergences in lifestyle, structure, physiology, population, closed versus open pangenomes, genomic traits, and pathogenicity, we show that when they are adequately understood, they are highly relevant in taxonomy and not unidimensional quantitative characters. Thus, the Ochrobactrum and Brucella differences are not limited to their assignments to different "risk-groups", a biologically (and hence, taxonomically) oversimplified description that, moreover, does not support ignoring the nomen periculosum rule, as proposed. Since the epidemiology, prophylaxis, diagnosis, and treatment are thoroughly unrelated, merging free-living Ochrobactrum organisms with highly pathogenic Brucella organisms brings evident risks for veterinarians, medical doctors, and public health authorities who confront brucellosis, a significant zoonosis worldwide. Therefore, from taxonomical and practical standpoints, the Brucella and Ochrobactrum genera must be maintained apart. Consequently, we urge researchers, culture collections, and databases to keep their canonical nomenclature.

RevDate: 2022-03-25

Qi W, Lim YW, Patrignani A, et al (2022)

The haplotype-resolved chromosome pairs of a heterozygous diploid African cassava cultivar reveal novel pan-genome and allele-specific transcriptome features.

GigaScience, 11:.

BACKGROUND: Cassava (Manihot esculenta) is an important clonally propagated food crop in tropical and subtropical regions worldwide. Genetic gain by molecular breeding has been limited, partially because cassava is a highly heterozygous crop with a repetitive and difficult-to-assemble genome.

FINDINGS: Here we demonstrate that Pacific Biosciences high-fidelity (HiFi) sequencing reads, in combination with the assembler hifiasm, produced genome assemblies at near complete haplotype resolution with higher continuity and accuracy compared to conventional long sequencing reads. We present 2 chromosome-scale haploid genomes phased with Hi-C technology for the diploid African cassava variety TME204. With consensus accuracy >QV46, contig N50 >18 Mb, BUSCO completeness of 99%, and 35k phased gene loci, it is the most accurate, continuous, complete, and haplotype-resolved cassava genome assembly so far. Ab initio gene prediction with RNA-seq data and Iso-Seq transcripts identified abundant novel gene loci, with enriched functionality related to chromatin organization, meristem development, and cell responses. During tissue development, differentially expressed transcripts of different haplotype origins were enriched for different functionality. In each tissue, 20-30% of transcripts showed allele-specific expression (ASE) differences. ASE bias was often tissue specific and inconsistent across different tissues. Direction-shifting was observed in <2% of the ASE transcripts. Despite high gene synteny, the HiFi genome assembly revealed extensive chromosome rearrangements and abundant intra-genomic and inter-genomic divergent sequences, with large structural variations mostly related to LTR retrotransposons. We use the reference-quality assemblies to build a cassava pan-genome and demonstrate its importance in representing the genetic diversity of cassava for downstream reference-guided omics analysis and breeding.

CONCLUSIONS: The phased and annotated chromosome pairs allow a systematic view of the heterozygous diploid genome organization in cassava with improved accuracy, completeness, and haplotype resolution. They will be a valuable resource for cassava breeding and research. Our study may also provide insights into developing cost-effective and efficient strategies for resolving complex genomes with high resolution, accuracy, and continuity.

RevDate: 2022-03-25

Chen Y, Ji S, Sun L, et al (2022)

The novel fosfomycin resistance gene fosY is present on a genomic island in CC1 methicillin-resistant Staphylococcus aureus.

Emerging microbes & infections [Epub ahead of print].

Fosfomycin has gained attention as a combination therapy for methicillin-resistant Staphylococcus aureus infections. Hence, the detection of novel fosfomycin-resistance mechanisms in S. aureus is important. Here, the minimal inhibitory concentrations (MICs) of fosfomycin in CC1 methicillin-resistant S. aureus were determined. The pangenome analysis and comparative genomics were used to analyse CC1 MRSA. The gene function was confirmed by cloning the gene into pTXΔ. A phylogenetic tree was constructed to determine the clustering of the CC1 strains of S. aureus. We identified a novel gene, designated fosY, that confers fosfomycin resistance in S. aureus. The FosY protein is a putative bacillithiol transferase enzyme sharing 65.9% to 77.5% amino acid identity with FosB and FosD, respectively. The function of fosY in decreasing fosfomycin susceptibility was confirmed by cloning it into pTXΔ. The pTX-fosY transformant exhibited a 16-fold increase in fosfomycin MIC. The bioinformatic analysis showed that fosY is in a novel genomic island designated RIfosY (for "resistance island carrying fosY") that originated from other species. The global phylogenetic tree of ST1 MRSA displayed this fosY-positive ST1 clone, originating from different regions, in the same clade. The novel resistance gene in the fos family, fosY, and a genomic island, RIfosY, can promote cross-species gene transfer and confer resistance to CC1 MRSA causing the failure of clinical treatment. This emphasises the importance of genetic surveillance of resistance genes among MRSA isolates.

RevDate: 2022-03-25

Shang Y, Ye Q, Wu Q, et al (2022)

Novel multiplex PCR assays for rapid identification of Salmonella serogroups B, C1, C2, D, E, S. enteritidis, and S. typhimurium.

Analytical methods : advancing methods and applications [Epub ahead of print].

Foodborne illnesses caused by Salmonella represent a significant public health problem worldwide. The aim of this study was to establish multiplex PCR (mPCR) for the rapid identification of Salmonella serogroups B, C1, C2, D, and E as well as for the serovars enteritidis and typhimurium. Employing pan-genome analysis and PCR verification, B-rfbJ, C1-9679, C2-pimB, D-rfbJ, E-rfbC, and four genes (SE18636, SE16574, SE2599, and SE13329) were identified as specific target genes for Salmonella serogroups B, C1, C2, D, E, and S. enteritidis, respectively. Thereafter, three novel mPCR assays (one of 3-mPCR and two of 2-mPCR) were successfully developed to identify these bacteria based on the target genes and another S. typhimurium-specific STM4495 gene. The primers targeting C1-9679, C2-pimB, and E-rfbC genes specific to the serogroups C1, C2, and E, respectively, constituted a 3-mPCR, while the other two 2-mPCRs, respectively, consisting primers specific to serogroup D and S. enteritidis (D-rfbJ and SE16574), and serogroup B and S. typhimurium-specific primers (B-rfbJ and STM4495), were also designed. The specificity of each mPCR was further evaluated by using non-target strains. The detection limits of mPCRs were approximately 103-104 CFU mL-1 in pure culture and 104-105 CFU g-1 in spiked chicken meat. In addition, mPCR assays could correctly detect target Salmonella in food samples. These results suggest that specific targets could be mined efficiently through a pan-genome analysis tool, and the novel mPCR assays developed in this study offer a promising technique for rapid and accurate detection of five serogroups of Salmonella (B, C1, C2, D, and E) and two serovars (S. enteritidis and S. typhimurium).

RevDate: 2022-03-25

Liu W, Yu SH, Zhang HP, et al (2022)

Two Cladosporium Fungi with Opposite Functions to the Chinese White Wax Scale Insect Have Different Genome Characters.

Journal of fungi (Basel, Switzerland), 8(3): pii:jof8030286.

Insects encounter infection of microorganisms, and they also harbor endosymbiosis to participate in nutrition providing and act as a defender against pathogens. We previously found the Chinese white wax scale insect, Ericerus pela, was infected and killed by Cladosporium sp. (pathogen). We also found it harbored Cladosporium sp. (endogensis). In this study, we cultured these two Cladosporium fungi and sequenced their genome. The results showed Cladosporium sp. (endogensis) has a larger genome size and more genes than Cladosporium sp. (pathogen). Pan-genome analysis showed Cladosporium sp. (endogensis)-specific genes enriched in pathways related to nutrition production, such as amino acid metabolism, carbohydrate metabolism, and energy metabolism. These pathways were absent in that of Cladosporium sp. (pathogen). Gene Ontology analysis showed Cladosporium sp. (pathogen)-specific genes enriched in the biosynthesis of asperfuranone, emericellamide, and fumagillin. These terms were not found in that of Cladosporium sp. (endogensis). Pathogen Host Interactions analysis found Cladosporium sp. (endogensis) had more genes related to loss of pathogenicity and reduced virulence than Cladosporium sp. (pathogen). Cytotoxicity assay indicated Cladosporium sp. (pathogen) had cytotoxicity, while Cladosporium sp. (endogensis) had no cytotoxicity. These characters reflect the adaptation of endosymbiosis to host-restricted lifestyle and the invader of the entomopathogen to the host.

RevDate: 2022-03-25

Tenea GN (2022)

Decoding the Gene Variants of Two Native Probiotic Lactiplantibacillus plantarum Strains through Whole-Genome Resequencing: Insights into Bacterial Adaptability to Stressors and Antimicrobial Strength.

Genes, 13(3): pii:genes13030443.

In this study, whole-genome resequencing of two native probiotic Lactiplantibacillus plantarum strains-UTNGt21A and UTNGt2-was assessed in order to identify variants and perform annotation of genes involved in bacterial adaptability to different stressors, as well as their antimicrobial strength. A total of 21,906 single-nucleotide polymorphisms (SNPs) were detected in UTNGt21A, while 17,610 were disclosed in the UTNGt2 genome. The comparative genomic analysis revealed a greater number of deletions, transversions, and transitions within the UTNGt21A genome, while a small difference in the number of insertions was detected between the strains. A divergent number of types of variant annotations were detected in both strains, and categorized in terms of low, moderate, and high modifier impact on the protein effectiveness. Although both native strains shared common specific genes involved in the stress response to the gastrointestinal environment, which may qualify as a putative probiotic (bile salt, acid, temperature, osmotic stress), they were different in their antimicrobial gene cluster organization, with UTNGt21A displaying a complex bacteriocin gene arrangement and dissimilar gene variants that might alter their defense mechanisms and overall inhibitory capacity. The genome comparison revealed 34 and 9 genomic islands (GIs) in the UTNGt21A and UTNGt2 genomes, respectively, with the overrepresentation of genes involved in defense mechanisms and carbohydrate utilization. In addition, pan-genome analysis disclosed the presence of various strain-specific genes (shell genes), suggesting a high genome variation between strains. This genome analysis illustrates that the bacteriocin signature and gene variants reflect a niche-inherent pattern. These extensive genomic datasets will guide us to understand the potential benefits of the native strains and their utility in the food or pharmaceutical sectors.

RevDate: 2022-03-25

Pudova DS, Toymentseva AA, Gogoleva NE, et al (2022)

Comparative Genome Analysis of Two Bacillus pumilus Strains Producing High Level of Extracellular Hydrolases.

Genes, 13(3): pii:genes13030409.

Whole-genome sequencing of a soil isolate Bacillus pumilus, strain 7P, and its streptomycin-resistant derivative, B. pumilus 3-19, showed genome sizes of 3,609,117 bp and 3,609,444 bp, respectively. Annotation of the genome showed 3794 CDS (3204 with predicted function) and 3746 CDS (3173 with predicted function) in the genome of strains 7P and 3-19, respectively. In the genomes of both strains, the prophage regions Bp1 and Bp2 were identified. These include 52 ORF of prophage proteins in the Bp1 region and 38 prophages ORF in the Bp2 region. Interestingly, more than 50% of Bp1 prophage proteins are similar to the proteins of the phi105 in B. subtilis. The DNA region of Bp2 has 15% similarity to the DNA of the Brevibacillus Jimmer phage. Degradome analysis of the genome of both strains revealed 148 proteases of various classes. These include 60 serine proteases, 48 metalloproteases, 26 cysteine proteases, 4 aspartate proteases, 2 asparagine proteases, 3 threonine proteases, and 2 unclassified proteases. Likewise, three inhibitors of proteolytic enzymes were found. Comparative analysis of variants in the genomes of strains 7P and 3-19 showed the presence of 81 nucleotide variants in the genome 3-19. Among them, the missense mutations in the rpsL, comA, spo0F genes and in the upstream region of the srlR gene were revealed. These nucleotide polymorphisms may have affected the streptomycin resistance and overproduction of extracellular hydrolases of the 3-19 strain. Finally, a plasmid DNA was found in strain 7P, which is lost in its derivative, strain 3-19. This plasmid contains five coding DNA sequencing (CDS), two regulatory proteins and three hypothetical proteins.

RevDate: 2022-03-24

Kim E, Kim D, Yang SM, et al (2022)

Multiplex SYBR Green real-time PCR for Lactobacillus acidophilus group species targeting biomarker genes revealed by a pangenome approach.

Microbiological research, 259:127013 pii:S0944-5013(22)00053-2 [Epub ahead of print].

The Lactobacillus acidophilus group consists of seven closely related species. Among these, Lb. acidophilus, Lb. gallinarum, and Lb. helveticus help maintain gut health and are used as a starter for fermented food. However, these species are difficult to differentiate using conventional methods due to the high similarity between the 16S rRNA and housekeeping genes. Thus, in this study, we selected biomarker genes to identify and discriminate the three species via pangenome analysis, and a multiplex SYBR Green real-time PCR that can be detected simultaneously in a single tube was developed. Pangenome analysis revealed three specific target genes: mucus-binding protein precursor to detect Lb. acidophilus, an amino acid ABC superfamily ATP binding cassette transporter carrier protein to detect Lb. gallinarum, and selenocysteine lyase to detect Lb. helveticus. The specificity was robustly verified using 26 Lb. acidophilus group strains and 62 other strains. The detection limits were 101 colony-forming units (CFU)/ml in pure culture. As per our findings, the developed method satisfactorily monitored Lb. acidophilus group species in probiotic and dairy products. This result suggests that real-time PCR based on specific targets provides a promising approach for the rapid, accurate, and sensitive identification of these three species.

RevDate: 2022-03-24

Li M, Sun C, Xu N, et al (2022)

De novo assembly of 20 chicken genomes reveals the undetectable phenomenon for thousands of core genes on micro-chromosomes and sub-telomeric regions.

Molecular biology and evolution pii:6553873 [Epub ahead of print].

The gene numbers and evolutionary rates of birds were assumed to be much lower than those of mammals, which is in sharp contrast to the huge species number and morphological diversity of birds. It is therefore necessary to construct a complete avian genome and analyze its evolution. We constructed a chicken pan-genome from 20 de novo assembled genomes with high sequencing depth, and identified 1,335 protein-coding genes and 3,011 long noncoding RNAs not found in GRCg6a. The majority of these novel genes were detected across most individuals of the examined transcriptomes but were seldomly measured in each of the DNA sequencing data regardless of Illumina or PacBio technology. Furthermore, different from previous pan-genome models, most of these novel genes were overrepresented on chromosomal sub-telomeric regions and micro-chromosomes, surrounded by extremely high proportions of tandem repeats, which strongly blocks DNA sequencing. These hidden genes were proved to be shared by all chicken genomes, included many housekeeping genes, and enriched in immune pathways. Comparative genomics revealed the novel genes had three-fold elevated substitution rates than known ones, updating the knowledge about evolutionary rates in birds. Our study provides a framework for constructing a better chicken genome, which will contribute towards the understanding of avian evolution and improvement of poultry breeding.

RevDate: 2022-03-24

Khedkar S, Smyshlyaev G, Letunic I, et al (2022)

Landscape of mobile genetic elements and their antibiotic resistance cargo in prokaryotic genomes.

Nucleic acids research pii:6552054 [Epub ahead of print].

Prokaryotic Mobile Genetic Elements (MGEs) such as transposons, integrons, phages and plasmids, play important roles in prokaryotic evolution and in the dispersal of cargo functions like antibiotic resistance. However, each of these MGE types is usually annotated and analysed individually, hampering a global understanding of phylogenetic and environmental patterns of MGE dispersal. We thus developed a computational framework that captures diverse MGE types, their cargos and MGE-mediated horizontal transfer events, using recombinases as ubiquitous MGE marker genes and pangenome information for MGE boundary estimation. Applied to ∼84k genomes with habitat annotation, we mapped 2.8 million MGE-specific recombinases to six operational MGE types, which together contain on average 13% of all the genes in a genome. Transposable elements (TEs) dominated across all taxa (∼1.7 million occurrences), outnumbering phages and phage-like elements (<0.4 million). We recorded numerous MGE-mediated horizontal transfer events across diverse phyla and habitats involving all MGE types, disentangled and quantified the extent of hitchhiking of TEs (17%) and integrons (63%) with other MGE categories, and established TEs as dominant carriers of antibiotic resistance genes. We integrated all these findings into a resource (, which should facilitate future studies on the large mobile part of genomes and its horizontal dispersal.

RevDate: 2022-03-23

Liu J, Xu Z, Li H, et al (2022)

Metagenomic Approaches Reveal Strain Profiling and Genotyping of Klebsiella pneumoniae from Hospitalized Patients in China.

Microbiology spectrum [Epub ahead of print].

Klebsiella pneumoniae is a leading cause of highly drug-resistant infections in hospitals worldwide. Strain-level bacterial identification on the genetic determinants of multidrug resistance and high pathogenicity is critical for the surveillance and treatment of this clinically relevant pathogen. In this study, metagenomic next-generation sequencing was performed for specimens collected from August 2020 to May 2021 in Ruijin Hospital, Ningbo Women and Children's Hospital, and the Second Affiliated Hospital of Harbin Medical University. Genome biology of K. pneumoniae prevalent in China was characterized based on metagenomic data. Thirty K. pneumoniae strains derived from 14 sequence types were identified by multilocus sequence typing. The hypervirulent ST11 K. pneumoniae strains carrying the KL64 capsular locus were the most prevalent in the hospital population. The phylogenomic analyses revealed that the metagenome-reconstructed strains and public isolate genomes belonging to the same STs were closely related in the phylogenetic tree. Furthermore, the pangenome structure of the detected K. pneumoniae strains was analyzed, particularly focusing on the distribution of antimicrobial resistance genes and virulence genes across the strains. The genes encoding carbapenemases and extended-spectrum beta-lactamases were frequently detected in the strains of ST11 and ST15. The highest numbers of virulence genes were identified in the well-known hypervirulent strains affiliated to ST23 bearing the K1 capsule. In comparison to traditional cultivation and identification, strain-level metagenomics is advantageous to understand the mechanisms underlying resistance and virulence of K. pneumoniae directly from clinical specimens. Our findings should provide novel clues for future research into culture-independent metagenomic surveillance for bacterial pathogens. IMPORTANCE Routine culture and PCR-based molecular testing in the clinical microbiology laboratory are unable to recognize pathogens at the strain level and to detect strain-specific genetic determinants involved in virulence and resistance. To address this issue, we explored the strain-level profiling of K. pneumoniae prevalent in China based on metagenome-sequenced patient materials. Genome biology of the targeted bacterium can be well characterized through decoding sequence signatures and functional gene profiles at the single-strain resolution. The in-depth metagenomic analysis on strain profiling presented here shall provide a promising perspective for culture-free pathogen surveillance and molecular epidemiology of nosocomial infections.

RevDate: 2022-03-23

Chapman MA, He Y, M Zhou (2022)

Beyond a reference genome: pangenomes and population genomics of underutilized and orphan crops for future food and nutrition security.

The New phytologist [Epub ahead of print].

Underutilized crops are, by definition, under-researched compared to staple crops yet come with traits that may be especially important given climate change and the need to feed a globally increasing population. These crops are often stress-tolerant, and this combined with unique and beneficial nutritional profiles. Whilst progress is being made by generating reference genome sequences, in this Tansley Review, we show how this is only the very first step. We advocate that going 'beyond a reference genome' should be a priority, as it is only at this stage one can identify the specific genes and the adaptive alleles that underpin the valuable traits. We sum up how population genomic and pangenomic approaches have led to the identification of stress- and disease-tolerant alleles in staple crops and compare this to the small number of examples from underutilized crops. We also demonstrate how previously underutilized crops have benefitted from genomic advances and that many breeding targets in underutilized crops are often well studied in staple crops. This cross-crop population-level resequencing could lead to an understanding of the genetic basis of adaptive traits in underutilized crops. This level of investment may be crucial for fully understanding the value of these crops before they are lost.

RevDate: 2022-03-22

Monir MM, Hossain T, Morita M, et al (2022)

Genomic Characteristics of Recently Recognized Vibrio cholerae El Tor Lineages Associated with Cholera in Bangladesh, 1991 to 2017.

Microbiology spectrum [Epub ahead of print].

Comparative genomic analysis of Vibrio cholerae El Tor associated with endemic cholera in Asia revealed two distinct lineages, one dominant in Bangladesh and the other in India. An in-depth whole-genome study of V. cholerae El Tor strains isolated during endemic cholera in Bangladesh (1991 to 2017) included reference genome sequence data obtained online. Core genome phylogeny established using single nucleotide polymorphisms (SNPs) showed V. cholerae El Tor strains comprised two lineages, BD-1 and BD-2, which, according to Bayesian phylodynamic analysis, originated from paraphyletic group BD-0 around 1981. BD-1 and BD-2 lineages overlapped temporally but were negatively associated as causative agents of cholera during 2004 to 2017. Genome-wide association study (GWAS) revealed 140 SNPs and 31 indels, resulting in gene alleles unique to BD-1 and BD-2. Regression analysis of root to tip distance and year of isolation indicated early BD-0 strains at the base, whereas BD-1 and BD-2 subsequently emerged and progressed by accumulating SNPs. Pangenome analysis provided evidence of gene acquisition by both BD-1 and BD-2, of which six crucial proteins of known function were predominant in BD-2. BD-1 and BD-2 diverged and have distinctively different genomic traits, namely, heterogeneity in VSP-2, VPI-1, mobile elements, toxin encoding elements, and total gene abundance. In addition, the observed phage-inducible chromosomal island-like element (PLE1), and SXT ICE elements (ICETET) in BD-2 presumably provided a fitness advantage for the lineage to outcompete BD-1 as the etiological agent of endemic cholera in Bangladesh, with implications for global cholera epidemiology. IMPORTANCE Cholera is a global disease with specific reference to the Bay of Bengal Ganges Delta where Vibrio cholerae O1 El Tor, the causative agent of the disease showed two circulating lineages, one dominant in Bangladesh and the other in India. Results of an in-depth genomic study of V. cholerae associated with endemic cholera during the past 27 years (1991 to 2017) indicate emergence and succession of the two lineages, BD-1 and BD-2, arising from a common ancestral paraphyletic group, BD-0, comprising the early strains and short-term evolution of the bacterium in Bangladesh. Among the two V. cholerae lineages, BD-2 supersedes BD-1 and is predominant in the most recent endemic cholera in Bangladesh. The BD-2 lineage contained significantly more SNPs and indels, and showed richness in gene abundance, including antimicrobial resistance genes, gene cassettes, and PLE to fight against bacteriophage infection, acquired over time. These findings have important epidemic implications on a global scale.

RevDate: 2022-03-21

Kahn AK, RPP Almeida (2022)

Phylogenetics of Historical Host Switches in a Bacterial Plant Pathogen.

Applied and environmental microbiology [Epub ahead of print].

Xylella fastidiosa is an insect-transmitted bacterial plant pathogen found across the Americas and, more recently, worldwide. X. fastidiosa infects plants of at least 563 species belonging to 82 botanical families. While the species X. fastidiosa infects many plants, particular strains have increased plant specificity. Understanding the molecular underpinnings of plant host specificity in X. fastidiosa is vital for predicting host shifts and epidemics. While there may exist multiple genetic determinants of host range in X. fastidiosa, the drivers of the unique relationships between X. fastidiosa and its hosts should be elucidated. Our objective with this study was to predict the ancestral plant hosts of this pathogen using phylogenetic and genomic methods based on a large data set of pathogen whole-genome data from agricultural hosts. We used genomic data to construct maximum-likelihood (ML) phylogenetic trees of subsets of the core and pan-genomes. With those trees, we ran ML ancestral state reconstructions of plant host at two taxonomic scales (genus and multiorder clades). Both the core and pan-genomes were informative in terms of predicting ancestral host state, giving new insight into the history of the plant hosts of X. fastidiosa. Subsequently, gene gain and loss in the pan-genome were found to be significantly correlated with plant host through genes that had statistically significant associations with particular hosts. IMPORTANCE Xylella fastidiosa is a globally important bacterial plant pathogen with many hosts; however, the underpinnings of host specificity are not known. This paper contains important findings about the usage of phylogenetics to understand the history of host specificity in this bacterial species, as well as convergent evolution in the pan-genome. There are strong signals of historical host range that give us insights into the history of this pathogen and its various invasions. The data from this paper are relevant in making decisions for quarantine and eradication, as they show the historical trends of host switching, which can help us predict likely future host shifts. We also demonstrate that using multilocus sequence type (MLST) genes in this system, which is still a commonly used process for policymaking, does not reconstruct the same phylogenetic topology as whole-genome data.

RevDate: 2022-03-19

Estrada AA, Gottschalk M, Gebhart CJ, et al (2022)

Comparative analysis of Streptococcus suis genomes identifies novel candidate virulence-associated genes in North American isolates.

Veterinary research, 53(1):23.

Streptococcus suis is a significant economic and welfare concern in the swine industry. Pan-genome analysis provides an in-silico approach for the discovery of genes involved in pathogenesis in bacterial pathogens. In this study, we performed pan-genome analysis of 208 S. suis isolates classified into the pathogenic, possibly opportunistic, and commensal pathotypes to identify novel candidate virulence-associated genes (VAGs) of S. suis. Using chi-square tests and LASSO regression models, three accessory pan-genes corresponding to S. suis strain P1/7 markers SSU_RS09525, SSU_RS09155, and SSU_RS03100 (>95% identity) were identified as having a significant association with the pathogenic pathotype. The proposed novel SSU_RS09525 + /SSU_RS09155 + /SSU_RS03100 + genotype identified 96% of the pathogenic pathotype strains, suggesting a novel genotyping scheme for predicting the pathogenicity of S. suis isolates in North America. In addition, mobile genetic elements carrying antimicrobial resistance genes (ARGs) and VAGs were identified but did not appear to play a major role in the spread of ARGs and VAGs.

RevDate: 2022-03-17

Silva M, Pontes A, Franco-Duarte R, et al (2022)

A glimpse at an early stage of microbe domestication revealed in the variable genome of Torulaspora delbrueckii, an emergent industrial yeast.

Molecular ecology [Epub ahead of print].

Microbe domestication has a major applied relevance but is still poorly understood from an evolutionary perspective. The yeast Torulaspora delbrueckii is gaining importance for biotechnology but little is known about its population structure, variation in gene content, or possible domestication routes. Here, we show that T. delbrueckii is composed of five major clades. Among the three European clades, a lineage associated with the wild arboreal niche is sister to the two other lineages that are linked with anthropic environments, one to wine fermentations and the other to diverse sources including dairy products and bread dough (Mix- Anthropic clade). Using 64 genomes we assembled the pangenome and the variable genome of T. delbrueckii. A comparison with Saccharomyces cerevisiae indicated that the weight of the variable genome in the pangenome of T. delbrueckii is considerably smaller. An association of gene content and ecology supported the hypothesis that the Mix - Anthropic clade has the most specialized genome and indicated that some of the exclusive genes were implicated in galactose and maltose utilization. More detailed analyses traced the acquisition of a cluster of GAL genes in strains associated with dairy products and the expansion and functional diversification of MAL genes in strains isolated from bread dough. Contrary to S. cerevisiae, domestication in T. delbrueckii is not primarily driven by alcoholic fermentation but rather by adaptation to dairy and bread-production niches. This study expands our views on the processes of microbe domestication and on the trajectories leading to adaptation to anthropic niches.

RevDate: 2022-03-16

Roux E, Nicolas A, Valence F, et al (2022)

The genomic basis of the Streptococcus thermophilus health-promoting properties.

BMC genomics, 23(1):210.

BACKGROUND: Streptococcus thermophilus is a Gram-positive bacterium widely used as starter in the dairy industry as well as in many traditional fermented products. In addition to its technological importance, it has also gained interest in recent years as beneficial bacterium due to human health-promoting functionalities. The objective of this study was to inventory the main health-promoting properties of S. thermophilus and to study their intra-species diversity at the genomic and genetic level within a collection of representative strains.

RESULTS: In this study various health-related functions were analyzed at the genome level from 79 genome sequences of strains isolated over a long time period from diverse products and different geographic locations. While some functions are widely conserved among isolates (e.g., degradation of lactose, folate production) suggesting their central physiological and ecological role for the species, others including the tagatose-6-phosphate pathway involved in the catabolism of galactose, and the production of bioactive peptides and gamma-aminobutyric acid are strain-specific. Most of these strain-specific health-promoting properties seems to have been acquired via horizontal gene transfer events. The genetic basis for the phenotypic diversity between strains for some health related traits have also been investigated. For instance, substitutions in the galK promoter region correlate with the ability of some strains to catabolize galactose via the Leloir pathway. Finally, the low occurrence in S. thermophilus genomes of genes coding for biogenic amine production and antibiotic resistance is also a contributing factor to its safety status.

CONCLUSIONS: The natural intra-species diversity of S. thermophilus, therefore, represents an interesting source for innovation in the field of fermented products enriched for healthy components that can be exploited to improve human health. A better knowledge of the health-promoting properties and their genomic and genetic diversity within the species may facilitate the selection and application of strains for specific biotechnological and human health-promoting purpose. Moreover, by pointing out that a substantial part of its functional potential still defies us, our work opens the way to uncover additional health-related functions through the intra-species diversity exploration of S. thermophilus by comparative genomics approaches.

RevDate: 2022-03-15

Zhuang Y, Wang X, Li X, et al (2022)

Phylogenomics of the genus Glycine sheds light on polyploid evolution and life-strategy transition.

Nature plants [Epub ahead of print].

Polyploidy and life-strategy transitions between annuality and perenniality often occur in flowering plants. However, the evolutionary propensities of polyploids and the genetic bases of such transitions remain elusive. We assembled chromosome-level genomes of representative perennial species across the genus Glycine including five diploids and a young allopolyploid, and constructed a Glycine super-pangenome framework by integrating 26 annual soybean genomes. These perennial diploids exhibit greater genome stability and possess fewer centromere repeats than the annuals. Biased subgenomic fractionation occurred in the allopolyploid, primarily by accumulation of small deletions in gene clusters through illegitimate recombination, which was associated with pre-existing local subgenomic differentiation. Two genes annotated to modulate vegetative-reproductive phase transition and lateral shoot outgrowth were postulated as candidates underlying the perenniality-annuality transition. Our study provides insights into polyploid genome evolution and lays a foundation for unleashing genetic potential from the perennial gene pool for soybean improvement.

RevDate: 2022-03-15

Cao H, Xu D, Zhang T, et al (2022)

Comprehensive and functional analyses reveal the genomic diversity and potential toxicity of Microcystis.

Harmful algae, 113:102186.

Microcystis is a cyanobacteria that is widely distributed across the world. It has attracted great attention because it produces the hepatotoxin microcystin (MC) that can inhibit eukaryotic protein phosphatases and pose a great risk to animal and human health. Due to the high diversity of morphospecies and genomes, it is still difficult to classify Microcystis species. In this study, we investigated the pangenome of 23 Microcystis strains to detect the genetic diversity and evolutionary dynamics. Microcystis revealed an open pangenome containing 22,009 gene families and exhibited different functional constraints. The core-genome phylogenetic analysis accurately differentiated the toxic and nontoxic strains and could be used as a taxonomic standard at the genetic level. We also investigated the functions of HGT events, of which were mostly conferred from cyanobacteria and closely related species. In order to detect the potential toxicity of Microcystis, we searched and characterized MC biosynthetic gene clusters and other secondary metabolite gene clusters. Our work provides insights into the genetic diversity, evolutionary dynamics, and potential toxicity of Microcystis, which could benefit the species classification and development of new methods for drinking water quality control and management of bloom formation in the future.

RevDate: 2022-03-14

Yan W, Feng X, Lin TH, et al (2022)

Diverse Subclade Differentiation Attributed to the Ubiquity of Prochlorococcus High-Light-Adapted Clade II.

mBio [Epub ahead of print].

Prochlorococcus is the key primary producer in marine ecosystems, and the high-light-adapted clade II (HLII) is the most abundant ecotype. However, the genomic and ecological basis of Prochlorococcus HLII in the marine environment has remained elusive. Here, we show that the ecologically coherent subclade differentiation of HLII corresponds to genomic and ecological characteristics on the basis of analyses of 31 different strains of HLII, including 12 novel isolates. Different subclades of HLII with different core and accessory genes were identified, and their distribution in the marine environment was explored using the TARA Oceans metagenome database. Three major subclade groups were identified, viz., the surface group (HLII-SG), the transition group (HLII-TG), and the deep group (HLII-DG). These subclade groups showed different temperature ranges and optima for distribution. In regression analyses, temperature and nutrient availability were identified as key factors affecting the distribution of HLII subclades. A 35% increase in the relative abundance of HLII-SG by the end of the 21st century was predicted under the Representative Concentration Pathway 8.5 scenario. Our results show that the ubiquity and distribution of Prochlorococcus HLII in the marine environment are associated with the differentiation of diverse subclades. These findings provide insights into the large-scale shifts in the Prochlorococcus community in response to future climate change. IMPORTANCE Prochlorococcus is the most abundant oxygenic photosynthetic microorganism on Earth, and high-light-adapted clade II (HLII) is the dominant ecotype. However, the factors behind the dominance of HLII in the vast oligotrophic oceans are still unknown. Here, we identified three distinct groups of HLII subclades, viz., the surface group (HLII-SG), the transition group (HLII-TG), and the deep group (HLII-DG). We further demonstrated that the ecologically coherent subclade differentiation of HLII corresponds to genomic and ecological characteristics. Our study suggests that the differentiation of diverse subclades underlies the ubiquity and distribution of Prochlorococcus HLII in the marine environment and provides insights into the shifts in the Prochlorococcus community in response to future climate change.

RevDate: 2022-03-12

Cai X, Lin R, Liang J, et al (2022)

Transposable element insertion: a hidden major source of domesticated phenotypic variation in Brassica rapa.

Plant biotechnology journal [Epub ahead of print].

Transposable element (TE) is prevalent in plant genomes. However, studies on their impact on phenotypic evolution in crop plants are relatively rare, because systematically identifying TE insertions within a species has been a challenge. Here, we present a novel approach for uncovering TE insertion polymorphisms (TIPs) using pan-genome analysis combined with population-scale resequencing, and we adopt this pipeline to retrieve TIPs in a Brassica rapa germplasm collection. We found that 23% of genes within the reference Chiifu-401-42 genome harbored TIPs. TIPs tended to have large transcriptional effects, including modifying gene expression levels and altering gene structure by introducing new introns. Among 524 diverse accessions, TIPs broadly influenced genes related to traits and acted a crucial role in the domestication of B. rapa morphotypes. As examples, four specific TIP-containing genes were found to be candidates that potentially involved in various climatic conditions, promoting the formation of diverse vegetable crops in B. rapa. Our work reveals the hitherto hidden TIPs implicated in agronomic traits and highlights their widespread utility in studies of crop domestication.

RevDate: 2022-03-10

Dwiyanto J, Hor JW, Reidpath D, et al (2022)

Pan-genome and resistome analysis of extended-spectrum ß-lactamase-producing Escherichia coli: A multi-setting epidemiological surveillance study from Malaysia.

PloS one, 17(3):e0265142 pii:PONE-D-21-27273.

OBJECTIVES: This study profiled the prevalence of extended-spectrum ß-lactamase-producing Escherichia coli (ESBL-EC) in the community and compared their resistome and genomic profiles with isolates from clinical patients through whole-genome sequencing.

METHODS: Fecal samples from 233 community dwellers from Segamat, a town in southern Malaysia, were obtained between May through August 2018. Putative ESBL strains were screened and tested using antibiotic susceptibility tests. Additionally, eight clinical ESBL-EC were obtained from a hospital in the same district between June through October 2020. Whole-genome sequencing was then conducted on selected ESBL-EC from both settings (n = 40) for pan-genome comparison, cluster analysis, and resistome profiling.

RESULTS: A mean ESBL-EC carriage rate of 17.82% (95% CI: 10.48%- 24.11%) was observed in the community and was consistent across demographic factors. Whole-genome sequences of the ESBL-EC (n = 40) enabled the detection of multiple plasmid replicon groups (n = 28), resistance genes (n = 34) and virulence factors (n = 335), with no significant difference in the number of genes carried between the community and clinical isolates (plasmid replicon groups, p = 0.13; resistance genes, p = 0.47; virulence factors, p = 0.94). Virulence gene marker analysis detected the presence of extraintestinal pathogenic E. coli (ExPEC), uropathogenic E. coli (UPEC), and enteroaggregative E. coli (EAEC) in both the community and clinical isolates. Multiple blaCTX-M variants were observed, dominated by blaCTX-M-27 (n = 12), blaCTX-M-65 (n = 10), and blaCTX-M-15 (n = 9). The clinical and community isolates did not cluster together based on the pan-genome comparison, suggesting isolates from the two settings were clonally unrelated. However, cluster analysis based on carried plasmids, resistance genes and phenotypic susceptibility profiles identified four distinct clusters, with similar patterns between the community and clinical isolates.

CONCLUSION: ESBL-EC from the clinical and community settings shared similar resistome profiles, suggesting the frequent exchange of genetic materials through horizontal gene transfer.

RevDate: 2022-03-10

Mancebo FJ, Parras-Moltó M, García-Ríos E, et al (2022)

Deciphering the Potential Coding of Human Cytomegalovirus: New Predicted Transmembrane Proteome.

International journal of molecular sciences, 23(5): pii:ijms23052768.

CMV is a major cause of morbidity and mortality in immunocompromised individuals that will benefit from the availability of a vaccine. Despite the efforts made during the last decade, no CMV vaccine is available. An ideal CMV vaccine should elicit a broad immune response against multiple viral antigens including proteins involved in virus-cell interaction and entry. However, the therapeutic use of neutralizing antibodies targeting glycoproteins involved in viral entry achieved only partial protection against infection. In this scenario, a better understanding of the CMV proteome potentially involved in viral entry may provide novel candidates to include in new potential vaccine design. In this study, we aimed to explore the CMV genome to identify proteins with putative transmembrane domains to identify new potential viral envelope proteins. We have performed in silico analysis using the genome sequences of nine different CMV strains to predict the transmembrane domains of the encoded proteins. We have identified 77 proteins with transmembrane domains, 39 of which were present in all the strains and were highly conserved. Among the core proteins, 17 of them such as UL10, UL139 or US33A have no ascribed function and may be good candidates for further mechanistic studies.

RevDate: 2022-03-10

Tay Fernandez CG, Nestor BJ, Danilevicz MF, et al (2022)

Pangenomes as a Resource to Accelerate Breeding of Under-Utilised Crop Species.

International journal of molecular sciences, 23(5): pii:ijms23052671.

Pangenomes are a rich resource to examine the genomic variation observed within a species or genera, supporting population genetics studies, with applications for the improvement of crop traits. Major crop species such as maize (Zea mays), rice (Oryza sativa), Brassica (Brassica spp.), and soybean (Glycine max) have had pangenomes constructed and released, and this has led to the discovery of valuable genes associated with disease resistance and yield components. However, pangenome data are not available for many less prominent crop species that are currently under-utilised. Despite many under-utilised species being important food sources in regional populations, the scarcity of genomic data for these species hinders their improvement. Here, we assess several under-utilised crops and review the pangenome approaches that could be used to build resources for their improvement. Many of these under-utilised crops are cultivated in arid or semi-arid environments, suggesting that novel genes related to drought tolerance may be identified and used for introgression into related major crop species. In addition, we discuss how previously collected data could be used to enrich pangenome functional analysis in genome-wide association studies (GWAS) based on studies in major crops. Considering the technological advances in genome sequencing, pangenome references for under-utilised species are becoming more obtainable, offering the opportunity to identify novel genes related to agro-morphological traits in these species.

RevDate: 2022-03-10

Syrokou MK, Paramithiotis S, Drosinos EH, et al (2022)

A Comparative Genomic and Safety Assessment of Six Lactiplantibacillus plantarum subsp. argentoratensis Strains Isolated from Spontaneously Fermented Greek Wheat Sourdoughs for Potential Biotechnological Application.

International journal of molecular sciences, 23(5): pii:ijms23052487.

The comparative genome analysis of six Lactiplantibacillus plantarum subsp. argentoratensis strains previously isolated from spontaneously fermented Greek wheat sourdoughs is presented. Genomic attributes related to food safety have been studied according to the European Food Safety Authority (EFSA) suggestions for the use of lactic acid bacteria (LAB) in the production of foods. Bioinformatic analysis revealed a complete set of genes for maltose, sucrose, glucose, and fructose fermentation; conversion of fructose to mannitol; folate and riboflavin biosynthesis; acetoin production; conversion of citrate to oxaloacetate; and the ability to produce antimicrobial compounds (plantaricins). Pathogenic factors were absent but some antibiotic resistance genes were detected. CRISPR and cas genes were present as well as various mobile genetic elements (MGEs) such as plasmids, prophages, and insertion sequences. The production of biogenic amines by these strains was not possible due to the absence of key genes in their genome except lysine decarboxylase associated with cadaverine; however, potential degradation of these substances was identified due to the presence of a blue copper oxidase precursor and a multicopper oxidase protein family. Finally, comparative genomics and pan-genome analysis showed genetic differences between the strains (e.g., variable pln locus), and it facilitated the identification of various phenotypic and probiotic-related properties.

RevDate: 2022-03-09

Kim JS, Kang SW, Lee JH, et al (2022)

The evolution and competitive strategies of Akkermansia muciniphila in gut.

Gut microbes, 14(1):2025017.

Akkermansia muciniphila is a commensal bacterium using mucin as its sole carbon and nitrogen source. A. muciniphila is a promising candidate for next-generation probiotics to prevent inflammatory and metabolic disorders, including diabetes and obesity, and to increase the response to cancer immunotherapy. In this study, a comparative pan-genome analysis was conducted to investigate the genomic diversity and evolutionary relationships between complete genomes of 27 A. muciniphila strains, including KGMB strains isolated from healthy Koreans. The analysis showed that A. muciniphila strains formed two clades of group A and B in a phylogenetic tree constructed using 1,219 orthologous single-copy core genes. Interestingly, group A comprised of strains from human feces in Korea, whereas most of group B comprised strains from human feces in Europe and China, and from mouse feces. As group A and B branched, mucin hydrolysis played an important role in the stability of the core genome and drove evolution in the direction of defense against invading pathogens, survival in, and colonization in the mucus layer. In addition, WapA and anSME, which function in competition and post-translational modification of sulfatase, respectively, have been a particularly important selective pressure in the evolution of group A. KGMB strains in group A with anSME gene showed sulfatase activity, but KCTC 15667T in group B without anSME did not. Our findings revealed that KGMB strains evolved to gain an edge in the competition with other gut bacteria by increasing the utilization of sulfated mucin, which will allow it to become highly colonized in the gut environment.

RevDate: 2022-03-07

Song Y, Xu X, Huang Z, et al (2022)

Genomic Characteristics Revealed Plasmid-Mediated Pathogenicity and Ubiquitous Rifamycin Resistance of Rhodococcus equi.

Frontiers in cellular and infection microbiology, 12:807610.

Rhodococcus equi is a zoonotic pathogen that can cause fatal disease in patients who are immunocompromised. At present, the epidemiology and pathogenic mechanisms of R. equi infection are not clear. This study characterized the genomes of 53 R. equi strains from different sources. Pan-genome analysis showed that all R. equi strains contained 11481 pan genes, including 3690 core genes and 602 ~ 1079 accessory genes. Functional annotation of pan genome focused on the genes related to basic lifestyle, such as the storage and expression of metabolic and genetic information. Phylogenetic analysis based on pan-genome showed that the R. equi strains were clustered into six clades, which was not directly related to the isolation location and host source. Also, a total of 84 virulence genes were predicted in 53 R. equi strains. These virulence factors can be divided into 20 categories related to substance metabolism, secreted protein and immune escape. Meanwhile, six antibiotic resistance genes (RbpA, tetA (33), erm (46), sul1, qacEdelta 1 and aadA9) were detected, and all strains carried RbpA related to rifamycin resistance. In addition, 28 plasmids were found in the 53 R. equi strains, belonging to Type-A (n = 14), Type-B (n = 8) and Type-N (n = 6), respectively. The genetic structures of the same type of plasmid were highly similar. In conclusion, R. equi strains show different genomic characteristics, virulence-related genes, potential drug resistance and virulence plasmid structures, which may be conducive to the evolution of its pathogenesis.

RevDate: 2022-03-07

Ma L, Yang W, Huang S, et al (2022)

Integrative Assessments on Molecular Taxonomy of Acidiferrobacter thiooxydans ZJ and Its Environmental Adaptation Based on Mobile Genetic Elements.

Frontiers in microbiology, 13:826829.

Acidiferrobacter spp. are facultatively anaerobic acidophiles that belong to a distinctive Acidiferrobacteraceae family, which are similar to Ectothiorhodospiraceae phylogenetically, and are closely related to Acidithiobacillia class/subdivision physiologically. The limited genome information has kept them from being studied on molecular taxonomy and environmental adaptation in depth. Herein, Af. thiooxydans ZJ was isolated from acid mine drainage (AMD), and the complete genome sequence was reported to scan its genetic constitution for taxonomic and adaptative feature exploration. The genome has a single chromosome of 3,302,271 base pairs (bp), with a GC content of 63.61%. The phylogenetic tree based on OrthoANI highlighted the unique position of Af. thiooxydans ZJ, which harbored more unique genes among the strains from Ectothiorhodospiraceae and Acidithiobacillaceae by pan-genome analysis. The diverse mobile genetic elements (MGEs), such as insertion sequence (IS), clustered regularly interspaced short palindromic repeat (CRISPR), prophage, and genomic island (GI), have been identified and characterized in Af. thiooxydans ZJ. The results showed that Af. thiooxydans ZJ may effectively resist the infection of foreign viruses and gain functional gene fragments or clusters to shape its own genome advantageously. This study will offer more evidence of the genomic plasticity and improve our understanding of evolutionary adaptation mechanisms to extreme AMD environment, which could expand the potential utilization of Af. thiooxydans ZJ as an iron and sulfur oxidizer in industrial bioleaching.

RevDate: 2022-03-03

Yang R, Zhang B, Xu Y, et al (2022)

Genomic insights revealed the environmental adaptability of Planococcus halotolerans Y50 isolated from petroleum-contaminated soil on the Qinghai-Tibet Plateau.

Gene pii:S0378-1119(22)00187-1 [Epub ahead of print].

The Tibetan Plateau niche provides unprecedented opportunities to find microbes that are functional and commercial significance. The present study investigated the physiological and genomic characteristics of Planococcus halotolerans Y50 that was isolated from a petroleum-contaminated soil sample from the Qinghai-Tibet Plateau, and it displayed psychrotolerant, antiradiation, and oil-degraded characteristics. Whole genome sequencing indicated that strain Y50 has a 3.52 Mb genome and 44.7% G+C content, and it possesses 3377 CDSs. The presence of a wide range of UV damage repair genes uvrX and uvsE, DNA repair genes radA and recN, superoxide dismutase, peroxiredoxin and dioxygenase genes provided the genomic basis for the adaptation of the plateau environment polluted by petroleum. Related experiments also verified that the Y50 strain could degrade n-alkanes from C11-C23, and approximately 30% of the total petroleum at 25 °C within 7 days. Meanwhile, strain Y50 could withstand 5×103 J/m2 UVC and 10 KGy gamma ray radiation, and it had strong antioxidant and high radical scavengers for superoxide anion, hydroxyl radical and DPPH. In addition, pan-genome analysis and horizontal gene transfers revealed that strains with different niches have obtained various genes through horizontal gene transfer in the process of evolution, and the more similar their geographical locations, the more similar their members are genetically and ecologically. In conclusion, P. halotolerans Y50 possesses high potential of applications in the bioremediation of alpine hydrocarbons contaminated environment.

RevDate: 2022-03-03

Ricci ML, Fillo S, Ciammaruconi A, et al (2022)

Genome analysis of Legionella pneumophila ST23 from various countries reveals highly similar strains.

Life science alliance, 5(6): pii:5/6/e202101117.

Legionella pneumophila serogroup 1 (Lp1) sequence type (ST) 23 is one of the most commonly detected STs in Italy where it currently causes all investigated outbreaks. ST23 has caused both epidemic and sporadic cases between 1995 and 2018 and was analysed at genomic level and compared with ST23 isolated in other countries to determine possible similarities and differences. A core genome multi-locus sequence typing (cgMLST), based on a previously described set of 1,521 core genes, and single-nucleotide polymorphisms (SNPs) approaches were applied to an ST23 collection including genomes from Italy, France, Denmark and Scotland. DNAs were automatically extracted, libraries prepared using NextEra library kit and MiSeq sequencing performed. Overall, 63 among clinical and environmental Italian Lp1 isolates and a further seven and 11 ST23 from Denmark and Scotland, respectively, were sequenced, and pangenome analysed. Both cgMLST and SNPs analyses showed very few loci and SNP variations in ST23 genomes. All the ST23 causing outbreaks and sporadic cases in Italy and elsewhere, were phylogenetically related independent of year, town or country of isolation. Distances among the ST23s were further shortened when SNPs due to horizontal gene transfers were removed. The Lp1 ST23 isolated in Italy have kept their monophyletic origin, but they are phylogenetically close also to ST23 from other countries. The ST23 are quite widespread in Italy, and a thorough epidemiological investigation is compelled to determine sources of infection when this ST is identified in both LD sporadic cases and outbreaks.

RevDate: 2022-03-01

Yin Z, Liu X, Qian C, et al (2022)

Pan-Genome Analysis of Delftia tsuruhatensis Reveals Important Traits Concerning the Genetic Diversity, Pathogenicity, and Biotechnological Properties of the Species.

Microbiology spectrum [Epub ahead of print].

Delftia tsuruhatensis strains have long been known to promote plant growth and biological control. Recently, it has become an emerging opportunistic pathogen in humans. However, the genomic characteristics of the genetic diversity, pathogenicity, and biotechnological properties have not yet been comprehensively investigated. Here, a comparative pan-genome analysis was constructed. The open pan-genome with a large and flexible gene repertoire exhibited a high degree of genetic diversity. The purifying selection was the main force to drive pan-genome evolution. Significant differences were observed in the evolutionary relationship, functional enrichment, and degree of selective pressure between the different components of the pan-genome. A high degree of genetic plasticity was characterized by the determinations of diverse mobile genetic elements (MGEs), massive genomic rearrangement, and horizontal genes. Horizontal gene transfer (HGT) plays an important role in the genetic diversity of this bacterium and the formation of genomic traits. Our results revealed the occurrence of diverse virulence-related elements associated with macromolecular secretion systems, virulence factors associated with multiple nosocomial infections, and antimicrobial resistance, indicating the pathogenic potential. Lateral flagellum, T1SS, T2SS, T6SS, Tad pilus, type IV pilus, and a part of virulence-related genes exhibited general properties, whereas polar flagellum, T4SS, a part of virulence-related genes, and resistance genes presented heterogeneous properties. The pan-genome also harbors abundant genetic traits related to secondary metabolism, carbohydrate active enzymes (CAZymes), and phosphate transporter, indicating rhizosphere adaptation, plant growth promotion, and great potential uses in agriculture and biological control. This study provides comprehensive insights into this uncommon species from the genomic perspective. IMPORTANCE D. tsuruhatensis is considered a plant growth-promoting rhizobacterium (PGPR), an organic pollutant degradation strain, and an emerging opportunistic pathogen to the human. However, the genetic diversity, the evolutionary dynamics, and the genetic basis of these remarkable traits are still little known. We constructed a pan-genome analysis for D. tsuruhatensis and revealed extensive genetic diversity and genetic plasticity exhibited by open pan-genome, diverse mobile genetic elements (MGEs), genomic rearrangement, and horizontal genes. Our results highlight that horizontal gene transfer (HGT) and purifying selection are important forces in D. tsuruhatensis genetic evolution. The abundant virulence-related elements associated with macromolecular secretion systems, virulence factors, and antimicrobial resistance could contribute to the pathogenicity of this bacterium. Therefore, clinical microbiologists need to be aware of D. tsuruhatensis as an opportunistic pathogen. The genetic profiles of secondary metabolism, carbohydrate active enzymes (CAZymes), and phosphate transporter could provide insight into the genetic armory of potential applications for agriculture and biological control of D. tsuruhatensis in general.

RevDate: 2022-02-23
CmpDate: 2022-02-23

Li H, Wang S, Chai S, et al (2022)

Graph-based pan-genome reveals structural and sequence variations related to agronomic traits and domestication in cucumber.

Nature communications, 13(1):682.

Structural variants (SVs) represent a major source of genetic diversity and are related to numerous agronomic traits and evolutionary events; however, their comprehensive identification and characterization in cucumber (Cucumis sativus L.) have been hindered by the lack of a high-quality pan-genome. Here, we report a graph-based cucumber pan-genome by analyzing twelve chromosome-scale genome assemblies. Genotyping of seven large chromosomal rearrangements based on the pan-genome provides useful information for use of wild accessions in breeding and genetic studies. A total of ~4.3 million genetic variants including 56,214 SVs are identified leveraging the chromosome-level assemblies. The pan-genome graph integrating both variant information and reference genome sequences aids the identification of SVs associated with agronomic traits, including warty fruits, flowering times and root growth, and enhances the understanding of cucumber trait evolution. The graph-based cucumber pan-genome and the identified genetic variants provide rich resources for future biological research and genomics-assisted breeding.


ESP Quick Facts

ESP Origins

In the early 1990's, Robert Robbins was a faculty member at Johns Hopkins, where he directed the informatics core of GDB — the human gene-mapping database of the international human genome project. To share papers with colleagues around the world, he set up a small paper-sharing section on his personal web page. This small project evolved into The Electronic Scholarly Publishing Project.

ESP Support

In 1995, Robbins became the VP/IT of the Fred Hutchinson Cancer Research Center in Seattle, WA. Soon after arriving in Seattle, Robbins secured funding, through the ELSI component of the US Human Genome Project, to create the original ESP.ORG web site, with the formal goal of providing free, world-wide access to the literature of classical genetics.

ESP Rationale

Although the methods of molecular biology can seem almost magical to the uninitiated, the original techniques of classical genetics are readily appreciated by one and all: cross individuals that differ in some inherited trait, collect all of the progeny, score their attributes, and propose mechanisms to explain the patterns of inheritance observed.

ESP Goal

In reading the early works of classical genetics, one is drawn, almost inexorably, into ever more complex models, until molecular explanations begin to seem both necessary and natural. At that point, the tools for understanding genome research are at hand. Assisting readers reach this point was the original goal of The Electronic Scholarly Publishing Project.

ESP Usage

Usage of the site grew rapidly and has remained high. Faculty began to use the site for their assigned readings. Other on-line publishers, ranging from The New York Times to Nature referenced ESP materials in their own publications. Nobel laureates (e.g., Joshua Lederberg) regularly used the site and even wrote to suggest changes and improvements.

ESP Content

When the site began, no journals were making their early content available in digital format. As a result, ESP was obliged to digitize classic literature before it could be made available. For many important papers — such as Mendel's original paper or the first genetic map — ESP had to produce entirely new typeset versions of the works, if they were to be available in a high-quality format.

ESP Help

Early support from the DOE component of the Human Genome Project was critically important for getting the ESP project on a firm foundation. Since that funding ended (nearly 20 years ago), the project has been operated as a purely volunteer effort. Anyone wishing to assist in these efforts should send an email to Robbins.

ESP Plans

With the development of methods for adding typeset side notes to PDF files, the ESP project now plans to add annotated versions of some classical papers to its holdings. We also plan to add new reference and pedagogical material. We have already started providing regularly updated, comprehensive bibliographies to the ESP.ORG site.

Electronic Scholarly Publishing
961 Red Tail Lane
Bellingham, WA 98226

E-mail: RJR8222 @

Papers in Classical Genetics

The ESP began as an effort to share a handful of key papers from the early days of classical genetics. Now the collection has grown to include hundreds of papers, in full-text format.

Digital Books

Along with papers on classical genetics, ESP offers a collection of full-text digital books, including many works by Darwin (and even a collection of poetry — Chicago Poems by Carl Sandburg).


ESP now offers a much improved and expanded collection of timelines, designed to give the user choice over subject matter and dates.


Biographical information about many key scientists.

Selected Bibliographies

Bibliographies on several topics of potential interest to the ESP community are now being automatically maintained and generated on the ESP site.

ESP Picks from Around the Web (updated 07 JUL 2018 )