Viewport Size Code:
Login | Create New Account


About | Classical Genetics | Timelines | What's New | What's Hot

About | Classical Genetics | Timelines | What's New | What's Hot


Bibliography Options Menu

Hide Abstracts   |   Hide Additional Links
Long bibliographies are displayed in blocks of 100 citations at a time. At the end of each block there is an option to load the next block.

Bibliography on: Pangenome

The Electronic Scholarly Publishing Project: Providing world-wide, free access to classic scientific papers and other scholarly materials, since 1993.


ESP: PubMed Auto Bibliography 08 Dec 2022 at 11:37 Created: 


Although the enforced stability of genomic content is ubiquitous among MCEs, the opposite is proving to be the case among prokaryotes, which exhibit remarkable and adaptive plasticity of genomic content. Early bacterial whole-genome sequencing efforts discovered that whenever a particular "species" was re-sequenced, new genes were found that had not been detected earlier — entirely new genes, not merely new alleles. This led to the concepts of the bacterial core-genome, the set of genes found in all members of a particular "species", and the flex-genome, the set of genes found in some, but not all members of the "species". Together these make up the species' pan-genome.

Created with PubMed® Query: ( pangenome OR "pan-genome" OR "pan genome" ) NOT pmcbook NOT ispreviousversion

Citations The Papers (from PubMed®)


RevDate: 2022-12-08

Djeghout B, Bloomfield SJ, Rudder S, et al (2022)

Comparative genomics of Campylobacter jejuni from clinical campylobacteriosis stool specimens.

Gut pathogens, 14(1):45.

BACKGROUND: Campylobacter jejuni is a pervasive pathogen of major public health concern with a complex ecology requiring accurate and informative approaches to define pathogen diversity during outbreak investigations. Source attribution analysis may be confounded if the genetic diversity of a C. jejuni population is not adequately captured in a single specimen. The aim of this study was to determine the genomic diversity of C. jejuni within individual stool specimens from four campylobacteriosis patients. Direct plating and pre-culture filtration of one stool specimen per patient was used to culture multiple isolates per stool specimen. Whole genome sequencing and pangenome level analysis were used to investigate genomic diversity of C. jejuni within a patient.

RESULTS: A total 92 C. jejuni isolates were recovered from four patients presenting with gastroenteritis. The number of isolates ranged from 13 to 30 per patient stool. Three patients yielded a single C. jejuni multilocus sequence type: ST-21 (n = 26, patient 4), ST-61 (n = 30, patient 1) and ST-2066 (n = 23, patient 2). Patient 3 was infected with two different sequence types [ST-51 (n = 12) and ST-354 (n = 1)]. Isolates belonging to the same sequence type from the same patient specimen shared 12-43 core non-recombinant SNPs and 0-20 frameshifts with each other, and the pangenomes of each sequence type consisted of 1406-1491 core genes and 231-264 accessory genes. However, neither the mutation nor the accessory genes were connected to a specific functional gene category.

CONCLUSIONS: Our findings show that the C. jejuni population recovered from an individual patient's stool are genetically diverse even within the same ST and may have shared common ancestors before specimens were obtained. The population is unlikely to have evolved from a single isolate at the time point of initial patient infection, leading us to conclude that patients were likely infected with a heterogeneous C. jejuni population. The diversity of the C. jejuni population found within individual stool specimens can inform future methodological approaches to attribution and outbreak investigations.

RevDate: 2022-12-08

Ullah A, Ullah Khan S, Haq MU, et al (2022)

Computational study to investigate Proteus mirabilis proteomes for multi-epitope vaccine construct design.

Journal of biomolecular structure & dynamics [Epub ahead of print].

Proteus mirabilis is a gram-negative bacterium particularly known for its unique swarming ability. The swarming gives the bacteria ability to enhance adherence to the catheter surface and epithelium cells of the urethra to cause catheter associated urinary tract infections. P. mirabilis has evolved resistant to antibiotics. Additionally, there is an approved vaccine against P. mirabilis, thus demanding for identification of new vaccine targets. This gram-negative bacterium consists of 19,502 core proteins, out of which 19,063 are redundant proteins and remaining 439 are non-redundant proteins. The non-redundant proteins have 21 proteins present on the cell surface out of which 11 proteins are virulent. Antigenicity analysis predicted only 2 proteins as antigenic (fimbrial biogenesis outer membrane usher protein and ligand-gated channel protein). Four and seven B-cells epitopes were predicted from the former and later proteins, respectively. The predicted B-cells epitopes were used for T- cells epitopes prediction. The predicted epitopes were linked to each other through GPGPG linkers and joined with cholera toxin beta subunit adjuvant. A multi-epitopes vaccine construct consisting of 226 residues was docked with MHC-I, MHC-II and TLR-4. The best docked complex in each case has binding energy of -714.6, -744.6 and -829.5 kcal/mol, respectively. Moreover, the docking results were validated through molecular dynamics simulation and binding free energies estimation. The net energy of -137.2 kcal/mol was calculated for vaccine-MHC-I complex, -133.39 kcal/mol for vaccine-MHC-II and -158.68 kcal/mol for vaccine-TLR-4 complex. The designed vaccine construct could provoke immune responses against targeted pathogen and may be used in experimental testing.Communicated by Ramaswamy H. Sarma.

RevDate: 2022-12-06

Wang M, Li J, Qi Z, et al (2022)

Genomic innovation and regulatory rewiring during evolution of the cotton genus Gossypium.

Nature genetics [Epub ahead of print].

Phenotypic diversity and evolutionary innovation ultimately trace to variation in genomic sequence and rewiring of regulatory networks. Here, we constructed a pan-genome of the Gossypium genus using ten representative diploid genomes. We document the genomic evolutionary history and the impact of lineage-specific transposon amplification on differential genome composition. The pan-3D genome reveals evolutionary connections between transposon-driven genome size variation and both higher-order chromatin structure reorganization and the rewiring of chromatin interactome. We linked changes in chromatin structures to phenotypic differences in cotton fiber and identified regulatory variations that decode the genetic basis of fiber length, the latter enabled by sequencing 1,005 transcriptomes during fiber development. We showcase how pan-genomic, pan-3D genomic and genetic regulatory data serve as a resource for delineating the evolutionary basis of spinnable cotton fiber. Our work provides insights into the evolution of genome organization and regulation and will inform cotton improvement by enabling regulome-based approaches.

RevDate: 2022-12-07
CmpDate: 2022-12-07

Yebra G, Harling-Lee JD, Lycett S, et al (2022)

Multiclonal human origin and global expansion of an endemic bacterial pathogen of livestock.

Proceedings of the National Academy of Sciences of the United States of America, 119(50):e2211217119.

Most new pathogens of humans and animals arise via switching events from distinct host species. However, our understanding of the evolutionary and ecological drivers of successful host adaptation, expansion, and dissemination are limited. Staphylococcus aureus is a major bacterial pathogen of humans and a leading cause of mastitis in dairy cows worldwide. Here we trace the evolutionary history of bovine S. aureus using a global dataset of 10,254 S. aureus genomes including 1,896 bovine isolates from 32 countries in 6 continents. We identified 7 major contemporary endemic clones of S. aureus causing bovine mastitis around the world and traced them back to 4 independent host-jump events from humans that occurred up to 2,500 y ago. Individual clones emerged and underwent clonal expansion from the mid-19th to late 20th century coinciding with the commercialization and industrialization of dairy farming, and older lineages have become globally distributed via established cattle trade links. Importantly, we identified lineage-dependent differences in the frequency of host transmission events between humans and cows in both directions revealing high risk clones threatening veterinary and human health. Finally, pangenome network analysis revealed that some bovine S. aureus lineages contained distinct sets of bovine-associated genes, consistent with multiple trajectories to host adaptation via gene acquisition. Taken together, we have dissected the evolutionary history of a major endemic pathogen of livestock providing a comprehensive temporal, geographic, and gene-level perspective of its remarkable success.

RevDate: 2022-12-07
CmpDate: 2022-12-07

Zhao C, Goldman M, Smith BJ, et al (2022)

Genotyping Microbial Communities with MIDAS2: From Metagenomic Reads to Allele Tables.

Current protocols, 2(12):e604.

The Metagenomic Intra-Species Diversity Analysis System 2 (MIDAS2) is a scalable pipeline that identifies single nucleotide variants and gene copy number variants in metagenomes using comprehensive reference databases built from public microbial genome collections (metagenotyping). MIDAS2 is the first metagenotyping tool with functionality to control metagenomic read mapping filters and to customize the reference database to the microbial community, features that improve the precision and recall of detected variants. In this article we present four basic protocols for the most common use cases of MIDAS2, along with supporting protocols for installation and use. In addition, we provide in-depth guidance on adjusting command line parameters, editing the reference database, optimizing hardware utilization, and understanding the metagenotyping results. All the steps of metagenotyping, from raw sequencing reads to population genetic analysis, are demonstrated with example data in two downloadable sequencing libraries of single-end metagenomic reads representing a mixture of multiple bacterial species. This set of protocols empowers users to accurately genotype hundreds of species in thousands of samples, providing rich genetic data for studying the evolution and strain-level ecology of microbial communities. © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Species prescreening Basic Protocol 2: Download MIDAS reference database Basic Protocol 3: Population single nucleotide variant calling Basic Protocol 4: Pan-genome copy number variant calling Support Protocol 1: Installing MIDAS2 Support Protocol 2: Command line inputs Support Protocol 3: Metagenotyping with a custom collection of genomes Support Protocol 4: Metagenotyping with advanced parameters.

RevDate: 2022-12-06

Pais AKL, Santos LVSD, Albuquerque GMR, et al (2022)

Comparative genomics and phylogenomics of the Ralstonia solanacearum Moko ecotype and its symptomatological variants.

Genetics and molecular biology, 45(4):e20220038 pii:S1415-47572022000500402.

Banana tree bacterial wilt is caused by the Ralstonia solanacearum Moko ecotype. These strains vary in their symptom progression in banana, and are classified as typical Moko variants (phylotype IIA and IIB strains from across Central and South America), Bugtok variant (Philippines), and Sergipe facies (the states of Sergipe and Alagoas, Brazil). This study used comparative genomic and phylogenomic approaches to identify a correlation between the symptom progression of the Moko ecotypes based on the analysis of 23 available genomes. Average nucleotide identity and in silico DNA-DNA hybridization revealed a high correlation (>96% and >78%, respectively) between the genomes of Moko variants. Pan-genome analysis identified 21.3% of inheritable regions between representatives of the typical Moko and Sergipe facies variants, which could be traced to an abundance of exclusive homolog clusters. Moko ecotype genomes shared 1,951 orthologous genes, but representatives with typical symptoms did not display unique orthologues. Moreover, Bugtok disease and Sergipe facies genomes did not share any unique genes, suggesting convergent evolution to a shared symptom progression. Overall, genomic and phylogenomic analyses were insufficient to differentiate the Moko variants based on symptom progression.

RevDate: 2022-12-06

Lee JH, Venkatesh J, Jo J, et al (2022)

High-quality chromosome-scale genomes facilitate effective identification of large structural variations in hot and sweet peppers.

Horticulture research, 9:uhac210.

Pepper (Capsicum annuum) is an important vegetable crop that has been subjected to intensive breeding, resulting in limited genetic diversity, especially for sweet peppers. Previous studies have reported pepper draft genome assemblies using short read sequencing, but their capture of the extent of large structural variants (SVs), such as presence-absence variants (PAVs), inversions, and copy-number variants (CNVs) in the complex pepper genome falls short. In this study, we sequenced the genomes of representative sweet and hot pepper accessions by long-read and/or linked-read methods and advanced scaffolding technologies. First, we developed a high-quality reference genome for the sweet pepper cultivar 'Dempsey' and then used the reference genome to identify SVs in 11 other pepper accessions and constructed a graph-based pan-genome for pepper. We annotated an average of 42 972 gene families in each pepper accession, defining a set of 19 662 core and 23 115 non-core gene families. The new pepper pan-genome includes informative variants, 222 159 PAVs, 12 322 CNVs, and 16 032 inversions. Pan-genome analysis revealed PAVs associated with important agricultural traits, including potyvirus resistance, fruit color, pungency, and pepper fruit orientation. Comparatively, a large number of genes are affected by PAVs, which is positively correlated with the high frequency of transposable elements (TEs), indicating TEs play a key role in shaping the genomic landscape of peppers. The datasets presented herein provide a powerful new genomic resource for genetic analysis and genome-assisted breeding for pepper improvement.

RevDate: 2022-12-06

Núñez-Montero K, Rojas-Villalta D, L Barrientos (2022)

Antarctic Sphingomonas sp. So64.6b showed evolutive divergence within its genus, including new biosynthetic gene clusters.

Frontiers in microbiology, 13:1007225.

INTRODUCTION: The antibiotic crisis is a major human health problem. Bioprospecting screenings suggest that proteobacteria and other extremophile microorganisms have biosynthetic potential for the production novel antimicrobial compounds. An Antarctic Sphingomonas strain (So64.6b) previously showed interesting antibiotic activity and elicitation response, then a relationship between environmental adaptations and its biosynthetic potential was hypothesized. We aimed to determine the genomic characteristics in So64.6b strain related to evolutive traits for the adaptation to the Antarctic environment that could lead to its diversity of potentially novel antibiotic metabolites.

METHODS: The complete genome sequence of the Antarctic strain was obtained and mined for Biosynthetic Gene Clusters (BGCs) and other unique genes related to adaptation to extreme environments. Comparative genome analysis based on multi-locus phylogenomics, BGC phylogeny, and pangenomics were conducted within the closest genus, aiming to determine the taxonomic affiliation and differential characteristics of the Antarctic strain.

RESULTS AND DISCUSSION: The Antarctic strain So64.6b showed a closest identity with Sphingomonas alpina, however containing a significant genomic difference of ortholog cluster related to degradation multiple pollutants. Strain So64.6b had a total of six BGC, which were predicted with low to no similarity with other reported clusters; three were associated with potential novel antibiotic compounds using ARTS tool. Phylogenetic and synteny analysis of a common BGC showed great diversity between Sphingomonas genus but grouping in clades according to similar isolation environments, suggesting an evolution of BGCs that could be linked to the specific ecosystems. Comparative genomic analysis also showed that Sphingomonas species isolated from extreme environments had the greatest number of predicted BGCs and a higher percentage of genetic content devoted to BGCs than the isolates from mesophilic environments. In addition, some extreme-exclusive clusters were found related to oxidative and thermal stress adaptations, while pangenome analysis showed unique resistance genes on the Antarctic strain included in genetic islands. Altogether, our results showed the unique genetic content on Antarctic strain Sphingomonas sp. So64.6, -a probable new species of this genetically divergent genus-, which could have potentially novel antibiotic compounds acquired to cope with Antarctic poly-extreme conditions.

RevDate: 2022-12-06

Jesus HNR, Rocha DJPG, Ramos RTJ, et al (2022)

Pan-genomic analysis of Corynebacterium amycolatum gives insights into molecular mechanisms underpinning the transition to a pathogenic phenotype.

Frontiers in microbiology, 13:1011578.

Corynebacterium amycolatum is a nonlipophilic coryneform which is increasingly being recognized as a relevant human and animal pathogen showing multidrug resistance to commonly used antibiotics. However, little is known about the molecular mechanisms involved in transition from colonization to the MDR invasive phenotype in clinical isolates. In this study, we performed a comprehensive pan-genomic analysis of C. amycolatum, including 26 isolates from different countries. We obtained the novel genome sequences of 8 of them, which are multidrug resistant clinical isolates from Spain and Tunisia. They were analyzed together with other 18 complete or draft C. amycolatum genomes retrieved from GenBank. The species C. amycolatum presented an open pan-genome (α = 0.854905), with 3,280 gene families, being 1,690 (51.52%) in the core genome, 1,121 related to accessory genes (34.17%), and 469 related to unique genes (14.29%). Although some classic corynebacterial virulence factors are absent in the species C. amycolatum, we did identify genes associated with immune evasion, toxin, and antiphagocytosis among the predicted putative virulence factors. Additionally, we found genomic evidence for extensive acquisition of antimicrobial resistance genes through genomic islands.

RevDate: 2022-12-06

Park J, Jung H, Mannaa M, et al (2022)

Genome-guided comparative in planta transcriptome analyses for identifying cross-species common virulence factors in bacterial phytopathogens.

Frontiers in plant science, 13:1030720.

Plant bacterial disease is a complex outcome achieved through a combination of virulence factors that are activated during infection. However, the common virulence factors across diverse plant pathogens are largely uncharacterized. Here, we established a pan-genome shared across the following plant pathogens: Burkholderia glumae, Ralstonia solanacearum, and Xanthomonas oryzae pv. oryzae. By overlaying in planta transcriptomes onto the pan-genome, we investigated the expression profiles of common genes during infection. We found over 70% of identical patterns for genes commonly expressed by the pathogens in different plant hosts or infection sites. Co-expression patterns revealed the activation of a signal transduction cascade to recognize and respond to external changes within hosts. Using mutagenesis, we uncovered a relationship between bacterial virulence and functions highly conserved and shared in the studied genomes of the bacterial phytopathogens, including flagellar biosynthesis protein, C4-dicarboxylate ABC transporter, 2-methylisocitrate lyase, and protocatechuate 3,4-dioxygenase (PCD). In particular, the disruption of PCD gene led to attenuated virulence in all pathogens and significantly affected phytotoxin production in B. glumae. This PCD gene was ubiquitously distributed in most plant pathogens with high homology. In conclusion, our results provide cross-species in planta models for identifying common virulence factors, which can be useful for the protection of crops against diverse pathogens.

RevDate: 2022-12-06

Tirnaz S, Zandberg J, Thomas WJW, et al (2022)

Application of crop wild relatives in modern breeding: An overview of resources, experimental and computational methodologies.

Frontiers in plant science, 13:1008904.

Global agricultural industries are under pressure to meet the future food demand; however, the existing crop genetic diversity might not be sufficient to meet this expectation. Advances in genome sequencing technologies and availability of reference genomes for over 300 plant species reveals the hidden genetic diversity in crop wild relatives (CWRs), which could have significant impacts in crop improvement. There are many ex-situ and in-situ resources around the world holding rare and valuable wild species, of which many carry agronomically important traits and it is crucial for users to be aware of their availability. Here we aim to explore the available ex-/in- situ resources such as genebanks, botanical gardens, national parks, conservation hotspots and inventories holding CWR accessions. In addition we highlight the advances in availability and use of CWR genomic resources, such as their contribution in pangenome construction and introducing novel genes into crops. We also discuss the potential and challenges of modern breeding experimental approaches (e.g. de novo domestication, genome editing and speed breeding) used in CWRs and the use of computational (e.g. machine learning) approaches that could speed up utilization of CWR species in breeding programs towards crop adaptability and yield improvement.

RevDate: 2022-12-06

Ma J, Wei H, Yu X, et al (2022)

Compared analysis with a high-quality genome of weedy rice reveals the evolutionary game of de-domestication.

Frontiers in plant science, 13:1065449.

The weedy rice (Oryza sativa f. spontanea) harbors large numbers of excellent traits and genetic diversities, which serves as a valuable germplasm resource and has been considered as a typical material for research about de-domestication. However, there are relatively few reference genomes on weedy rice that severely limit exploiting these genetic resources and revealing more details about de-domestication events. In this study, a high-quality genome (~376.4 Mb) of weedy rice A02 was assembled based on Nanopore ultra-long platform with a coverage depth of about 79.3× and 35,423 genes were predicted. Compared to Nipponbare genome, 5,574 structural variations (SVs) were found in A02. Based on super pan-genome graph, population SVs of 238 weedy rice and cultivated rice accessions were identified using public resequencing data. Furthermore, the de-domestication sites of weedy rice and domestication sites of wild rice were analyzed and compared based on SVs and single-nucleotide polymorphisms (SNPs). Interestingly, an average of 2,198 genes about de-domestication could only be found by F ST analysis based on SVs (SV-F ST) while not by F ST analysis based on SNPs (SNP-F ST) in divergent region. Additionally, there was a low overlap between domestication and de-domestication intervals, which demonstrated that two different mechanisms existed in these events. Our finding could facilitate pinpointing of the evolutionary events that had shaped the genomic architecture of wild, cultivated, and weedy rice, and provide a good foundation for cloning of the superior alleles for breeding.

RevDate: 2022-12-06
CmpDate: 2022-12-06

Xiang X, Diao E, Shang Y, et al (2022)

Rapid quantitative detection of Vibrio parahaemolyticus via high-fidelity target-based microfluidic identification.

Food research international (Ottawa, Ont.), 162(Pt A):112032.

With the rapid development of logistics, a growing number of pathogenic microorganisms has the means to spread worldwide using food as a carrier; thus, there is an urgent need to develop effective detection strategies to ensure food safety. By combining novel markers identified by pan-genome analysis and a digital recombinase-aided amplification (RAA) detection method based on a microfluidic chip, a strategy of high-fidelity target-based microfluidic identification (HFTMI) has been developed. Herein, a proof-of-concept study of HFTMI for rapid pathogen detection of V. parahaemolyticus was investigated. Specific primers designed for the gene group_41170 identified in the pan-genome analysis showed high sensitivity and a broad spectrum for the detection of V. parahaemolyticus. Different power systems were investigated to increase the partition rate on specifically designed chamber-based digital chips. The performance of HFTMI was greatly improved compared with qPCR. Collectively, this novel HFTMI system provides more reliable guidance for food safety testing.

RevDate: 2022-12-05

Marone MP, Singh HC, Pozniak CJ, et al (2022)

A technical guide to TRITEX, a computational pipeline for chromosome-scale sequence assembly of plant genomes.

Plant methods, 18(1):128.

BACKGROUND: As complete and accurate genome sequences are becoming easier to obtain, more researchers wish to get one or more of them to support their research endeavors. Reliable and well-documented sequence assembly workflows find use in reference or pangenome projects.

RESULTS: We describe modifications to the TRITEX genome assembly workflow motivated by the rise of fast and easy long-read contig assembly of inbred plant genomes and the routine deployment of the toolchains in pangenome projects. New features include the use as surrogates of or complements to dense genetic maps and the introduction of user-editable tables to make the curation of contig placements easier and more intuitive.

CONCLUSION: Even maximally contiguous sequence assemblies of the telomere-to-telomere sort, and to a yet greater extent, the fragmented kind require validation, correction, and comparison to reference standards. As pangenomics is burgeoning, these tasks are bound to become more widespread and TRITEX is one tool to get them done. This technical guide is supported by a step-by-step computational tutorial accessible under . The TRITEX source code is hosted under this URL: .

RevDate: 2022-12-01

Prondzinsky P, Toyoda S, SE McGlynn (2022)

The methanogen core and pangenome: conservation and variability across biology's growth temperature extremes.

DNA research : an international journal for rapid publication of reports on genes and genomes pii:6862058 [Epub ahead of print].

Temperature is a key variable in biological processes. However, a complete understanding of biological temperature adaptation is lacking, in part because of the unique constraints among different evolutionary lineages and physiological groups. Here we compared the genomes of cultivated psychrotolerant and thermotolerant methanogens, which are physiologically related and span growth temperatures from -2.5 °C to 122 °C. Despite being phylogenetically distributed amongst three phyla in the archaea, the methanogenic genome core comprises about one third of a given methanogen's genome, and the genome fraction shared by any two organisms decreases with increasing phylogenetic distance between them. Increased growth temperature is associated with reduced genome size, and thermotolerant organisms have larger core genome fractions, suggesting that genome reduction is governed by temperature rather than phylogeny. Thermotolerant methanogens are enriched in metal and other transporters, and psychrotolerant methanogens are enriched in proteins related to structure and motility. Observed amino acid compositional differences between temperature groups include proteome charge, polarity, and unfolding entropy. Our results suggest that in the methanogens, shared physiology maintains a large, conserved core even across large phylogenetic distances and biology's temperature extremes.

RevDate: 2022-12-01

Pham HM, Le DT, Le LT, et al (2022)

A highly quality genome sequence of Penicillium oxalicum species isolated from the root of Ixora chinensis in Vietnam.

G3 (Bethesda, Md.) pii:6858938 [Epub ahead of print].

Penicillium oxalicum has been reported as a multienzyme producing fungus and is widely used in industry due to great potential for cellulase release. Until now, there are only ten available genome assemblies of P. oxalicum species deposited in the Genbank database. In this study, the genome of the I1R1 strain isolated from the root of Ixora chinensis was completely sequenced by Pacbio Sequel sequencing technology, assembled into eight chromosomes with the genome size of 30.8 Mb, as well as a mitogenome of 26 Kb. The structural and functional analyses of the I1R1 genome revealed gene model annotations encoding an enzyme set involved in significant metabolic processes, along with cytochrome P450s and secondary metabolite biosynthesis. The comparative analysis of the P. oxalicum species based on orthology and gene family duplications indicated their large and closed pan genome of 9,500 orthologous groups. This is valuable data for future phylogenetic and population genomics studies.

RevDate: 2022-12-01

Rabanal FA, Gräff M, Lanz C, et al (2022)

Pushing the limits of HiFi assemblies reveals centromere diversity between two Arabidopsis thaliana genomes.

Nucleic acids research pii:6858746 [Epub ahead of print].

Although long-read sequencing can often enable chromosome-level reconstruction of genomes, it is still unclear how one can routinely obtain gapless assemblies. In the model plant Arabidopsis thaliana, other than the reference accession Col-0, all other accessions de novo assembled with long-reads until now have used PacBio continuous long reads (CLR). Although these assemblies sometimes achieved chromosome-arm level contigs, they inevitably broke near the centromeres, excluding megabases of DNA from analysis in pan-genome projects. Since PacBio high-fidelity (HiFi) reads circumvent the high error rate of CLR technologies, albeit at the expense of read length, we compared a CLR assembly of accession Eyach15-2 to HiFi assemblies of the same sample. The use of five different assemblers starting from subsampled data allowed us to evaluate the impact of coverage and read length. We found that centromeres and rDNA clusters are responsible for 71% of contig breaks in the CLR scaffolds, while relatively short stretches of GA/TC repeats are at the core of >85% of the unfilled gaps in our best HiFi assemblies. Since the HiFi technology consistently enabled us to reconstruct gapless centromeres and 5S rDNA clusters, we demonstrate the value of the approach by comparing these previously inaccessible regions of the genome between the Eyach15-2 accession and the reference accession Col-0.

RevDate: 2022-12-01

Belloso Daza MV, Almeida-Santos AC, Novais C, et al (2022)

Distinction between Enterococcus faecium and Enterococcus lactis by a gluP PCR-Based Assay for Accurate Identification and Diagnostics.

Microbiology spectrum [Epub ahead of print].

It was recently proposed that Enterococcus faecium colonizing the human gut (previous clade B) actually corresponds to Enterococcus lactis. Our goals were to develop a PCR assay to rapidly differentiate these species and to discuss the main phenotypic and genotypic differences from a clinical perspective. The pan-genome of 512 genomes of E. faecium and E. lactis strains was analyzed to assess diversity in genes between the two species. Sequences were aligned to find the best candidate gene for designing species-specific primers, and their accuracy was tested with a collection of 382 enterococci. E. lactis isolates from clinical origins were further characterized by whole-genome sequencing (Illumina). Pan-genome analysis resulted in 12 gene variants, with gene gluP (rhomboid protease) being selected as the candidate for species differentiation. The nucleotide sequence of gluP diverged by 90 to 92% between sets, which allowed species identification through PCR with 100% specificity and no cross-reactivity. E. lactis strains were greatly pan-susceptible and not host specific. Hospital E. lactis isolates were susceptible to clinically relevant antibiotics, lacked infection-associated virulence markers, and were associated with patients presenting risk factors for enhanced bacterial translocation. Here, we propose a PCR-based assay using gluP for easy routine differentiation between E. faecium and E. lactis that could be implemented in different public health contexts. We further suggest that E. lactis, a dominant human gut species, can cross the gut barrier in severely ill, immunodeficient, and surgical patients. Knowing that bacterial translocation may be a sepsis promoter, the relevance of infections caused by E. lactis strains, even if they are pan-susceptible, should be explored. IMPORTANCE Enterococcus faecium is a WHO priority pathogen that causes severe and hard-to-treat human infections. It was recently proposed that E. faecium colonizing the human gut (previous clade B) actually corresponds to Enterococcus lactis; therefore, some of the human infections occurring globally are being misidentified. In this work, we developed a PCR-based rapid identification method for the differentiation of E. faecium and E. lactis and discussed the main phenotypic and genotypic differences of these species from a clinical perspective. We identified the gluP gene as the best candidate, based on the phylogenomic analysis of 512 published pan-genomes, and validated the PCR assay with a comprehensive collection of 382 enterococci obtained from different sources. Further detailed analysis of clinical E. lactis strains showed that they are highly susceptible to antibiotics and lack the typical virulence markers of E. faecium but are able to cause severe human infections in immunosuppressed patients, possibly in part due to gut barrier translocation.

RevDate: 2022-12-02
CmpDate: 2022-12-02

Sarkar S, Kamke A, Ward K, et al (2022)

Pseudomonas cultivated from Andropogon gerardii rhizosphere show functional potential for promoting plant host growth and drought resilience.

BMC genomics, 23(1):784.

BACKGROUND: Climate change will result in more frequent droughts that can impact soil-inhabiting microbiomes (rhizobiomes) in the agriculturally vital North American perennial grasslands. Rhizobiomes have contributed to enhancing drought resilience and stress resistance properties in plant hosts. In the predicted events of more future droughts, how the changing rhizobiome under environmental stress can impact the plant host resilience needs to be deciphered. There is also an urgent need to identify and recover candidate microorganisms along with their functions, involved in enhancing plant resilience, enabling the successful development of synthetic communities.

RESULTS: In this study, we used the combination of cultivation and high-resolution genomic sequencing of bacterial communities recovered from the rhizosphere of a tallgrass prairie foundation grass, Andropogon gerardii. We cultivated the plant host-associated microbes under artificial drought-induced conditions and identified the microbe(s) that might play a significant role in the rhizobiome of Andropogon gerardii under drought conditions. Phylogenetic analysis of the non-redundant metagenome-assembled genomes (MAGs) identified a bacterial genome of interest - MAG-Pseudomonas. Further metabolic pathway and pangenome analyses recovered genes and pathways related to stress responses including ACC deaminase; nitrogen transformation including assimilatory nitrate reductase in MAG-Pseudomonas, which might be associated with enhanced drought tolerance and growth for Andropogon gerardii.

CONCLUSIONS: Our data indicated that the metagenome-assembled MAG-Pseudomonas has the functional potential to contribute to the plant host's growth during stressful conditions. Our study also suggested the nitrogen transformation potential of MAG-Pseudomonas that could impact Andropogon gerardii growth in a positive way. The cultivation of MAG-Pseudomonas sets the foundation to construct a successful synthetic community for Andropogon gerardii. To conclude, stress resilience mediated through genes ACC deaminase, nitrogen transformation potential through assimilatory nitrate reductase in MAG-Pseudomonas could place this microorganism as an important candidate of the rhizobiome aiding the plant host resilience under environmental stress. This study, therefore, provided insights into the MAG-Pseudomonas and its potential to optimize plant productivity under ever-changing climatic patterns, especially in frequent drought conditions.

RevDate: 2022-12-05
CmpDate: 2022-12-02

Groza C, Bourque G, C Goubert (2023)

A Pangenome Approach to Detect and Genotype TE Insertion Polymorphisms.

Methods in molecular biology (Clifton, N.J.), 2607:85-94.

Pangenome graphs are flexible data structures that contain the genetic variation that exists in a population of genomes and describe the sequences of the many possible ensuing haplotypes. Here, we use such a pangenome graph to represent and genotype transposable element (TE) polymorphisms. By combining the transposable element annotation (Alus, L1s, and SVAs) of the human genome reference with novel transposable element insertions observed in two high-quality assemblies (HG002 and HG00733), we show how to create a transposable element pangenome that consists of ~1.2 million reference and 2939 non-reference transposable elements. We then demonstrate this approach by aligning short-read sequencing data and genotyping transposable element deletions and insertions with reasonable specificity and sensitivity (0.85 F1-score).

RevDate: 2022-11-30

Garrison E, A Guarracino (2022)

Unbiased pangenome graphs.

Bioinformatics (Oxford, England) pii:6854971 [Epub ahead of print].

MOTIVATION: Pangenome variation graphs model the mutual alignment of collections of DNA sequences. A set of pairwise alignments implies a variation graph, but there are no scalable methods to generate such a graph from these alignments. Existing related approaches depend on a single reference, a specific ordering of genomes, or a de Bruijn model based on a fixed k-mer length. A scalable, self-contained method to build pangenome graphs without such limitations would be a key step in pangenome construction and manipulation pipelines.

RESULTS: We design the seqwish algorithm, which builds a variation graph from a set of sequences and alignments between them. We first transform the alignment set into an implicit interval tree. To build up the variation graph, we query this tree-based representation of the alignments to reduce transitive matches into single DNA segments in a sequence graph. By recording the mapping from input sequence to output graph, we can trace the original paths through this graph, yielding a pangenome variation graph. We present an implementation that operates in external memory, using disk-backed data structures and lock-free parallel methods to drive the core graph induction step. We demonstrate that our method scales to very large graph induction problems by applying it to build pangenome graphs for several species.

AVAILABILITY: seqwish is published as free software under the MIT open source license. Source code and documentation are available at seqwish can be installed via Bioconda or GNU Guix

RevDate: 2022-12-02

Moniruzzaman M, Erazo-Garcia MP, FO Aylward (2022)

Endogenous giant viruses contribute to intraspecies genomic variability in the model green alga Chlamydomonas reinhardtii.

Virus evolution, 8(2):veac102.

Chlamydomonas reinhardtii is a unicellular eukaryotic alga that has been studied as a model organism for decades. Despite an extensive history as a model system, phylogenetic and genetic characteristics of viruses infecting this alga have remained elusive. We analyzed high-throughput genome sequence data of C. reinhardtii field isolates, and in six we discovered sequences belonging to endogenous giant viruses that reach up to several 100 kb in length. In addition, we have also discovered the entire genome of a closely related giant virus that is endogenized within the genome of Chlamydomonas incerta, the closest sequenced relative of C. reinhardtii. Endogenous giant viruses add hundreds of new gene families to the host strains, highlighting their contribution to the pangenome dynamics and interstrain genomic variability of C. reinhardtii. Our findings suggest that the endogenization of giant viruses may have important implications for structuring the population dynamics and ecology of protists in the environment.

RevDate: 2022-11-29

Yu Y, Cheng W, Chen X, et al (2022)

Cyanobacterial Blooms Are Not a Result of Positive Selection by Freshwater Eutrophication.

Microbiology spectrum [Epub ahead of print].

Long-standing cyanobacterial harmful algal blooms (CyanoHABs) are known to result from synergistic interaction between elevated nutrients and superior ecophysiology of cyanobacteria. However, it remains to be determined whether CyanoHABs are a result of positive selection by eutrophic waters. To address this, we conducted molecular evolutionary analyses on the genomes of 9 bloom-forming cyanobacteria, combined with pangenomics and metatranscriptomics. The results showed no positive selection by water eutrophication. Instead, all homologous genes in the species are under strong purifying selection based on the ratio of divergence at nonsynonymous and synonymous sites (dN/dS) and phylogeny. The dN/dS < 0.85 (median = 0.3) for all homologous genes are similar between the genes in the pathways driving CyanoHABs and housekeeping functions. Phylogenetic support for non-positive selection comes from the mixed clustering of strains: strains of the same species from diverse geographic origins form the same clusters, while strains from the same origins form different clusters. Further support lies in the codon adaptation index (CAI) and single nucleotide polymorphism (SNP). The CAI ranged from 0.42 to 0.9 (mean = 0.75), which indicates high-level codon usage bias; the pathways for CyanoHABs and housekeeping functions showed a similar CAI. Interestingly, CAI was negatively correlated with gene expression in 3 metatranscriptomes. The numbers of SNPs were concentrated around 5 to 50. As the SNP number increases, the gene expression level decreases. These negative correlations agree with the population-level dN/dS and phylogeny in supporting purifying selection in bloom-forming cyanobacteria. In summary, superior ecophysiology appears to be acquired prior to water eutrophication. IMPORTANCE CyanoHABs are global environmental hazards, and their mechanisms of action are being intensively investigated. On an ecological scale, CyanoHABs are consequences of synergistic interactions between biological functions and elevated nutrients in eutrophic waters. On an evolutionary scale, one important question is how bloom-forming cyanobacteria acquire these superior biological functions. There are several possibilities, including adaptive evolution and horizontal gene transfer. Here, we explored the possibility of positive selection. We reasoned that there are two possible periods for cyanobacteria to acquire these functions: before the onset of water eutrophication or during water eutrophication. Either way, there should be molecular signatures in protein sequences for positive selection. Interestingly, we found no positive selection by water eutrophication, but strong purifying selection instead on nearly all the genes, suggesting these superior functions aiding CyanoHABs are acquired prior to water eutrophication.

RevDate: 2022-11-29

Cheng S, Fleres G, Chen L, et al (2022)

Within-Host Genotypic and Phenotypic Diversity of Contemporaneous Carbapenem-Resistant Klebsiella pneumoniae from Blood Cultures of Patients with Bacteremia.

mBio [Epub ahead of print].

It is unknown whether bacterial bloodstream infections (BSIs) are commonly caused by single organisms or mixed microbial populations. We hypothesized that contemporaneous carbapenem-resistant Klebsiella pneumoniae (CRKP) strains from blood cultures of individual patients are genetically and phenotypically distinct. We determined short-read whole-genome sequences of 10 sequence type 258 (ST258) CRKP strains from blood cultures in each of 6 patients (Illumina HiSeq). Strains clustered by patient by core genome and pan-genome phylogeny. In 5 patients, there was within-host strain diversity by gene mutations, presence/absence of antibiotic resistance or virulence genes, and/or plasmid content. Accessory gene phylogeny revealed strain diversity in all 6 patients. Strains from 3 patients underwent long-read sequencing for genome completion (Oxford Nanopore) and phenotypic testing. Genetically distinct strains within individuals exhibited significant differences in carbapenem and other antibiotic responses, capsular polysaccharide (CPS) production, mucoviscosity, and/or serum killing. In 2 patients, strains differed significantly in virulence during mouse BSIs. Genetic or phenotypic diversity was not observed among strains recovered from blood culture bottles seeded with index strains from the 3 patients and incubated in vitro at 37°C. In conclusion, we identified genotypic and phenotypic variant ST258 CRKP strains from blood cultures of individual patients with BSIs, which were not detected by the clinical laboratory or in seeded blood cultures. The data suggest a new paradigm of CRKP population diversity during BSIs, at least in some patients. If validated for BSIs caused by other bacteria, within-host microbial diversity may have implications for medical, microbiology, and infection prevention practices and for understanding antibiotic resistance and pathogenesis. IMPORTANCE The long-standing paradigm for pathogenesis of bacteremia is that, in most cases, a single organism passes through a bottleneck and establishes itself in the bloodstream (single-organism hypothesis). In keeping with this paradigm, standard practice in processing positive microbiologic cultures is to test single bacterial strains from morphologically distinct colonies. This study is the first genome-wide analysis of within-host diversity of Klebsiella pneumoniae strains recovered from individual patients with bloodstream infections (BSIs). Our finding that positive blood cultures comprised genetically and phenotypically heterogeneous carbapenem-resistant K. pneumoniae strains challenges the single-organism hypothesis and suggests that at least some BSIs are caused by mixed bacterial populations that are unrecognized by the clinical laboratory. The data support a model of pathogenesis in which pressures in vivo select for strain variants with particular antibiotic resistance or virulence attributes and raise questions about laboratory protocols and treatment decisions directed against single strains.

RevDate: 2022-11-29

Conde C, Thézé J, Cochard T, et al (2022)

Genetic Features of Mycobacterium avium subsp. paratuberculosis Strains Circulating in the West of France Deciphered by Whole-Genome Sequencing.

Microbiology spectrum [Epub ahead of print].

Paratuberculosis is a chronic infection of the intestine, mainly the ileum, caused by Mycobacterium avium subsp. paratuberculosis in cattle and other ruminants. This enzootic disease is present worldwide and has a negative impact on the dairy cattle industry. For this subspecies, the current genotyping tools do not provide the needed resolution to investigate the genetic diversity of closely related strains. These limitations can be overcome by the application of whole-genome sequencing (WGS), particularly for clonal populations such as M. avium subsp. paratuberculosis. The purpose of the present study was to undertake a WGS analysis with a panel of 200 animal field M. avium subsp. paratuberculosis strains selected based on a previous large-scale longitudinal study of Prim'Holstein and Normande dairy breeds naturally infected with M. avium subsp. paratuberculosis in the West of France. The pangenome analysis revealed that M. avium subsp. paratuberculosis has a closed pangenome. The phylogeny, based on alignment of 2,786 nonhomoplasic single nucleotide polymorphisms (SNPs), showed that the strain population is structured into three clades independently of the cattle breed or geographic distribution. The increased resolution of phylogeny obtained by WGS confirmed the homoplasic nature of the markers variable-number tandem repeat (VNTR) and short sequence repeat (SSR) used for M. avium subsp. paratuberculosis genotyping. These phylogenetic data also revealed independent introductions of the different genotypes in two main waves since at least 2003. WGS applied to this sampling demonstrated the presence of mixed infections in herds and at the individual animal level. Collectively, the phylogeny results inferred with French isolates compared to M. avium subsp. paratuberculosis isolates from around the world suggest introductions of M. avium subsp. paratuberculosis genotypes through the animal trade. Relationships between genetic traits and epidemiological data can now be investigated to better understand transmission dynamics of the disease. IMPORTANCE Mycobacterium avium subsp. paratuberculosis causes Johne's disease in ruminants, which is present worldwide and has significant negative impacts on the dairy cattle industry and animal welfare. Prevention and control of M. avium subsp. paratuberculosis infection are hampered by knowledge gaps in strain virulence, genotype distribution, and transmission dynamics. This work has revealed new insights into M. avium subsp. paratuberculosis strains currently circulating in western France and how they are related to strains circulating globally. We applied whole-genome sequencing (WGS) to obtain comprehensive information on genome evolution and discrimination of closely related strains. This approach revealed the history of M. avium subsp. paratuberculosis infection in France, refined the pangenomic characteristics of M. avium subsp. paratuberculosis, and demonstrated the existence of mixed infection in animals. Finally, this study identified predominant genotypes, which allow a better understanding of disease transmission dynamics. This information will facilitate tracking of this pathogen on farms and across agricultural regions, thus informing transmission pathways and disease control points.

RevDate: 2022-11-29

Singh V, Pandey S, A Bhardwaj (2022)

From the reference human genome to human pangenome: Premise, promise and challenge.

Frontiers in genetics, 13:1042550.

The Reference Human Genome remains the single most important resource for mapping genetic variations and assessing their impact. However, it is monophasic, incomplete and not representative of the variation that exists in the population. Given the extent of ethno-geographic diversity and the consequent diversity in clinical manifestations of these variations, population specific references were developed overtime. The dramatically plummeting cost of sequencing whole genomes and the advent of third generation long range sequencers allowing accurate, error free, telomere-to-telomere assemblies of human genomes present us with a unique and unprecedented opportunity to develop a more composite standard reference consisting of a collection of multiple genomes that capture the maximal variation existing in the population, with the deepest annotation possible, enabling a realistic, reliable and actionable estimation of clinical significance of specific variations. The Human Pangenome Project thus is a logical next step promising a more accurate and global representation of genomic variations. The pangenome effort must be reciprocally complemented with precise variant discovery tools and exhaustive annotation to ensure unambiguous clinical assessment of the variant in ethno-geographical context. Here we discuss a broad roadmap, the challenges and way forward in developing a universal pangenome reference including data visualization techniques and integration of prior knowledge base in the new graph based architecture and tools to submit, compare, query, annotate and retrieve relevant information from the pangenomes. The biggest challenge, however, will be the ethical, legal and social implications and the training of human resource to the new reference paradigm.

RevDate: 2022-11-29
CmpDate: 2022-11-29

Zoaiter M, Magdy Wasfy R, Caputo A, et al (2022)

Streptococcus bouchesdurhonensis sp. nov. isolated from a bronchoalveolar lavage of a patient with pneumonia.

Archives of microbiology, 205(1):3.

Strain Marseille-Q6994 was isolated from a 72-year-old patient with pneumonia from Bouches-du-Rhône department, in France. Cells were Gram positive, non-motile, catalase and oxidase-negative cocci. The major fatty acids were hexadecanoic (47.4%) and tetradecanoic acids (28.3%). 16S rRNA gene sequence comparison suggested that strain Marseille-Q6994 was affiliated to the Streptococcus genus. GroEL phylogenetic analysis separated strain Marseille-Q6994 in a distinct branch from the closely related Streptococcus-type strains with standing in nomenclature. Whole genome sequencing-based methods (OrthoAverage Nucleotide Identity, digital DNA-DNA hybridization and pangenome analysis) supported the classification of the strain into a novel species. Therefore, based on the phenotypic, genomic, and phylogenetic analyses, we propose the name Streptococcus bouchesdurhonensis sp. nov for which strain Marseille-Q6994[T] (CSUR Marseille-Q6994 = DSMZ 113892) is the type strain.

RevDate: 2022-11-28

Jha UC, Nayyar H, von Wettberg EJB, et al (2022)

Legume Pangenome: Status and Scope for Crop Improvement.

Plants (Basel, Switzerland), 11(22): pii:plants11223041.

In the last decade, legume genomics research has seen a paradigm shift due to advances in genome sequencing technologies, assembly algorithms, and computational genomics that enabled the construction of high-quality reference genome assemblies of major legume crops. These advances have certainly facilitated the identification of novel genetic variants underlying the traits of agronomic importance in many legume crops. Furthermore, these robust sequencing technologies have allowed us to study structural variations across the whole genome in multiple individuals and at the species level using 'pangenome analysis.' This review updates the progress of constructing pangenome assemblies for various legume crops and discusses the prospects for these pangenomes and how to harness the information to improve various traits of economic importance through molecular breeding to increase genetic gain in legumes and tackle the increasing global food crisis.

RevDate: 2022-11-29
CmpDate: 2022-11-29

Almuhayawi MS, Al Jaouni SK, Selim S, et al (2022)

Integrated Pangenome Analysis and Pharmacophore Modeling Revealed Potential Novel Inhibitors against Enterobacter xiangfangensis.

International journal of environmental research and public health, 19(22): pii:ijerph192214812.

Enterobacter xiangfangensis is a novel, multidrug-resistant pathogen belonging to the Enterobacter genus and has the ability to acquire resistance to multiple antibiotic classes. However, there is currently no registered E. xiangfangensis drug on the market that has been shown to be effective. Hence, there is an urgent need to identify novel therapeutic targets and effective treatments for E. xiangfangensis. In the current study, a bacterial pan genome analysis and subtractive proteomics approach was employed to the core proteomes of six strains of E. xiangfangensis using several bioinformatic tools, software, and servers. However, 2611 nonredundant proteins were predicted from the 21,720 core proteins of core proteome. Out of 2611 nonredundant proteins, 372 were obtained from Geptop2.0 as essential proteins. After the subtractive proteomics and subcellular localization analysis, only 133 proteins were found in cytoplasm. All cytoplasmic proteins were examined using BLASTp against the virulence factor database, which classifies 20 therapeutic targets as virulent. Out of these 20, 3 cytoplasmic proteins: ferric iron uptake transcriptional regulator (FUR), UDP-2,3diacylglucosamine diphosphatase (UDP), and lipid-A-disaccharide synthase (lpxB) were chosen as potential drug targets. These drug targets are important for bacterial survival, virulence, and growth and could be used as therapeutic targets. More than 2500 plant chemicals were used to molecularly dock these proteins. Furthermore, the lowest-binding energetic docked compounds were found. The top five hit compounds, Adenine, Mollugin, Xanthohumol C, Sakuranetin, and Toosendanin demonstrated optimum binding against all three target proteins. Furthermore, molecular dynamics simulations and MM/GBSA analyses validated the stability of ligand-protein complexes and revealed that these compounds could serve as potential E. xiangfangensis replication inhibitors. Consequently, this study marks a significant step forward in the creation of new and powerful drugs against E. xiangfangensis. Future studies should validate these targets experimentally to prove their function in E. xiangfangensis survival and virulence.

RevDate: 2022-11-29
CmpDate: 2022-11-29

González-Castillo A, Carballo JL, E Bautista-Guerrero (2022)

Genomics, Phylogeny, and in Silico Phenotyping of Nitrosopumilus Genus.

Current microbiology, 80(1):3.

The present study reports the first genome of Nitrosopumilus extracted from the marine sponge Thoosa mismalolli. The genomic study of Nitrosopumilus genus using seven genomes type strains (N. maritimus, N. piranensis, N. zosterae, N. ureiphilus, N. adriaticus, N. oxyclinae and N. cobalaminigenes), four genomes Candidatus species (Ca. N. koreensis, Ca. N. sp. AR2, Ca. N. salaria BD31, and SZUA-335), and six reference genomes (SI075, SI0036, SI0060, SI0034, SI0048, and bin36o) isolated from marine sponge, a tropical marine fish tank, dimly lit deep coastal waters, the lower euphotic zone of coastal waters, near-surface sediment, and MAG N. sp NMAG03 isolated from Thoosa mismalolli was performed. These genomes were characterized by means of a polyphasic approach comprising multilocus sequence analysis (MLSA) of 139 single-copy genes (SCG), core-pangenome, ANI, and in silico phenotypic characterization. We found that the genomes of the Nitrosopumilus genus formed three separate clusters (A, B, and C) based in 139 SCG sequence similarity. The genomes showed values between 75.2 and 99.5% for ANI, the core genome consisted of 168 gene families and the pangenome of 6,011 gene families. Based on the genomic analyses performed, the cluster A may contain a potential new species (NMAG03), and the cluster C could be represented by three new species of the genus. Finally, based on the results shown in this polyphasic approach, we support the use of the integrated approach for genomic analysis of poorly studied genera.

RevDate: 2022-11-26

Gtari M (2022)

Taxogenomic status of phylogenetically distant Frankia clusters warrants their elevation to the rank of genus: A description of Protofrankia gen. nov., Parafrankia gen. nov., and Pseudofrankia gen. nov. as three novel genera within the family Frankiaceae.

Frontiers in microbiology, 13:1041425.

The genus Frankia is at present the sole genus in the family Frankiaceae and encompasses filamentous, sporangia-forming actinomycetes principally isolated from root nodules of taxonomically disparate dicotyledonous hosts named actinorhizal plants. Multiple independent phylogenetic analyses agree with the division of the genus Frankia into four well-supported clusters. Within these clusters, Frankia strains are well defined based on host infectivity range, mode of infection, morphology, and their behaviour in culture. In this study, phylogenomics, overall genome related indices (OGRI), together with available data sets for phenotypic and host-plant ranges available for the type strains of Frankia species, were considered. The robustness and the deep radiation observed in Frankia at the subgeneric level, fulfilling the primary principle of phylogenetic systematics, were strengthened by establishing genome criteria for new genus demarcation boundaries. Therefore, the taxonomic elevation of the Frankia clusters to the rank of the genus is proposed. The genus Frankia should be revised to encompass cluster 1 species only and three novel genera, Protofrankia gen. nov., Parafrankia gen. nov., and Pseudofrankia gen. nov., are proposed to accommodate clusters 2, 3, and 4 species, respectively. New combinations for validly named species are also provided.

RevDate: 2022-11-29
CmpDate: 2022-11-28

Swetha RG, Basu S, Ramaiah S, et al (2022)

Multi-Epitope Vaccine for Monkeypox Using Pan-Genome and Reverse Vaccinology Approaches.

Viruses, 14(11):.

Outbreaks of monkeypox virus infections have imposed major health concerns worldwide, with high morbidity threats to children and immunocompromised adults. Although repurposed drugs and vaccines are being used to curb the disease, the evolving traits of the virus, exhibiting considerable genetic dynamicity, challenge the limits of a targeted treatment. A pan-genome-based reverse vaccinology approach can provide fast and efficient solutions to resolve persistent inconveniences in experimental vaccine design during an outbreak-exigency. The approach encompassed screening of available monkeypox whole genomes (n = 910) to identify viral targets. From 102 screened viral targets, viral proteins L5L, A28, and L5 were finalized based on their location, solubility, and antigenicity. The potential T-cell and B-cell epitopes were extracted from the proteins using immunoinformatics tools and algorithms. Multiple vaccine constructs were designed by combining the epitopes. Based on immunological properties, chemical stability, and structural quality, a novel multi-epitopic vaccine construct, V4, was finalized. Flexible-docking and coarse-dynamics simulation portrayed that the V4 had high binding affinity towards human HLA-proteins (binding energy < -15.0 kcal/mol) with low conformational fluctuations (<1 Å). Thus, the vaccine construct (V4) may act as an efficient vaccine to induce immunity against monkeypox, which encourages experimental validation and similar approaches against emerging viral infections.

RevDate: 2022-11-28
CmpDate: 2022-11-28

Jalil M, Quddos F, Anwer F, et al (2022)

Comparative Pan-Genomic Analysis Revealed an Improved Multi-Locus Sequence Typing Scheme for Staphylococcus aureus.

Genes, 13(11):.

The growing prevalence of antibiotic-resistant Staphylococcus aureus strains mandates selective susceptibility testing and epidemiological investigations. It also draws attention to an efficient typing strategy. Whole genome sequencing helps in genetic comparison, strain differentiation, and typing; however, it is not that cost-effective. In comparison, Multi-Locus Sequence Typing (MLST) is an efficient typing method employed for bacterial strain typing and characterizations. In this paper, a comprehensive pangenome and phylogenetic analysis of 502/1279 S. aureus genomes is carried out to understand the species divergence. Additionally, the current Multi-Locus Sequence Typing (MLST) scheme was evaluated, and genes were excluded or substituted by alternative genes based on reported shortcomings, genomic data, and statistical scores calculated. The data generated were helpful in devising a new Multi-Locus Sequence Typing (MLST) scheme for the efficient typing of S. aureus strains. The revised scheme is now a blend of previously used genes and new candidate genes. The genes yQil, aroE, and gmk are replaced with better gene candidates, opuCC, aspS, and rpiB, based on their genome localization, representation, and statistical scores. Therefore, the proposed Multi-Locus Sequence Typing (MLST) method offers a greater resolution with 58 sequence types (STs) in comparison to the prior scheme's 42 STs.

RevDate: 2022-11-24

Frankish A, Carbonell-Sala S, Diekhans M, et al (2022)

GENCODE: reference annotation for the human and mouse genomes in 2023.

Nucleic acids research pii:6845433 [Epub ahead of print].

GENCODE produces high quality gene and transcript annotation for the human and mouse genomes. All GENCODE annotation is supported by experimental data and serves as a reference for genome biology and clinical genomics. The GENCODE consortium generates targeted experimental data, develops bioinformatic tools and carries out analyses that, along with externally produced data and methods, support the identification and annotation of transcript structures and the determination of their function. Here, we present an update on the annotation of human and mouse genes, including developments in the tools, data, analyses and major collaborations which underpin this progress. For example, we report the creation of a set of non-canonical ORFs identified in GENCODE transcripts, the LRGASP collaboration to assess the use of long transcriptomic data to build transcript models, the progress in collaborations with RefSeq and UniProt to increase convergence in the annotation of human and mouse protein-coding genes, the propagation of GENCODE across the human pan-genome and the development of new tools to support annotation of regulatory features by GENCODE. Our annotation is accessible via Ensembl, the UCSC Genome Browser and

RevDate: 2022-11-25

Tripodi P (2022)

Next generation sequencing technologies to explore the diversity of germplasm resources: Achievements and trends in tomato.

Computational and structural biotechnology journal, 20:6250-6258.

Tomato is one of the major vegetable crops grown worldwide and a model species for genetic and biological research. Progress in genomic technologies made possible the development of forefront methods for high-scale sequencing, providing comprehensive insight into the genetic architecture of germplasm resources. This review revisits next-generation sequencing strategies and applications to investigate the diversity of tomato, describing the common platforms used for SNP genotyping of large collections, de novo sequencing, and whole genome resequencing. Significant findings in evolutionary history are outlined, thus discussing how genomics has provided new hints about the processes behind domestication. Finally, achievement and perspectives on pan-genome construction and graphical pan-genome development toward precise mining of the natural variation to be exploited for breeding purposes are presented.

RevDate: 2022-11-25

Wang Q, Zhang L, Zhang Y, et al (2022)

Comparative genomic analyses reveal genetic characteristics and pathogenic factors of Bacillus pumilus HM-7.

Frontiers in microbiology, 13:1008648.

Bacillus pumilus plays an important role in industrial application and biocontrol activities, as well as causing humans and plants disease, leading to economic losses and biosafety concerns. However, until now, the pathogenesis and underlying mechanisms of B. pumilus strains remain unclear. In our previous study, one representative isolate of B. pumilus named HM-7 has been recovered and proved to be the causal agent of fruit rot on muskmelon (Cucumis melo). Herein, we present a complete and annotated genome sequence of HM-7 that contains 4,111 coding genes in a single 3,951,520 bp chromosome with 41.04% GC content. A total of 3,481 genes were functionally annotated with the GO, COG, and KEGG databases. Pan-core genome analysis of HM-7 and 20 representative B. pumilus strains, as well as six closely related Bacillus species, discovered 740 core genes and 15,205 genes in the pan-genome of 21 B. pumilus strains, in which 485 specific-genes were identified in HM-7 genome. The average nucleotide identity (ANI), and whole-genome-based phylogenetic analysis revealed that HM-7 was most closely related to the C4, GR8, MTCC-B6033, TUAT1 and SH-B11 strains, but evolutionarily distinct from other strains in B. pumilus. Collinearity analysis of the six similar B. pumilus strains showed high levels of synteny but also several divergent regions for each strains. In the HM-7 genome, we identified 484 genes in the carbohydrate-active enzymes (CAZyme) class, 650 genes encoding virulence factors, and 1,115 genes associated with pathogen-host interactions. Moreover, three HM-7-specific regions were determined, which contained 424 protein-coding genes. Further investigation of these genes showed that 19 pathogenesis-related genes were mainly associated with flagella formation and secretion of toxic products, which might be involved in the virulence of strain HM-7. Our results provided detailed genomic and taxonomic information for the HM-7 strain, and discovered its potential pathogenic mechanism, which lay a foundation for developing effective prevention and control strategies against this pathogen in the future.

RevDate: 2022-11-25

Kumar P, Rani S, Dahiya P, et al (2022)

Whole genome analysis for plant growth promotion profiling of Pantoea agglomerans CPHN2, a non-rhizobial nodule endophyte.

Frontiers in microbiology, 13:998821.

Reduced agricultural production as well as issues like nutrient-depleted soils, eutrophication, and groundwater contamination have drawn attention to the use of endophyte-based bioformulations to restore soil fertility. Pantoea agglomerans CPHN2, a non-rhizobial nodule endophyte isolated from Cicer arietinum, exhibited a variety of plant growth-promoting traits. In this study, we used NextSeq500 technology to analyze whole-genome sequence information of this plant growth-promoting endophytic bacteria. The genome of P. agglomerans CPHN2 has a length of 4,839,532 bp and a G + C content of 55.2%. The whole genome comprises three different genomic fractions, comprising one circular chromosome and two circular plasmids. A comparative analysis between P. agglomerans CPHN2 and 10 genetically similar strains was performed using a bacterial pan-genome pipeline. All the predicted and annotated gene sequences for plant growth promotions (PGPs), such as phosphate solubilization, siderophore synthesis, nitrogen metabolism, and indole-3-acetic acid (IAA) of P. agglomerans CPHN2, were identified. The whole-genome analysis of P. agglomerans CPHN2 provides an insight into the mechanisms underlying PGP by endophytes and its potential applications as a biofertilizer.

RevDate: 2022-11-25
CmpDate: 2022-11-25

Brito LP, Santos DS, Freitas NSA, et al (2022)

In silico evaluation of genomic characteristics of Streptococcus infantarius subsp. infantarius for application in fermentations.

Anais da Academia Brasileira de Ciencias, 94(suppl 3):e20211447 pii:S0001-37652022000700904.

This study aims to evaluate the in silico genomic characteristics of Streptococcus infantarius subsp. infantarius, isolated from Coalho cheese from Paraíba, Brazil, with a view to application in lactic fermentations. rRNA sequences from the 16S ribosomal region were used as input to GenBank, in the search for patterns that could reveal a non-pathogenic behavior of S. infantarius subsp. infantarius, comparing mobile genetic elements, antibiotic resistance genes, pan-genome analysis and multi-genome alignment among related species. S. infantarius subsp. infantarius CJ18 was the only complete genome reported by BLAST/NCBI with high similarity and after comparative genetics with complete genomes of Streptococcus agalactiae (SAG153, NJ1606) and Streptococcus thermophilus (ST106, CS18, IDCC2201, APC151) revealed that CJ18 showed a low number of transposases and integrases, infection by phage bacteria of the Streptococcus genus, absence of antibiotic resistance genes and presence of bacteriocin, folate and riboflavin producing genes. The genome alignment revealed that the collinear blocks of S. thermophilus ST106 and S. agalactiae SAG153 have inverted blocks when compared to the CJ18 genome due to gene positioning, insertions and deletions. Therefore, the strains of S. infantarius subsp. infantarius isolated from Coalho cheese from Paraíba showed genomic similarity with CJ18 and the mobility of genes analyzed in silico showed absence of pathogenicity throughout the genome of CJ18, indicating the potential of these strains for the dairy industry.

RevDate: 2022-11-23

Yang L, Yang Y, Huang L, et al (2022)

From single- to multi-omics: future research trends in medicinal plants.

Briefings in bioinformatics pii:6840072 [Epub ahead of print].

Medicinal plants are the main source of natural metabolites with specialised pharmacological activities and have been widely examined by plant researchers. Numerous omics studies of medicinal plants have been performed to identify molecular markers of species and functional genes controlling key biological traits, as well as to understand biosynthetic pathways of bioactive metabolites and the regulatory mechanisms of environmental responses. Omics technologies have been widely applied to medicinal plants, including as taxonomics, transcriptomics, metabolomics, proteomics, genomics, pangenomics, epigenomics and mutagenomics. However, because of the complex biological regulation network, single omics usually fail to explain the specific biological phenomena. In recent years, reports of integrated multi-omics studies of medicinal plants have increased. Until now, there have few assessments of recent developments and upcoming trends in omics studies of medicinal plants. We highlight recent developments in omics research of medicinal plants, summarise the typical bioinformatics resources available for analysing omics datasets, and discuss related future directions and challenges. This information facilitates further studies of medicinal plants, refinement of current approaches and leads to new ideas.

RevDate: 2022-11-24

Golchha NC, Nighojkar A, S Nighojkar (2022)

Redefining genomic view of Clostridioides difficile through pangenome analysis and identification of drug targets from its core genome.

Drug target insights, 16:17-24.

INTRODUCTION:: Clostridioides difficile infection (CDI) is a leading cause of gastrointestinal infections and in the present day is a major concern for global health care system. The unavailability of specific antibiotics for CDI treatment and its emerging cases worldwide further broaden the challenge to control CDI.

METHODS:: The availability of a large number of genome sequences for C. difficile and many bioinformatics tools for genome analysis provides the opportunity for in silico pangenomic analysis. In the present study, 97 strains of C. difficile were used for pangenomic studies and characterized for their phylogenomic and functional analysis.

RESULTS:: Pangenome analysis reveals open pangenome of C. difficile and high genetic diversity. Sequence and interactome analysis of 1,481 core genes was done and eight potent drug targets are identified. Three drug targets, namely, aminodeoxychorismate synthase (PabB), D-alanyl-D-alanine carboxypeptidase (DD-CPase) and undecaprenyl diphospho-muramoyl pentapeptide beta-N-acetylglucosaminyl transferase (MurG transferase), have been reported as drug targets for other human pathogens, and five targets, namely, bifunctional diguanylate cyclase/phosphodiesterase (cyclic-diGMP), sporulation transcription factor (Spo0A), histidinol-phosphate transaminase (HisC), 3-deoxy-7-phosphoheptulonate synthase (DAHP synthase) and c-di-GMP phosphodiesterase (PdcA), are novel.

CONCLUSION:: The suggested potent targets could act as broad-spectrum drug targets for C. difficile. However, further validation needs to be done before using them for lead compound discovery.

RevDate: 2022-11-22

Sánchez-Suárez J, Díaz L, Coy-Barrera E, et al (2022)

Specialized Metabolism of Gordonia Genus: An Integrated Survey on Chemodiversity Combined with a Comparative Genomics-Based Analysis.

Biotech (Basel (Switzerland)), 11(4): pii:biotech11040053.

Members of the phylum Actinomycetota (formerly Actinobacteria) have historically been the most prolific providers of small bioactive molecules. Although the genus Streptomyces is the best-known member for this issue, other genera, such as Gordonia, have shown interesting potential in their specialized metabolism. Thus, we combined herein the result of a comprehensive literature survey on metabolites derived from Gordonia strains with a comparative genomic analysis to examine the potential of the specialized metabolism of the genus Gordonia. Thirty Gordonia-derived compounds of different classes were gathered (i.e., alkaloids, amides, phenylpropanoids, and terpenoids), exhibiting antimicrobial and cytotoxic activities, and several were also isolated from Streptomyces (e.g., actinomycin, nocardamin, diolmycin A1). With the genome data, we estimated an open pan-genome of 57,901 genes, most of them being part of the cloud genome. Regarding the BGCs content, 531 clusters were found, including Terpenes, RiPP-like, and NRPS clusters as the most frequent clusters. Our findings demonstrated that Gordonia is a poorly studied genus in terms of its specialized metabolism production and potential applications. Nevertheless, given their BGCs content, Gordonia spp. are a valuable biological resource that could expand the chemical spectrum of the phylum Actinomycetota, involving novel BGCs for inspiring innovative outlines for synthetic biology and further use in biotechnological initiatives. Therefore, further studies and more efforts should be made to explore different environments and evaluate other bioactivities.

RevDate: 2022-11-22

Mun T, Vaddadi NSK, B Langmead (2022)

Pangenomic Genotyping with the Marker Array.

Algorithms in bioinformatics : ... International Workshop, WABI ..., proceedings. WABI (Workshop), 242:.

We present a new method and software tool called rowbowt that applies a pangenome index to the problem of inferring genotypes from short-read sequencing data. The method uses a novel indexing structure called the marker array. Using the marker array, we can genotype variants with respect from large panels like the 1000 Genomes Project while avoiding the reference bias that results when aligning to a single linear reference. rowbowt can infer accurate genotypes in less time and memory compared to existing graph-based methods.

RevDate: 2022-11-21

Fullam A, Letunic I, Schmidt TSB, et al (2022)

proGenomes3: approaching one million accurately and consistently annotated high-quality prokaryotic genomes.

Nucleic acids research pii:6835361 [Epub ahead of print].

The interpretation of genomic, transcriptomic and other microbial 'omics data is highly dependent on the availability of well-annotated genomes. As the number of publicly available microbial genomes continues to increase exponentially, the need for quality control and consistent annotation is becoming critical. We present proGenomes3, a database of 907 388 high-quality genomes containing 4 billion genes that passed stringent criteria and have been consistently annotated using multiple functional and taxonomic databases including mobile genetic elements and biosynthetic gene clusters. proGenomes3 encompasses 41 171 species-level clusters, defined based on universal single copy marker genes, for which pan-genomes and contextual habitat annotations are provided. The database is available at

RevDate: 2022-12-01
CmpDate: 2022-12-01

Vij S, Thakur R, P Rishi (2022)

Reverse engineering approach: a step towards a new era of vaccinology with special reference to Salmonella.

Expert review of vaccines, 21(12):1763-1785.

INTRODUCTION: Salmonella is responsible for causing enteric fever, septicemia, and gastroenteritis in humans. Due to high disease burden and emergence of multi- and extensively drug-resistant Salmonella strains, it is becoming difficult to treat the infection with existing battery of antibiotics as we are not able to discover newer antibiotics at the same pace at which the pathogens are acquiring resistance. Though vaccines against Salmonella are available commercially, they have limited efficacy. Advancements in genome sequencing technologies and immunoinformatics approaches have solved the problem significantly by giving rise to a new era of vaccine designing, i.e. 'Reverse engineering.' Reverse engineering/vaccinology has expedited the vaccine identification process. Using this approach, multiple potential proteins/epitopes can be identified and constructed as a single entity to tackle enteric fever.

AREAS COVERED: This review provides details of reverse engineering approach and discusses various protein and epitope-based vaccine candidates identified using this approach against typhoidal Salmonella.

EXPERT OPINION: Reverse engineering approach holds great promise for developing strategies to tackle the pathogen(s) by overcoming the limitations posed by existing vaccines. Progressive advancements in the arena of reverse vaccinology, structural biology, and systems biology combined with an improved understanding of host-pathogen interactions are essential components to design new-generation vaccines.

RevDate: 2022-11-28
CmpDate: 2022-11-22

Guo Y, Zeng C, Ma C, et al (2022)

Comparative genomics analysis of the multidrug-resistant Aeromonas hydrophila MX16A providing insights into antibiotic resistance genes.

Frontiers in cellular and infection microbiology, 12:1042350.

In this paper, the whole genome of the multidrug-resistant Aeromonas hydrophila MX16A was comprehensively analyzed and compared after sequencing by PacBio RS II. To shed light on the drug resistance mechanism of A. hydrophila MX16A, a Kirby-Bauer disk diffusion method was used to assess the phenotypic drug susceptibility. Importantly, resistance against β-lactam, sulfonamides, rifamycins, macrolides, tetracyclines and chloramphenicols was largely consistent with the prediction analysis results of drug resistance genes in the CARD database. The varied types of resistance genes identified from A. hydrophila MX16A revealed multiple resistance mechanisms, including enzyme inactivation, gene mutation and active effusion. The publicly available complete genomes of 35 Aeromonas hydrophila strains on NCBI, including MX16A, were downloaded for genomic comparison and analysis. The analysis of 33 genomes with ANI greater than 95% showed that the pan-genome consisted of 9556 genes, and the core genes converged to 3485 genes. In summary, the obtained results showed that A. hydrophila exhibited a great genomic diversity as well as diverse metabolic function and it is believed that frequent exchanges between strains lead to the horizontal transfer of drug resistance genes.

RevDate: 2022-11-20

Orata FD, Hussain NAS, Liang KYH, et al (2022)

Genomes of Vibrio metoecus co-isolated with Vibrio cholerae extend our understanding of differences between these closely related species.

Gut pathogens, 14(1):42.

BACKGROUND: Vibrio cholerae, the causative agent of cholera, is a well-studied species, whereas Vibrio metoecus is a recently described close relative that is also associated with human infections. The availability of V. metoecus genomes provides further insight into its genetic differences from V. cholerae. Additionally, both species have been co-isolated from a cholera-free brackish coastal pond and have been suggested to interact with each other by horizontal gene transfer (HGT).

RESULTS: The genomes of 17 strains from each species were sequenced. All strains share a large core genome (2675 gene families) and very few genes are unique to each species (< 3% of the pan-genome of both species). This led to the identification of potential molecular markers-for nitrite reduction, as well as peptidase and rhodanese activities-to further distinguish V. metoecus from V. cholerae. Interspecies HGT events were inferred in 21% of the core genes and 45% of the accessory genes. A directional bias in gene transfer events was found in the core genome, where V. metoecus was a recipient of three times (75%) more genes from V. cholerae than it was a donor (25%).

CONCLUSION: V. metoecus was misclassified as an atypical variant of V. cholerae due to their resemblance in a majority of biochemical characteristics. More distinguishing phenotypic assays can be developed based on the discovery of potential gene markers to avoid any future misclassifications. Furthermore, differences in relative abundance or seasonality were observed between the species and could contribute to the bias in directionality of HGT.

RevDate: 2022-12-06
CmpDate: 2022-12-06

Lofgren LA, Ross BS, Cramer RA, et al (2022)

The pan-genome of Aspergillus fumigatus provides a high-resolution view of its population structure revealing high levels of lineage-specific diversity driven by recombination.

PLoS biology, 20(11):e3001890.

Aspergillus fumigatus is a deadly agent of human fungal disease where virulence heterogeneity is thought to be at least partially structured by genetic variation between strains. While population genomic analyses based on reference genome alignments offer valuable insights into how gene variants are distributed across populations, these approaches fail to capture intraspecific variation in genes absent from the reference genome. Pan-genomic analyses based on de novo assemblies offer a promising alternative to reference-based genomics with the potential to address the full genetic repertoire of a species. Here, we evaluate 260 genome sequences of A. fumigatus including 62 newly sequenced strains, using a combination of population genomics, phylogenomics, and pan-genomics. Our results offer a high-resolution assessment of population structure and recombination frequency, phylogenetically structured gene presence-absence variation, evidence for metabolic specificity, and the distribution of putative antifungal resistance genes. Although A. fumigatus disperses primarily via asexual conidia, we identified extraordinarily high levels of recombination with the lowest linkage disequilibrium decay value reported for any fungal species to date. We provide evidence for 3 primary populations of A. fumigatus, with recombination occurring only rarely between populations and often within them. These 3 populations are structured by both gene variation and distinct patterns of gene presence-absence with unique suites of accessory genes present exclusively in each clade. Accessory genes displayed functional enrichment for nitrogen and carbohydrate metabolism suggesting that populations may be stratified by environmental niche specialization. Similarly, the distribution of antifungal resistance genes and resistance alleles were often structured by phylogeny. Altogether, the pan-genome of A. fumigatus represents one of the largest fungal pan-genomes reported to date including many genes unrepresented in the Af293 reference genome. These results highlight the inadequacy of relying on a single-reference genome-based approach for evaluating intraspecific variation and the power of combined genomic approaches to elucidate population structure, genetic diversity, and putative ecological drivers of clinically relevant fungi.

RevDate: 2022-11-18

Jiang ZM, Deng Y, Han XF, et al (2022)

Geminicoccus flavidas sp. nov. and Geminicoccus harenae sp. nov., two IAA-producing novel rare bacterial species inhabiting desert biological soil crusts.

Frontiers in microbiology, 13:1034816.

Two Gram-staining negative strains (CPCC 101082[T] and CPCC 101083[T]) were isolated from biological sandy soil crusts samples collected from Badain Jaran desert, China. Both isolates were heterotrophic phototroph, could produce indole-3-acetic acid. The 16S rRNA gene sequences of these two strains were closely related to the members of the family Geminicoccaceae, showing high similarities with Geminicoccus roseus DSM 18922[T] (96.9%) and Arboricoccus pini B29T1[T] (90.1%), respectively. In phylogenetic tree based on 16S rRNA gene sequences, strain CPCC 101082[T] and CPCC 101083[T] formed a robust distinct clade with Geminicoccus roseus DSM 18922[T] within the family Geminicoccaceae, which indicated that these two isolates could be classified into the genus Geminicoccus. The growth of strain CPCC 101082[T] occurred at 15-42°C and pH 4.0-10.0 (optima at 28-37°C and pH 6.0-8.0). The growth of strain CPCC 101083[T] occurred at 4-45°C and pH 4.0-10.0 (optima at 25-30°C and pH 6.0-8.0). The major cellular fatty acids of CPCC 101082[T] and CPCC 101083[T] contained C18:1 ω7c/C18:1 ω6c, cyclo-C19:0 ω8c, and C16:0. Q-10 was detected as the sole respiratory quinone. Diphosphatidylglycerol, phosphatidylglycerol, phosphatidylcholine, phosphatidylethanolamine, an unidentified phospholipid and an unidentified aminolipid were tested in the polar lipids profile. The genomes of the two isolates were characterized as about 5.9 Mbp in size with the G + C content of nearly 68%. The IAA-producing encoding genes were predicated in both genomes. The values of average nucleotide identity were 80.6, 81.2 and 92.4% based on a pairwise comparison of the genomes of strains CPCC 101082[T] and CPCC 101083[T] and Geminicoccus roseus DSM 18922[T], respectively. On the basis of the genotypic, chemotaxonomic and phenotypic characteristics, the strains CPCC 101082[T] (=NBRC 113513[T] = KCTC 62853[T]) and CPCC 101083[T] (=NBRC 113514[T] = KCTC 62854[T]) are proposed to represent two novel species of the genus Geminicoccus with the names Geminicoccus flavidas sp. nov. and Geminicoccus harenae sp. nov.

RevDate: 2022-11-15

Daware A, Malik A, Srivastava R, et al (2022)

Rice Pangenome Array (RPGA): an efficient genotyping solution for pangenome-based accelerated crop improvement in rice.

The Plant journal : for cell and molecular biology [Epub ahead of print].

The advent of the pangenome era has unraveled previously unknown genetic variation existing within diverse crop plants, including rice. This untapped genetic variation is believed to account for a major portion of phenotypic variation existing in crop plants. However, the use of conventional single reference-guided genotyping often fails to capture large portion of this genetic variation leading to a reference bias. This makes it difficult to identify and utilize novel population/cultivar-specific genes for crop improvement. Thus, we developed a rice pangenome genotyping array (RPGA) harboring probes assaying 80K single nucleotide polymorphisms (SNPs) and presence-absence variants (PAVs) spanning the entire 3K rice pangenome. This array provides a simple, user-friendly and cost-effective (60 to 80 USD per sample) solution for rapid pangenome-based genotyping in rice. The GWAS conducted using RPGA-SNP genotyping data of a rice diversity panel detected a total of 42 loci, including previously known as well as novel genomic loci regulating grain size/weight traits in rice. Eight of these identified trait-associated loci (dispensable loci) could not be detected with conventional single reference genome-based GWAS. A WD repeat-containing PROTEIN 12 gene underlying one of such dispensable locus on chromosome 7 (qLWR7) along with other non-dispensable loci were subsequently detected using high-resolution QTL mapping confirming authenticity of RPGA-led GWAS. This demonstrates the potential of RPGA-based genotyping to overcome reference bias. The application of RPGA-based genotyping for population structure analysis, hybridity testing, ultra-high-density genetic map construction and chromosome-level genome assembly, and marker-assisted selection was also demonstrated. A web application ( was further developed to provide easy to use platform for the imputation of RPGA-based genotyping data using 3K Rice Reference Panel and subsequent GWAS.

RevDate: 2022-11-27

Tello D, Gonzalez-Garcia LN, Gomez J, et al (2022)

NGSEP 4: Efficient and accurate identification of orthogroups and whole-genome alignment.

Molecular ecology resources [Epub ahead of print].

Whole-genome alignment allows researchers to understand the genomic structure and variation among genomes. Approaches based on direct pairwise comparisons of DNA sequences require large computational capacities. As a consequence, pipelines combining tools for orthologous gene identification and synteny have been developed. In this manuscript, we present the latest functionalities implemented in NGSEP 4, to identify orthogroups and perform whole genome alignments. NGSEP implements functionalities for identification of clusters of homologus genes, synteny analysis and whole genome alignment. Our results showed that the NGSEP algorithm for orthogroups identification has competitive accuracy and efficiency in comparison to commonly used tools. The implementation also includes a visualization of the whole genome alignment based on synteny of the orthogroups that were identified, and a reconstruction of the pangenome based on frequencies of the orthogroups among the genomes. NGSEP 4 also includes a new graphical user interface based on the JavaFX technology. We expect that these new developments will be very useful for several studies in evolutionary biology and population genomics.

RevDate: 2022-11-30

Chivian D, Jungbluth SP, Dehal PS, et al (2022)

Metagenome-assembled genome extraction and analysis from microbiomes using KBase.

Nature protocols [Epub ahead of print].

Uncultivated Bacteria and Archaea account for the vast majority of species on Earth, but obtaining their genomes directly from the environment, using shotgun sequencing, has only become possible recently. To realize the hope of capturing Earth's microbial genetic complement and to facilitate the investigation of the functional roles of specific lineages in a given ecosystem, technologies that accelerate the recovery of high-quality genomes are necessary. We present a series of analysis steps and data products for the extraction of high-quality metagenome-assembled genomes (MAGs) from microbiomes using the U.S. Department of Energy Systems Biology Knowledgebase (KBase) platform ( Overall, these steps take about a day to obtain extracted genomes when starting from smaller environmental shotgun read libraries, or up to about a week from larger libraries. In KBase, the process is end-to-end, allowing a user to go from the initial sequencing reads all the way through to MAGs, which can then be analyzed with other KBase capabilities such as phylogenetic placement, functional assignment, metabolic modeling, pangenome functional profiling, RNA-Seq and others. While portions of such capabilities are available individually from other resources, the combination of the intuitive usability, data interoperability and integration of tools in a freely available computational resource makes KBase a powerful platform for obtaining MAGs from microbiomes. While this workflow offers tools for each of the key steps in the genome extraction process, it also provides a scaffold that can be easily extended with additional MAG recovery and analysis tools, via the KBase software development kit (SDK).

RevDate: 2022-11-14

Gonçalves Dos Santos R, Castillo RH, Neres Rodrigues DL, et al (2022)

Comparative genomic analysis of the Dietzia genus: an insight into genomic diversity, and adaptation.

Research in microbiology pii:S0923-2508(22)00079-1 [Epub ahead of print].

Dietzia strains are widely distributed in the environment, presenting an opportunistic role, and some species have undetermined taxonomic characteristics. Here, we propose the existence of errors in the classification of species in this genus using comparative genomics. We performed ANI, dDDH, pangenome and genomic plasticity analyses better to elucidate the phylogenomic relationships between Dietzia strains. For this, we used 55 genomes of Dietzia downloaded from public databases that were combined with a newly sequenced. Sequence analysis of a phylogenetic tree based on genome similarity comparisons and dDDH, ANI analyses supported grouping different Dietzia species into four distinct groups. The pangenome analysis corroborated the classification of these groups, supporting the idea that some species of Dietzia could be reassigned in a possible classification into three distinct species, each containing less variability than that found within the global pangenome of all strains. Additionally, analysis of genomic plasticity based on groups containing Dietzia strains found differences in the presence and absence of symbiotic Islands and pathogenic islands related to their isolation site. We propose that the comparison of pangenome subsets together with phylogenomic approaches can be used as an alternative for the classification and differentiation of new species of the genus Dietzia.

RevDate: 2022-11-29
CmpDate: 2022-11-29

Islam J, Sarkar H, Hoque H, et al (2022)

In-silico approach of identifying novel therapeutic targets against Yersinia pestis using pan and subtractive genomic analysis.

Computational biology and chemistry, 101:107784.

The magnitude of human affliction brought about by bacterial infections has been on the rise since the mid-5th century. Yersinia pestis is one such notable, gram-negative bacterium that inflicted havoc around the globe three times throughout different millenniums by causing deadly plagues. Despite the unremitting efforts by scientists, different strains of Yersinia pestis are still affecting the populations in various parts of the world by growing resistant to existing antimicrobial agents owing to their overuse. The current scenario, therefore, calls for new therapeutics to further combat the disease. In this study, 3105 core, 387 pathogen-specific unique, 536 choke-point, 796 virulence factors, and 115 antimicrobial resistant proteins were found using a pan-genomic and subtractive genome analysis of nine Yersinia pestis strains that could be instrumental in the development of drugs against Yersinia pestis. Subsequently, 1461 and 1114 essential proteins were identified as non-homologous to human and gut microflora. 535 and 30 proteins were predicted as cytoplasmic and broad-spectrum targets respectively. Finally, four potential targets were selected for their high connectivity in protein-protein interaction network. These selected target proteins are associated with one of the major lipopolysaccharide biosynthesis pathways. Therefore, dismantling their activity might indicate a probable strategy for developing therapeutics to combat bacterial infection caused by Yersinia pestis. However, further experimental validation in the laboratory is needed to consolidate the research findings.

RevDate: 2022-11-15
CmpDate: 2022-11-15

Qu L, Li Y, Wang W, et al (2022)

Aestuarium zhoushanense is a later heterotypic synonym of Marivivens donghaensis, and transfer of Paradonghicola geojensis to the genus Marivivens as Marivivens geojensis comb. nov.

International journal of systematic and evolutionary microbiology, 72(11):.

The 16S rRNA genes of Aestuarium zhoushanense G7[T] and Paradonghicola geojensis FJ12[T] shared 100 % sequence identity with Marivivens donghaensis AM-4[T]. Phylogeny of 16S rRNA gene sequences showed that the three type strains formed a monophyletic clade within the genus Marivivens. Whole genome sequence comparisons showed that three type strains shared 46.7-69.7 % digital DNA-DNA hybridization, 92.1-96.4 % average nucleotide identity and 96.2-98.1 % average amino acid identity. The high 16S rRNA gene similarity values show that three type strains should belong to the same genus. The pan-genome of the five strains contained 5754 genes including 1877 core genes. Based on the principle of priority, we propose that A. zhoushanense Yu et al. 2019 is a later heterotypic synonym of M. donghaensis Park et al. 2016, and P. geojensis should be reclassified as Marivivens geojensis comb. nov., respectively.

RevDate: 2022-11-29

Mushtaq M, Khan S, Hassan M, et al (2022)

Computational Design of a Chimeric Vaccine against Plesiomonas shigelloides Using Pan-Genome and Reverse Vaccinology.

Vaccines, 10(11):.

The swift emergence of antibiotic resistance (AR) in bacterial pathogens to make themselves adaptable to changing environments has become an alarming health issue. To prevent AR infection, many ways can be accomplished such as by decreasing the misuse of antibiotics in human and animal medicine. Among these AR bacterial species, Plesiomonas shigelloides is one of the etiological agents of intestinal infection in humans. It is a gram-negative rod-shaped bacterium that is highly resistant to several classes of antibiotics, and no licensed vaccine against the aforementioned pathogen is available. Hence, substantial efforts are required to screen protective antigens from the pathogen whole genome that can be subjected easily to experimental evaluations. Here, we employed a reverse vaccinology (RV) approach to design a multi-antigenic epitopes based vaccine against P. shigelloides. The complete genomes of P. shigelloides were retrieved from the National Center for Biotechnological Information (NCBI) that on average consist of 5226 proteins. The complete proteomes were subjected to different subtractive proteomics filters, and in the results of that analysis, out of total proteins, 2399 were revealed as non-redundant and 2827 as redundant proteins. The non-redundant proteins were further checked for subcellular localization analysis, in which three were localized in the extracellular matrix, eight were outer membrane, and 13 were found in the periplasmic membrane. All surface localized proteins were found to be virulent. Out of a total of 24 virulent proteins, three proteins (flagellar hook protein (FlgE), hypothetical protein, and TonB-dependent hemoglobin/transferrin/lactoferrin family receptor protein) were considered as potential vaccine targets and subjected to epitopes prediction. The predicted epitopes were further examined for antigenicity, toxicity, and solubility. A total of 10 epitopes were selected (GFKESRAEF, VQVPTEAGQ, KINENGVVV, ENKALSQET, QGYASANDE, RLNPTDSRW, TLDYRLNPT, RVTKKQSDK, GEREGKNRP, RDKKTNQPL). The selected epitopes were linked with each other via specific GPGPG linkers in order to design a multi-epitopes vaccine construct, and linked with cholera toxin B subunit adjuvant to make the designed vaccine construct more efficient in terms of antigenicity. The 3D structure of the vaccine construct was modeled ab initio as no appropriate template was available. Furthermore, molecular docking was carried out to check the interaction affinity of the designed vaccine with major histocompatibility complex (MHC-)I (PDB ID: 1L1Y), MHC-II (1KG0), and toll-like receptor 4 ((TLR-4) (PDB: 4G8A). Molecular dynamic simulation was applied to evaluate the dynamic behavior of vaccine-receptor complexes. Lastly, the binding free energies of the vaccine with receptors were estimated by using MMPB/GBSA methods. All of the aforementioned analyses concluded that the designed vaccine molecule as a good candidate to be used in experimental studies to disclose its immune protective efficacy in animal models.

RevDate: 2022-11-29

Murr L, Huber I, Pavlovic M, et al (2022)

Whole-Genome Sequence Comparisons of Listeria monocytogenes Isolated from Meat and Fish Reveal High Inter- and Intra-Sample Diversity.

Microorganisms, 10(11):.

Interpretation of whole-genome sequencing (WGS) data for foodborne outbreak investigations is complex, as the genetic diversity within processing plants and transmission events need to be considered. In this study, we analyzed 92 food-associated Listeria monocytogenes isolates by WGS-based methods. We aimed to examine the genetic diversity within meat and fish production chains and to assess the applicability of suggested thresholds for clustering of potentially related isolates. Therefore, meat-associated isolates originating from the same samples or processing plants as well as fish-associated isolates were analyzed as distinct sets. In silico serogrouping, multilocus sequence typing (MLST), core genome MLST (cgMLST), and pangenome analysis were combined with screenings for prophages and genetic traits. Isolates of the same subtypes (cgMLST types (CTs) or MLST sequence types (STs)) were additionally compared by SNP calling. This revealed the occurrence of more than one CT within all three investigated plants and within two samples. Analysis of the fish set resulted in predominant assignment of isolates from pangasius catfish and salmon to ST2 and ST121, respectively, potentially indicating persistence within the respective production chains. The approach not only allowed the detection of distinct subtypes but also the determination of differences between closely related isolates, which need to be considered when interpreting WGS data for surveillance.

RevDate: 2022-11-17
CmpDate: 2022-11-14

Khoder M, Osman M, Kassem II, et al (2022)

Whole Genome Analyses Accurately Identify Neisseria spp. and Limit Taxonomic Ambiguity.

International journal of molecular sciences, 23(21):.

Genome sequencing facilitates the study of bacterial taxonomy and allows the re-evaluation of the taxonomic relationships between species. Here, we aimed to analyze the draft genomes of four commensal Neisseria clinical isolates from the semen of infertile Lebanese men. To determine the phylogenetic relationships among these strains and other Neisseria spp. and to confirm their identity at the genomic level, we compared the genomes of these four isolates with the complete genome sequences of Neisseria gonorrhoeae and Neisseria meningitidis and the draft genomes of Neisseria flavescens, Neisseria perflava, Neisseria mucosa, and Neisseria macacae that are available in the NCBI Genbank database. Our findings revealed that the WGS analysis accurately identified and corroborated the matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) species identities of the Neisseria isolates. The combination of three well-established genome-based taxonomic tools (in silico DNA-DNA Hybridization, Ortho Average Nucleotide identity, and pangenomic studies) proved to be relatively the best identification approach. Notably, we also discovered that some Neisseria strains that are deposited in databases contain many taxonomical errors. The latter is very important and must be addressed to prevent misdiagnosis and missing emerging etiologies. We also highlight the need for robust cut-offs to delineate the species using genomic tools.

RevDate: 2022-11-17
CmpDate: 2022-11-14

Hameed A, Poznanski P, Nadolska-Orczyk A, et al (2022)

Graph Pangenomes Track Genetic Variants for Crop Improvement.

International journal of molecular sciences, 23(21):.

Global climate change and the urgency to transform crops require an exhaustive genetic evaluation. The large polyploid genomes of food crops, such as cereals, make it difficult to identify candidate genes with confirmed hereditary. Although genome-wide association studies (GWAS) have been proficient in identifying genetic variants that are associated with complex traits, the resolution of acquired heritability faces several significant bottlenecks such as incomplete detection of structural variants (SV), genetic heterogeneity, and/or locus heterogeneity. Consequently, a biased estimate is generated with respect to agronomically complex traits. The graph pangenomes have resolved this missing heritability and provide significant details in terms of specific loci segregating among individuals and evolving to variations. The graph pangenome approach facilitates crop improvements through genome-linked fast breeding.

RevDate: 2022-11-17

Cinque A, Minnei R, Floris M, et al (2022)

The Clinical and Molecular Features in the VHL Renal Cancers; Close or Distant Relatives with Sporadic Clear Cell Renal Cell Carcinoma?.

Cancers, 14(21):.

Von Hippel-Lindau (VHL) disease is an autosomal dominant inherited cancer syndrome caused by germline mutations in the VHL tumor suppressor gene, characterized by the susceptibility to a wide array of benign and malign neoplasms, including clear-cell renal cell carcinoma. Moreover, VHL somatic inactivation is a crucial molecular event also in sporadic ccRCCs tumorigenesis. While systemic biomarkers in the VHL syndrome do not currently play a role in clinical practice, a new promising class of predictive biomarkers, microRNAs, has been increasingly studied. Lots of pan-genomic studies have deeply investigated the possible biological role of microRNAs in the development and progression of sporadic ccRCC; however, few studies have investigated the miRNA profile in VHL patients. Our review summarize all the new insights related to clinical and molecular features in VHL renal cancers, with a particular focus on the overlap with sporadic ccRCC.

RevDate: 2022-11-26

Moglad E, Alanazi N, HN Altayb (2022)

Genomic Study of Chromosomally and Plasmid-Mediated Multidrug Resistance and Virulence Determinants in Klebsiella Pneumoniae Isolates Obtained from a Tertiary Hospital in Al-Kharj, KSA.

Antibiotics (Basel, Switzerland), 11(11):.

Klebsiella pneumoniae is an emergent pathogen causing respiratory tract, bloodstream, and urinary tract infections in humans. This study defines the genomic sequence data, genotypic and phenotypic characterization of K. pneumoniae clinically isolated from Al-Kharj, KSA. Whole-genome analysis of four K. pneumoniae strains was performed, including de novo assembly, functional annotation, whole-genome-phylogenetic analysis, antibiotic-resistant gene identification, prophage regions, virulent factor, and pan-genome analysis. The results showed that K6 and K7 strains were MDR and ESBL producers, K16 was an ESBL producer, and K8 was sensitive to all tested drugs except ampicillin. K6 and K7 were identified with sequence type (ST) 23, while K16 and K8 were identified with STs 353 and 592, respectively. K6 and K7 were identified with the K1 (wzi1 genotype) capsule and O1 serotype, while K8 was identified with the K57 (wzi206 genotype) capsule and O3b. K6 isolates harbored 10 antimicrobial resistance genes (ARGs) associated with four different plasmids; the chloramphenicol acetyltransferase (catB3), blaOXA-1 and aac(6')-Ib-cr genes were detected in plasmid pB-8922_OXA-48. K6 and K7 also carried a similar gene cassette in plasmid pC1K6P0122-2; the gene cassettes were the trimethoprim-resistant gene (dfrA14), integron integrase (IntI1), insertion sequence (IS1), transposase protein, and replication initiation protein (RepE). Two hypervirulent plasmids were reported in isolates K6 and K7 that carried synthesis genes (iucA, iucB, iucC, iucD, and iutA) and iron siderophore genes (iroB, iroC, iroD, and iroN). The presence of these plasmids in high-risk clones suggests their dissemination in our region, which represents a serious health problem.

RevDate: 2022-11-24

Oren E, Dafna A, Tzuri G, et al (2022)

Pan-genome and multi-parental framework for high-resolution trait dissection in melon (Cucumis melo).

The Plant journal : for cell and molecular biology [Epub ahead of print].

Linking genotype with phenotype is a fundamental goal in biology and requires robust data for both. Recent advances in plant-genome sequencing have expedited comparisons among multiple-related individuals. The abundance of structural genomic within-species variation that has been discovered indicates that a single reference genome cannot represent the complete sequence diversity of a species, leading to the expansion of the pan-genome concept. For high-resolution forward genetics, this unprecedented access to genomic variation should be paralleled and integrated with phenotypic characterization of genetic diversity. We developed a multi-parental framework for trait dissection in melon (Cucumis melo), leveraging a novel pan-genome constructed for this highly variable cucurbit crop. A core subset of 25 diverse founders (MelonCore25), consisting of 24 accessions from the two widely cultivated subspecies of C. melo, encompassing 12 horticultural groups, and 1 feral accession was sequenced using a combination of short- and long-read technologies, and their genomes were assembled de novo. The construction of this melon pan-genome exposed substantial variation in genome size and structure, including detection of ~300 000 structural variants and ~9 million SNPs. A half-diallel derived set of 300 F2 populations, representing all possible MelonCore25 parental combinations, was constructed as a framework for trait dissection through integration with the pan-genome. We demonstrate the potential of this unified framework for genetic analysis of various melon traits, including rind color intensity and pattern, fruit sugar content, and resistance to fungal diseases. We anticipate that utilization of this integrated resource will enhance genetic dissection of important traits and accelerate melon breeding.

RevDate: 2022-11-09

Dong X, Zhu M, Li Y, et al (2022)

Whole-Genome Sequencing-Based Species Classification, Multilocus Sequence Typing, and Antimicrobial Resistance Mechanism Analysis of the Enterobacter cloacae Complex in Southern China.

Microbiology spectrum [Epub ahead of print].

Members of the Enterobacter cloacae complex (ECC) are important opportunistic nosocomial pathogens that are associated with a great variety of infections. Due to limited data on the genome-based classification of species and investigation of resistance mechanisms, in this work, we collected 172 clinical ECC isolates between 2019 and 2020 from three hospitals in Zhejiang, China and performed a retrospective whole-genome sequencing to analyze their population structure and drug resistance mechanisms. Of the 172 ECC isolates, 160 belonged to 9 classified species, and 12 belonged to unclassified species based on ANI analysis. Most isolates belonged to E. hormaechei (45.14%) followed by E. kobei (13.71%), which contained 126 STs, including 62 novel STs, as determined by multilocus sequence typing (MLST) analysis. Pan-genome analysis of the two ECC species showed that they have an "open" tendency, which indicated that their Pan-genome increased considerably with the addition of new genomes. A total of 80 resistance genes associated with 11 antimicrobial agent categories were identified in the genomes of all the isolates. The most prevailing resistance genes (12/29, 41.38%) were related to β-lactams followed by aminoglycosides. A total of 247 β-lactamase genes were identified, of which the blaACT genes were the most dominant (145/247, 58.70%), followed by the blaTEM genes (21/247, 8.50%). The inherent ACT type β-lactamase genes differed among different species. blaACT-2 and blaACT-3 were only present in E. asburiae, while blaACT-9, blaACT-12, and blaACT-6 exclusively appeared in E. kobei, E. ludwigii, and E. mori. Among the six carbapenemase-encoding genes (blaNDM-1, blaNDM-5, blaIMP-1, blaIMP-4, blaIMP-26, and blaKPC-2) identified, two (blaNDM-1 and blaIMP-1) were identified in an ST78 E. hormaechei isolate. Comparative genomic analysis of the carbapenemase gene-related sequences was performed, and the corresponding genetic structure of these resistance genes was analyzed. Genome-wide molecular characterization of the ECC population and resistance mechanism would offer valuable insights into the effective management of ECC infection in clinical settings. IMPORTANCE The presence and emergence of multiple species/subspecies of ECC have led to diversity and complications at the taxonomic level, which impedes our further understanding of the epidemiology and clinical significance of species/subspecies of ECC. Accurate identification of ECC species is extremely important. Also, it is of great importance to study the carbapenem-resistant genes in ECC and to further understand the mechanism of horizontal transfer of the resistance genes by analyzing the surrounding environment around the genes. The occurrence of ECC carrying two MBL genes also indicates that the selection pressure of bacteria is further increased, suggesting that we need to pay special attention to the emergence of such bacteria in the clinic.

RevDate: 2022-11-16
CmpDate: 2022-11-09

Otani H, Udwary DW, NJ Mouncey (2022)

Comparative and pangenomic analysis of the genus Streptomyces.

Scientific reports, 12(1):18909.

Streptomycetes are highly metabolically gifted bacteria with the abilities to produce bioproducts that have profound economic and societal importance. These bioproducts are produced by metabolic pathways including those for the biosynthesis of secondary metabolites and catabolism of plant biomass constituents. Advancements in genome sequencing technologies have revealed a wealth of untapped metabolic potential from Streptomyces genomes. Here, we report the largest Streptomyces pangenome generated by using 205 complete genomes. Metabolic potentials of the pangenome and individual genomes were analyzed, revealing degrees of conservation of individual metabolic pathways and strains potentially suitable for metabolic engineering. Of them, Streptomyces bingchenggensis was identified as a potent degrader of plant biomass. Polyketide, non-ribosomal peptide, and gamma-butyrolactone biosynthetic enzymes are primarily strain specific while ectoine and some terpene biosynthetic pathways are highly conserved. A large number of transcription factors associated with secondary metabolism are strain-specific while those controlling basic biological processes are highly conserved. Although the majority of genes involved in morphological development are highly conserved, there are strain-specific varieties which may contribute to fine tuning the timing of cellular differentiation. Overall, these results provide insights into the metabolic potential, regulation and physiology of streptomycetes, which will facilitate further exploitation of these important bacteria.

RevDate: 2022-11-08

Lynch T, Nandi T, Jayaprakash T, et al (2022)

Genomic analysis of group A Streptococcus isolated during a correctional facility outbreak of MRSA in 2004.

Journal of the Association of Medical Microbiology and Infectious Disease Canada = Journal officiel de l'Association pour la microbiologie medicale et l'infectiologie Canada, 7(1):23-35.

BACKGROUND: In 2004-2005, an outbreak of impetigo occurred at a correctional facility during a sentinel outbreak of methicillin- resistant Staphylococcus aureus (MRSA) in Alberta, Canada. Next-generation sequencing (NGS) was used to characterize the group A Streptococcus (GAS) isolates and evaluate whether genomic biomarkers could distinguish between those recovered alone and those co-isolated with S. aureus.

METHODS: Superficial wound swabs collected from all adults with impetigo during this outbreak were cultured using standard methods. NGS was used to characterize and compare all of the GAS and S. aureus genomes.

RESULTS: Fifty-three adults were culture positive for GAS, with a subset of specimens also positive for MRSA (n = 5) or methicillin-sensitive S. aureus (n = 3). Seventeen additional MRSA isolates from this facility from the same time frame (no GAS co-isolates) were also included. All 78 bacterial genomes were analyzed for the presence of known virulence factors, plasmids, and antimicrobial resistance (AMR) genes. Among the GAS isolates were 12 emm types, the most common being 41.2 (n = 27; 51%). GAS genomes were phylogenetically compared with local and public datasets of invasive and non-invasive isolates. GAS genomes had diverse profiles for virulence factors, plasmids, and AMR genes. Pangenome analysis did not identify horizontally transferred genes in the co-infection versus single infections.

CONCLUSIONS: GAS recovered from invasive and non-invasive sources were not genetically distinguishable. Virulence factors, plasmids, and AMR profiles grouped by emm type, and no genetic changes were identified that predict co-infection or horizontal gene transfer between GAS and S. aureus.

RevDate: 2022-12-02
CmpDate: 2022-12-01

Weigert S, Perez-Garcia P, Gisdon FJ, et al (2022)

Investigation of the halophilic PET hydrolase PET6 from Vibrio gazogenes.

Protein science : a publication of the Protein Society, 31(12):e4500.

The handling of plastic waste and the associated ubiquitous occurrence of microplastic poses one of the biggest challenges of our time. Recent investigations of plastic degrading enzymes have opened new prospects for biological microplastic decomposition as well as recycling applications. For polyethylene terephthalate, in particular, several natural and engineered enzymes are known to have such promising properties. From a previous study that identified new PETase candidates by homology search, we chose the candidate PET6 from the globally distributed, halophilic organism Vibrio gazogenes for further investigation. By mapping the occurrence of Vibrios containing PET6 homologs we demonstrated their ubiquitous prevalence in the pangenome of several Vibrio strains. The biochemical characterization of PET6 showed that PET6 has a comparatively lower activity than other enzymes but also revealed a superior turnover at very high salt concentrations. The crystal structure of PET6 provides structural insights into this adaptation to saline environments. By grafting only a few beneficial mutations from other PET degrading enzymes onto PET6, we increased the activity up to three-fold, demonstrating the evolutionary potential of the enzyme. MD simulations of the variant helped rationalize the mutational effects of those mutants and elucidate the interaction of the enzyme with a PET substrate. With tremendous amounts of plastic waste in the Ocean and the prevalence of Vibrio gazogenes in marine biofilms and estuarine marshes, our findings suggest that Vibrio and the PET6 enzyme are worthy subjects to study the PET degradation in marine environments.

RevDate: 2022-11-08
CmpDate: 2022-11-08

Luo X, Kang X, A Schönhuth (2022)

VeChat: correcting errors in long reads using variation graphs.

Nature communications, 13(1):6657.

Error correction is the canonical first step in long-read sequencing data analysis. Current self-correction methods, however, are affected by consensus sequence induced biases that mask true variants in haplotypes of lower frequency showing in mixed samples. Unlike consensus sequence templates, graph-based reference systems are not affected by such biases, so do not mistakenly mask true variants as errors. We present VeChat, as an approach to implement this idea: VeChat is based on variation graphs, as a popular type of data structure for pangenome reference systems. Extensive benchmarking experiments demonstrate that long reads corrected by VeChat contain 4 to 15 (Pacific Biosciences) and 1 to 10 times (Oxford Nanopore Technologies) less errors than when being corrected by state of the art approaches. Further, using VeChat prior to long-read assembly significantly improves the haplotype awareness of the assemblies. VeChat is an easy-to-use open-source tool and publicly available at .

RevDate: 2022-11-05

Alsowayeh N, A Albutti (2022)

Designing a novel chimeric multi-epitope vaccine against Burkholderia pseudomallei, a causative agent of melioidosis.

Frontiers in medicine, 9:945938.

Burkholderia pseudomallei, a gram-negative soil-dwelling bacterium, is primarily considered a causative agent of melioidosis infection in both animals and humans. Despite the severity of the disease, there is currently no licensed vaccine on the market. The development of an effective vaccine against B. pseudomallei could help prevent the spread of infection. The purpose of this study was to develop a multi-epitope-based vaccine against B. pseudomallei using advanced bacterial pan-genome analysis. A total of four proteins were prioritized for epitope prediction by using multiple subtractive proteomics filters. Following that, a multi-epitopes based chimeric vaccine construct was modeled and joined with an adjuvant to improve the potency of the designed vaccine construct. The structure of the construct was predicted and analyzed for flexibility. A population coverage analysis was performed to evaluate the broad-spectrum applicability of B. pseudomallei. The computed combined world population coverage was 99.74%. Molecular docking analysis was applied further to evaluate the binding efficacy of the designed vaccine construct with the human toll-like receptors-5 (TLR-5). Furthermore, the dynamic behavior and stability of the docked complexes were investigated using molecular dynamics simulation, and the binding free energy determined for Vaccine-TLR-5 was delta total -168.3588. The docking result revealed that the vaccine construct may elicit a suitable immunological response within the host body. Hence, we believe that the designed in-silico vaccine could be helpful for experimentalists in the formulation of a highly effective vaccine for B. pseudomallei.

RevDate: 2022-11-30
CmpDate: 2022-11-07

Amulyasai B, Anusha R, Sasikala C, et al (2022)

Phylogenomic analysis of a metagenome-assembled genome indicates a new taxon of an anoxygenic phototroph bacterium in the family Chromatiaceae and the proposal of "Candidatus Thioaporhodococcus" gen. nov.

Archives of microbiology, 204(12):688.

In this study, three metagenome-assembled genomes of a sediment sample were constructed. A Bin1 (JB001) genome was identified as a photo-litho-auto/heterotroph (purple sulfur bacteria) bacterium with the ability to fix nitrogen, tolerate salt, and to produce bacteriochlorophyll a. It has a genome length of 4.1 Mb and a G + C content of 64.9%. Phylogenetic studies based on concatenated 92 core genes and photosynthetic genes (pufLM and bchY) showed that Bin JB001 is related to Thiococcus pfennigii, "Thioflavicoccus mobilis" and to the Lamprocystis purpurea lineage. Bin JB001 and its closely related members were subjected to the genome-based study of phenotypic and phylogenomic analysis. Genomic similarity indices (dDDH and ANI) showed that Bin JB001 could be defined as a novel species. The average amino acid identity (AAI) and percentage of conserved proteins (POCP) values were below 60 and 50%, respectively. The pan-genome analysis indicated that the pan-genome was an open type wherein Bin JB001 had 855 core genes. This study shows that the binned genome, Bin JB001 could represent a novel species of a new genus under the family Chromatiaceae, for which the name "Candidatus Thioaporhodococcus sediminis" gen. nov. sp. nov. is proposed.

RevDate: 2022-11-03

Amas J, Thomas WJW, Zhang Y, et al (2022)

Key advances in the new era of genomics-assisted disease resistance improvement of Brassica species.

Phytopathology [Epub ahead of print].

Disease resistance improvement remains a major focus in breeding programs as diseases continue to devastate Brassica production systems due to intensive cultivation and climate change. Genomics has paved the way to understand the complex genomes of Brassicas, which has been pivotal in the dissection of the genetic underpinnings of agronomic traits driving the development of superior cultivars. The new era of genomics-assisted disease resistance breeding has been marked by the development of high-quality genome references, accelerating the identification of disease resistance genes controlling both qualitative (major) gene and quantitative resistance (QR). This facilitates the development of molecular markers for marker assisted selection (MAS) and enables genome editing approaches for targeted gene manipulation to enhance the genetic value of disease resistance traits. This review summarizes the key advances in the development of genomic resources for Brassica species, focusing on improved genome references, based on long-read sequencing technologies, and pangenome assemblies. This is further supported by the advances in pathogen genomics, which have resulted in the discovery of pathogenicity factors, complementing the mining of disease resistance genes in the host. Recognizing the co-evolutionary arms race between the host and pathogen, it is critical to identify novel resistance genes using crop wild relatives (CWRs) and synthetic cultivars or through genetic manipulation via genome-editing to sustain the development of superior cultivars. Integrating these key advances with new breeding techniques and improved phenotyping using advanced data analysis platforms will make disease resistance improvement in Brassica species more efficient and responsive to current and future demand.

RevDate: 2022-11-02

Maynard-Smith L, Derrick JP, Borrow R, et al (2022)

Genome-wide association studies identify an association of transferrin binding protein B variation and invasive serogroup Y meningococcal disease in older adults.

The Journal of infectious diseases pii:6794087 [Epub ahead of print].

BACKGROUND: Neisseria meningitidis serogroup Y, especially ST-23 clonal complex (Y:cc23), represents a larger proportion of invasive meningococcal disease (IMD) in older adults compared to younger individuals. This study explored the meningococcal genetic variation underlying this association.

METHODS: Maximum-likelihood phylogenies and the pangenome were analysed using whole genome sequence (WGS) data from 200 Y:cc23 isolates in the Neisseria PubMLST database. Genome-wide association studies (GWAS) were performed on WGS data from 250 Y:cc23 isolates from individuals with IMD aged ≥65 years versus < 65 years.

RESULTS: Y:cc23 meningococcal variants did not cluster by age-group or disease phenotype in phylogenetic analyses. Pangenome comparisons found no differences in presence or absence of genes in IMD isolates from the different age groups. GWAS identified differences in nucleotide polymorphisms within the transferrin-binding protein B (tbpB) gene in isolates from individuals ≥65 years of age. TbpB structure modelling suggests these may impact binding of human transferrin.

CONCLUSION: These data suggest differential iron scavenging capacity amongst Y:cc23 meningococci isolated from older compared to younger patients. Iron acquisition is essential for many bacterial pathogens including the meningococcus. These polymorphisms may facilitate colonisation, thereby increasing the risk of disease in vulnerable older people with altered nasopharyngeal microbiomes and nutritional status.

RevDate: 2022-11-01

Martin FJ, Amode MR, Aneja A, et al (2022)

Ensembl 2023.

Nucleic acids research pii:6786199 [Epub ahead of print].

Ensembl ( has produced high-quality genomic resources for vertebrates and model organisms for more than twenty years. During that time, our resources, services and tools have continually evolved in line with both the publicly available genome data and the downstream research and applications that utilise the Ensembl platform. In recent years we have witnessed a dramatic shift in the genomic landscape. There has been a large increase in the number of high-quality reference genomes through global biodiversity initiatives. In parallel, there have been major advances towards pangenome representations of higher species, where many alternative genome assemblies representing different breeds, cultivars, strains and haplotypes are now available. In order to support these efforts and accelerate downstream research, it is our goal at Ensembl to create high-quality annotations, tools and services for species across the tree of life. Here, we report our resources for popular reference genomes, the dramatic growth of our annotations (including haplotypes from the first human pangenome graphs), updates to the Ensembl Variant Effect Predictor (VEP), interactive protein structure predictions from AlphaFold DB, and the beta release of our new website.

RevDate: 2022-11-01

Liu N, Liu D, Li K, et al (2022)

Pan-Genome Analysis of Staphylococcus aureus Reveals Key Factors Influencing Genomic Plasticity.

Microbiology spectrum [Epub ahead of print].

The massive quantities of bacterial genomic data being generated have facilitated in-depth analyses of bacteria for pan-genomic studies. However, the pan-genome compositions of one species differed significantly between different studies, so we used Staphylococcus aureus as a model organism to explore the influences driving bacterial pan-genome composition. We selected a series of diverse strains for pan-genomic analysis to explore the pan-genomic composition of S. aureus at the species level and the actual contribution of influencing factors (sequence type [ST], source of isolation, country of isolation, and date of collection) to pan-genome composition. We found that the distribution of core genes in bacterial populations restrained under different conditions differed significantly and showed "local core gene regions" in the same ST. Therefore, we propose that ST may be a key factor driving the dynamic distribution of bacterial genomes and that phylogenetic analyses using whole-genome alignment are no longer appropriate in populations containing multiple ST strains. Pan-genomic analysis showed that some of the housekeeping genes of multilocus sequence typing (MLST) are carried at less than 60% in S. aureus strains. Consequently, we propose a new set of marker genes for the classification of S. aureus, which provides a reference for finding a new set of housekeeping genes to apply to MLST. In this study, we explored the role of driving factors influencing pan-genome composition, providing new insights into the study of bacterial pan-genomes. IMPORTANCE We sought to explore the impact of driving factors influencing pan-genome composition using Staphylococcus aureus as a model organism to provide new insights for the study of bacterial pan-genomes. We believe that the sequence type (ST) of the strains under consideration plays a significant role in the dynamic distribution of bacterial genes. Our findings indicate that there are a certain number of essential genes in Staphylococcus aureus; however, the number of core genes is not as high as previously thought. The new classification method proposed herein suggests that a new set of housekeeping genes more suitable for Staphylococcus aureus must be identified to improve the current classification status of this species.

RevDate: 2022-11-01

Yuan Y, Seif Y, Rychel K, et al (2022)

Pan-Genome Analysis of Transcriptional Regulation in Six Salmonella enterica Serovar Typhimurium Strains Reveals Their Different Regulatory Structures.

mSystems [Epub ahead of print].

Establishing transcriptional regulatory networks (TRNs) in bacteria has been limited to well-characterized model strains. Using machine learning methods, we established the transcriptional regulatory networks of six Salmonella enterica serovar Typhimurium strains from their transcriptomes. By decomposing a compendia of RNA sequencing (RNA-seq) data with independent component analysis, we obtained 400 independently modulated sets of genes, called iModulons. We (i) performed pan-genome analysis of the phylogroup structure of S. Typhimurium and analyzed the iModulons against this background, (ii) revealed different genetic signatures in pathogenicity islands that explained phenotypes, (iii) discovered three transport iModulons linked to antibiotic resistance, (iv) described concerted responses to cationic antimicrobial peptides, and (v) uncovered new regulons. Thus, by combining pan-genome and transcriptomic analytics, we revealed variations in TRNs across six strains of serovar Typhimurium. IMPORTANCE Salmonella enterica serovar Typhimurium is a pathogen involved in human nontyphoidal infections. Treating S. Typhimurium infections is difficult due to the species's dynamic adaptation to its environment, which is dictated by a complex transcriptional regulatory network (TRN) that is different across strains. In this study, we describe the use of independent component analysis to characterize the differential TRNs across the S. Typhimurium pan-genome using a compendium of high-quality RNA-seq data. This approach provided unprecedented insights into the differences between regulation of key cellular functions and pathogenicity in the different strains. The study provides an impetus to initiate a large-scale effort to reveal the TRN differences between the major phylogroups of the pathogenic bacteria, which could fundamentally impact personalizing treatments of bacterial pathogens.

RevDate: 2022-10-31

Hur JI, Kim J, Ryu S, et al (2022)

Phylogenetic Association and Genetic Factors in Cold Stress Tolerance in Campylobacter jejuni.

Microbiology spectrum [Epub ahead of print].

Campylobacter jejuni is a major foodborne pathogen transmitted to humans primarily via contaminated poultry meat. Since poultry meat is generally processed, distributed, and stored in the cold chain, the survival of C. jejuni at refrigeration temperatures crucially affects human exposure to C. jejuni. Here, we investigated genetic factors associated with cold stress tolerance in C. jejuni. Seventy-nine C. jejuni strains isolated from retail raw chicken exhibited different survival levels at 4°C for 21 days. Multilocus sequence typing (MLST) clonal complex 21 (CC-21) and CC-443 were dominant among cold stress-tolerant strains, whereas CC-45 was common among cold stress-sensitive strains. Genome-wide average nucleotide identity (ANI) analysis identified a phylogenetic cluster associated with cold stress tolerance. Moreover, a pangenome analysis revealed 58 genes distinctively present in the cold stress-tolerant phylogenetic cluster. Among these 58 genes, cfrA, encoding the ferric enterobactin receptor involved in ion transport and metabolism, was selected for further analysis. Remarkably, the viability of a ΔcfrA mutant at 4°C was significantly decreased, while the levels of total reactive oxygen species and intracellular iron exceeded those of the wild type. Additionally, a knockout mutation of cfrA also significantly decreased the viability of three cold stress-tolerant isolates at 4°C, confirming the role of cfrA in cold stress tolerance. The results of this study demonstrate that unique phylogenetic clusters of C. jejuni associated with cold stress tolerance exist and that cfrA is a genetic factor contributing to cold stress tolerance in C. jejuni. IMPORTANCE The tolerance of foodborne pathogens to environmental stresses significantly affects food safety. Several studies have demonstrated that C. jejuni survives extended exposures to low temperatures, but the mechanisms of cold stress tolerance are not fully understood. Here, we demonstrate that C. jejuni strains in certain phylogenetic groups exhibit increased tolerance to cold stress. Notably, cfrA is present in the phylogenetic cluster associated with cold stress tolerance and plays a role in the survival of C. jejuni at low temperatures by alleviating oxidative stress. This is the first study to discover phylogenetic associations involving cold stress tolerance and to identify genetic elements conferring cold stress tolerance to C. jejuni.

RevDate: 2022-11-01
CmpDate: 2022-11-01

Chen Y, Miao Y, Bai W, et al (2022)

Characteristics and potential functional effects of long insertions in Asian butternuts.

BMC genomics, 23(1):732.

BACKGROUND: Structural variants (SVs) play important roles in adaptation evolution and species diversification. Especially, in plants, many phenotypes of response to the environment were found to be associated with SVs. Despite the prevalence and significance of SVs, long insertions remain poorly detected and studied in all but model species.

RESULTS: We used whole-genome resequencing of paired reads from 80 Asian butternuts to detect long insertions and further analyse their characteristics and potential functional effects. By combining of mapping-based and de novo assembly-based methods, we obtained a multiple related species pangenome representing higher taxonomic groups. We obtained 89,312 distinct contigs totaling 147,773,999 base pair (bp) of new sequences, of which 347 were putative long insertions placed in the reference genome. Most of the putative long insertions appeared in multiple species; in contrast, only 62 putative long insertions appeared in one species, which may be involved in the response to the environment. 65 putative long insertions fell into 61 distinct protein-coding genes involved in plant development, and 105 putative long insertions fell into upstream of 106 distinct protein-coding genes involved in cellular respiration. 3,367 genes were annotated in 2,606 contigs. We propose PLAINS (, a streamlined, comprehensive pipeline for the prediction and analysis of long insertions using whole-genome resequencing.

CONCLUSIONS: Our study lays down an important foundation for further whole-genome long insertion studies, allowing the investigation of their effects by experiments.

RevDate: 2022-10-29

Zia K, Rao MJ, Sadaqat M, et al (2022)

Pangenome-wide analysis of cyclic nucleotide-gated channel (CNGC) gene family in citrus Spp. Revealed their intraspecies diversity and potential roles in abiotic stress tolerance.

Frontiers in genetics, 13:1034921.

Cyclic nucleotide-gated channels (CNGC) gene family has been found to be involved in physiological processes including signaling pathways, environmental stresses, plant growth, and development. This gene family of non-selective cation channels is known to regulate the uptake of calcium and is reported in several plant species. The pangenome-wide studies enable researchers to understand the genetic diversity comprehensively; as a comparative analysis of multiple plant species or member of a species at once helps to better understand the evolutionary relationships and diversity present among them. In the current study, pangenome-wide analysis of the CNGC gene family has been performed on five Citrus species. As a result, a total of 32 genes in Citrus sinensis, 27 genes in Citrus recticulata, 30 genes in Citrus grandis, 31 genes in Atalantia buxfolia, and 30 genes in Poncirus trifoliata were identified. In addition, two unique genes CNGC13 and CNGC14 were identified, which may have potential roles. All the identified CNGC genes were unevenly distributed on 9 chromosomes except P. trifoliata had genes distributed on 7 chromosomes and were classified into four major groups and two sub-groups namely I, II, III, IV-A, and IV-B. Cyclic nucleotide binding (CNB) motif, calmodulin-binding motif (CaMB), and motif for IQ-domain were conserved in Citrus Spp. Intron exon structures of citrus species were not exactly as same as the gene structures of Arabidopsis. The majority of cis-regulatory elements (CREs) were light responsive and others include growth, development, and stress-related indicating potential roles of the CNGC gene family in these functions. Both segmental and tandem duplication were involved in the expansion of the CNGC gene family in Citrus Spp. The miRNAs are involved in the response of CsCNGC genes towards drought stress along with having regulatory association in the expression of these genes. Protein- Protein interaction (PPI) analysis also showed the interaction of CNGC proteins with other CNGCs which suggested their potential role in pathways regulating different biological processes. GO enrichment revealed that CNGC genes were involved in the transport of ions across membranes. Furthermore, tissue-specific expression patterns of leaves sample of C. sinensis were studied under drought stress. Out of 32 genes of C. sinensis 3 genes i.e., CsCNGC1.4, CsCNGC2.1, and CsCNGC4.2 were highly up-regulated, and only CsCNGC4.6 was highly down-regulated. The qRT-PCR analysis also showed that CNGC genes were highly expressed after treatment with drought stress, while gene expression was lower under controlled conditions. This work includes findings based on multiple genomes instead of one, therefore, this will provide more genomic information rather than single genome-based studies. These findings will serve as a basis for further functional insights into the CNGC gene family.

RevDate: 2022-10-31
CmpDate: 2022-10-31

Zhang J, Xu J, Lei H, et al (2023)

The development of variation-based rifampicin resistance in Staphylococcus aureus deciphered through genomic and transcriptomic study.

Journal of hazardous materials, 442:130112.

Rifampicin (RIF) resistance imposes a challenge on the antimicrobial treatment of pathogen infections. Figuring out the development mechanism of RIF resistance is critical to improving antimicrobial therapy strategy in clinics and biological treatment strategy of RIF polluted sewage in environmental engineering. The RIF resistance development of Staphylococcus aureus (S. aureus) with exposure to RIF at sub-inhibitory concentrations was comprehensively investigated via genomic and transcriptomic approaches in this study. RIF minimal inhibitory concentration (MIC) for S. aureus rapidly increased from 0.032 to 256 mg/L. Membrane permeability decrease, biofilm formation enhancement, and ROS production increase associated with RIF resistance were observed in RIF-induced strains. Through comparative genomic analysis, mutations in rpoB and rpoC were considered to be associated with RIF resistance in S. aureus mutants. Pan-genome-wide single-nucleotide variant analysis indicated that mutations at rpoB-1412, rpoB-1451, and rpoB-1457 were prevalent in 13849 public genomes of S. aureus, while mutations at rpoB-2256, and rpoC-3092 were first discovered in this study. The panorama of adaptative alteration of cellular physiological processes was observed via transcriptomic analysis. The oxidation pressure responses, metabolism, transporters, virulence factors, and multiple steps of DNA and RNA machinery were found to be perturbed by RIF in S. aureus.

RevDate: 2022-10-31
CmpDate: 2022-10-31

Leigh RJ, McKenna C, McWade R, et al (2022)

Comparative genomics and pangenomics of vancomycin-resistant and susceptible Enterococcus faecium from Irish hospitals.

Journal of medical microbiology, 71(10):.

Introduction. Enterococcus faecium has emerged as an important nosocomial pathogen, which is increasingly difficult to treat due to the genetic acquisition of vancomycin resistance. Ireland has a recalcitrant vancomycin-resistant bloodstream infection rate compared to other developed countries.Hypothesis/Gap statement. Vancomycin resistance rates persist amongst E. faecium isolates from Irish hospitals. The evolutionary genomics governing these trends have not been fully elucidated.Methodology. A set of 28 vancomycin-resistant isolates was sequenced to construct a dataset alongside 61 other publicly available Irish genomes. This dataset was extensively analysed using in silico methodologies (comparative genomics, pangenomics, phylogenetics, genotypics and comparative functional analyses) to uncover distinct evolutionary, coevolutionary and clinically relevant population trends.Results. These results suggest that a stable (in terms of genome size, GC% and number of genes), yet genetically diverse population (in terms of gene content) of E. faecium persists in Ireland with acquired resistance arising via plasmid acquisition (vanA) or, to a lesser extent, chromosomal recombination (vanB). Population analysis revealed five clusters with one cluster partitioned into four clades which transcend isolation dates. Pangenomic and recombination analyses revealed an open (whole genome and chromosomal specific) pangenome illustrating a rampant evolutionary pattern. Comparative resistomics and virulomics uncovered distinct chromosomal and mobilomal propensity for multidrug resistance, widespread chromosomal point-mutation-mediated resistance and chromosomally harboured arsenals of virulence factors. Interestingly, a potential difference in biofilm formation strategies was highlighted by coevolutionary analysis, suggesting differential biofilm genotypes between vanA and vanB isolates.Conclusions. These results highlight the evolutionary history of Irish E. faecium isolates and may provide insight into underlying infection dynamics in a clinical setting. Due to the apparent ease of vancomycin resistance acquisition over time, susceptible E. faecium should be concurrently reduced in Irish hospitals to mitigate potential resistant infections.

RevDate: 2022-10-30

Nawaz M, Ullah A, Al-Harbi AI, et al (2022)

Genome-Based Multi-Antigenic Epitopes Vaccine Construct Designing against Staphylococcus hominis Using Reverse Vaccinology and Biophysical Approaches.

Vaccines, 10(10):.

Staphylococcus hominis is a Gram-positive bacterium from the staphylococcus genus; it is also a member of coagulase-negative staphylococci because of its opportunistic nature and ability to cause life-threatening bloodstream infections in immunocompromised patients. Gram-positive and opportunistic bacteria have become a major concern for the medical community. It has also drawn the attention of scientists due to the evaluation of immune evasion tactics and the development of multidrug-resistant strains. This prompted the need to explore novel therapeutic approaches as an alternative to antibiotics. The current study aimed to develop a broad-spectrum, multi-epitope vaccine to control bacterial infections and reduce the burden on healthcare systems. A computational framework was designed to filter the immunogenic potent vaccine candidate. This framework consists of pan-genomics, subtractive proteomics, and immunoinformatics approaches to prioritize vaccine candidates. A total of 12,285 core proteins were obtained using a pan-genome analysis of all strains. The screening of the core proteins resulted in the selection of only two proteins for the next epitope prediction phase. Eleven B-cell derived T-cell epitopes were selected that met the criteria of different immunoinformatics approaches such as allergenicity, antigenicity, immunogenicity, and toxicity. A vaccine construct was formulated using EAAAK and GPGPG linkers and a cholera toxin B subunit. This formulated vaccine construct was further used for downward analysis. The vaccine was loop refined and improved for structure stability through disulfide engineering. For an efficient expression, the codons were optimized as per the usage pattern of the E coli (K12) expression system. The top three refined docked complexes of the vaccine that docked with the MHC-I, MHC-II, and TLR-4 receptors were selected, which proved the best binding potential of the vaccine with immune receptors; this was followed by molecular dynamic simulations. The results indicate the best intermolecular bonding between immune receptors and vaccine epitopes and that they are exposed to the host's immune system. Finally, the binding energies were calculated to confirm the binding stability of the docked complexes. This work aimed to provide a manageable list of immunogenic and antigenic epitopes that could be used as potent vaccine candidates for experimental in vivo and in vitro studies.

RevDate: 2022-10-30

Liu Y, Cui X, Yang R, et al (2022)

Genomic Insights into the Radiation-Resistant Capability of Sphingomonas qomolangmaensis S5-59[T] and Sphingomonas glaciei S8-45[T], Two Novel Bacteria from the North Slope of Mount Everest.

Microorganisms, 10(10):.

Mount Everest provides natural advantages to finding radiation-resistant extremophiles that are functionally mechanistic and possess commercial significance. (1) Background: Two bacterial strains, designated S5-59T and S8-45T, were isolated from moraine samples collected from the north slope of Mount Everest at altitudes of 5700m and 5100m above sea level. (2) Methods: The present study investigated the polyphasic features and genomic characteristics of S5-59[T] and S8-45[T]. (3) Results: The major fatty acids and the predominant respiratory menaquinone of S5-59[T] and S8-45[T] were summed as feature 3 (comprising C16:1 ω6c and/or C16:1 ω7c) and ubiquinone-10 (Q-10). Phylogenetic analyses based on 16S rRNA sequences and average nucleotide identity values among these two strains and their reference type strains were below the species demarcation thresholds of 98.65% and 95%. Strains S5-59[T] and S8-45[T] harbored great radiation resistance. The genomic analyses showed that DNA damage repair genes, such as mutL, mutS, radA, radC, recF, recN, etc., were present in the S5-59[T] and S8-45[T] strains. Additionally, strain S5-59[T] possessed more genes related to DNA protection proteins. The pan-genome analysis and horizontal gene transfers revealed that strains of Sphingomonas had a consistently homologous genetic evolutionary radiation resistance. Moreover, enzymatic antioxidative proteins also served critical roles in converting ROS into harmless molecules that resulted in resistance to radiation. Further, pigments and carotenoids such as zeaxanthin and alkylresorcinols of the non-enzymatic antioxidative system were also predicted to protect them from radiation. (4) Conclusions: Type strains S5-59[T] (=JCM 35564T =GDMCC 1.3193T) and S8-45[T] (=JCM 34749T =GDMCC 1.2715T) represent two novel species of the genus Sphingomonas with the proposed name Sphingomonas qomolangmaensis sp. nov. and Sphingomonas glaciei sp. nov. The type strains, S5-59[T] and S8-45[T], were assessed in a deeply genomic study of their radiation-resistant mechanisms and this thus resulted in a further understanding of their greater potential application for the development of anti-radiation protective drugs.

RevDate: 2022-10-30

Zhang Z, Guo Y, Yang F, et al (2022)

Pan-Genome Analysis Reveals Functional Divergences in Gut-Restricted Gilliamella and Snodgrassella.

Bioengineering (Basel, Switzerland), 9(10):.

Gilliamella and Snodgrassella, members of core gut microbiota in corbiculate bees, have high species diversity and adaptability to a wide range of hosts. In this study, we performed species taxonomy and phylogenetic analysis for Gilliamella and Snodgrassella strains that we isolated in our laboratory, in combination with published whole-genome. Functional effects of accessory and unique genes were investigated by KEGG category and pathway annotation in pan-genome analysis. Consequently, in Gilliamella, we inferred the importance of carbohydrate metabolism, amino acid metabolism, membrane transport, energy metabolism, and metabolism of cofactors and vitamins in accessory or unique genes. The pathway mentioned above, plus infectious disease, lipid metabolism, nucleotide metabolism as well as replication and repair exert a pivotal role in accessory or unique genes of Snodgrassella. Further analysis revealed the existence of functional differentiation of accessory and unique genes among Apis-derived genomes and Bombus-derived genomes. We also identified eight and four biosynthetic gene clusters in all Gilliamella and Snodgrassella genomes, respectively. Our study provides a good insight to better understand how host heterogeneity influences the bacterial speciation and affects the versatility of the genome of the gut bacteria.

RevDate: 2022-10-26

McInerney JO (2022)

Prokaryotic Pangenomes Act as Evolving Ecosystems.

Molecular biology and evolution pii:6775222 [Epub ahead of print].

Understanding adaptation to the local environment is a central tenet and a major focus of evolutionary biology. But this is only part of the adaptionist story. In addition to the external environment, one of the main drivers of genome composition is genetic background. In this perspective, I argue that there is a growing body of evidence that intra-genomic selective pressures play a significant part in the composition of prokaryotic genomes and play a significant role in the origin, maintenance and structuring of prokaryotic pangenomes.

RevDate: 2022-11-18
CmpDate: 2022-11-18

Sun X, Chen Z, Kong T, et al (2022)

Mycobacteriaceae Mineralizes Micropolyethylene in Riverine Ecosystems.

Environmental science & technology, 56(22):15705-15717.

Microplastic (MP) contamination is a serious global environmental problem. Plastic contamination has attracted extensive attention during the past decades. While physiochemical weathering may influence the properties of MPs, biodegradation by microorganisms could ultimately mineralize plastics into CO2. Compared to the well-studied marine ecosystems, the MP biodegradation process in riverine ecosystems, however, is less understood. The current study focuses on the MP biodegradation in one of the world's most plastic contaminated rivers, Pearl River, using micropolyethylene (mPE) as a model substrate. Mineralization of [13]C-labeled mPE into [13]CO2 provided direct evidence of mPE biodegradation by indigenous microorganisms. Several Actinobacteriota genera were identified as putative mPE degraders. Furthermore, two Mycobacteriaceae isolates related to the putative mPE degraders, Mycobacterium sp. mPE3 and Nocardia sp. mPE12, were retrieved, and their ability to mineralize [13]C-mPE into [13]CO2 was confirmed. Pangenomic analysis reveals that the genes related to the proposed mPE biodegradation pathway are shared by members of Mycobacteriaceae. While both Mycobacterium and Nocardia are known for their pathogenicity, these populations on the plastisphere in this study were likely nonpathogenic as they lacked virulence factors. The current study provided direct evidence for MP mineralization by indigenous biodegraders and predicted their biodegradation pathway, which may be harnessed to improve bioremediation of MPs in urban rivers.

RevDate: 2022-10-28

Rodrigues Blanco I, José Luduverio Pizauro L, Victor Dos Anjos Almeida J, et al (2022)

Pan-genomic and comparative analysis of Pediococcus pentosaceus focused on the in silico assessment of pediocin-like bacteriocins.

Computational and structural biotechnology journal, 20:5595-5606.

Bacteriocins are antimicrobial peptides produced by different species of bacteria, especially the Gram-positive lactic acid bacteria (LAB). Pediococcus pentosaceus is widely applied in the industry and stands out as Bacteriocin-Like Inhibitory Substances (BLIS) producer known to inhibit pathogens commonly considered a concern in the food industries. This study aimed to perform in silico comparisons of P. pentosaceus genomes available in the public GenBank database focusing on their pediocin-like bacteriocins repertoire. The pan-genome analysis evidenced a temporal signal in the pattern of gene gain and loss, supporting the hypothesis that the complete genetic repertoire of this group of bacteria is still uncovered. Thirteen bacteriocin genes from Class II and III were predicted in the accessory genome. Four pediocin-like bacteriocins (54% of the detected bacteriocin repertoire) and their accompanying immunity genes are highlighted; penocin A, coagulin A, pediocin PA-1, and plantaricin 423. Additionally, in silico, modeling of the pediocin-like bacteriocins revealed different configurations of the helix motif compared to other physically determined pediocin-like structures. Comparative and phylogenomic analyses support the hypothesis that a dynamic mechanism of bacteriocin acquisition and purging is not dependent on the bacterial isolation source origin. Synteny analysis revealed that while coagulin A, pediocin PA-1, and Plantaricin 423 loci are associated with insertion sequences mainly from the IS30 family and are likely of plasmid origin, penocin A lies in a conserved chromosomal locus. The results presented here provide insights into the unique pediocin-like bacteriocin peptide fold, genomic diversity, and the evolution of the bacteriocin genetic repertoire of P. pentosaceus, shedding new insights into the role of these biomolecules for application in inhibiting bacterial pathogens, and suggesting that prospecting and sequencing new strains is still an alternative to mining for new probiotic compounds.

RevDate: 2022-10-28
CmpDate: 2022-10-27

Chia CT, Bender AT, Lillis L, et al (2022)

Rapid detection of hepatitis C virus using recombinase polymerase amplification.

PloS one, 17(10):e0276582.

Over 71 million people are infected with hepatitis C virus (HCV) worldwide, and approximately 400,000 global deaths result from complications of untreated chronic HCV. Pan-genomic direct-acting antivirals (DAAs) have recently become widely available and feature high cure rates in less than 12 weeks of treatment. The rollout of DAAs is reliant on diagnostic tests for HCV RNA to identify eligible patients with viremic HCV infections. Current PCR-based HCV RNA assays are restricted to well-resourced central laboratories, and there remains a prevailing clinical need for expanded access to decentralized HCV RNA testing to provide rapid chronic HCV diagnosis and linkage to DAAs in outpatient clinics. This paper reports a rapid, highly accurate, and minimally instrumented assay for HCV RNA detection using reverse transcription recombinase polymerase amplification (RT-RPA). The assay detects all HCV genotypes with a limit of detection of 25 copies per reaction for genotype 1, the most prevalent in the United States and worldwide. The clinical sensitivity and specificity of the RT-RPA assay were both 100% when evaluated using 78 diverse clinical serum specimens. The accuracy, short runtime, and low heating demands of RT-RPA may enable implementation in a point-of-care HCV test to expand global access to effective treatment via rapid chronic HCV diagnosis.

RevDate: 2022-11-03
CmpDate: 2022-10-26

Gourlie R, McDonald M, Hafez M, et al (2022)

The pangenome of the wheat pathogen Pyrenophora tritici-repentis reveals novel transposons associated with necrotrophic effectors ToxA and ToxB.

BMC biology, 20(1):239.

BACKGROUND: In fungal plant pathogens, genome rearrangements followed by selection pressure for adaptive traits have facilitated the co-evolutionary arms race between hosts and their pathogens. Pyrenophora tritici-repentis (Ptr) has emerged recently as a foliar pathogen of wheat worldwide and its populations consist of isolates that vary in their ability to produce combinations of different necrotrophic effectors. These effectors play vital roles in disease development. Here, we sequenced the genomes of a global collection (40 isolates) of Ptr to gain insights into its gene content and genome rearrangements.

RESULTS: A comparative genome analysis revealed an open pangenome, with an abundance of accessory genes (~ 57%) reflecting Ptr's adaptability. A clear distinction between pathogenic and non-pathogenic genomes was observed in size, gene content, and phylogenetic relatedness. Chromosomal rearrangements and structural organization, specifically around effector coding genes, were detailed using long-read assemblies (PacBio RS II) generated in this work in addition to previously assembled genomes. We also discovered the involvement of large mobile elements associated with Ptr's effectors: ToxA, the gene encoding for the necrosis effector, was found as a single copy within a 143-kb 'Starship' transposon (dubbed 'Horizon') with a clearly defined target site and target site duplications. 'Horizon' was located on different chromosomes in different isolates, indicating mobility, and the previously described ToxhAT transposon (responsible for horizontal transfer of ToxA) was nested within this newly identified Starship. Additionally, ToxB, the gene encoding the chlorosis effector, was clustered as three copies on a 294-kb element, which is likely a different putative 'Starship' (dubbed 'Icarus') in a ToxB-producing isolate. ToxB and its putative transposon were missing from the ToxB non-coding reference isolate, but the homolog toxb and 'Icarus' were both present in a different non-coding isolate. This suggests that ToxB may have been mobile at some point during the evolution of the Ptr genome which is contradictory to the current assumption of ToxB vertical inheritance. Finally, the genome architecture of Ptr was defined as 'one-compartment' based on calculated gene distances and evolutionary rates.

CONCLUSIONS: These findings together reflect on the highly plastic nature of the Ptr genome which has likely helped to drive its worldwide adaptation and has illuminated the involvement of giant transposons in facilitating the evolution of virulence in Ptr.

RevDate: 2022-10-24

Suryaletha K, Savithri AV, Nayar SA, et al (2022)

Demystifying Bacteriocins of human microbiota by genome guided prospects: An impetus to rekindle the antimicrobial research.

Current protein & peptide science pii:CPPS-EPUB-127084 [Epub ahead of print].

The human microbiome is a reservoir of potential bacteriocins that can counteract with the multidrug resistant bacterial pathogens. Unlike antibiotics, bacteriocins selectively inhibit a spectrum of competent bacteria and are said to safeguard gut commensals, reducing the chance of dysbiosis. Bacteriocinogenic probiotics or bacteriocins of human origin will be more pertinent in human physiological conditions for therapeutic applications to act against invading pathogens. Recent advancement in the omics approach enables the mining of diverse and novel bacteriocins by identifying biosynthetic gene clusters from the human microbial genome, pangenome or shotgun metagenome, which is a breakthrough in the discovery line of novel bacteriocins. This review summarizes the most recent trends and therapeutic potential of bacteriocins of human microbial origin, and the advancement in the in silico algorithms and databases in the discovery of novel bacteriocin, and how to bridge the gap between the discovery of bacteriocin genes from big datasets and their in vitro production. Besides, the later part of the review discussed the various impediments in their clinical applications and possible solution to bring them in the frontline therapeutics to control infections, thereby meeting the challenges of global antimicrobial resistance.

RevDate: 2022-11-29
CmpDate: 2022-11-29

González-Torres B, González-Gómez JP, Ramírez K, et al (2023)

Population structure of the Salmonella enterica serotype Oranienburg reveals similar virulence, regardless of isolation years and sources.

Gene, 851:146966.

Salmonella enterica serotype Oranienburg is a multi-host, ubiquitous, and prevalent Non-typhoidal Salmonella (NTS) in subtropical rivers, particularly in sediments; little studied so far possible the adaptation and establishment of this microorganism based on its genetic content. This study was focused on the first five genomes of S. Oranienburg in sediments through whole-genome sequencing (WGS) and 61 river water genomes isolated in previous studies. Results showed an open pangenome with 5,594 gene clusters (GCs), and the division of their categories showed; 3,303 core genes, 741 persistent genes, 1,282 accessory genes, and 268 unique genes. Additionally, it showed three main subclades within the same serotype and showed a conserved genetic content, suggesting the display of different adaptation strategies to its establishment. Nine genes for antimicrobial resistance were detected: aac (6') - Iy, H-NS, golS, marA, mdsABC, mdtK, and sdiA, and a mutation in the parC gene p. T57S generating a resistance. In addition, virulence genes and pathogenicity islands (SPI's) were analyzed, finding 92 genes and an identity above 80 % in the SPI's 1 to 5, and the centisomes 54 and 63. The environmental strains of S. Oranienburg do not represent a concern as multidrug resistance (MDR) bacterium; however, virulence genes remain a potential health risk. This study contributes to understanding its adaptation to aquatic environments in Mexico.

RevDate: 2022-12-03
CmpDate: 2022-11-14

Dyrhage K, Garcia-Montaner A, Tamarit D, et al (2022)

Genome Evolution of a Symbiont Population for Pathogen Defense in Honeybees.

Genome biology and evolution, 14(11):.

The honeybee gut microbiome is thought to be important for bee health, but the role of the individual members is poorly understood. Here, we present closed genomes and associated mobilomes of 102 Apilactobacillus kunkeei isolates obtained from the honey crop (foregut) of honeybees sampled from beehives in Helsingborg in the south of Sweden and from the islands Gotland and Åland in the Baltic Sea. Each beehive contained a unique composition of isolates and repeated sampling of similar isolates from two beehives in Helsingborg suggests that the bacterial community is stably maintained across bee generations during the summer months. The sampled bacterial population contained an open pan-genome structure with a high genomic density of transposons. A subset of strains affiliated with phylogroup A inhibited growth of the bee pathogen Melissococcus plutonius, all of which contained a 19.5 kb plasmid for the synthesis of the antimicrobial compound kunkecin A, while a subset of phylogroups B and C strains contained a 32.9 kb plasmid for the synthesis of a putative polyketide antibiotic. This study suggests that the mobile gene pool of A. kunkeei plays a key role in pathogen defense in honeybees, providing new insights into the evolutionary dynamics of defensive symbiont populations.

RevDate: 2022-11-30
CmpDate: 2022-11-29

Jarvis ED, Formenti G, Rhie A, et al (2022)

Semi-automated assembly of high-quality diploid human reference genomes.

Nature, 611(7936):519-531.

The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society[1,2]. However, it still has many gaps and errors, and does not represent a biological genome as it is a blend of multiple individuals[3,4]. Recently, a high-quality telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but it was derived from a hydatidiform mole cell line with a nearly homozygous genome[5]. To address these limitations, the Human Pangenome Reference Consortium formed with the goal of creating high-quality, cost-effective, diploid genome assemblies for a pangenome reference that represents human genetic diversity[6]. Here, in our first scientific report, we determined which combination of current genome sequencing and assembly approaches yield the most complete and accurate diploid genome assembly with minimal manual curation. Approaches that used highly accurate long reads and parent-child data with graph-based haplotype phasing during assembly outperformed those that did not. Developing a combination of the top-performing methods, we generated our first high-quality diploid reference assembly, containing only approximately four gaps per chromosome on average, with most chromosomes within ±1% of the length of CHM13. Nearly 48% of protein-coding genes have non-synonymous amino acid changes between haplotypes, and centromeric regions showed the highest diversity. Our findings serve as a foundation for assembling near-complete diploid human genomes at scale for a pangenome reference to capture global genetic variation from single nucleotides to structural rearrangements.

RevDate: 2022-10-20
CmpDate: 2022-10-20

Abram KZ, Jun SR, Z Udaondo (2022)

Pseudomonas aeruginosa Pangenome: Core and Accessory Genes of a Highly Resourceful Opportunistic Pathogen.

Advances in experimental medicine and biology, 1386:3-28.

In this chapter, we leverage a novel approach to assess the seamless population structure of Pseudomonas aeruginosa, using the full repertoire of genomes sequenced to date (GenBank, April 6, 2020). In order to assess the set of core functions that represents the species as well as the differences in these core functions among the phylogroups observed in the population structure analysis, we performed pangenome analyses at the species level and at the phylogroup level. The existence of the phylogroups described in the population structure analyses was supported by their different profiles of antibiotic-resistant determinants. Finally, we utilized a presence/absence matrix of protein families from the entire species to evaluate if P. aeruginosa phylogroups can be differentiated according to their accessory genomic content. Our analysis shows that the core genome of P. aeruginosa is approximately 62% of the average gene content for the species, and it is highly enriched with pathways related to the metabolism of carbohydrates and amino acids as well as cellular processes and cell maintenance. The analysis of the accessory genome of P. aeruginosa performed in this chapter confirmed not only the existence of the three phylogroups previously described in the population structure analysis, but also of 29 genetic substructures (subgroups) within the main phylogroups. Our work illustrates the utility of populations genomics pipelines to better understand highly complex bacterial species such as P. aeruginosa.

RevDate: 2022-10-18

Wang S, Qian YQ, Zhao RP, et al (2022)

Graph-based pan-genome: increased opportunities in plant genomics.

Journal of experimental botany pii:6762754 [Epub ahead of print].

Due to the development of sequencing technology and the great reduction in sequencing costs, the genomes of an increasing number of plant species have been assembled, and the numerous genomes have revealed large amounts of variation. However, a single reference genome does not allow the exploration of species diversity; therefore, the concept of the pan-genome was developed. A pan-genome is a collection of all sequences available for a species, including a large number of consensus sequences, large structural variations (SVs), and small variations, including single nucleotide polymorphisms (SNPs) and insertions/deletions (InDels). A simple linear pan-genome does not allow these SVs to be intuitively characterized, so graph-based pan-genomes have been developed. These pan-genomes store sequence and SV information in the form of nodes and paths to store and display species variation information in a more intuitive manner. The key role of graph-based pan-genome is to expand the coordinate system of the linear reference genome to accommodate more regions of genetic diversity. Here, we review the origin and development of graph-based pan-genomes, explore their application in plant research, and further highlight the application of graph-based pan-genomes for future plant breeding.

RevDate: 2022-10-19
CmpDate: 2022-10-18

Monshizadeh M, Zomorodi S, Mortensen K, et al (2022)

Revealing bacteria-phage interactions in human microbiome through the CRISPR-Cas immune systems.

Frontiers in cellular and infection microbiology, 12:933516.

The human gut microbiome is composed of a diverse consortium of microorganisms. Relatively little is known about the diversity of the bacteriophage population and their interactions with microbial organisms in the human microbiome. Due to the persistent rivalry between microbial organisms (hosts) and phages (invaders), genetic traces of phages are found in the hosts' CRISPR-Cas adaptive immune system. Mobile genetic elements (MGEs) found in bacteria include genetic material from phage and plasmids, often resultant from invasion events. We developed a computational pipeline (BacMGEnet), which can be used for inference and exploratory analysis of putative interactions between microbial organisms and MGEs (phages and plasmids) and their interaction network. Given a collection of genomes as the input, BacMGEnet utilizes computational tools we have previously developed to characterize CRISPR-Cas systems in the genomes, which are then used to identify putative invaders from publicly available collections of phage/prophage sequences. In addition, BacMGEnet uses a greedy algorithm to summarize identified putative interactions to produce a bacteria-MGE network in a standard network format. Inferred networks can be utilized to assist further examination of the putative interactions and for discovery of interaction patterns. Here we apply the BacMGEnet pipeline to a few collections of genomic/metagenomic datasets to demonstrate its utilities. BacMGEnet revealed a complex interaction network of the Phocaeicola vulgatus pangenome with its phage invaders, and the modularity analysis of the resulted network suggested differential activities of the different P. vulgatus' CRISPR-Cas systems (Type I-C and Type II-C) against some phages. Analysis of the phage-bacteria interaction network of human gut microbiome revealed a mixture of phages with a broad host range (resulting in large modules with many bacteria and phages), and phages with narrow host range. We also showed that BacMGEnet can be used to infer phages that invade bacteria and their interactions in wound microbiome. We anticipate that BacMGEnet will become an important tool for studying the interactions between bacteria and their invaders for microbiome research.

RevDate: 2022-10-15

Palevich N, Palevich FP, Gardner A, et al (2022)

Genome collection of Shewanella spp. isolated from spoiled lamb.

Frontiers in microbiology, 13:976152.

The diversity of the genus Shewanella and their roles across a variety of ecological niches is largely unknown highlighting the phylogenetic diversity of these bacteria. From a food safety perspective, Shewanella species have been recognized as causative spoilage agents of vacuum-packed meat products. However, the genetic basis and metabolic pathways for the spoilage mechanism are yet to be explored due to the unavailability of relevant Shewanella strains and genomic resources. In this study, whole-genome sequencing of 32 Shewanella strains isolated from vacuum-packaged refrigerated spoiled lamb was performed to examine their roles in meat spoilage. Phylogenomic reconstruction revealed their genomic diversity with 28 Shewanella spp. strains belonging to the same putative novel species, two Shewanella glacialipiscicola strains (SM77 and SM91), Shewanella xiamenensis NZRM825, and Shewanella putrefaciens DSM 50426 (ATCC 8072) isolated from butter. Genome-wide clustering of orthologous gene families revealed functional groupings within the major Shewanella cluster but also considerable plasticity across the different species. Pan-genome analysis revealed conserved occurrence of spoilage genes associated with sulfur and putrescine metabolism, while the complete set of trimethylamine metabolism genes was observed in only Shewanella sp. SM74, S. glacialipiscicola SM77 and SM91 strains. Through comparative genomics, some variations were also identified pertaining to genes associated with adaptation to environmental cues such as temperature, osmotic, salt, oxidative, antimicrobial peptide, and drug resistance stresses. Here we provide a reference collection of draft Shewanella genomes for subsequent species descriptions and future investigations into the molecular spoilage mechanisms for further applications in the meat industry.

RevDate: 2022-10-13

Jana B, Keppel K, Fridman CM, et al (2022)

Multiple T6SSs, Mobile Auxiliary Modules, and Effectors Revealed in a Systematic Analysis of the Vibrio parahaemolyticus Pan-Genome.

mSystems [Epub ahead of print].

Type VI secretion systems (T6SSs) play a major role in interbacterial competition and in bacterial interactions with eukaryotic cells. The distribution of T6SSs and the effectors they secrete vary between strains of the same bacterial species. Therefore, a pan-genome investigation is required to better understand the T6SS potential of a bacterial species of interest. Here, we performed a comprehensive, systematic analysis of T6SS gene clusters and auxiliary modules found in the pan-genome of Vibrio parahaemolyticus, an emerging pathogen widespread in marine environments. We identified 4 different T6SS gene clusters within genomes of this species; two systems appear to be ancient and widespread, whereas the other 2 systems are rare and appear to have been more recently acquired via horizontal gene transfer. In addition, we identified diverse T6SS auxiliary modules containing putative effectors with either known or predicted toxin domains. Many auxiliary modules are possibly horizontally shared between V. parahaemolyticus genomes, since they are flanked by DNA mobility genes. We further investigated a DUF4225-containing protein encoded on an Hcp auxiliary module, and we showed that it is an antibacterial T6SS effector that exerts its toxicity in the bacterial periplasm, leading to cell lysis. Computational analyses of DUF4225 revealed a widespread toxin domain associated with various toxin delivery systems. Taken together, our findings reveal a diverse repertoire of T6SSs and auxiliary modules in the V. parahaemolyticus pan-genome, as well as novel T6SS effectors and toxin domains that can play a major role in the interactions of this species with other cells. IMPORTANCE Gram-negative bacteria employ toxin delivery systems to mediate their interactions with neighboring cells. Vibrio parahaemolyticus, an emerging pathogen of humans and marine animals, was shown to deploy antibacterial toxins into competing bacteria via the type VI secretion system (T6SS). Here, we analyzed 1,727 V. parahaemolyticus genomes and revealed the pan-genome T6SS repertoire of this species, including the T6SS gene clusters, horizontally shared auxiliary modules, and toxins. We also identified a role for a previously uncharacterized domain, DUF4225, as a widespread antibacterial toxin associated with diverse toxin delivery systems.

RevDate: 2022-10-12

Wang F, Guo Y, Liu Z, et al (2022)

New insights into the novel sequences of the chicken pangenome by liquid chip.

Journal of animal science pii:6759641 [Epub ahead of print].

Increasing evidence indicates that the missing sequences and genes in the chicken reference genome are involved in many crucial biological pathways, including metabolism and immunity. The low detection rate of novel sequences by resequencing hindered the acquisition of these sequences and the exploration of the relationship between new genes and economic traits. To improve the capture ratio of novel sequences, a 48K liquid chip including 25K from the reference sequence and 23K from the novel sequence was designed. The assay was tested on a panel of 218 animals from 5 chicken breeds. The average capture ratio of the reference sequence was 99.55%, and the average sequencing depth of the target sites was approximately 187 X, indicating a good performance and successful application of liquid chips in farm animals. For the target region in the novel sequence, the average capture ratio was 33.15% and the average sequencing depth of target sites was approximately 60X, both of which were higher than that of resequencing. However, the different capture ratios and capture regions among varieties and individuals proved the difficulty of capturing these regions with complex structures. After genotyping, GWAS showed variations in novel sequences potentially relevant to immune-related traits. For example, a SNP close to the differentiation of lymphocyte-related gene IGHV3-23-like was associated with the H/L ratio. These results suggest that targeted capture sequencing is a preferred method to capture these sequences with complex structures and genes potentially associated with immune-related traits.

RevDate: 2022-10-18
CmpDate: 2022-10-14

Wagner DM, Birdsell DN, McDonough RF, et al (2022)

Genomic characterization of Francisella tularensis and other diverse Francisella species from complex samples.

PloS one, 17(10):e0273273.

Francisella tularensis, the bacterium that causes the zoonosis tularemia, and its genetic near neighbor species, can be difficult or impossible to cultivate from complex samples. Thus, there is a lack of genomic information for these species that has, among other things, limited the development of robust detection assays for F. tularensis that are both specific and sensitive. The objective of this study was to develop and validate approaches to capture, enrich, sequence, and analyze Francisella DNA present in DNA extracts generated from complex samples. RNA capture probes were designed based upon the known pan genome of F. tularensis and other diverse species in the family Francisellaceae. Probes that targeted genomic regions also present in non-Francisellaceae species were excluded, and probes specific to particular Francisella species or phylogenetic clades were identified. The capture-enrichment system was then applied to diverse, complex DNA extracts containing low-level Francisella DNA, including human clinical tularemia samples, environmental samples (i.e., animal tissue and air filters), and whole ticks/tick cell lines, which was followed by sequencing of the enriched samples. Analysis of the resulting data facilitated rigorous and unambiguous confirmation of the detection of F. tularensis or other Francisella species in complex samples, identification of mixtures of different Francisella species in the same sample, analysis of gene content (e.g., known virulence and antimicrobial resistance loci), and high-resolution whole genome-based genotyping. The benefits of this capture-enrichment system include: even very low target DNA can be amplified; it is culture-independent, reducing exposure for research and/or clinical personnel and allowing genomic information to be obtained from samples that do not yield isolates; and the resulting comprehensive data not only provide robust means to confirm the presence of a target species in a sample, but also can provide data useful for source attribution, which is important from a genomic epidemiology perspective.

RevDate: 2022-10-11

Bista PK, Pillai D, Roy C, et al (2022)

Comparative Genomic Analysis of Fusobacterium necrophorum Provides Insights into Conserved Virulence Genes.

Microbiology spectrum [Epub ahead of print].

Fusobacterium necrophorum is a Gram-negative, filamentous anaerobe prevalent in the mucosal flora of animals and humans. It causes necrotic infections in cattle, resulting in a substantial economic impact on the cattle industry. Although infection severity and management differ within F. necrophorum species, little is known about F. necrophorum speciation and the genetic virulence determinants between strains. To characterize the clinical isolates, we performed whole-genome sequencing of four bovine isolates (8L1, 212, B17, and SM1216) and one human isolate (MK12). To determine the phylogenetic relationship and evolution pattern and investigate the presence of antimicrobial resistance genes (ARGs) and potential virulence genes of F. necrophorum, we also performed comparative genomics with publicly available Fusobacterium genomes. Using up-to-date bacterial core gene (UBCG) set analysis, we uncovered distinct Fusobacterium species and F. necrophorum subspecies clades. Pangenome analyses revealed a high level of diversity among Fusobacterium strains down to species levels. The output also identified 14 and 26 genes specific to F. necrophorum subsp. necrophorum and F. necrophorum subsp. funduliforme, respectively, which could be essential for bacterial survival under different environmental conditions. ClonalFrameML-based recombination analysis suggested that extensive recombination among accessory genes led to species divergence. Furthermore, the only strain of F. necrophorum with ARGs was F. necrophorum subsp. funduliforme B35, with acquired macrolide and tetracycline resistance genes. Our custom search revealed common virulence genes, including toxins, adhesion proteins, outer membrane proteins, cell envelope, type IV secretion system, ABC (ATP-binding cassette) transporters, and transporter proteins. A focused study on these genes could help identify major virulence genes and inform effective vaccination strategies against fusobacterial infections. IMPORTANCE Fusobacterium necrophorum is an anaerobic bacterium that causes liver abscesses in cattle with an annual incidence rate of 10% to 20%, resulting in a substantial economic impact on the cattle industry. The lack of definite biochemical tests makes it difficult to distinguish F. necrophorum subspecies phenotypically, where genomic characterization plays a significant role. However, due to the lack of a good reference genome for comparison, F. necrophorum subspecies-level identification represents a significant challenge. To overcome this challenge, we used comparative genomics to validate clinical test strains for subspecies-level identification. The findings of our study help predict specific clades of previously uncharacterized strains of F. necrophorum. Our study identifies both general and subspecies-specific virulence genes through a custom search-based analysis. The virulence genes identified in this study can be the focus of future studies aimed at evaluating their potential as vaccine targets to prevent fusobacterial infections in cattle.

RevDate: 2022-12-06
CmpDate: 2022-10-12

Moolhuijzen PM, See PT, Shi G, et al (2022)

A global pangenome for the wheat fungal pathogen Pyrenophora tritici-repentis and prediction of effector protein structural homology.

Microbial genomics, 8(10):.

The adaptive potential of plant fungal pathogens is largely governed by the gene content of a species, consisting of core and accessory genes across the pathogen isolate repertoire. To approximate the complete gene repertoire of a globally significant crop fungal pathogen, a pan genomic analysis was undertaken for Pyrenophora tritici-repentis (Ptr), the causal agent of tan (or yellow) spot disease in wheat. In this study, 15 new Ptr genomes were sequenced, assembled and annotated, including isolates from three races not previously sequenced. Together with 11 previously published Ptr genomes, a pangenome for 26 Ptr isolates from Australia, Europe, North Africa and America, representing nearly all known races, revealed a conserved core-gene content of 57 % and presents a new Ptr resource for searching natural homologues (orthologues not acquired by horizontal transfer from another species) using remote protein structural homology. Here, we identify for the first time a non-synonymous mutation in the Ptr necrotrophic effector gene ToxB, multiple copies of the inactive toxb within an isolate, a distant natural Pyrenophora homologue of a known Parastagonopora nodorum necrotrophic effector (SnTox3), and clear genomic break points for the ToxA effector horizontal transfer region. This comprehensive genomic analysis of Ptr races includes nine isolates sequenced via long read technologies. Accordingly, these resources provide a more complete representation of the species, and serve as a resource to monitor variations potentially involved in pathogenicity.

RevDate: 2022-10-11

Kim E, Yang SM, Kim IS, et al (2022)

Identification of Leuconostoc species based on novel marker genes identified using real-time PCR via computational pangenome analysis.

Frontiers in microbiology, 13:1014872.

Leuconostoc species are important microorganisms in food fermentation but also cause food spoilage. Although these species are commercially important, their taxonomy is still based on inaccurate identification methods. Here, we used computational pangenome analysis to develop a real-time PCR-based method for identifying and differentiating the 12 major Leuconostoc species found in food. Analysis of pan and core-genome phylogenies showed clustering of strains into 12 distinct groups according to the species. Pangenome analysis of 130 Leuconostoc genomes from these 12 species enabled the identification of each species-specific gene. In silico testing of the species-specific genes against 143 publicly available Leuconostoc and 100 other lactic acid bacterial genomes showed that all the assays had 100% inclusivity/exclusivity. We also verified the specificity for each primer pair targeting each specific gene using 23 target and 124 non-target strains and found high specificity (100%). The sensitivity of the real-time PCR method was 10[2] colony forming units (CFUs)/ml in pure culture and spiked food samples. All standard curves showed good linear correlations, with an R [2] value of ≥0.996, suggesting that screened targets have good specificity and strong anti-interference ability from food sample matrices and non-target strains. The real-time PCR method can be potentially used to determine the taxonomic status and identify the Leuconostoc species in foods.


ESP Quick Facts

ESP Origins

In the early 1990's, Robert Robbins was a faculty member at Johns Hopkins, where he directed the informatics core of GDB — the human gene-mapping database of the international human genome project. To share papers with colleagues around the world, he set up a small paper-sharing section on his personal web page. This small project evolved into The Electronic Scholarly Publishing Project.

ESP Support

In 1995, Robbins became the VP/IT of the Fred Hutchinson Cancer Research Center in Seattle, WA. Soon after arriving in Seattle, Robbins secured funding, through the ELSI component of the US Human Genome Project, to create the original ESP.ORG web site, with the formal goal of providing free, world-wide access to the literature of classical genetics.

ESP Rationale

Although the methods of molecular biology can seem almost magical to the uninitiated, the original techniques of classical genetics are readily appreciated by one and all: cross individuals that differ in some inherited trait, collect all of the progeny, score their attributes, and propose mechanisms to explain the patterns of inheritance observed.

ESP Goal

In reading the early works of classical genetics, one is drawn, almost inexorably, into ever more complex models, until molecular explanations begin to seem both necessary and natural. At that point, the tools for understanding genome research are at hand. Assisting readers reach this point was the original goal of The Electronic Scholarly Publishing Project.

ESP Usage

Usage of the site grew rapidly and has remained high. Faculty began to use the site for their assigned readings. Other on-line publishers, ranging from The New York Times to Nature referenced ESP materials in their own publications. Nobel laureates (e.g., Joshua Lederberg) regularly used the site and even wrote to suggest changes and improvements.

ESP Content

When the site began, no journals were making their early content available in digital format. As a result, ESP was obliged to digitize classic literature before it could be made available. For many important papers — such as Mendel's original paper or the first genetic map — ESP had to produce entirely new typeset versions of the works, if they were to be available in a high-quality format.

ESP Help

Early support from the DOE component of the Human Genome Project was critically important for getting the ESP project on a firm foundation. Since that funding ended (nearly 20 years ago), the project has been operated as a purely volunteer effort. Anyone wishing to assist in these efforts should send an email to Robbins.

ESP Plans

With the development of methods for adding typeset side notes to PDF files, the ESP project now plans to add annotated versions of some classical papers to its holdings. We also plan to add new reference and pedagogical material. We have already started providing regularly updated, comprehensive bibliographies to the ESP.ORG site.

Electronic Scholarly Publishing
961 Red Tail Lane
Bellingham, WA 98226

E-mail: RJR8222 @

Papers in Classical Genetics

The ESP began as an effort to share a handful of key papers from the early days of classical genetics. Now the collection has grown to include hundreds of papers, in full-text format.

Digital Books

Along with papers on classical genetics, ESP offers a collection of full-text digital books, including many works by Darwin (and even a collection of poetry — Chicago Poems by Carl Sandburg).


ESP now offers a much improved and expanded collection of timelines, designed to give the user choice over subject matter and dates.


Biographical information about many key scientists.

Selected Bibliographies

Bibliographies on several topics of potential interest to the ESP community are now being automatically maintained and generated on the ESP site.

ESP Picks from Around the Web (updated 07 JUL 2018 )